NATIONAL INFORMATICS CENTRE

Major Activities

Achievements & Contributions

Consultancy

Training

Technical Papers, Reports & Presentations

FAQ

Contact Us

Training Notes

DISCLAIMER

 

Frequently Asked Questions :

Q. What is special about Modelling & Simulation Area?

Ans.  It is one of the specialized area like GIS,  CAD, networking etc.  Once you have the data or database(s) then off-line as well as on-line analyses using specialized quantitative, qualitative, management and simulation technique(s) could be applied with the pre-defined objective(s) depending upon aim and objective  a specific technique(s) could be applied to for scientific processing of information and in-depth analyses. The results will be unbiased as well as optimal and based on scientific proven and tested technique(s).  Each application is specific to a technique and the basic assumption before a technique, must be fulfilled  in order to have the optimal result(s) which are implementable.

Back to top

Q. What are the pre-requisites?

Ans.  Major pre-requisites includes domains and subject area knowledge.  The software tool exposure, the requisite databases/information, input details and output besides knowledge on theoretical and practical aspects for interpreting the output(s).

Back to top

Q. Different areas of modelling and Simulation techniques?

Ans.  The major area includes Pre-processing of large statistical information;  Statistical model(s) and analyses; linear and non-linear optimization including multi-objective criterion, Econometric systems; Time Series analyses (Univariate and Multivariate), CPM/PERT based Project Management including project simulation, Neural network based forecasting systems,   besides System Dynamics. Each of the above area are specific to a situation and the pre-requisite for each one is to be known before developing a theoretical mathematical models as well as interpreting  output(s)/results.

Back to top

Q. Why different techniques in each of the modelling area?

Ans.   Like statistical analysis is very broad and includes large number of techniques and each technique has specific application and applied to a particular situation. Theoretical knowledge of each technique(s) to be applied for a particular situation is to be known in advance? Like regression is a specific application in the statistical area and could be applied to build a relationship model between dependent and independent variables. Likewise there are number of techniques which falls into the area of statistical analysis and similarly the situation is in other areas, such as linear and non-linear optimization using operational research optimization techniques. Theoretical in-depth knowledge is a must for the development of system based on modelling technique(s).

Back to top

Q. What is Time Series & Forecasting Analysis?

Ans.  Once the data is available with the homogeneous time intervals for a number of periods then there are number of time series techniques. Depending upon the past data behavior to build a time series model is build so as to forecast the independent variable for number of periods ahead.  Time series techniques, neural network based applications as well as econometric systems basically fall broadly in the area of forecasting.  There are different requirements in terms of the data availability and time period for each of the above area as well as potential of a particular technique(s), to be applied to a situation.

Back to top

Q. What is Statistical Analyses?

Ans.   When there are number of observations on a set of dependent and independent variables and one wishes to establish relationship in order to understand the likely behavior in the near future, different statistical techniques can be applied to understand relationship between the variable over time. In short the variable behavior and dependency between them as well as likely behavior in the future, the regression (linear or non-linear) analysis is performed. There have large number of statistical techniques available such as factor Analysis, Multi variate analyses, Canonical analysis;  Design of Experiments; Analysis of variance (ANOVA) etc.

Back to top

Q. Optimization model, pre-requisites?

Ans.   This falls in the area of operational research and optimizing (maximizing and minimizing).  An objective function based on the number of constraints, example : Resource Planning, Inventory Control systems; Resource Allocation and planning, Decentralized planning; Optimal facilities location; Transportation and Distribution models etc.  It optimizing the objective function based on the defined constraints of the systems.  The model could be linear, non-linear or multi-objective optimization system.

Back to top

Q.  Project Management for monitoring & control.

Ans.  Applicable during the planning and execution of a project using CPM/PERT techniques, Defining activities, knowing the relationship between the activities, duration of each activity are some of the pre-requisites.  The allocation in a dynamic environment,  on-line monitoring for time cost over-run as well as the optimal resources required to complete an activity. The likely completion with respect to time as well as resources/cost required could be monitored on-line. The status of a project at a point of the time can be monitored with the likely slippage in terms of time, cost and the resources requirement can be obtained on-line. Powerful techniques for a on-line monitoring and control during the execution stage of a project.There are web unable project management software packages for monitoring at different levels during the execution.One of the potential area for on-line monitoring and control of large number of projects being executed by state, centre and PSUs in the country. 

Back to top

Q. Large national/state level Survey Design & Analysis

Ans.  Availability of NICNET across the country is a powerful tool for undertaking online survey design as well as on-line analysis.Web unable survey could be planned using decentralized collection and validation as well as the reports could be generated on-line.The time as well as the resource requirement for undertaking on-line survey could be reduced significantly by using the state-of-the-art survey design and analysis software tools, besides minimizing the time and cost over-runs, alongwith ease of implementation.

Back to top

Q.  Different software tools for analyses?

Ans.   There are number of modelling area and in each of these areas there are number of techniques (applicable to a particular type of analysis required) to be undertaken besides large number of software tools available for each of these techniques depending upon the complexity and its use.  One has to select a specific software tool depending upon the local requirement.  The tools could be classified into high, medium and low range in terms of cost as well as the facilities available. If the requirement of analysis is one time then the application development is different then for an on-line analysis/long term requirement.

Back to top

Q.  Software tool(s) & Application development

Ans.   The software tools are generic and application development is specific to an application. Likewise in RDBMS: Oracle or DB2 is used for specific application development.  Similarly SPSS based module is used for developing specific application development/analysis using multi-variate regression analysis.

Back to top

Q. Software & Application development Cost?

Ans.   Most of the software in the area of modelling and simulation are costly with or without annual renewal fee.  The development cost is also high mainly because it is to be used by specialized Division and the requirement is very small.  As in CAD and GIS area the number of users are small as compared to RDBMS users, and thus involves high cost in terms of the software tools and technical expertise.

Back to top

Q. Training pre-requisite for modelling and application development

Ans.  Indepth subject knowledge is a must besides an aptitude in the area. Econometric models can be developed faster and much more efficiently if a person has the basic knowledge about the econometrics technique.  Similarly, the optimization models requires the basic to in-depth knowledge about the optimization techniques/theory and its applications Operational Research as well as in-depth knowledge in terms of model building testing and interpretation of the final results.

Back to top

Q.  What is special about Neural Network based Forecasting application development?

Ans.   The latest in the field for undertaking forecasting applications and  domains knowledge requirements is very limited? it automatically develop class of  models, test and finally select the final forecast model? An in-depth knowledge requirement for statistical and econometric model building is not required as save, have been incorporated in the neural network based systems. Though, for the interpretation of the output the theoretical background is required.

Back to top

Q.  Minimum number of observation for modelling & analyses

Ans.  Minimum number of observations are required for a model depending upon the area and the corresponding technique to be applied.  As the theoretical development is based on minimum number of observations.  As in the case of statistical model building the whole theory is developed based on law of large numbers and the minimum observations required is 30.Similarly, in the optimization model building the number of constraints are normally more than the variables in the objective function. In the multi-variable regression analyses one can do the analyses based on the set of twenty observations also, but  statistically not correct? It has to be at least  thirty or more. Though one will get the output based o 20 or less observations that will be misleading, it will not follow the law of large numbers, the basic requirements for building a regression analyses.

Back to top

Q.  Homogeneous & heterogeneous Data 

Ans.   This basically deal with the time interval as well as observations for all the variables in time series analyses, time interfaces should be homogeneous and otherwise time series techniques can not be applied.  If one has one dependent variable and 20 independent variable. Twenty observations should be available for all the variables (dependent and independent).  In case there are some missing values, the treatment has to be done for the missing values otherwise the corresponding statistical model  will not be correct. The available technique requires homogenous data  only. In very few cases,  heterogeneous data could be used for analyses.

Back to top

Q. Different software tools for modelling application development

Ans.   There are number of software tools in various areas of modelling and simulation depending upon a short term and long term requirement and the associated cost.  One has to select a software tools with in-depth understanding, capabilities as well as limitation of final selected software. The potentials and limitation of the selected software should be known well in advance before undertaking the application development.  There are large number of software tools without the technical support including training. This has to be kept in mind before finalizing a software tool.

Back to top

Q.  Why the basic cost of software tools is so high?

Ans.   Since these are not generic tools and required by a specialist for a specific application, which are limited as compared to generalized application, such as RDBMS etc.? Development cost is high, and the usage is limited.  Limited usage makes the cost  high including support and training. Scenario is changing fast even in the developing world, as RDBMS are in placed and organizations are now looking for scientific processes based decisions on Quantitative, Qualitative, Management, Simulation, Data Warehousing, OLAP and Data mining tools.  The basic cost and availability of above tool is improving with pace of time.

Back to top

Q.  Pre-requisite for Network based Computer Aided Project Management?

Ans.   The software Project Management – CPM/PERT based Project Management – monitoring system should support the web unable function so as to monitor and control on-line, for reducing time over-runs and cost over-run.  Moreover, as large projects are multi-locational and multiple executing agencies are executing, thus the network based project management system is a must. NICNET based project monitoring system is the only solution for executing and monitoring. It may be on-line survey for  a state or in the whole country or monitoring of state level power project for minimizing the time and cost overruns.

Back to top

Q.  Strength of NIC for modelling Application development & implementation?

Ans.   The basic Data in the almost sectors of economy generated by the govt. agencies central, state or district is available on the NIC network. Most of the sectorial RDBMS are in place and Network availability across the country is fully functional and stable, the basic requirement for monitoring and control for government - centre & state projects.

    There is a full proof and tested mechanism to regularly up-date the various databases in almost in all sectors of economy. On-line analyses through the RDBMS systems are in place and are in use and stable.   There is a regular mechanism to download and upload data to various levels of integrations.  Also,  the data can be integrated at any level within a state through districts or centre through various states/regions.  The centralized and decentralized processing could be undertaken at any point of time and the analyses could be up-rooted to the desired level of usage.  There are specific divisions responsible for developing applications and give support to the centre and state departments/ministries.

·    The data warehouse can be developed at any desired level for any government department for on-line analyses using data, text and graphical information besides, undertaking the in-depth OLAP & data mining applications.

Back to top

Q . What value does business intelligence provide my organization?

Ans. Business intelligence (BI) provides timely and accurate information to better understand your organization and to make more informed, real-time  decisions. Full utilization of BI solutions can optimize organization processes and resources, improve decision making and maximize profits/minimize costs. 

Back to top

Q .What real-world advantages will a data warehouse provide ?

Ans. Some advantages to a well-constructed data warehouse include better data quality, accurate trend analysis, easier access to data, easier decision making.

The next logical step once the data bases on heterogeneous platforms and heterogeneous RDBMS are developed is that data analysis can be performed online and in-depth from simple to complex for better decision making

Back to top

Q .What is the single most important objective in building a data warehouse?

Ans. Most difficult task in initially creating, and maintaining, a data warehouse is ensuring the validity of the information stored within the database itself. The collecting and cleansing of data from many systems is not an easy task, and a data warehouse project usually fails because the data cannot be validated.

Back to top

Q . How much is the average cost for building a data warehouse?

Ans. It is very difficult to define average cost for a data warehouse as it depends on no. of factors like : 

  size of the data warehouse 
  number of users 
  source data: number of source files, complexity, cleanliness, documentation, etc. 
  the platform(s)
  the software 
  contractors and consultants used 
  knowledge and capabilities of the team 
  what's in place today 
  how easy you will make it for the users 

Back to top

Q . We want to implement a DW. Which tools and database should we select?

Ans. Before you even consider tools and an RDBMS, you need to figure out what you are doing and why you are doing it. What are your  requirements? What type of DW architecture are you building? Who is your user community and what are their analysis patterns and needs? Only when this is established and understood can you objectively

 analyze the tools and systems best meant for this task you have identified. You can't evaluate the appropriate solution unless you clearly see the problem that it's meant to solve. Features and functions are meaningless unless they directly apply to your problem.

Back to top

Q . What is the purpose of  using (On Line Analytical Processing) ?

Ans. OLAP servers organize, calculate and associate data so that intelligent relationships and deductions can be established. OLAP servers also let users model scenarios for predicting outcomes and determining "what-if" situations. Specifically, OLAP functionality is characterized by dynamic multidimensional analysis of consolidated organization data . OLAP server also supports end-user analytical and navigational activities including

*Calculations across dimensions

*Trend analysis in sequential time periods

*   Slicing data for on-screen viewing

*   Drilling down the data to deeper levels of consolidation

Back to top

Q . What are the major differences between MOLAP (multidimensional OLAP) and ROLAP(relational OLAP) ?

Ans. MOLAP tools utilize a pre-calculated data set, commonly referred to as a data cube and MOLAP systems gives fast response. MOLAP systems are generally used for bounded problem set. ROLAP tools are best used for users who have "unbounded" problem set .

ROLAP tools do not use pre-calculated data cubes. Instead, queries the standard relational database and its tables in order to  answer the question and ROLAP systems are comparatively slow.

Hybrid OLAP systems can use capability of both pre-calculated cubes and relational data sources. 

Back to top

Q . What is the difference between OLAP and data mining  ?

Ans.  OLAP is a technical concept where, most often, a reporting tool is configured to understand your database schema and composition (facts [values and attributes) and dimensions [levels and descriptors]). By simple point-n-clicking, a user can run any number of  user-designed reports without having to know anything of SQL or the schema. 

 But Data mining is used for detailed analyses, often algorithmic, mining actually might produce "models" that might be analyzed. Data Mining is the "automatic" extraction of patterns of information from historical data, enabling companies to focus on the most important aspects of their business -- telling them what they did not know and had not even thought of asking.        

Back to top

Q . What is the major difference between data reporting and data mining ?

Ans. There are a lot of software companies and experts claiming to do “data mining” when in fact they are performing “data reporting.” Data mining starts with data integration ,cleansing and ends with predictions with their accuracy. In addition, ranks the predictive parameters and gives users how well they account for the observed variability.

The problem with data mining is that it can’t be done well without
a) time,
b) subject knowledge
c) real data mining software(s), and
d) developing an application(DSS) for an end user.

Back to top

Q . What are the requirements for data mining  ?

Ans. Basic requirements to have simple to complex analysis from data mining tools are :

- Data bases (mainly historical) and mechanism for online regular updation of data on long term basis.

Subject as well as expert domain knowledge besides the usage of state-of-the-art data mining tools.

  Evaluation criteria to select a data mining tool should include:

a) Easy to Use: It should be easy to use for an expert. Data mining software should not be an MS Office look-alike, where people learning the job can intuitively play with it and get results. If they do, they’ll come to you with great graphs that point you in bizarre directions.

b) Speed: Three weeks to build a model is not something that can be implemented in today's business. Not in data mining at least. For any type of data mining project that will actually tell you something real you may very well take a couple of months just to prepare the data.

c) Integration: All high-end software packages can integrate on all platforms and file systems. All cheaper software packages say they do.

d) Robustness: However, robustness should be a function of the model. You need to develop an appropriate model, and that is where most data mining projects fail.  

Back to top

Q . Which are the principal differences between Data Mining and the statistical analysis from the data? 

Ans. There is not a real difference between Dated Mining and the statistical analysis from the data. Data Mining  uses some of the statistical tools for analysis of the data while adding to it of the methods resulting from the heuristic technique and information theory more recently

While statistical analysis itself is a broad term and also most of statistical tools are offline while data mining can be done online

Back to top

Q . Which is the relationship between Data Mining and Data Warehouses/ Data Marts?

Ans. Mining can use data from a data warehouse as well as data mart for the purpose of analysis where Data warehousing is the process of integrating enterprise-wide corporate data into a single repository but  "data mart" is a department or functional oriented data warehouse. It is a scaled down version of a data warehouse that focuses on the local needs of a specific department like finance or purchase. An organization may have many data marts, each focused on a subset of a distinct organization activity

Back to top

Q . Which are the objectives of Text Mining? 

Ans. The objectives of Text Mining can be varied:

  • To quantify a text or parts of a text to extract the meaning structures from them strongest,
  • To establish bonds between the terms and the documents
  • To analyze the documents in their associating qualitative and quantitative information structured
  • To lay down rules of automatic classification of documents

Back to top

Q. How data warehousing and data mining are useful in government departments ?

Ans. Since most of the government departments have scattered  databases  which exists  on different platforms  and operating systems. To setup the coordination between them and remove the duplicacy there is a strong need to build a data warehouse which can serve to organization as a whole and it will be easy for professionals at higher level to take the decisions 

Back to top

Q. How National Informatics Centre can help to build Decision Support Systems around DW and DM tools for government departments ?

Ans. Since NIC is maintaining big data bases for most of government departments , it will be easy to develop a data warehouse/mart which otherwise is  very time consuming,   costly and cumbersome  process. Since NIC has a already built infrastructure with network facilities at all state and district level , online transmission of data from one place to another can be easily carried out to fulfill the major requirements for regularly updating databases and thus datawarehouse.

Back to top

 

 

 

BACK                                              HOME