Frequently Asked Questions :
Q. What is special about Modelling & Simulation Area?
Ans. It is one of
the specialized area like GIS, CAD, networking etc. Once you
have the data or database(s) then off-line as well as on-line analyses
using specialized quantitative, qualitative, management and simulation
technique(s) could be applied with the pre-defined objective(s) depending
upon aim and objective a specific technique(s) could be applied to
for scientific processing of information and in-depth analyses. The
results will be unbiased as well as optimal and based on scientific proven
and tested technique(s). Each application is specific to a technique
and the basic assumption before a technique, must be fulfilled in
order to have the optimal result(s) which are implementable.
Q. What are the pre-requisites?
pre-requisites includes domains and subject area knowledge. The
software tool exposure, the requisite databases/information, input details
and output besides knowledge on theoretical and practical aspects for
interpreting the output(s).
Q. Different areas of modelling and Simulation techniques?
major area includes Pre-processing of large statistical information;
Statistical model(s) and analyses; linear and non-linear optimization
including multi-objective criterion, Econometric systems; Time Series
analyses (Univariate and Multivariate), CPM/PERT based Project Management
including project simulation, Neural network based forecasting systems,
besides System Dynamics. Each of the above area are specific to a
situation and the pre-requisite for each one is to be known before
developing a theoretical mathematical models as well as interpreting
Q. Why different techniques in each of the modelling area?
statistical analysis is very broad and includes large number of techniques
and each technique has specific application and applied to a particular
situation. Theoretical knowledge of each technique(s) to be applied for a
particular situation is to be known in advance? Like regression is a
specific application in the statistical area and could be applied to build
a relationship model between dependent and independent variables. Likewise
there are number of techniques which falls into the area of statistical
analysis and similarly the situation is in other areas, such as linear and
non-linear optimization using operational research optimization
techniques. Theoretical in-depth knowledge is a must for the development
of system based on modelling technique(s).
Q. What is Time Series & Forecasting Analysis?
the data is available with the homogeneous time intervals for a number of
periods then there are number of time series techniques. Depending upon
the past data behavior to build a time series model is build so as to
forecast the independent variable for number of periods ahead. Time
series techniques, neural network based applications as well as
econometric systems basically fall broadly in the area of forecasting.
There are different requirements in terms of the data availability and
time period for each of the above area as well as potential of a
particular technique(s), to be applied to a situation.
Q. What is Statistical Analyses?
there are number of observations on a set of dependent and independent
variables and one wishes to establish relationship in order to understand
the likely behavior in the near future, different statistical techniques
can be applied to understand relationship between the variable over time.
In short the variable behavior and dependency between them as well as
likely behavior in the future, the regression (linear or non-linear)
analysis is performed. There have large number of statistical techniques
available such as factor Analysis, Multi variate analyses, Canonical
analysis; Design of Experiments; Analysis of variance (ANOVA) etc.
Q. Optimization model, pre-requisites?
falls in the area of operational research and optimizing (maximizing and
minimizing). An objective function based on the number of
constraints, example : Resource Planning, Inventory Control systems;
Resource Allocation and planning, Decentralized planning; Optimal
facilities location; Transportation and Distribution models etc. It
optimizing the objective function based on the defined constraints of the
systems. The model could be linear, non-linear or multi-objective
Q. Project Management for monitoring & control.
Ans. Applicable during the planning and execution of a project using CPM/PERT
techniques, Defining activities, knowing the relationship between the
activities, duration of each activity are some of the pre-requisites.
The allocation in a dynamic environment, on-line monitoring for time
cost over-run as well as the optimal resources required to complete an
activity. The likely completion with respect to time as well as
resources/cost required could be monitored on-line. The status of a
project at a point of the time can be monitored with the likely slippage
in terms of time, cost and the resources requirement can be obtained
on-line. Powerful techniques for a on-line monitoring and control during
the execution stage of a project.There are web unable project management
software packages for monitoring at different levels during the
execution.One of the potential area for on-line monitoring and control of
large number of projects being executed by state, centre and PSUs in the
Q. Large national/state level Survey Design & Analysis
Ans. Availability of NICNET across the country is a powerful tool for
undertaking online survey design as well as on-line analysis.Web unable
survey could be planned using decentralized collection and validation as
well as the reports could be generated on-line.The time as well as the
resource requirement for undertaking on-line survey could be reduced
significantly by using the state-of-the-art survey design and analysis
software tools, besides minimizing the time and cost over-runs, alongwith
ease of implementation.
Q. Different software tools for analyses?
are number of modelling area and in each of these areas there are number
of techniques (applicable to a particular type of analysis required) to be
undertaken besides large number of software tools available for each of
these techniques depending upon the complexity and its use. One has
to select a specific software tool depending upon the local requirement.
The tools could be classified into high, medium and low range in terms of
cost as well as the facilities available. If the requirement of analysis
is one time then the application development is different then for an
on-line analysis/long term requirement.
Q. Software tool(s) & Application development
software tools are generic and application development is specific to an
application. Likewise in RDBMS: Oracle or DB2 is used for specific
application development. Similarly SPSS based module is used for
developing specific application development/analysis using multi-variate
Q. Software & Application development Cost?
of the software in the area of modelling and simulation are costly with or
without annual renewal fee. The development cost is also high mainly
because it is to be used by specialized Division and the requirement is
very small. As in CAD and GIS area the number of users are small as
compared to RDBMS users, and thus involves high cost in terms of the
software tools and technical expertise.
Q. Training pre-requisite for modelling and application development
Ans. Indepth subject knowledge is a must besides an aptitude in the area.
Econometric models can be developed faster and much more efficiently if a
person has the basic knowledge about the econometrics technique.
Similarly, the optimization models requires the basic to in-depth
knowledge about the optimization techniques/theory and its applications
Operational Research as well as in-depth knowledge in terms of model
building testing and interpretation of the final results.
Q. What is
special about Neural Network based Forecasting application development?
latest in the field for undertaking forecasting applications and domains
knowledge requirements is very limited? it automatically develop class of
models, test and finally select the final forecast model? An
in-depth knowledge requirement for statistical and econometric model
building is not required as save, have been incorporated in the neural
network based systems. Though, for the interpretation of the output the
theoretical background is required.
Q. Minimum number of observation for modelling & analyses
Ans. Minimum number of observations are required for a model depending upon the
area and the corresponding technique to be applied. As the
theoretical development is based on minimum number of observations.
As in the case of statistical model building the whole theory is developed
based on law of large numbers and the minimum observations required is
30.Similarly, in the optimization model building the number of constraints
are normally more than the variables in the objective function. In the
multi-variable regression analyses one can do the analyses based on the
set of twenty observations also, but statistically not correct? It
has to be at least thirty or more. Though one will get the output
based o 20 or less observations that will be misleading, it will not follow
the law of large numbers, the basic requirements for building a regression
Q. Homogeneous & heterogeneous Data
basically deal with the time interval as well as observations for all the
variables in time series analyses, time interfaces should be homogeneous
and otherwise time series techniques can not be applied. If one has
one dependent variable and 20 independent variable. Twenty observations
should be available for all the variables (dependent and independent).
In case there are some missing values, the treatment has to be done for
the missing values otherwise the corresponding statistical model
will not be correct. The available technique requires homogenous data
only. In very few cases, heterogeneous data could be used for
Q. Different software tools for modelling application development
are number of software tools in various areas of modelling and simulation
depending upon a short term and long term requirement and the associated
cost. One has to select a software tools with in-depth
understanding, capabilities as well as limitation of final selected
software. The potentials and limitation of the selected software should be
known well in advance before undertaking the application development.
There are large number of software tools without the technical support
including training. This has to be kept in mind before finalizing a
Q. Why the
basic cost of software tools is so high?
these are not generic tools and required by a specialist for a specific
application, which are limited as compared to generalized application,
such as RDBMS etc.? Development cost is high, and the usage is limited.
Limited usage makes the cost high including support and training.
Scenario is changing fast even in the developing world, as RDBMS are in
placed and organizations are now looking for scientific processes based
decisions on Quantitative, Qualitative, Management, Simulation, Data
Warehousing, OLAP and Data mining tools. The basic cost and
availability of above tool is improving with pace of time.
Pre-requisite for Network based Computer Aided Project Management?
software Project Management – CPM/PERT based Project Management –
monitoring system should support the web unable function so as to monitor
and control on-line, for reducing time over-runs and cost over-run.
Moreover, as large projects are multi-locational and multiple executing
agencies are executing, thus the network based project management system
is a must. NICNET based project monitoring system is the only solution for
executing and monitoring. It may be on-line survey for a state or in
the whole country or monitoring of state level power project for
minimizing the time and cost overruns.
Q. Strength of NIC for modelling Application development &
Ans. The basic Data in the almost sectors of economy generated by the govt.
agencies central, state or district is available on the NIC network. Most of the sectorial RDBMS are in place and Network availability across the
country is fully functional and stable, the basic requirement for
monitoring and control for government - centre & state projects.
There is a full proof and tested mechanism to regularly up-date the
various databases in almost in all sectors of economy. On-line analyses
through the RDBMS systems are in place and are in use and stable.
There is a regular mechanism to download and upload data to various levels
of integrations. Also, the data can be integrated at any level
within a state through districts or centre through various states/regions.
The centralized and decentralized processing could be undertaken at any
point of time and the analyses could be up-rooted to the desired level of
usage. There are specific divisions responsible for developing
applications and give support to the centre and state
data warehouse can be developed at any desired level for any government
department for on-line analyses using data, text and graphical information
besides, undertaking the in-depth OLAP & data mining applications.
Q . What value does
business intelligence provide my organization?
Ans. Business intelligence
(BI) provides timely and accurate information to better understand your
organization and to make more informed, real-time decisions. Full
utilization of BI solutions can optimize organization processes and resources,
improve decision making and maximize profits/minimize costs.
Q .What real-world
advantages will a data warehouse provide ?
Ans. Some advantages to a
well-constructed data warehouse include better data quality, accurate
trend analysis, easier access to data, easier decision making.
The next logical step once the
data bases on heterogeneous platforms and heterogeneous RDBMS are
developed is that data analysis can be performed online and in-depth from
simple to complex for better decision making
Q .What is the single most important
objective in building a data warehouse?
Ans. Most difficult task in initially
creating, and maintaining, a data warehouse is ensuring the validity of
the information stored within the database itself. The collecting and
cleansing of data from many systems is not an easy task, and a
data warehouse project usually fails because the data cannot be validated.
Q . How much is the average cost for building a data warehouse?
Ans. It is very difficult to define average
cost for a data warehouse as it depends on no. of factors like :
size of the data warehouse
number of users
source data: number of source files, complexity, cleanliness, documentation, etc.
contractors and consultants used
knowledge and capabilities of the team
what's in place today
how easy you will make it for the users
Q . We want to
implement a DW. Which tools and database should we select?
Ans. Before you even
consider tools and an RDBMS, you need to figure out what you are doing and
why you are doing it. What are your requirements? What type of DW
architecture are you building? Who is your user community and what are
their analysis patterns and needs? Only when this is established and
understood can you objectively
analyze the tools and systems best meant
for this task you have identified. You can't evaluate the appropriate
solution unless you clearly see the problem that it's meant to solve.
Features and functions are meaningless unless they directly apply to your
Q . What is the
purpose of using (On Line Analytical Processing) ?
Ans. OLAP servers organize,
calculate and associate data so that intelligent relationships and
deductions can be established. OLAP servers also let users model scenarios
for predicting outcomes and determining "what-if" situations.
Specifically, OLAP functionality is characterized by dynamic
multidimensional analysis of consolidated organization data . OLAP server
also supports end-user analytical and navigational activities including
analysis in sequential time periods
Slicing data for on-screen viewing
Drilling down the data to deeper levels of consolidation
What are the major differences between MOLAP (multidimensional OLAP) and
ROLAP(relational OLAP) ?
MOLAP tools utilize a pre-calculated data set, commonly referred to as a data cube
and MOLAP systems gives fast response. MOLAP systems are generally used
for bounded problem set. ROLAP tools are best used for users who have "unbounded" problem set
ROLAP tools do not use pre-calculated data cubes.
the standard relational database and its tables in order to answer the
question and ROLAP systems are comparatively slow.
OLAP systems can use capability of both pre-calculated cubes and relational data sources.
What is the difference between OLAP and data mining ?
is a technical concept where, most often, a reporting tool is configured
to understand your database schema and composition (facts [values and
attributes) and dimensions [levels and descriptors]). By simple
point-n-clicking, a user can run any number of user-designed
reports without having to know anything of SQL or the schema.
Data mining is used for detailed analyses, often algorithmic, mining
actually might produce "models" that might be analyzed. Data
Mining is the "automatic" extraction of patterns of information
from historical data, enabling companies to focus on the most important
aspects of their business -- telling them what they did not know and had
not even thought of asking.
What is the major difference between data reporting and data mining ?
There are a lot of software companies and experts claiming to do “data
mining” when in fact they are performing “data reporting.” Data
mining starts with data integration ,cleansing and ends with predictions
with their accuracy. In addition, ranks the predictive parameters and gives
users how well they account for the observed variability.
problem with data mining is that it can’t be done well without
b) subject knowledge
c) real data mining software(s), and
d) developing an application(DSS) for an end user.
Q . What are the requirements for data
Ans. Basic requirements to have
simple to complex analysis from
data mining tools are :
- Data bases (mainly
historical) and mechanism for online regular updation of data on long term
Subject as well as
expert domain knowledge besides the usage of state-of-the-art data mining
Evaluation criteria to select a data mining tool should include:
a) Easy to Use: It should be easy to use for an expert. Data mining
software should not be an MS Office look-alike, where people learning the
job can intuitively play with it and get results. If they do, they’ll
come to you with great graphs that point you in bizarre directions.
b) Speed: Three weeks to build a model is not something that can be
implemented in today's business. Not in data mining at least. For any type
of data mining project that will actually tell you something real you may
very well take a couple of months just to prepare the data.
c) Integration: All high-end software packages can integrate on all
platforms and file systems. All cheaper software packages say they do.
d) Robustness: However, robustness should be a function of the model. You
need to develop an appropriate model, and that is where most data mining
Q . Which are the principal differences between
Data Mining and the statistical analysis from the data?
Ans. There is not a real difference between Dated Mining
and the statistical analysis from the data. Data Mining uses some of the statistical tools for analysis of the data while adding to it
of the methods resulting from the heuristic technique and information
theory more recently
While statistical analysis itself is a broad term
and also most of statistical tools are offline while data mining can be
Q . Which is the relationship between Data
Mining and Data Warehouses/ Data Marts?
Ans. Mining can use data from a data
warehouse as well as data mart for the purpose of analysis where Data
warehousing is the process of integrating enterprise-wide corporate data
into a single repository but "data mart" is a department
or functional oriented data warehouse. It is a scaled down version of a
data warehouse that focuses on the local needs of a specific department
like finance or purchase. An organization may have many data marts, each
focused on a subset of a distinct organization activity
Q . Which are the objectives of Text Mining?
Ans. The objectives of Text Mining can be varied:
- To quantify a text or parts of a text to extract the meaning
structures from them strongest,
- To establish bonds between the terms and the documents
- To analyze the documents in their associating qualitative and
quantitative information structured
- To lay down rules of automatic classification of documents
Q. How data warehousing and data mining are useful in
government departments ?
Ans. Since most of the government departments have
scattered databases which exists on different platforms and operating systems. To setup the coordination
between them and remove the duplicacy there is a strong need to build a data warehouse which can serve to
organization as a whole and it will be easy for professionals at higher level to take the decisions
Q. How National Informatics Centre can help to build
Decision Support Systems around DW and DM tools for government departments ?
Ans. Since NIC is maintaining big data bases for most of
government departments , it will be easy to develop a data warehouse/mart which
otherwise is very time consuming, costly and cumbersome
process. Since NIC has a already built infrastructure with network
facilities at all state and district level , online transmission of data from one place to another can be easily carried
out to fulfill the major requirements for regularly updating databases and