български  
 
The power of our analysis

Data Entry software
MicroStat Analytics Data Entry offers all functional possibilities for data entry…
more »»

Neural networks
The universality of the Neural Networks is one of their most precious features…
more »»

Logistic regressions
Regression models with a categorical dependent variable…
more »»

Six Sigma
A statistical approach for quality control, leading to a production close to perfection…
more »»

Time Series Analysis and Forecasting
Strategic and tactical forecasts…
more »»

 

 

top  

Data Entry Software

 

 

The use of data entry software for research data (Data Entry Software) increases drastically the credibility of the entered data. For each questionnaire MicroStat Analytics develops a separate data entry software, based on MicroStat Analytics Data Entry.
Characteristics of MicroStat Analytics Data Entry:

  • MicroStat Analytics Data Entry is a completed software package for Windows, offering all the functional possibilities, needed for the data entry process –easy installation and uninstallation, working environment for data entry with specially developed for the particular inquiry card model of the questionnaire, build-in controls for validation of the answers, logical and arithmetical control, compulsory answers, check-ups for uniqueness of identificators, possibility for generation of a file with data for sending and following unification, etc.

  • MicroStat Analytics Data Entry is based on a free platform. This conceptual solution allows the service to be at completely acceptable prices, because it is not additionally made more expensive by license fees and it is not necessary for the customer to buy all the licenses for all the working stations of the operators.

  • MicroStat Analytics Data Entry takes extremely small disk space. On practice it is fits in diskette. This allows easy and comfortable distribution of a distributive to the working stations of the operators, irrespectively of the place they work in – in the office or at home.

  • MicroStat Analytics Data Entry is distributed to the operators together with guidelines for working with the software and instructions for correct data entry of every particular card.

  • MicroStat Analytics Data Entry allows data entry to be done in a comfortable way for the operators through the digital part of the keyboard, in contrast to other products with similar function, which impose working with dropdown menus.

  • MicroStat Analytics ensures free training of the operators for working with MicroStat Analytics Data Entry.

 

top  

Neural networks

 

 

For first attempt in the creation and examination of artificial neural networks is considered the work of McCulloch and Pitts in 1943, in which they formulated the basic principle for construction of artificial neurons and neural networks. A big breakthrough in this field is made by Rosenblatt in 1962, who created a model of one-layered neural network, called percepton. It is used in for a variety of tasks like weather forecasting, electrocardiogram analysis and artificial vision. Later on Minsky and Papert proved in a strictly mathematical way, that this kind of neural network is not able to solve some very simple tasks like ‘XOR’. The unimpeachability of the proofs becomes the reason for the standstill in the development of the neural networks. Only a few persistent scientists like Kohonen, Anderson, etc continued the research in this direction. As it becomes clear later on, Minski was too pessimistic in his conclusions for the future of the neural networks, because the tasks he described as unsolvable, are solved today by standard models of neural networks.


At the moment the neural networks are used in various fields – from analysis of temporal rows to control of robots and examination of uncommon diseases. Their universality is one of their most precious features. It is also proven that the neural networks are performing perfectly where the traditional statistical models fail to give a result.


In the literature there is not yet a common adopted definition for ‘neural networks’ (NN). The terminological precision requires the use of the whole name – ‘artificial neural networks’ (ANN), but as the biological neural networks are not going to be examined, only the short name will be used. The complexity when defining the concept neural networks, origins from the fact that this subject has an interdisciplinary character. Besides, there is a big variety of algorithms, united under the name ‘neural networks’. The ways for realization of this algorithms are also conceptionally different – software and hardware. And last, but not least the various fields of implementation of the neural networks also lead to the fact that the experts from different fields put a different sense to this concept. After all, without pretending for absolute precision, the following working definition for the neural networks can be given.


The common name of several groups of algorithms that have the ability fot self-training through examples extracting the relationships of data.

In order to understand the nature of the neural networks, it is proper to begin with the description of the biological analogue, which served as a base for their creation. The human nerve system is composed of elements, called neurons. Their number is about 1011. They have many unique features, but the most important is to accept, process and transmit electrochemical signals through the neural paths, realizing in this way the communication system of the brain. The number of the connections between the neurons is also huge- 1015.

 

Biological Neuron

On the figure the structure of two biological neurons is shown. From the body of the cell two dendrites come out, which connect the cell to the other neurons. The points of junction are called synapses. The accepted input signals go to the body of the neuron. There they are stimulated, while some of the signals strive for exciting the neuron and others- to prevent its excitement. If the total excitement of the neuron is larger than a given minimum, it sends through the axon a signal to the other neurons.  However, this scheme has exceptions, and, of course, in more complex situations the majority of artificial neurons model exactly these simple properties of the biological neurons.

The artificial neural networks are composed of simple elements – neurons connected with each other through a huge number of connections. The artificial neuron can imitate the properties of the real one. A variety of input signals come at its entrance, each of which is an output signal from another neuron. Each entry signal is multiplied by a corresponding weight (analogical to the strength of the synapse), the products are added together and in this way the so called activation level of the neuron is determined, which also determined its consequent behavior.

 

The simplest model of an artificial neuron is proposed by McCulloch and Pitts. The scheme of such a neuron is represented in the next figure.

Artificial Neuron

The input signals are multiplied by the corresponding weights and are summed in the core. After that they are compared with some minimal value . The output signal is determined in the following way.

Here represents:

The function   is called activation function. In the model of McCulloch and Pitts it is discrete:

The positive value of the weights acts excitingly on the synapse, and the negative – suspensory. If the weight is zero, than this testifies the lack of connectio n between the neurons.


As this type discrete function has its disadvantages, in their quality of antirational, most commonly the following functions are used, ‘shrinking’ the exiting signal in some boundaries:


Logistic function, varying in the range between 0 and 1:

 

Hyperbolic tangent, varying in the range between -1 and 1:

The combination of the variety of artificial neurons, arranged in layers, together with the connections between them, form the architecture of the artificial neural network. The layers can be input, hidden (one or more) and output.


The big variety of algorithms, united under the name ‘neural networks’ also determines the variety of classifications of their types. The most significant classification is by the way, in which they learn. The learning is a process, under which the combination of data is passed to the network consecutively on steps, called epochs. There are two types of learning – ‘Supervised Learning’ and ‘Unsupervised Learning’.


In the case of ‘Supervised Learning’ a combination of input variables and an output variable is passed to the network.


In the case of ‘Unsupervised Learning’ the network does not need an output variable. The process follows the properties of the input data and groups the similar objects in clusters.

At the moment a great variety of different architectures of neural networks exists – multilayered perception (MLP), probabilitical neural networks (PNN), radial basic functions (RBF), associative network of Hopfield, Elman model, self-organizing Kohonen maps (SOM) or , self organizing feature maps (SOFM) etc.

 

 

Multilayer percepton

On the figure, an exemplary scheme of a neural network with a hidden layer is represented. This architecture is probably the most widely used. The number of the hidden layers can also be more than one, but on practice networks with more than two to three hidden layers are not applied. The reason is that ‘theoretically for modeling of a random task, a multilayered perception with two hidden layers is enough (in its exact formulation this result is known as the Kolmogorov theorem)’.

Multilayer Perceptron with one hidden layer

This type of network is called Multilayer Perceptron (MLP). It is proposed by Rumelhart and McClelland and it is being discussed in detail in all works, connected to the neural networks. Each element from the input layer is connected through a defined weight with an element from the next layer (in this case the hidden layer). In a similar way each element from the hidden layer is connected with each element from the output layer.

In order for the network to start working, it should be trained. The aim is such values of the weights and thresholds to be found, that would minimize the aggregate error of the network. This happens through letting the whole combination of real data through the network and comparing them with the forecasted ones. All such differences are added together and the received value is the error. Most often this is an average quadratic error, at which the target function is minimized:

where:

 is the value of the output layer , calculated by the network;

– the real value of this neuron on the basis of the input data.

The combination of data is passed to the network at a defined number of epochs. At each following epoch the network minimizes its error, until it reaches some preliminary determined criteria, for example error size, rate of decreasing the error size, number of epochs, etc.

Unlike the linear methods, in which the minimum of the function is found with the help of analytical methods, in the case of neural networks, this is impossible. Searching the minimum is done through iterative algorithms, ‘travelling over’ the so called surface of the error, which is a multidimensional hyper plane with a complex lay. During this iterative process, however, there is a threat that the found by algorithm minimum is only local, and not lump, which means that the best solution has not been found. This is also the price that is paid for the nonlinear possibilities for modeling neural networks. In order to decrease the probability of founding a local minimum, the network is trained many times, and the received results are compared.

Different algorithms are used for training the networks. The most popular of them is the Back Propagation. Of course, there are also numerous other algorithms like the gradient method, the Levenberg-Marquard algorithm, etc.

One of the biggest difficulties when training the neural networks is choosing the parameters in such a way, that it is able to summarize the information later on. That is, the network should be able to pass new, unknown for it, data and to deduce a true result. If, however, during the training process, this necessity is not taken into account, the network is trained to absolutely exactly settle itself down to the data, but loses the ability to summarize. This problem is known as over-learning.

An example for such a problem can be also given when applying the regression analysis. It is known that when the function, through which the data is approximated, is a polynomial one, the higher its degree, the higher the determination coefficient. This, however, does not necessarily mean that the polynomial with the higher degree should be preferred.

In order for the retraining of the network to be avoided, the so called cross-check is done. A part of the input data is used for a type of an ‘independent control’ of the results. This dataset is called ‘test dataset’, and the dataset used during the training-‘training dataset’. In the beginning of the training the error for the training and the test dataset is the same. In the process of training, if the error of the test dataset decreases together with the error of the training one, this shows that the network needs more training. If the error of the test dataset stops decreasing and even starts increasing, then the training should be terminated, because the network has started to retrain itself.

In practice the process of model searching and the most suitable adjustments takes a lot of experimentation. This leads to unpleasant consequences – the test dataset starts playing a role when making the choice. It becomes a part of the training process and can no more be used for ‘independent control’. That is why it is necessary that another combination to be selected from the data- the so called ‘test’ dataset. It is used only once in the end of the training in order to check the results.

 

Self Organizing Kohonen Maps

The idea for Kohonen networks (Self Organizing Maps, SOM or Self Organizing Feature Maps, SOFM) has also originated analogically to some features of the human brain. The cerebral cortex is a big flat sheet (with area of about 0.5 m2, which, in order to fit in the human skull, is strongly folded) with topological features. For example, the section, responsible for the wrist is situated near the section, responsible for the movement of the whole hand. In this way the image of the human body is constantly being mapped on the two-dimensional surface of the cerebral cortex.

Kohonen maps are among the most popular kinds of neural networks. They are intended to identify the clusters of similar data, and to determine their proximity as well. They work on the principle ‘unsupervised learning’, realizing a process of clustering. Only input data is sent to the network, and it does not have any preliminary given output information.

The algorithm involved in Kohonen maps is a variation of multi-dimensional vectors clustering. With the help of this algorithm a mapping from a higher dimensional input space (determined by the number of indicators) to a lower dimensional (it is usually two-dimensional, but it is also possible to be one-dimensional) with preserving the topological resemblance, is achieved. This means that all vectors, which are adjacent to the topological map, are also adjacent in the input space. It should be noted that the opposite is not always true.

The Kohonen network is taught through the method of the successive approximations – Kaski. Each neuron on the topological map is – dimensional vector, where is the size of the input space (the number of indicators). The quantity of neurons on the topological map determines the degree of detailisation of the results from the work of the algorithm. Their initial position is chosen randomly.

Kohonen Map

Beginning with these randomly situated centers of clusters, the algorithm gradually improves their position in such a way as to catch the input data clustering (the objects in the input space are represented as dots). In result of the iterative procedure of learning the map is self-organizing in such a way that the elements, corresponding to the centers and situated near one another in the input space, are also situated close to the topological map (the output layer).

The algorithm is known as ‘the winner takes all’ and consists of the following:

1. The neuron-winner is chosen (the one, which is situated most closely to the input example-object). In practice the learning of the Kohonen map is a correction of the positions of the vectors-neurons on the topological map. At every step of the learning (the term, used with neural networks, is epoch) from the input supply of data, one of the vectors is chosen randomly and then the nearest to it vector from the neurons on the topological map is looked for. In this way the neuron-winner, which mostly resembles the input vector, is chosen. Under resemblance here is understood the distance between the vectors (usually Evklid space). The formula is:

where:

 is input vector;

 -vector of weights (of the ouput layer)

2. The neuron-winner is corrected is such a way that it resembles more the input example (for this purpose the weighted sum of the previous center of the neuron and the input example is calculated). By doing this, the vector, describing the neuron winner and the vectors, describing its neighbors on the topological map, move in the direction of the input vector.

In this process of correction of weights, the formula is used:

Where is the number of the epochs (discrete time). The vector is randomly chosen from the input combination of vectors of the epoch . The function is called the neuron neighborhood function. This is a non increasing function of time and distance between the neuron-winner and its neighbors.

When learning the Kohonen network, the concept ‘neighborhood’ is used. The neighborhood is the set of neurons, surrounding the neuron-winner. Its size decreases with time, as at the end it becomes equal to zero, that is, it is composed only of the neuron-winner itself. As a result of this procedure, bigger and bigger sections of the networks are attracted to the input examples – input objects. In this way the observations, which are similar to one another, activate groups of neurons, situated closely on the topological map. The process is repeated over and over again according to the chosen number of epochs.

After the Kohonen network is trained, the so called ‘Unified Distance Matrix’ (U-Matrix) is used for the recognition of clusters. In this way the distance (usually Euclid) from each neuron to its neighbors on the topological map, is calculated. This distance determines in what color the neuron is represented on the map. The small distances speak for resemblance of the neuron-neighbors, and the big ones- for differences. The coloring is done analogically to altitude maps - the small values are colored in green, and the high ones- in brown. In this way the cluster on the map should form areas in green colors, and around them beige-brown-red areas should be situated- the boundaries of the clusters. Another option of coloring is to be black and white. In this option the white color corresponds to the small distances, and the black color- to the large ones. In this way the clusters are colored in white, and the boundaries – in black.

Unified Distance Matrix

It is also possible that maps of variables are produced, used for describing the input vectors. In this way it can be identified in which region of the map the corresponding variable has low values, and in which region -high ones. This makes it possible, ‘portraits’ of the clusters to be made, that is, their description to be made up. The received combination of cards represents an original ‘atlas’, describing the situation of the variables and clusters in the combination of data

 

Author: Research Associate Ist degree Alexander Tzvetkov, PhD

 

top  

Logistic regressions

 

 

The logistic regressions are defined as regressive models, in which the dependent variable is measured on the weak scales. In the literature the separate logistic regressions are examined with a binary dependent variable and one, whose dependent variable takes more than two values.

The logistic regressions are usually used for dependencies analysis. In this case the dependent variable shows if a given event has occurred or not. Often, however, the logistic regressions is used when solving a task for classification and then the dependent variable reflect the affiliation of the objects to the given group. The purpose is, using a combination of properties, to classify the objects to the given groups and to ‘produce a rule’ for the classification of new objects.

The logistic regression is described generally with the following equation:

Where:

 is the so called ‘hidden’ or ‘latent’ variable, whose values cannot be observed.

 are independent variables;

 – remainders.

What we observe, is the variable:

The equation for the model of logistic regression can be written in the following form:

The dependent variable on the left side of the equation is called ‘log of odds’. The ‘odds’ is the ratio of the favorable to the unfavorable events. In the case, this is the probability that the object belongs to a given group to the probability that it does not belong to it.

Transforming the above equation in the form:

shows, that the probability (!) to take value of 1, is . It can also be seen, that the upper limit of is 1 and the lower limit is zero, independently of the fact that the variable  varies in the interval .

Therefore, on the left side of the equation represents the estimated probability to take 1 as a value. Exactly this probability is used in the task of classification of objects. It is calculated by antilogarithming the logarithm from the odds by the formula:

In the process of classification (if not explicitly otherwise indicated) the objects with higher than 0,5 are classified to the first group (for which equals 1), and the rest- to the second one ( equals 0). Another number from the interval [0,1] can also be given as a classification cutoff.

The parameters of the logistic regression are interpreted similarly to these in the regression analysis. They show with how much will the logarithm change from the odds with a one unit change of the factor variable, in condition that the values of the other factors, included in the model, stay constant. This is accepted as the so called ‘pure influence’ of the given factor on the dependent variable. The influence of the factor variables on the change of odds can also be estimated. This is done, as is calculated for each of them.

The so called ‘marginal effects’ are being calculated. They show how the probability for affiliation to a given group changes when changing the value of the factor variable with one unit, in condition that the other variables are kept constant. For this purpose the log of odds is antilogarithmed. When interpreting the marginal effects, the concept ‘basic group’ is used. It includes objects, having the average (basic) values of all the variables. If the factors are measured on the strong scales, the objects from the basic groups take their average values. For variables, measures on weak scales, the objects from the basic group take the value of zero.

For estimation of the parameters of the logistic regression the method for maximum credibility is used. When doing this, the following requirements for the data are set:

  • Large enough combination of data.

  • The two alternatives of the dependent variable are presented in a large enough degree, that is, have large enough number of objects, taking values of 1 or 0. Otherwise a problem of heteroscedasticity may arise and the estimations of the parameters will not be effective enough.

 Author: Research Associate Ist degree Alexander Tzvetkov, PhD

 

top  

Six Sigma

 

 

Six Sigma can be defined as the statistical approach for quality control, which leads to a production close to perfection. The name origins from the most widely used indicator for statistical dispersion – the standard deviation ?. With the help of Six Sigma it is possible in practice to eliminate the flaws of every business process – from the production to the sale of goods and services.

What is the ultimate aim of Six Sigma? In order for one process to reach a level Six Sigma, it should NOT allow on average, more than 3.4 defective on 1 million products. As defective is defined every product or service, whose quality deviates from the user requirements.

 

Standard Deviations from norms

 

Influence of defects on the total cost of companies
Source: Quality Associates International Inc.

Six Sigma includes two main approaches – DMAIC and DMADV.

DMAIC is mainly used for analyzing and improving the already existing business processes. It goes through the following five stages:

  • Defining the required aims and processes, that must be improved in correspondence to the user demand and the company strategy.

  • Measurement of the processing and gathering the needed information.

  • Analysis of the connections and dependencies between the factors that influence the process with the help of suitable statistical methods.

  • Improving and optimizing the processes through a number of statistical techniques, such as experiment design.

  • Control, in order to ensure the processes run according to the desired way.

MADV is mainly used when developing new products and services on the market. It passes through the following five stages:

  • Defining the operations and processes, which should be introduced and correspond to the user demand and the company strategy.

  • Estimation and determination of the factors, influencing the quality and the corresponding risk.

  • Analysis and development of project alternatives. Choice of the best project.

  • Project design and process optimization. The use of simulations is possible.

  • Confirmation of the project and its introduction in the production.

 

Some of the most popular companies in the world, implementing Six Sigma:

3M, A.B. Dick Company, Abbott Labs, Adolph Coors, Advanced Micro Devices, Aerospace Corp, Airborne, Alcoa, Allen Bradley, Allied Signal, Ampex, Apple Computers, Applied Magnetics, ASQC, Atmel, Baxter Pharmaseal, Beatrice Foods, Bell Helicopter, Boeing, Bombardier, Borden, Bristol Meyers - Squibb, Bryn Mawr Hospital, Campbell Soup, Cellular 1, Chevron, Citicorp, City of Austin, TX, City of Dallas, TX, Clorox, Cooper Ind, Dannon, Defense Mapping Agency, Delnosa ( Delco Electronics in Mexico), Digital Equipment Corp, DTM Corp, Eastmen Kodak, Electronic Systems Center, Empak, Florida Dept. of Corrections, Ford Motor Company, GEC Marconi, General Dynamics, General Electric, Hazeltine Corp, Hewlett Packard, Holly Sugar, Honeywell, Intel, Junior Achievement, Kaiser Aluminum, Kraft General Foods, Larson & Darby, Inc, Laser Magnetic Storage, Lear Astronics, Lenox China, Littton Data Systems, Lockhee Martin, Loral, Los Alamos National labs, Martin Marietta, McDonnell Douglas, Merix, Microsoft, Morton Int'l, Motorola, NASA, Nat'l Institute of Corrections, Nat'l Institute of Standards, Nat'l Semiconductor, Natural Gas Pipeline Company of America, Northrop Corp, PACE, Parkview Hospital, Pentagon, Pharmacia, PRC, Inc, Qualified Specialists, Ramtron Corp, Rockwell Int'l, Rohm & Haas, Seagate, Society of Plastics Egineers, Solar Optical, Sony, Star Quality, Storgae Tek, Symbios Logic, Synthes, Technicomp, Tessco, Texaco, Texas Commerce Bank, Texas Dept. of Transportation, Texas Instruments, Titleist, Trane, TRW, Ultratech Stepper, United States Air Force, United States Army, United technologies, UPS, USAA, Verbatim, Walbro Automotive, Walker parking, Woodward Governor, Xerox.

 

top  

Time series analysis and forecasting

 

 

Strategic forecasts

When elaborating business development strategies, key elements are research and forecasting of the development trends of basic indicators, factors on which they depend and the elaboration of exact enough forecasts for their development throughout the next several years.

It is known that, in order to implement the most statistical and econometrical methods, it is required to have long enough period of time available. If, however, such data cannot be provided (especially for a starting business) MicroStat Analytics offers opportunities for implementation of alternative statistical methods, ensuring sufficient precision of the forecast.

Methods used:

Trend models
Method of exponential moving averages
Regression and autoregression models
ARIMA and VARMA models
SETAR
STAR
Markov processes and Markov-Switching models (MSW)
VAR models
Complex models
Neural networks
Impulsive functions

Tactical forecasts

MicroStat Analytics offers research and analysis of the important from the tactical development viewpoint business components (income, expenditures, work load of the resources in time, etc) within the frames of one working day, week, month, etc., and providing in this way an opportunity for planning of the needed resources for normal flow of business processes.

Used models:

Autospectral and crosspectral analysis
Wavelets
Seasonal ARIMA models
ARCH/GARCH models

 

top  

 


Add to Google
Powered by  MyPagerank.Net

site map  |  web design  |  terms of use  |  privacy policy
home  |  about us  |  why are we unique?  |  our experience  |  contacts  |  links

MicroStat Analytics Ltd © 2006-2010

All materials contained on this site are property of MicroStat Analytics Ltd and are protected by the Bulgarian copyright law. Any reproduction, distribution, transmission, display, publishing or broadcast of contents (in parts or as a whole) without the prior written permission of MicroStat Analytics Ltd is not allowed.