Home About us Contact | |||
Gaussian Mixture Model (gaussian + mixture_model)
Selected AbstractsTypologies of Advantage and Disadvantage: Socio-economic Outcomes in Australian Metropolitan CitiesGEOGRAPHICAL RESEARCH, Issue 4 2005SCOTT BAUM Abstract Australia's metropolitan cities have undergone significant social, economic and demographic change over the past several decades. In terms of socio-economic advantage and disadvantage these changes, which are often associated with globalisation, wider economic and technological restructuring, the changing demographics of the population and shifts in public policy are not evenly dispersed across cities, but represent a range of often contrasting outcomes. The current paper develops a typology of socio-economic advantage and disadvantage for locations across Australian metropolitan cities. More specifically, the paper takes a range of Australian Bureau of Statistics data and uses a model-based approach with clustering of data represented by a parameterised Gaussian mixture model and discriminant analysis utilised to consider the differences between the clusters. These clusters form the basis of a typology representing the range of socio-economic and demographic outcomes at the local community level. [source] Improved GMM with parameter initialization for unsupervised adaptation of Brain,Computer interfaceINTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING, Issue 6 2010Guangquan Liu Abstract An important property of brain signals is their nonstationarity. How to adapt a brain,computer interface (BCI) to the changing brain states is one of the challenges faced by BCI researchers, especially in real application where the subject's real intent is unknown to the system. Gaussian mixture model (GMM) has been used for the unsupervised adaptation of the classifier in BCI. In this paper, a method of initializing the model parameters is proposed for expectation maximization-based GMM parameter estimation. This improved GMM method and other two existing unsupervised adaptation methods are applied to groups of constructed artificial data with different data properties. Performances of these methods in different situations are analyzed. Compared with the other two unsupervised adaptation methods, this method shows a better ability of adapting to changes and discovering class information from unlabelled data. The methods are also applied to real EEG data recorded in 19 experiments. For real data, the proposed method achieves an error rate significantly lower than the other two unsupervised methods. Results of the real data agree with the analysis based on the artificial data, which confirms not only the effectiveness of our method but also the validity of the constructed data. Copyright © 2009 John Wiley & Sons, Ltd. [source] Nonstationary fault detection and diagnosis for multimode processesAICHE JOURNAL, Issue 1 2010Jialin Liu Abstract Fault isolation based on data-driven approaches usually assume the abnormal event data will be formed into a new operating region, measuring the differences between normal and faulty states to identify the faulty variables. In practice, operators intervene in processes when they are aware of abnormalities occurring. The process behavior is nonstationary, whereas the operators are trying to bring it back to normal states. Therefore, the faulty variables have to be located in the first place when the process leaves its normal operating regions. For an industrial process, multiple normal operations are common. On the basis of the assumption that the operating data follow a Gaussian distribution within an operating region, the Gaussian mixture model is employed to extract a series of operating modes from the historical process data. The local statistic T2 and its normalized contribution chart have been derived for detecting abnormalities early and isolating faulty variables in this article. © 2009 American Institute of Chemical Engineers AIChE J, 2010 [source] Conditional Gaussian mixture modelling for dietary pattern analysisJOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES A (STATISTICS IN SOCIETY), Issue 1 2007Michael T. Fahey Summary., Free-living individuals have multifaceted diets and consume foods in numerous combinations. In epidemiological studies it is desirable to characterize individual diets not only in terms of the quantity of individual dietary components but also in terms of dietary patterns. We describe the conditional Gaussian mixture model for dietary pattern analysis and show how it can be adapted to take account of important characteristics of self-reported dietary data. We illustrate this approach with an analysis of the 2000,2001 National Diet and Nutrition Survey of adults. The results strongly favoured a mixture model solution allowing clusters to vary in shape and size, over the standard approach that has been used previously to find dietary patterns. [source] Using unlabelled data to update classification rules with applications in food authenticity studiesJOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES C (APPLIED STATISTICS), Issue 1 2006Nema Dean Summary., An authentic food is one that is what it purports to be. Food processors and consumers need to be assured that, when they pay for a specific product or ingredient, they are receiving exactly what they pay for. Classification methods are an important tool in food authenticity studies where they are used to assign food samples of unknown type to known types. A classification method is developed where the classification rule is estimated by using both the labelled and the unlabelled data, in contrast with many classical methods which use only the labelled data for estimation. This methodology models the data as arising from a Gaussian mixture model with parsimonious covariance structure, as is done in model-based clustering. A missing data formulation of the mixture model is used and the models are fitted by using the EM and classification EM algorithms. The methods are applied to the analysis of spectra of food-stuffs recorded over the visible and near infra-red wavelength range in food authenticity studies. A comparison of the performance of model-based discriminant analysis and the method of classification proposed is given. The classification method proposed is shown to yield very good misclassification rates. The correct classification rate was observed to be as much as 15% higher than the correct classification rate for model-based discriminant analysis. [source] Correlating two continuous variables subject to detection limits in the context of mixture distributionsJOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES C (APPLIED STATISTICS), Issue 5 2005Haitao Chu Summary., In individuals who are infected with human immunodeficiency virus (HIV), distributions of quantitative HIV ribonucleic acid measurements may be highly left censored with an extra spike below the limit of detection LD of the assay. A two-component mixture model with the lower component entirely supported on [0, LD] is recommended to model the extra spike in univariate analysis better. Let LD1 and LD2 be the limits of detection for the two HIV viral load measurements. When estimating the correlation coefficient between two different measures of viral load obtained from each of a sample of patients, a bivariate Gaussian mixture model is recommended to model the extra spike on [0, LD1] and [0, LD2] better when the proportion below LD is incompatible with the left-hand tail of a bivariate Gaussian distribution. When the proportion of both variables falling below LD is very large, the parameters of the lower component may not be estimable since almost all observations from the lower component are falling below LD. A partial solution is to assume that the lower component's entire support is on [0, LD1]×[0, LD2]. Maximum likelihood is used to estimate the parameters of the lower and higher components. To evaluate whether there is a lower component, we apply a Monte Carlo approach to assess the p -value of the likelihood ratio test and two information criteria: a bootstrap-based information criterion and a cross-validation-based information criterion. We provide simulation results to evaluate the performance and compare it with two ad hoc estimators and a single-component bivariate Gaussian likelihood estimator. These methods are applied to the data from a cohort study of HIV-infected men in Rio de Janeiro, Brazil, and the data from the Women's Interagency HIV oral study. These results emphasize the need for caution when estimating correlation coefficients from data with a large proportion of non-detectable values when the proportion below LD is incompatible with the left-hand tail of a bivariate Gaussian distribution. [source] Using statistical image models for objective evaluation of spot detection in two-dimensional gelsPROTEINS: STRUCTURE, FUNCTION AND BIOINFORMATICS, Issue 6 2003Mike Rogers Abstract Protein spot detection is central to the analysis of two-dimensional electrophoresis gel images. There are many commercially available packages, each implementing a protein spot detection algorithm. Despite this, there have been relatively few studies comparing the performance characteristics of the different packages. This is in part due to the fact that different packages employ different sets of user-adjustable parameters. It is also partly due to the fact that the images are complex. To carry out an evaluation, "ground truth" data specifying spot position, shape and intensities needs to be defined subjectively on selected test images. We address this problem by proposing a method of evaluation using synthetic images with unambiguous interpretation. The characteristics of the spots in the synthetic images are determined from statistical models of the shape, intensity, size, spread and location of real spot data. The distribution of parameters is described using a Gaussian mixture model obtained from training images. The synthetic images allow us to investigate the effects of individual image properties, such as signal-to-noise ratios and degree of spot overlap, by measuring quantifiable outcomes, e.g. accuracy of spot position, false positive and false negative detection. We illustrate the approach by carrying out quantitative evaluations of spot detection on a number of widely used analysis packages. [source] |