Parallel Factor Analysis (parallel + factor_analysis)

Distribution by Scientific Domains


Selected Abstracts


Multi-way models for sensory profiling data

JOURNAL OF CHEMOMETRICS, Issue 1 2008
Rasmus Bro
Abstract One of the problems in analyzing sensory profiling data is to handle the systematic individual differences in the assessments from different panelists. It is unavoidable that different persons have, at least to a certain degree, different perceptions of the samples as well as a different understanding of the attributes or of the scales used for quantifying the assessments. Hence, any model attempting to describe sensory profiling data needs to deal with individual differences; either implicitly or explicitly. In this paper, a unifying family of models is proposed based on (i) the assumption that latent variables are appropriate for sensory data, and (ii) that individual differences occur. Based on how individual differences occur, various mathematical models can be constructed, all aiming at modeling simultaneously the sample-specific variation and the panelist-specific variation. The model family includes Principal Component Analysis (PCA) and PARAllel FACtor analysis (PARAFAC). The paper can be viewed as extending the latent variable approach commonly based on PCA to multi-way models that specifically take certain panelist-variations into account. The proposed model family is focused on analyzing data from quantitative descriptive analysis with fixed vocabulary, but it also provides a foundation upon which comparisons, extensions and further developments can be made. An example is given which shows that even for well-working data, models handling individual differences can shed important light on differences between the quality of the data from individual panelists. Copyright © 2007 John Wiley & Sons, Ltd. [source]


Application of Multiway Chemometric Techniques for Analysis of AC Voltammetric Data

ELECTROANALYSIS, Issue 3-5 2009
Aleksander Jaworski
Abstract Three multiway calibration techniques have been applied for determining of the suppressor concentration in industrial copper electrometallization baths used in semiconductor manufacturing. Parallel factor analysis (PARAFAC) for multiway array decomposition coupled with inverse least squares (ILS) regression (PARAFAC/ILS), direct trilinear decomposition (DTLD) coupled with ILS (DTLD/ILS), and multilinear partial least squares (N-PLS) regression were employed to develop and test calibration models based on trilinear AC voltammetric data. All techniques employed comparatively produce reliable calibration model and provide quantitative information about its robustness. [source]


A fully robust PARAFAC method for analyzing fluorescence data

JOURNAL OF CHEMOMETRICS, Issue 3 2009
Sanne Engelen
Abstract Parallel factor analysis (PARAFAC) is a widespread method for modeling fluorescence data by means of an alternating least squares procedure. Consequently, the PARAFAC estimates are highly influenced by outlying excitation,emission landscapes (EEM) and element-wise outliers, like for example Raman and Rayleigh scatter. Recently, a robust PARAFAC method that circumvents the harmful effects of outlying samples has been developed. For removing the scatter effects on the final PARAFAC model, different techniques exist. Newly, an automated scatter identification tool has been constructed. However, there still exists no robust method for handling fluorescence data encountering both outlying EEM landscapes and scatter. In this paper, we present an iterative algorithm where the robust PARAFAC method and the scatter identification tool are alternately performed. A fully automated robust PARAFAC method is obtained in that way. The method is assessed by means of simulations and a laboratory-made data set. Copyright © 2009 John Wiley & Sons, Ltd. [source]


Accelerating the analyses of 3-way and 4-way PARAFAC models utilizing multi-dimensional wavelet compression

JOURNAL OF CHEMOMETRICS, Issue 11-12 2005
Jeff Cramer
Abstract Parallel factor analysis (PARAFAC) is one of the most popular methods for evaluating multi-way data sets, such as those typically acquired by hyphenated measurement techniques. One of the reasons for PARAFAC popularity is the ability to extract directly interpretable chemometric models with little a priori information and the capability to handle unknown interferents and missing values. However, PARAFAC requires long computation times that often prohibit sufficiently fast analyses for applications such as online sensing. An additional challenge faced by PARAFAC users is the handling and storage of very large, high-dimensional data sets. Accelerating computations and reducing storage requirements in multi-way analyses are the topics of this manuscript. This study introduces a data pre-processing method based on multi-dimensional wavelet transforms (WTs), which enables highly efficient data compression applied prior to data evaluation. Because multi-dimensional WTs are linear, the intrinsic underlying linear data construction is preserved in the wavelet domain. In almost all studied examples, computation times for analyzing the much smaller, compressed data sets could be reduced so much that the additional effort for wavelet compression was more than recompensated. For 3-way and 4-way synthetic and experimental data sets, acceleration factors up to 50 have been achieved; these data sets could be compressed down to a few per cent of the original size. Despite the high compression, accurate and interpretable models were derived, which are in good agreement with conventionally determined PARAFAC models. This study also found that the wavelet type used for compression is an important factor determining acceleration factors, data compression ratios and model quality. Copyright © 2006 John Wiley & Sons, Ltd. [source]


PARAFASCA: ASCA combined with PARAFAC for the analysis of metabolic fingerprinting data

JOURNAL OF CHEMOMETRICS, Issue 2 2008
Jeroen J. Jansen
Abstract Novel post-genomics experiments such as metabolomics provide datasets that are highly multivariate and often reflect an underlying experimental design, developed with a specific experimental question in mind. ANOVA-simultaneous component analysis (ASCA) can be used for the analysis of multivariate data obtained from an experimental design instead of the widely used principal component analysis (PCA). This increases the interpretability of the model in terms of the experimental question. Aside from the levels of individual factors, variation that can be described by the experimental design may also depend on levels of multiple (crossed) factors simultaneously, e.g. the interactions. ASCA describes each contribution with a PCA model, but a contribution depending on crossed factors may be described more parsimoniously by multiway models like parallel factor analysis (PARAFAC). The combination of PARAFAC and ASCA, named PARAFASCA, provides a view on the data that is both parsimonious and focused on the experimental question. The novel method is used to analyze a dataset in which the effect of two doses of hydrazine on the urinary chemical composition of rats is investigated by time-resolved metabolic fingerprinting with nuclear magnetic resonance (NMR) spectroscopy. This experiment has been conducted to monitor the dose-specific urine composition changes in time upon hydrazine administration. Comparison of the PCA, the ASCA and the PARAFASCA models shows that ASCA and PARAFASCA describe the data more dedicated to the experimental question than PCA, but that PARAFASCA is more parsimonious than ASCA, and separates the variation underlying different effects better. Copyright © 2008 John Wiley & Sons, Ltd. [source]


Handling of Rayleigh and Raman scatter for PARAFAC modeling of fluorescence data using interpolation

JOURNAL OF CHEMOMETRICS, Issue 3-4 2006
Morteza Bahram
Abstract Fluorescence excitation-emission matrix (EEM) measurements are useful in fields such as food science, analytical chemistry, biochemistry and environmental science. EEMs contain information which can be modeled using the parallel factor analysis (PARAFAC) model but the data analysis is often complicated due to both Rayleigh and Raman scattering. There are several established ways to deal with scattering effects. However, all of these methods have associated problems. This paper develops a new method for handling scattering using interpolation in the areas affected by first- and second-order Rayleigh and Raman scatter in such a way that the interfering signal is, at best, removed. The suggested method is fast and requires no additional input other than specifying the scattering region. The results of the proposed method were compared with those obtained from common alternative approaches used for preprocessing fluorescence data before analysis with PARAFAC and were shown to be equally good for various types of EEM data. The main advantage of the interpolation method is in its lack of additional metaparameters, its algorithmic speed and subsequent speed-up of PARAFAC modeling. It also allows for using EEM data in software not able to handle missing data. Copyright © 2007 John Wiley & Sons, Ltd. [source]


Mathematical improvements to maximum likelihood parallel factor analysis: theory and simulations

JOURNAL OF CHEMOMETRICS, Issue 4 2005
Lorenzo Vega-Montoto
Abstract A number of simplified algorithms for carrying out maximum likelihood parallel factor analysis (MLPARAFAC) for three-way data affected by different error structures are described. The MLPARAFAC method was introduced to establish the theoretical basis to treat heteroscedastic and/or correlated noise affecting trilinear data. Unfortunately, the large size of the error covariance matrix employed in the general formulation of this algorithm prevents its application to solve standard three-way problems. The algorithms developed here are based on the principle of alternating least squares, but differ from the generalized MLPARAFAC algorithm in that they do not use equivalent alternatives of the objective function to estimate the loadings for the different modes. Instead, these simplified algorithms tackle the loss of symmetry of the PARAFAC model by using only one representation of the objective function to estimate the loadings of all of the modes. In addition, a compression step is introduced to allow the use of the generalized algorithm. Simulation studies carried out under a variety of measurement error conditions were used for statistical validation of the maximum likelihood properties of the algorithms and to assess the quality of the results and computation time. The simplified MLPARAFAC methods are also shown to produce more accurate results than PARAFAC under a variety of conditions. Copyright © 2005 John Wiley & Sons, Ltd. [source]


A new efficient method for determining the number of components in PARAFAC models

JOURNAL OF CHEMOMETRICS, Issue 5 2003
Rasmus Bro
Abstract A new diagnostic called the core consistency diagnostic (CORCONDIA) is suggested for determining the proper number of components for multiway models. It applies especially to the parallel factor analysis (PARAFAC) model, but also to other models that can be considered as restricted Tucker3 models. It is based on scrutinizing the ,appropriateness' of the structural model based on the data and the estimated parameters of gradually augmented models. A PARAFAC model (employing dimension-wise combinations of components for all modes) is called appropriate if adding other combinations of the same components does not improve the fit considerably. It is proposed to choose the largest model that is still sufficiently appropriate. Using examples from a range of different types of data, it is shown that the core consistency diagnostic is an effective tool for determining the appropriate number of components in e.g. PARAFAC models. However, it is also shown, using simulated data, that the theoretical understanding of CORCONDIA is not yet complete. Copyright © 2003 John Wiley & Sons, Ltd. [source]