Multivariate Data (multivariate + data)

Distribution by Scientific Domains

Terms modified by Multivariate Data

  • multivariate data analysis

  • Selected Abstracts


    Some contributions to the analysis of multivariate data

    BIOMETRICAL JOURNAL, Issue 2 2009
    Arne C. Bathke
    Abstract In this paper, we provide an overview of recently developed methods for the analysis of multivariate data that do not necessarily emanate from a normal universe. Multivariate data occur naturally in the life sciences and in other research fields. When drawing inference, it is generally recommended to take the multivariate nature of the data into account, and not merely analyze each variable separately. Furthermore, it is often of major interest to select an appropriate set of important variables. We present contributions in three different, but closely related, research areas: first, a general approach to the comparison of mean vectors, which allows for profile analysis and tests of dimensionality; second, non-parametric and parametric methods for the comparison of independent samples of multivariate observations; and third, methods for the situation where the experimental units are observed repeatedly, for example, over time, and the main focus is on analyzing different time profiles when the number p of repeated observations per subject is larger than the number n of subjects. [source]


    Visual Clustering in Parallel Coordinates

    COMPUTER GRAPHICS FORUM, Issue 3 2008
    Hong Zhou
    Abstract Parallel coordinates have been widely applied to visualize high-dimensional and multivariate data, discerning patterns within the data through visual clustering. However, the effectiveness of this technique on large data is reduced by edge clutter. In this paper, we present a novel framework to reduce edge clutter, consequently improving the effectiveness of visual clustering. We exploit curved edges and optimize the arrangement of these curved edges by minimizing their curvature and maximizing the parallelism of adjacent edges. The overall visual clustering is improved by adjusting the shape of the edges while keeping their relative order. The experiments on several representative datasets demonstrate the effectiveness of our approach. [source]


    Testing the capital asset pricing model efficiently under elliptical symmetry: a semiparametric approach

    JOURNAL OF APPLIED ECONOMETRICS, Issue 6 2002
    Douglas J. Hodgson
    We develop new tests of the capital asset pricing model that take account of and are valid under the assumption that the distribution generating returns is elliptically symmetric; this assumption is necessary and sufficient for the validity of the CAPM. Our test is based on semiparametric efficient estimation procedures for a seemingly unrelated regression model where the multivariate error density is elliptically symmetric, but otherwise unrestricted. The elliptical symmetry assumption allows us to avoid the curse of dimensionality problem that typically arises in multivariate semiparametric estimation procedures, because the multivariate elliptically symmetric density function can be written as a function of a scalar transformation of the observed multivariate data. The elliptically symmetric family includes a number of thick-tailed distributions and so is potentially relevant in financial applications. Our estimated betas are lower than the OLS estimates, and our parameter estimates are much less consistent with the CAPM restrictions than the corresponding OLS estimates. Copyright © 2002 John Wiley & Sons, Ltd. [source]


    PARAFASCA: ASCA combined with PARAFAC for the analysis of metabolic fingerprinting data

    JOURNAL OF CHEMOMETRICS, Issue 2 2008
    Jeroen J. Jansen
    Abstract Novel post-genomics experiments such as metabolomics provide datasets that are highly multivariate and often reflect an underlying experimental design, developed with a specific experimental question in mind. ANOVA-simultaneous component analysis (ASCA) can be used for the analysis of multivariate data obtained from an experimental design instead of the widely used principal component analysis (PCA). This increases the interpretability of the model in terms of the experimental question. Aside from the levels of individual factors, variation that can be described by the experimental design may also depend on levels of multiple (crossed) factors simultaneously, e.g. the interactions. ASCA describes each contribution with a PCA model, but a contribution depending on crossed factors may be described more parsimoniously by multiway models like parallel factor analysis (PARAFAC). The combination of PARAFAC and ASCA, named PARAFASCA, provides a view on the data that is both parsimonious and focused on the experimental question. The novel method is used to analyze a dataset in which the effect of two doses of hydrazine on the urinary chemical composition of rats is investigated by time-resolved metabolic fingerprinting with nuclear magnetic resonance (NMR) spectroscopy. This experiment has been conducted to monitor the dose-specific urine composition changes in time upon hydrazine administration. Comparison of the PCA, the ASCA and the PARAFASCA models shows that ASCA and PARAFASCA describe the data more dedicated to the experimental question than PCA, but that PARAFASCA is more parsimonious than ASCA, and separates the variation underlying different effects better. Copyright © 2008 John Wiley & Sons, Ltd. [source]


    Maximum likelihood scaling (MALS)

    JOURNAL OF CHEMOMETRICS, Issue 3-4 2006
    Huub C. J. Hoefsloot
    Abstract A filtering procedure is introduced for multivariate data that does not suffer from noise amplification by scaling. A maximum likelihood principal component analysis (MLPCA) step is used as a filter that partly removes noise. This filtering can be used prior to any subsequent scaling and multivariate analysis of the data and is especially useful for data with moderate and low signal-to-noise ratio's, such as metabolomics, proteomics and transcriptomics data. Copyright © 2007 John Wiley & Sons, Ltd. [source]


    Clustering of variables to analyze spectral data

    JOURNAL OF CHEMOMETRICS, Issue 3 2005
    E. Vigneau
    Abstract A cluster analysis of variables around latent variables is presented and applied in order to identify groups among near-infrared (NIR) spectral variables. By organizing multivariate data into a small number of clusters, each of them being represented by a component, this approach makes it possible to reduce the dimensionality of the problem. For the NIR data considered herein, it turned out that the groups of spectral variables are associated with various spectral regions. This feature can be helpful for the interpretation of the outcomes. For a predictive perspective the groups of variables can be used as blocks in multiblock partial least squares models. Alternatively the latent variables associated with the various clusters can be used as predictors. The cluster analysis procedure together with how its outcomes can be used for prediction purposes are illustrated on the basis of sensory and NIR data on green peas. Copyright © 2005 John Wiley & Sons, Ltd. [source]


    Target transform fitting: a new method for the non-linear fitting of multivariate data with separable parameters

    JOURNAL OF CHEMOMETRICS, Issue 6 2001
    Porn Jandanklang
    Abstract Data fitting is an important technique in chemistry. The number of parameters to be fitted is a most significant aspect: the larger the number, the more difficult the task. A unique combination of target factor analysis with non-linear data fitting can result in complete or partial separation of the parameters, which can then be fitted independently. Thus, instead of one multiparameter fit, several fits are performed with only one or a few parameters. The procedure can also support model development. Applications in kinetics and chromatography are presented. Copyright © 2001 John Wiley & Sons, Ltd. [source]


    Predictive and correlative techniques for the design, optimisation and manufacture of solid dosage forms

    JOURNAL OF PHARMACY AND PHARMACOLOGY: AN INTERNATI ONAL JOURNAL OF PHARMACEUTICAL SCIENCE, Issue 1 2003
    Ian J. Hardy
    ABSTRACT There is much interest in predicting the properties of pharmaceutical dosage forms from the properties of the raw materials they contain. Achieving this with reasonable accuracy would aid the faster development and manufacture of dosage forms. A variety of approaches to prediction or correlation of properties are reviewed. These approaches have variable accuracy, with no single technique yet able to provide an accurate prediction of the overall properties of the dosage form. However, there have been some successes in predicting trends within a formulation series based on the physicochemical and mechanical properties of raw materials, predicting process scale-up through mechanical characterisation of materials and predicting product characteristics by process monitoring. Advances in information technology have increased predictive capability and accuracy by facilitating the analysis of complex multivariate data, mapping formulation characteristics and capturing past knowledge and experience. [source]


    Invariant co-ordinate selection

    JOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES B (STATISTICAL METHODOLOGY), Issue 3 2009
    David E. Tyler
    Summary., A general method for exploring multivariate data by comparing different estimates of multivariate scatter is presented. The method is based on the eigenvalue,eigenvector decomposition of one scatter matrix relative to another. In particular, it is shown that the eigenvectors can be used to generate an affine invariant co-ordinate system for the multivariate data. Consequently, we view this method as a method for invariant co-ordinate selection. By plotting the data with respect to this new invariant co-ordinate system, various data structures can be revealed. For example, under certain independent components models, it is shown that the invariant co- ordinates correspond to the independent components. Another example pertains to mixtures of elliptical distributions. In this case, it is shown that a subset of the invariant co-ordinates corresponds to Fisher's linear discriminant subspace, even though the class identifications of the data points are unknown. Some illustrative examples are given. [source]


    Biplots of compositional data

    JOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES C (APPLIED STATISTICS), Issue 4 2002
    John Aitchison
    Summary. The singular value decomposition and its interpretation as a linear biplot have proved to be a powerful tool for analysing many forms of multivariate data. Here we adapt biplot methodology to the specific case of compositional data consisting of positive vectors each of which is constrained to have unit sum. These relative variation biplots have properties relating to the special features of compositional data: the study of ratios, subcompositions and models of compositional relationships. The methodology is applied to a data set consisting of six-part colour compositions in 22 abstract paintings, showing how the singular value decomposition can achieve an accurate biplot of the colour ratios and how possible models interrelating the colours can be diagnosed. [source]


    A three-year follow-up study of the psychosocial predictors of delayed and unresolved post-traumatic stress disorder in Taiwan Chi-Chi earthquake survivors

    PSYCHIATRY AND CLINICAL NEUROSCIENCES, Issue 3 2010
    Chao-Yueh Su MS
    Aims:, To predict the longitudinal course of post-traumatic stress disorder (PTSD) in survivors three years following a catastrophic earthquake using multivariate data presented six months after the earthquake. Methods:, Trained assistants and psychiatrists used the Disaster-related Psychological Screening Test (DRPST) to interview earthquake survivors 16 years and older and to assess current and incidental psychopathology. A total of 1756 respondents were surveyed over the three-year follow-up period. Results:, A total of 38 (9.1%) of the original 418 PTSD subjects and 40 of the original 1338 (3.0%) non-PTSD subjects were identified as having PTSD at the 3-year post-earthquake follow up. Younger age, significant financial loss, and memory/attention impairment were predictive factors of unresolved PTSD and delayed PTSD. Conclusions:, The longitudinal course of PTSD three years after the earthquake could be predicted as early as six months after the earthquake on the basis of demographic data, PTSD-related factors, and putative factors for PTSD. [source]


    Some contributions to the analysis of multivariate data

    BIOMETRICAL JOURNAL, Issue 2 2009
    Arne C. Bathke
    Abstract In this paper, we provide an overview of recently developed methods for the analysis of multivariate data that do not necessarily emanate from a normal universe. Multivariate data occur naturally in the life sciences and in other research fields. When drawing inference, it is generally recommended to take the multivariate nature of the data into account, and not merely analyze each variable separately. Furthermore, it is often of major interest to select an appropriate set of important variables. We present contributions in three different, but closely related, research areas: first, a general approach to the comparison of mean vectors, which allows for profile analysis and tests of dimensionality; second, non-parametric and parametric methods for the comparison of independent samples of multivariate observations; and third, methods for the situation where the experimental units are observed repeatedly, for example, over time, and the main focus is on analyzing different time profiles when the number p of repeated observations per subject is larger than the number n of subjects. [source]