Component Regression (component + regression)

Distribution by Scientific Domains
Distribution within Chemistry

Kinds of Component Regression

  • principal component regression


  • Selected Abstracts


    DETERMINATION OF FIRMNESS AND SUGAR CONTENT OF APPLES USING NEAR-INFRARED DIFFUSE REFLECTANCE,

    JOURNAL OF TEXTURE STUDIES, Issue 6 2000
    RENFU LU
    The objective of this research was to study the potential of near-infrared (NIR) diffuse reflectance between 800 nm and 1700 nm for determining the firmness and sugar content of apples and to ascertain the effects of apple peel and variety on the NIR prediction of these two quality attributes. The spectral reflectance data were acquired from both peeled and unpeeled ,Empire', ,Golden Delicious', and ,Red Delicious' apples. Statistical models were developed using principal component analysis/regression. Lower correlations of prediction were obtained (r=0.38 to 0.58) between NIR measurement and Magness-Taylor firmness for both unpeeled and peeled fruit, with the standard errors of prediction (SEP) between 6.6 N and 10.1 N. Improved predictions were obtained when NIR reflectance was correlated with the slope of the Magness-Taylor force-deformation curves. Excellent predictions of the sugar content in peeled apples were obtained (r=0.93 to 0.97; SEP=0.37 to 0.42 °Brix). The SEP, on average, increased by about 0.17 °Brix for the unpeeled apples. Variety did not have a large effect on the model performance on sugar content predictions. [source]


    Incorporating affective customer needs for luxuriousness into product design attributes

    HUMAN FACTORS AND ERGONOMICS IN MANUFACTURING & SERVICE INDUSTRIES, Issue 2 2009
    Sangwoo Bahn
    In a highly competitive market, customers' product affection is a critical factor to product success. However, understanding customers' affective needs is difficult to grasp; product design practitioners often misunderstand what customers really want. In this study we report our experience in developing and using an affective design framework that identified critical affective features customers have on products and are systematically incorporated into product design attributes. To identify key affective features such as luxuriousness, we utilized the Kansei engineering methodology. This approach consists of three steps: (1) selecting related affective features and product design attributes through a comprehensive literature survey, expert panel opinion, and focus group interviews; (2) conducting evaluation experiments; and (3) developing Kansei models using multivariate statistical analysis and analyzing critical product design attributes. To demonstrate applicability of the proposed affective design framework, 30 customers and 30 product design practitioners participated in an evaluation experiment for car crash pads, and 44 customers and 20 designers participated in an evaluation experiment for two interior room products (wallpapers and flooring materials). The evaluation experiments were conducted via systematically developed questionnaires consisting of a 7-point semantic differential scale and a 100-point magnitude estimation scale. The results of the experiments were analyzed using principal component regression and quantification theory type I method. Using the analyzed survey data, the relationship between luxuriousness and related affective features and product design attributes were identified. This relationship indicated that there was a significant difference in the perception of luxuriousness between customers and designers. Consequently, it is expected that the results of this study could provide a foundation for developing affective products. © 2009 Wiley Periodicals, Inc. [source]


    Impartial graphical comparison of multivariate calibration methods and the harmony/parsimony tradeoff

    JOURNAL OF CHEMOMETRICS, Issue 11-12 2006
    Forrest Stout
    Abstract For multivariate calibration with the relationship y,=,Xb, it is often necessary to determine the degrees of freedom for parsimony consideration and for the error measure root mean square error of calibration (RMSEC). This paper shows that degrees of freedom can be estimated by an effective rank (ER) measure to estimate the model fitting degrees of freedom and the more parsimonious model has the smallest ER. This paper also shows that when such a measure is used on the X-axis, simultaneous graphing of model errors and other regression diagnostics is possible for ridge regression (RR), partial least squares (PLS) and principal component regression (PCR) and thus, a fair comparison between all potential models can be accomplished. The ER approach is general and applicable to other multivariate calibration methods. It is often noted that by selecting variables, more parsimonious models are obtained; typically by multiple linear regression (MLR). By using the ER, the more parsimonious model is graphically shown to not always be the MLR model. Additionally, a harmony measure is proposed that expresses the bias/variance tradeoff for a particular model. By plotting this new measure against the ER, the proper harmony/parsimony tradeoff can be graphically assessed for RR, PCR and PLS. Essentially, pluralistic criteria for fairly valuating and characterizing models are better than a dualistic or a single criterion approach which is the usual tactic. Results are presented using spectral, industrial and quantitative structure activity relationship (QSAR) data. Copyright © 2007 John Wiley & Sons, Ltd. [source]


    Tikhonov regularization in standardized and general form for multivariate calibration with application towards removing unwanted spectral artifacts

    JOURNAL OF CHEMOMETRICS, Issue 1-2 2006
    Forrest Stout
    Abstract Tikhonov regularization (TR) is an approach to form a multivariate calibration model for y,=,Xb. It includes a regulation operator matrix L that is usually set to the identity matrix I and in this situation, TR is said to operate in standard form and is the same as ridge regression (RR). Alternatively, TR can function in general form with L,,,I where L is used to remove unwanted spectral artifacts. To simplify the computations for TR in general form, a standardization process can be used on X and y to transform the problem into TR in standard form and a RR algorithm can now be used. The calculated regression vector in standardized space must be back-transformed to the general form which can now be applied to spectra that have not been standardized. The calibration model building methods of principal component regression (PCR), partial least squares (PLS) and others can also be implemented with the standardized X and y. Regardless of the calibration method, armed with y, X and L, a regression vector is sought that can correct for irrelevant spectral variation in predicting y. In this study, L is set to various derivative operators to obtain smoothed TR, PCR and PLS regression vectors in order to generate models robust to noise and/or temperature effects. Results of this smoothing process are examined for spectral data without excessive noise or other artifacts, spectral data with additional noise added and spectral data exhibiting temperature-induced peak shifts. When the noise level is small, derivative operator smoothing was found to slightly degrade the root mean square error of validation (RMSEV) as well as the prediction variance indicator represented by the regression vector 2-norm thereby deteriorating the model harmony (bias/variance tradeoff). The effective rank (ER) (parsimony) was found to decrease with smoothing and in doing so; a harmony/parsimony tradeoff is formed. For the temperature-affected data and some of the noisy data, derivative operator smoothing decreases the RMSEV, but at a cost of greater values for . The ER was found to increase and hence, the parsimony degraded. A simulated data set from a previous study that used TR in general form was reexamined. In the present study, the standardization process is used with L set to the spectral noise structure to eliminate undesirable spectral regions (wavelength selection) and TR, PCR and PLS are evaluated. There was a significant decrease in bias at a sacrifice to variance with wavelength selection and the parsimony essentially remains the same. This paper includes discussion on the utility of using TR to remove other undesired spectral patterns resulting from chemical, environmental and/or instrumental influences. The discussion also incorporates using TR as a method for calibration transfer. Copyright © 2006 John Wiley & Sons, Ltd. [source]


    Non-parametric statistical methods for multivariate calibration model selection and comparison,

    JOURNAL OF CHEMOMETRICS, Issue 12 2003
    Edward V. Thomas
    Abstract Model selection is an important issue when constructing multivariate calibration models using methods based on latent variables (e.g. partial least squares regression and principal component regression). It is important to select an appropriate number of latent variables to build an accurate and precise calibration model. Inclusion of too few latent variables can result in a model that is inaccurate over the complete space of interest. Inclusion of too many latent variables can result in a model that produces noisy predictions through incorporation of low-order latent variables that have little or no predictive value. Commonly used metrics for selecting the number of latent variables are based on the predicted error sum of squares (PRESS) obtained via cross-validation. In this paper a new approach for selecting the number of latent variables is proposed. In this new approach the prediction errors of individual observations (obtained from cross-validation) are compared across models incorporating varying numbers of latent variables. Based on these comparisons, non-parametric statistical methods are used to select the simplest model (least number of latent variables) that provides prediction quality that is indistinguishable from that provided by more complex models. Unlike methods based on PRESS, this new approach is robust to the effects of anomalous observations. More generally, the same approach can be used to compare the performance of any models that are applied to the same data set where reference values are available. The proposed methodology is illustrated with an industrial example involving the prediction of gasoline octane numbers from near-infrared spectra. Published in 2004 by John Wiley & Sons, Ltd. [source]


    A robust PCR method for high-dimensional regressors

    JOURNAL OF CHEMOMETRICS, Issue 8-9 2003
    Mia Hubert
    Abstract We consider the multivariate calibration model which assumes that the concentrations of several constituents of a sample are linearly related to its spectrum. Principal component regression (PCR) is widely used for the estimation of the regression parameters in this model. In the classical approach it combines principal component analysis (PCA) on the regressors with least squares regression. However, both stages yield very unreliable results when the data set contains outlying observations. We present a robust PCR (RPCR) method which also consists of two parts. First we apply a robust PCA method for high-dimensional data on the regressors, then we regress the response variables on the scores using a robust regression method. A robust RMSECV value and a robust R2 value are proposed as exploratory tools to select the number of principal components. The prediction error is also estimated in a robust way. Moreover, we introduce several diagnostic plots which are helpful to visualize and classify the outliers. The robustness of RPCR is demonstrated through simulations and the analysis of a real data set. Copyright © 2003 John Wiley & Sons, Ltd. [source]


    Fast principal component analysis of large data sets based on information extraction

    JOURNAL OF CHEMOMETRICS, Issue 11 2002
    F. Vogt
    Abstract Principal component analysis (PCA) and principal component regression (PCR) are routinely used for calibration of measurement devices and for data evaluation. However, their use is hindered in some applications, e.g. hyperspectral imaging, by excessive data sets that imply unacceptable calculation time. This paper discusses a fast PCA achieved by a combination of data compression based on a wavelet transformation and a spectrum selection method prior to the PCA itself. The spectrum selection step can also be applied without previous data compression. The calculation speed increase is investigated based on original and compressed data sets, both simulated and measured. Two different data sets are used for assessment of the new approach. One set contains 65,536 synthetically generated spectra at four different noise levels with 256 measurement points each. Compared with the conventional PCA approach, these examples can be accelerated 20 times. Evaluation errors of the fast method were calculated and found to be comparable with those of the conventional approach. Four experimental spectra sets of similar size are also investigated. The novel method outperforms PCA in speed by factors of up to 12, depending on the data set. The principal components obtained by the novel algorithm show the same ability to model the measured spectra as the conventional time-consuming method. The acceleration factors also depend on the possible compression; in particular, if only a small compression is feasible, the acceleration lies purely with the novel spectrum selection step proposed in this paper. Copyright © 2002 John Wiley & Sons, Ltd. [source]


    Multivariate calibration stability: a comparison of methods

    JOURNAL OF CHEMOMETRICS, Issue 3 2002
    Brian D. Marx
    Abstract In the multivariate calibration framework we revisit and investigate the prediction performance of three high-dimensional modeling strategies: partial least squares, principal component regression and P-spline signal regression. Specifically we are interested in comparing the stability and robustness of prediction under differing conditions, e.g. training the model under one temperature and using it to predict under differing temperatures. An example illustrates stability comparisons. Copyright © 2002 John Wiley & Sons, Ltd. [source]


    Improved calculation of the net analyte signal in inverse multivariate calibration

    JOURNAL OF CHEMOMETRICS, Issue 6 2001
    Joan Ferré
    Abstract The net analyte signal (NAS) is the part of the measured signal that a calibration model relates to the property of interest (e.g. analyte concentration). Accurate values of the NAS are required in multivariate calibration to calculate analytical figures of merit such as sensitivity, selectivity, signal-to-noise ratio and limit of detection. This paper presents an improved version of the calculation method for the NAS in inverse models proposed by Lorber et al. (Anal. Chem. 1997; 69: 1620). Model coefficients and predictions calculated with the improved NAS are the same as those from the common equations of principal component regression (PCR) and partial least squares (PLS) regression. The necessary alterations to the calculations of sensitivity, selectivity and the pseudounivariate presentation of the model are also provided. Copyright © 2001 John Wiley & Sons, Ltd. [source]


    Simultaneous Spectrophotometric Determination of 2-Thiouracil and 2-Mercaptobenzimidazole in Animal Tissue Using Multivariate Calibration Methods: Concerns and Rapid Methods for Detection

    JOURNAL OF FOOD SCIENCE, Issue 2 2010
    Abolghasem Beheshti
    ABSTRACT:, Two multivariate calibration methods, partial least squares (PLS) and principal component regression (PCR), were applied to the spectrophotometric simultaneous determination of 2-mercaptobenzimidazole (MB) and 2-thiouracil (TU). A genetic algorithm (GA) using partial least squares was successfully utilized as a variable selection method. The concentration model was based on the absorption spectra in the range of 200 to 350 nm for 25 different mixtures of MB and TU. The calibration curve was linear across the concentration range of 1 to 10 ,g mL,1 and 1.5 to 15 ,g mL,1 for MB and TU, respectively. The values of the root mean squares error of prediction (RMSEP) were 0.3984, 0.1066, and 0.0713 for MB and 0.2010, 0.1667, and 0.1115 for TU, which were obtained using PCR, PLS, and GA-PLS, respectively. Finally, the practical applicability of the GA-PLS method was effectively evaluated by the concurrent detection of both analytes in animal tissues. It should also be mentioned that the proposed method is a simple and rapid way that requires no preliminary separation steps and can be used easily for the analysis of these compounds, especially in quality control laboratories. [source]


    Rapid Analysis of Glucose, Fructose, Sucrose, and Maltose in Honeys from Different Geographic Regions using Fourier Transform Infrared Spectroscopy and Multivariate Analysis

    JOURNAL OF FOOD SCIENCE, Issue 2 2010
    Jun Wang
    ABSTRACT:, Quantitative analysis of glucose, fructose, sucrose, and maltose in different geographic origin honey samples in the world using the Fourier transform infrared (FTIR) spectroscopy and chemometrics such as partial least squares (PLS) and principal component regression was studied. The calibration series consisted of 45 standard mixtures, which were made up of glucose, fructose, sucrose, and maltose. There were distinct peak variations of all sugar mixtures in the spectral "fingerprint" region between 1500 and 800 cm,1. The calibration model was successfully validated using 7 synthetic blend sets of sugars. The PLS 2nd-derivative model showed the highest degree of prediction accuracy with a highest,R2 value of 0.999. Along with the canonical variate analysis, the calibration model further validated by high-performance liquid chromatography measurements for commercial honey samples demonstrates that FTIR can qualitatively and quantitatively determine the presence of glucose, fructose, sucrose, and maltose in multiple regional honey samples. [source]


    An improved independent component regression modeling and quantitative calibration procedure

    AICHE JOURNAL, Issue 6 2010
    Chunhui Zhao
    Abstract An improved independent component regression (M-ICR) algorithm is proposed by constructing joint latent variable (LV) based regressors, and a quantitative statistical analysis procedure is designed using a bootstrap technique for model validation and performance evaluation. First, the drawbacks of the conventional regression modeling algorithms are analyzed. Then the proposed M-ICR algorithm is formulated for regressor design. It constructs a dual-objective optimization criterion function, simultaneously incorporating quality-relevance and independence into the feature extraction procedure. This ties together the ideas of partial-least squares (PLS), and independent component regression (ICR) under the same mathematical umbrella. By adjusting the controllable suboptimization objective weights, it adds insight into the different roles of quality-relevant and independent characteristics in calibration modeling, and, thus, provides possibilities to combine the advantages of PLS and ICR. Furthermore, a quantitative statistical analysis procedure based on a bootstrapping technique is designed to identify the effects of LVs, determine a better model rank and overcome ill-conditioning caused by model over-parameterization. A confidence interval on quality prediction is also approximated. The performance of the proposed method is demonstrated using both numerical and real world data. © 2009 American Institute of Chemical Engineers AIChE J, 2010 [source]


    Influence of the season on the relationships between NMR transverse relaxation data and water-holding capacity of turkey breast meat

    JOURNAL OF THE SCIENCE OF FOOD AND AGRICULTURE, Issue 12 2004
    Maurizio Bianchi
    Abstract In the last few years the poultry industry has seen a significant deterioration in meat quality properties during the summer season. The objective of this study was to evaluate the seasonal effect (summer and winter) on turkey meat quality assessed by both conventional and low-resolution nuclear magnetic resonance (LR-NMR) analysis. Eighty-eight breast muscle samples (35 winter and 53 summer) from BUT-Big 6 turkeys belonging to 16 different flocks, were randomly collected from a commercial processing plant. The samples were analysed for transverse relaxation times (T2) by LR-NMR and for initial pH (15 min post mortem), ultimate pH (24 h post mortem) and pH after cooking, temperature at 15 min post mortem, water-holding capacity (WHC, drip loss, filter paper press wetness and cooking loss) at 24 h post mortem, colour of raw and cooked meat and chemical composition (moisture, lipids and proteins). The results indicate that, during the summer season, turkey breast meat undergoes a relevant WHC decrease. Cluster analysis of the raw LR-NMR data evidenced the presence of two groups corresponding to samples harvested in each different season. Correlations between the LR-NMR signal and the conventional parameters measuring WHC were obtained by a recently proposed type of principal component regression (PCR) termed relative standard deviation PCR. Copyright © 2004 Society of Chemical Industry [source]


    Discrimination and classification of adulterants in maple syrup with the use of infrared spectroscopic techniques

    JOURNAL OF THE SCIENCE OF FOOD AND AGRICULTURE, Issue 5 2002
    M Paradkar
    Abstract Food adulteration is a profit-making business for some unscrupulous manufacturers. Maple syrup is a soft target for adulterators owing to its simplicity of chemical composition. The use of infrared spectroscopic techniques such as Fourier transform infrared (FTIR) and near-infrared (NIR) as a tool to detect adulterants such as cane and beet invert syrups as well as cane and beet sugar solutions in maple syrup was investigated. The FTIR spectra of adulterated samples were characterised and the regions of 800,1200,cm,1 (carbohydrates) and 1200,1800 and 2800,3200,cm,1 (carbohydrates, carboxylic acids and amino acids) were used for detection. The NIR spectral region between 1100 and 1660,nm was used for analysis. Linear discriminant analysis (LDA) and canonical variate analysis (CVA) were used for discriminant analysis, while partial least squares (PLS) and principal component regression (PCR) were used for quantitative analysis. FTIR was more accurate in predicting adulteration using the two different regions (R2,>,0.93 and 0.98) compared with NIR (R2,>,0.93). Classification and quantification of adulterants in maple syrup show that both NIR and FTIR can be used for detecting adulterants such as pure beet and cane sugar solutions, but FTIR was superior to NIR in detecting invert syrups. © 2002 Society of Chemical Industry [source]


    Near Infrared Spectroscopy as a Tool for the Determination of Eumelanin in Human Hair

    PIGMENT CELL & MELANOMA RESEARCH, Issue 4 2004
    Marina Zoccola
    Eumelanins are brown-black pigments present in the hair and in the epidermis which are acknowledged as protection factors against cell damage caused by ultraviolet radiation. The quantity of eumelanin present in hair has recently been put forward as a means of identifying subjects with a higher risk of skin tumours. For epidemiological studies, chromatographic methods of determining pyrrole-2,3,5-tricarboxylic acid (PTCA; the principal marker of eumelanin) are long, laborious and unsuitable for screening large populations. We suggest near infrared (NIR) spectroscopy as an alternative method of analysing eumelanin in hair samples. PCTA was determined on 93 samples of hair by means of oxidizing with hydrogen peroxide in a basic environment followed by chromatographic separation. The same 93 samples were then subjected to NIR spectrophotometric analysis. The spectra were obtained in reflectance mode on hair samples which had not undergone any preliminary treatment, but had simply been pressed and placed on the measuring window of the spectrophotometer. The PTCA values obtained by means of HPLC were correlated with the near infrared spectrum of the respective samples. A correlation between the PTCA values obtained by means of HPLC and the PTCA values obtained from an analysis of the spectra was obtained using the principal component regression (PCR) algorithm. The correlation obtained has a coefficient of regression (R2) of 0.89 and a standard error of prediction (SEP) of 13.8 for a mean value of 108.6 ng PTCA/mg hair. Some considerations about the accuracy of the obtained correlation and the main sources of error are made and some validation results are shown. [source]


    Dynamic Process Modelling using a PCA-based Output Integrated Recurrent Neural Network

    THE CANADIAN JOURNAL OF CHEMICAL ENGINEERING, Issue 4 2002
    Yu Qian
    Abstract A new methodology for modelling of dynamic process systems, the output integrated recurrent neural network (OIRNN), is presented in this paper. OIRNN can be regarded as a modified Jordan recurrent neural network, in which the past values for certain steps of the output variables are integrated with the input variables, and the original input variables are pre-processed using principal component analysis (PCA) for the purpose of dimension reduction. The main advantage of the PCA-based OIRNN is that the input dimension is reduced, so that the network can be used to model the dynamic behavior of multiple input multiple output (MIMO) systems effectively. The new method is illustrated with reference to the Tennessee-Eastman process and compared with principal component regression and feedforward neural networks. On présente dans cet article une nouvelle méthodologie pour la modélisation de systèmes de procédés dynamiques, soit le réseau neuronal récurrent avec intégration de la réponse (OIRNN). Ce dernier peut être vu comme un réseau neuronal récurrent de Jordan modifié, dans lequel les valeurs passées pour certaines étapes des valeurs de sortie sont intégrées aux variables d'entrée et les variables d'entrée originales pré-traitée par l'analyse des composants principaux (PCA) dans un but de réduction des dimensions. Le principal avantage de l'OIRNN basé sur la PCA est que la dimension d'entée est réduite de sorte que le réseau peut servir à modéliser le comportement dynamique de systèmes à entrée et sorties multiples (MIMO) de façon efficace. La nouvell méthod est illustrée dans le cas du procédé Tennessee-Eastman et est comparée aux réseaux neuronaux anticipés et à régression des composants principaux. [source]


    Functional Generalized Linear Models with Images as Predictors

    BIOMETRICS, Issue 1 2010
    Philip T. Reiss
    Summary Functional principal component regression (FPCR) is a promising new method for regressing scalar outcomes on functional predictors. In this article, we present a theoretical justification for the use of principal components in functional regression. FPCR is then extended in two directions: from linear to the generalized linear modeling, and from univariate signal predictors to high-resolution image predictors. We show how to implement the method efficiently by adapting generalized additive model technology to the functional regression context. A technique is proposed for estimating simultaneous confidence bands for the coefficient function; in the neuroimaging setting, this yields a novel means to identify brain regions that are associated with a clinical outcome. A new application of likelihood ratio testing is described for assessing the null hypothesis of a constant coefficient function. The performance of the methodology is illustrated via simulations and real data analyses with positron emission tomography images as predictors. [source]


    Differential Kinetic Spectrophotometric Determination of Methamidophos and Fenitrothion in Water and Food Samples by Use of Chemometrics

    CHINESE JOURNAL OF CHEMISTRY, Issue 3 2010
    Na Deng
    Abstract A spectrophotometric method for simultaneous analysis of methamidophos and fenitrothion was proposed by application of chemometrics to the spectral kinetic data, which was based upon the difference in the inhibitory effect of the two pesticides on acetylcholinesterase (AChE) and the use of 5,5,-dithiobis(2-nitrobenzoic acid) (DTNB) as a chromogenic reagent for the thiocholine iodide (TChI) released from the acetylthiocholine iodide (ATChI) substrate. The absorbance of the chromogenic product was measured at 412 nm. The different experimental conditions affecting the development and stability of the chromogenic product were carefully studied and optimized. Linear calibration graphs were obtained in the concentration range of 0.5,7.5 ng·mL,1 and 5,75 ng·mL,1 for methamidophos and fenitrothion, respectively. Synthetic mixtures of the two pesticides were analysed, and the data obtained processed by chemometrics, such as partial least square (PLS), principal component regression (PCR), back propagation-artificial neural network (BP-ANN), radial basis function-artificial neural network (RBF-ANN) and principal component-radial basis function-artificial neural network (PC-RBF-ANN). The results show that the RBF-ANN gives the lowest prediction errors of the five chemometric methods. Following the validation of the proposed method, it was applied to the determination of the pesticides in several commercial fruit and vegetable samples; and the standard addition method yielded satisfactory recoveries. [source]


    Application of PC-ANN to Acidity Constant Prediction of Various Phenols and Benzoic Acids in Water

    CHINESE JOURNAL OF CHEMISTRY, Issue 5 2008
    Aziz HABIBI-YANGJEH
    Abstract Principal component regression (PCR) and principal component-artificial neural network (PC-ANN) models were applied to prediction of the acidity constant for various benzoic acids and phenols (242 compounds) in water at 25 °C. A large number of theoretical descriptors were calculated for each molecule. The first fifty principal components (PC) were found to explain more than 95% of variances in the original data matrix. From the pool of these PC's, the eigenvalue ranking method was employed to select the best set of PC for PCR and PC-ANN models. The PC-ANN model with architecture 47-20-1 was generated using 47 principal components as inputs and its output is pKa. For evaluation of the predictive power of the PCR and PC-ANN models, pKa values of 37 compounds in the prediction set were calculated. Mean percentage deviation (MPD) for PCR and PC-ANN models are 18.45 and 0.6448, respectively. These improvements are due to the fact that the pKa of the compounds demonstrate non-linear correlations with the principal components. Comparison of the results obtained by the models reveals superiority of the PC-ANN model relative to the PCR model. [source]


    Performance of recalibration systems for GCM forecasts for southern Africa

    INTERNATIONAL JOURNAL OF CLIMATOLOGY, Issue 12 2006
    Mxolisi E. Shongwe
    Abstract Two regression-based methods that recalibrate the ECHAM4.5 general circulation model (GCM) output during austral summer have been developed for southern Africa, and their performance assessed over a 12-year retroactive period 1989/90,2000/01. A linear statistical model linking near-global sea-surface temperatures (SSTs) to regional rainfall has also been developed. The recalibration technique is model output statistics (MOS) using principal components regression (PCR) and canonical correlation analysis (CCA) to statistically link archived records of the GCM to regional rainfall over much of Africa, south of the equator. The predictability of anomalously dry and wet conditions over each rainfall region during December,February (DJF) using the linear statistical model and MOS models has been quantitatively evaluated. The MOS technique outperforms the raw-GCM ensembles and the linear statistical model. Neither the PCR-MOS nor the CCA-MOS models show clear superiority over the other, probably because the two methods are closely related. The need to recalibrate GCM predictions at regional scales to improve their skill at smaller spatial scales is further demonstrated in this paper. Copyright © 2006 Royal Meteorological Society. [source]


    Desensitizing models using covariance matrix transforms or counter-balanced distortions

    JOURNAL OF CHEMOMETRICS, Issue 4 2005
    Rocco DiFoggio
    Abstract This paper presents a generalization of the Lagrange multiplier equation for a regression subject to constraints. It introduces two methods for desensitizing models to anticipated spectral artifacts such as baseline variations, wavelength shift, or trace contaminants. For models derived from a covariance matrix such as multiple linear regression (MLR) and principal components regression (PCR) models, the first method shows how a covariance matrix can be desensitized to an artifact spectrum, v, by adding ,2v,,,v to it. For models not derived from a covariance matrix, such as partial least squares (PLS) or neural network (NN) models, the second method shows how distorted copies of the original spectra can be prepared in a counter-balanced manner to achieve desensitization. Unlike earlier methods that added random distortions to spectra, these new methods never introduce any accidental correlations between the added distortions and the Y -block. The degree of desensitization is controlled by a parameter, ,, for each artifact from zero (no desensitization) to infinity (complete desensitization, which is the Lagrange multiplier limit). Unlike Lagrange multipliers, these methods permit partial desensitization so we can individually vary the degree of desensitization to each artifact, which is important when desensitization to one artifact inhibits desensitization to another. Copyright © 2005 John Wiley & Sons, Ltd. [source]