MLR Models (mlr + models)

Distribution by Scientific Domains


Selected Abstracts


Potentialities of quantile regression to predict ozone concentrations

ENVIRONMETRICS, Issue 2 2009
S. I. V. Sousa
Abstract This paper aims: (i) to analyse the influence of ozone precursors (both meteorological variables and pollutant concentrations) on ozone concentrations at different ozone levels; and (ii) to predict next day hourly ozone concentrations using a new approach based on quantile regression (QR). The performance of this model was compared with multiple linear regressions (MLR) for the three following periods: daylight, night time and all day. QR as proven to be an useful mathematical tool to evidence the heterogeneity of ozone predictor influences at different ozone levels. Such heterogeneity is generally hidden when an ordinary least square regression model is applied. The influence of previous concentrations of ozone and nitrogen monoxide on next day ozone concentrations was higher for lower quantiles. When QR was applied, the wind direction (WD) was found to be significant in the medium quantiles and the relative humidity (RH) in the higher quantiles. On the contrary, using the MLR models, both variables were not statistically significant. Moreover, QR allowed more efficient previsions of extreme values which are very useful once the forecasting of higher concentrations is fundamental to develop strategies for protecting the public health. Copyright © 2008 John Wiley & Sons, Ltd. [source]


A self-adaptive genetic algorithm-artificial neural network algorithm with leave-one-out cross validation for descriptor selection in QSAR study

JOURNAL OF COMPUTATIONAL CHEMISTRY, Issue 10 2010
Jingheng Wu
Abstract Based on the quantitative structure-activity relationships (QSARs) models developed by artificial neural networks (ANNs), genetic algorithm (GA) was used in the variable-selection approach with molecule descriptors and helped to improve the back-propagation training algorithm as well. The cross validation techniques of leave-one-out investigated the validity of the generated ANN model and preferable variable combinations derived in the GAs. A self-adaptive GA-ANN model was successfully established by using a new estimate function for avoiding over-fitting phenomenon in ANN training. Compared with the variables selected in two recent QSAR studies that were based on stepwise multiple linear regression (MLR) models, the variables selected in self-adaptive GA-ANN model are superior in constructing ANN model, as they revealed a higher cross validation (CV) coefficient (Q2) and a lower root mean square deviation both in the established model and biological activity prediction. The introduced methods for validation, including leave-multiple-out, Y-randomization, and external validation, proved the superiority of the established GA-ANN models over MLR models in both stability and predictive power. Self-adaptive GA-ANN showed us a prospect of improving QSAR model. © 2010 Wiley Periodicals, Inc. J Comput Chem, 2010 [source]


Are Mechanistic and Statistical QSAR Approaches Really Different?

MOLECULAR INFORMATICS, Issue 6-7 2010
MLR Studies on 158 Cycloalkyl-Pyranones
Abstract Two parallel approaches for quantitative structure-activity relationships (QSAR) are predominant in literature, one guided by mechanistic methods (including read-across) and another by the use of statistical methods. To bridge the gap between these two approaches and to verify their main differences, a comparative study of mechanistically relevant and statistically relevant QSAR models, developed on a case study of 158 cycloalkyl-pyranones, biologically active on inhibition (Ki) of HIV protease, was performed. Firstly, Multiple Linear Regression (MLR) based models were developed starting from a limited amount of molecular descriptors which were widely proven to have mechanistic interpretation. Then robust and predictive MLR models were developed on the same set using two different statistical approaches unbiased of input descriptors. Development of models based on Statistical I method was guided by stepwise addition of descriptors while Genetic Algorithm based selection of descriptors was used for the Statistical II. Internal validation, the standard error of the estimate, and Fisher's significance test were performed for both the statistical models. In addition, external validation was performed for Statistical II model, and Applicability Domain was verified as normally practiced in this approach. The relationships between the activity and the important descriptors selected in all the models were analyzed and compared. It is concluded that, despite the different type and number of input descriptors, and the applied descriptor selection tools or the algorithms used for developing the final model, the mechanistical and statistical approach are comparable to each other in terms of quality and also for mechanistic interpretability of modelling descriptors. Agreement can be observed between these two approaches and the better result could be a consensus prediction from both the models. [source]


Principles of QSAR models validation: internal and external

MOLECULAR INFORMATICS, Issue 5 2007
Paola Gramatica
Abstract The recent REACH Policy of the European Union has led to scientists and regulators to focus their attention on establishing general validation principles for QSAR models in the context of chemical regulation (previously known as the Setubal, nowadays, the OECD principles). This paper gives a brief analysis of some principles: unambiguous algorithm, Applicability Domain (AD), and statistical validation. Some concerns related to QSAR algorithm reproducibility and an example of a fast check of the applicability domain for MLR models are presented. Common myths and misconceptions related to popular techniques for verifying internal predictivity, particularly for MLR models (for instance cross-validation, bootstrap), are commented on and compared with commonly used statistical techniques for external validation. The differences in the two validating approaches are highlighted, and evidence is presented that only models that have been validated externally, after their internal validation, can be considered reliable and applicable for both external prediction and regulatory purposes. [source]


The role of procalcitonin in a decision tree for prediction of bloodstream infection in febrile patients

CLINICAL MICROBIOLOGY AND INFECTION, Issue 12 2006
R. P. H. Peters
Abstract Bloodstream infection (BSI) in febrile patients is associated with high mortality. Clinical and laboratory variables, such as procalcitonin (PCT), may predict BSI and help decision-making concerning empirical treatment. This study compared two models for prediction of BSI, and evaluated the role of PCT vs. clinical variables, collected daily in 300 consecutive febrile inpatients, for 48 h after onset of fever. Multiple logistic regression (MLR) and classification and regression tree (CART) models were compared for discriminatory power and diagnostic performance. BSI was present in 17% of cases. MLR identified the presence of intravascular devices, nadir albumin and thrombocyte counts, and peak temperature, respiratory rate and leukocyte counts, but not PCT, as independent predictors of BSI. In contrast, a peak PCT level of >2.45 ng/mL was the principal discriminator in the decision tree based on CART. The latter was more accurate (94%) than the model based on MLR (72%; p <0.01). Hence, the presence of BSI in febrile patients is predicted more accurately and by different variables, e.g., PCT, in CART analysis, as compared with MLR models. This underlines the value of PCT plus CART analysis in the diagnosis of a febrile patient. [source]