Home About us Contact | |||
Multivariate Adaptive Regression Splines (multivariate + adaptive_regression_spline)
Selected AbstractsFlexible and Robust Implementations of Multivariate Adaptive Regression Splines Within a Wastewater Treatment Stochastic Dynamic ProgramQUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, Issue 7 2005Julia C. C. Tsai Abstract This paper presents an automatic and more robust implementation of multivariate adaptive regression splines (MARS) within the orthogonal array (OA)/MARS continuous-state stochastic dynamic programming (SDP) method. MARS is used to estimate the future value functions in each SDP level. The default stopping rule of MARS employs the maximum number of basis functions Mmax, specified by the user. To reduce the computational effort and improve the MARS fit for the wastewater treatment SDP model, two automatic stopping rules, which automatically determine an appropriate value for Mmax, and a robust version of MARS that prefers lower-order terms over higher-order terms are developed. Computational results demonstrate the success of these approaches. Copyright © 2005 John Wiley & Sons, Ltd. [source] A hierarchical Bayesian model for predicting the functional consequences of amino-acid polymorphismsJOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES C (APPLIED STATISTICS), Issue 1 2005Claudio J. Verzilli Summary., Genetic polymorphisms in deoxyribonucleic acid coding regions may have a phenotypic effect on the carrier, e.g. by influencing susceptibility to disease. Detection of deleterious mutations via association studies is hampered by the large number of candidate sites; therefore methods are needed to narrow down the search to the most promising sites. For this, a possible approach is to use structural and sequence-based information of the encoded protein to predict whether a mutation at a particular site is likely to disrupt the functionality of the protein itself. We propose a hierarchical Bayesian multivariate adaptive regression spline (BMARS) model for supervised learning in this context and assess its predictive performance by using data from mutagenesis experiments on lac repressor and lysozyme proteins. In these experiments, about 12 amino-acid substitutions were performed at each native amino-acid position and the effect on protein functionality was assessed. The training data thus consist of repeated observations at each position, which the hierarchical framework is needed to account for. The model is trained on the lac repressor data and tested on the lysozyme mutations and vice versa. In particular, we show that the hierarchical BMARS model, by allowing for the clustered nature of the data, yields lower out-of-sample misclassification rates compared with both a BMARS and a frequen-tist MARS model, a support vector machine classifier and an optimally pruned classification tree. [source] Predicting species distributions from museum and herbarium records using multiresponse models fitted with multivariate adaptive regression splinesDIVERSITY AND DISTRIBUTIONS, Issue 3 2007Jane Elith ABSTRACT Current circumstances , that the majority of species distribution records exist as presence-only data (e.g. from museums and herbaria), and that there is an established need for predictions of species distributions , mean that scientists and conservation managers seek to develop robust methods for using these data. Such methods must, in particular, accommodate the difficulties caused by lack of reliable information about sites where species are absent. Here we test two approaches for overcoming these difficulties, analysing a range of data sets using the technique of multivariate adaptive regression splines (MARS). MARS is closely related to regression techniques such as generalized additive models (GAMs) that are commonly and successfully used in modelling species distributions, but has particular advantages in its analytical speed and the ease of transfer of analysis results to other computational environments such as a Geographic Information System. MARS also has the advantage that it can model multiple responses, meaning that it can combine information from a set of species to determine the dominant environmental drivers of variation in species composition. We use data from 226 species from six regions of the world, and demonstrate the use of MARS for distribution modelling using presence-only data. We test whether (1) the type of data used to represent absence or background and (2) the signal from multiple species affect predictive performance, by evaluating predictions at completely independent sites where genuine presence,absence data were recorded. Models developed with absences inferred from the total set of presence-only sites for a biological group, and using simultaneous analysis of multiple species to inform the choice of predictor variables, performed better than models in which species were analysed singly, or in which pseudo-absences were drawn randomly from the study area. The methods are fast, relatively simple to understand, and useful for situations where data are limited. A tutorial is included. [source] Nonlinear multiple regression methods: a survey and extensionsINTELLIGENT SYSTEMS IN ACCOUNTING, FINANCE & MANAGEMENT, Issue 1 2010Kenneth O. Cogger Abstract This paper reviews some nonlinear statistical procedures useful in function approximation, classification, regression and time-series analysis. Primary emphasis is on piecewise linear models such as multivariate adaptive regression splines, adaptive logic networks, hinging hyperplanes and their conceptual differences. Potential and actual applications of these methods are cited. Software for implementation is discussed, and practical suggestions are given for improvement. Examples show the relative capabilities of the various methods, including their ability for universal approximation. Copyright © 2010 John Wiley & Sons, Ltd. [source] Nonlinear modelling of periodic threshold autoregressions using TsmarsJOURNAL OF TIME SERIES ANALYSIS, Issue 4 2002PETER A. W. LEWIS We present new methods for modelling nonlinear threshold-type autoregressive behaviour in periodically correlated time series. The methods are illustrated using a series of average monthly flows of the Fraser River in British Columbia. Commonly used nonlinearity tests of the river flow data in each month indicate nonlinear behaviour in certain months. The periodic nonlinear correlation structure is modelled nonparametrically using TSMARS, a time series version of Friedman's extended multivariate adaptive regression splines (MARS) algorithm, which allows for categorical predictor variables. We discuss two methods of using the computational algorithm in TSMARS for modelling and fitting periodically correlated data. The first method applies the algorithm to data from each period separately. The second method models data from all periods simultaneously by incorporating an additional predictor variable to distinguish different behaviour in different periods, and allows for coalescing of data from periods with similar behaviour. The models obtained using TSMARS provide better short-term forecasts for the Fraser River data than a corresponding linear periodic AR model. [source] Flexible and Robust Implementations of Multivariate Adaptive Regression Splines Within a Wastewater Treatment Stochastic Dynamic ProgramQUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, Issue 7 2005Julia C. C. Tsai Abstract This paper presents an automatic and more robust implementation of multivariate adaptive regression splines (MARS) within the orthogonal array (OA)/MARS continuous-state stochastic dynamic programming (SDP) method. MARS is used to estimate the future value functions in each SDP level. The default stopping rule of MARS employs the maximum number of basis functions Mmax, specified by the user. To reduce the computational effort and improve the MARS fit for the wastewater treatment SDP model, two automatic stopping rules, which automatically determine an appropriate value for Mmax, and a robust version of MARS that prefers lower-order terms over higher-order terms are developed. Computational results demonstrate the success of these approaches. Copyright © 2005 John Wiley & Sons, Ltd. [source] Bayesian Adaptive Regression Splines for Hierarchical DataBIOMETRICS, Issue 3 2007Jamie L. Bigelow Summary This article considers methodology for hierarchical functional data analysis, motivated by studies of reproductive hormone profiles in the menstrual cycle. Current methods standardize the cycle lengths and ignore the timing of ovulation within the cycle, both of which are biologically informative. Methods are needed that avoid standardization, while flexibly incorporating information on covariates and the timing of reference events, such as ovulation and onset of menses. In addition, it is necessary to account for within-woman dependency when data are collected for multiple cycles. We propose an approach based on a hierarchical generalization of Bayesian multivariate adaptive regression splines. Our formulation allows for an unknown set of basis functions characterizing the population-averaged and woman-specific trajectories in relation to covariates. A reversible jump Markov chain Monte Carlo algorithm is developed for posterior computation. Applying the methods to data from the North Carolina Early Pregnancy Study, we investigate differences in urinary progesterone profiles between conception and nonconception cycles. [source] |