Bayesian Information Criterion (bayesian + information_criterion)

Distribution by Scientific Domains


Selected Abstracts


Extreme College Drinking and Alcohol-Related Injury Risk

ALCOHOLISM, Issue 9 2009
Marlon P. Mundt
Background:, Despite the enormous burden of alcohol-related injuries, the direct connection between college drinking and physical injury has not been well understood. The goal of this study was to assess the connection between alcohol consumption levels and college alcohol-related injury risk. Methods:, A total of 12,900 college students seeking routine care in 5 college health clinics completed a general Health Screening Survey. Of these, 2,090 students exceeded at-risk alcohol use levels and participated in a face-to-face interview to determine eligibility for a brief alcohol intervention trial. The eligibility interview assessed past 28-day alcohol use and alcohol-related injuries in the past 6 months. Risk of alcohol-related injury was compared across daily drinking quantities and frequencies. Logistic regression analysis and the Bayesian Information Criterion were applied to compute the odds of alcohol-related injury based on daily drinking totals after adjusting for age, race, site, body weight, and sensation seeking. Results:, Male college students in the study were 19% more likely (95% CI: 1.12,1.26) to suffer an alcohol-related injury with each additional day of consuming 8 or more drinks. Injury risks among males increased marginally with each day of consuming 5 to 7 drinks (odds ratio = 1.03, 95% CI: 0.94,1.13). Female participants were 10% more likely (95% CI: 1.04,1.16) to suffer an alcohol-related injury with each additional day of drinking 5 or more drinks. Males (OR = 1.69, 95% CI: 1.14,2.50) and females (OR = 1.81, 95% CI: 1.27,2.57) with higher sensation-seeking scores were more likely to suffer alcohol-related injuries. Conclusions:, College health clinics may want to focus limited alcohol injury prevention resources on students who frequently engage in extreme drinking, defined in this study as 8+M/5+F drinks per day, and score high on sensation-seeking disposition. [source]


Comparison of repeatability and multiple trait threshold models for litter size in sheep using observed and simulated data in Bayesian analyses

JOURNAL OF ANIMAL BREEDING AND GENETICS, Issue 4 2010
W. Mekkawy
Summary Bayesian analyses were used to estimate genetic parameters on 5580 records of litter size in the first four parities from 1758 Mule ewes. To examine the appropriateness of fitting repeatability (RM) or multiple trait threshold models (MTM) to litter size of different parities, both models were used to estimate genetic parameters on the observed data and were thereafter compared in a simulation study. Posterior means of the heritabilities of litter size in different parities using a MTM ranged from 0.12 to 0.18 and were higher than the heritability based on the RM (0.08). Posterior means of the genetic correlations between litter sizes of different parities were positive and ranged from 0.24 to 0.71. Data sets were simulated based on the same pedigree structure and genetic parameters of the Mule ewe population obtained from both models. The simulation showed that the relative loss in accuracy and increase in mean squared error (MSE) was substantially higher when using the RM, given that the parameters estimated from the observed data using the opposite model are the true parameters. In contrast, Bayesian information criterion (BIC) selected the RM as most appropriate model given the data because of substantial penalty for the higher number of parameters to be estimated in the MTM model. In conclusion, when the relative change in accuracy and MSE is of main interest for estimation of breeding values of litter size of different parities, the MTM is recommended for the given population. When reduction in risk of using the wrong model is the main aim, the BIC suggest that the RM is the most appropriate model. [source]


Related-variables selection in temporal disaggregation

JOURNAL OF FORECASTING, Issue 4 2009
Kosei Fukuda
Abstract Two related-variables selection methods for temporal disaggregation are proposed. In the first method, the hypothesis tests for a common feature (cointegration or serial correlation) are first performed. If there is a common feature between observed aggregated series and related variables, the conventional Chow,Lin procedure is applied. In the second method, alternative Chow,Lin disaggregating models with and without related variables are first estimated and the corresponding values of the Bayesian information criterion (BIC) are stored. It is determined on the basis of the selected model whether related variables should be included in the Chow,Lin model. The efficacy of these methods is examined via simulations and empirical applications. Copyright © 2008 John Wiley & Sons, Ltd. [source]


Predicting the regenerative capacity of conifer somatic embryogenic cultures by metabolomics

PLANT BIOTECHNOLOGY JOURNAL, Issue 9 2009
Andrew R. Robinson
Summary Somatic embryogenesis in gymnosperms is an effective approach to clonally propagating germplasm. However, embryogenic cultures frequently lose regenerative capacity. The interactions between metabolic composition, physiological state, genotype and embryogenic capacity in Pinus taeda (loblolly pine) somatic embryogenic cultures were explored using metabolomics. A stepwise modelling procedure, using the Bayesian information criterion, generated a 47 metabolite predictive model that could explain culture productivity. The model performed extremely well in cross-validation, achieving a correlation coefficient of 0.98 between actual and predicted mature embryo production. The metabolic composition and structure of the model implied that variation in culture regenerative capacity was closely linked to the physiological transition of cultures from the proliferation phase to the maturation phase of development. The propensity of cultures to advance into this transition appears to relate to nutrient uptake and allocation in vivo, and to be associated with the tolerance and response of cultures to stress, during the proliferation phase. [source]


Selecting explanatory variables with the modified version of the Bayesian information criterion

QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, Issue 6 2008
gorzata Bogdan
Abstract We consider the situation in which a large database needs to be analyzed to identify a few important predictors of a given quantitative response variable. There is a lot of evidence that in this case classical model selection criteria, such as the Akaike information criterion or the Bayesian information criterion (BIC), have a strong tendency to overestimate the number of regressors. In our earlier papers, we developed the modified version of BIC (mBIC), which enables the incorporation of prior knowledge on a number of regressors and prevents overestimation. In this article, we review earlier results on mBIC and discuss the relationship of this criterion to the well-known Bonferroni correction for multiple testing and the Bayes oracle, which minimizes the expected costs of inference. We use computer simulations and a real data analysis to illustrate the performance of the original mBIC and its rank version, which is designed to deal with data that contain some outlying observations. Copyright © 2008 John Wiley & Sons, Ltd. [source]


Model selection for generalized linear models with factor-augmented predictors

APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, Issue 3 2009
Tomohiro Ando
Abstract This paper considers generalized linear models in a data-rich environment in which a large number of potentially useful explanatory variables are available. In particular, it deals with the case that the sample size and the number of explanatory variables are of similar sizes. We adopt the idea that the relevant information of explanatory variables concerning the dependent variable can be represented by a small number of common factors and investigate the issue of selecting the number of common factors while taking into account the effect of estimated regressors. We develop an information criterion under model mis-specification for both the distributional and structural assumptions and show that the proposed criterion is a natural extension of the Akaike information criterion (AIC). Simulations and empirical data analysis demonstrate that the proposed new criterion outperforms the AIC and Bayesian information criterion. Copyright © 2009 John Wiley & Sons, Ltd. [source]


UPPER BOUNDS ON THE MINIMUM COVERAGE PROBABILITY OF CONFIDENCE INTERVALS IN REGRESSION AFTER MODEL SELECTION

AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, Issue 3 2009
Paul Kabaila
Summary We consider a linear regression model, with the parameter of interest a specified linear combination of the components of the regression parameter vector. We suppose that, as a first step, a data-based model selection (e.g. by preliminary hypothesis tests or minimizing the Akaike information criterion , AIC) is used to select a model. It is common statistical practice to then construct a confidence interval for the parameter of interest, based on the assumption that the selected model had been given to us,a priori. This assumption is false, and it can lead to a confidence interval with poor coverage properties. We provide an easily computed finite-sample upper bound (calculated by repeated numerical evaluation of a double integral) to the minimum coverage probability of this confidence interval. This bound applies for model selection by any of the following methods: minimum AIC, minimum Bayesian information criterion (BIC), maximum adjusted,R2, minimum Mallows' CP and,t -tests. The importance of this upper bound is that it delineates general categories of design matrices and model selection procedures for which this confidence interval has poor coverage properties. This upper bound is shown to be a finite-sample analogue of an earlier large-sample upper bound due to Kabaila and Leeb. [source]


PREDICTION-FOCUSED MODEL SELECTION FOR AUTOREGRESSIVE MODELS

AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, Issue 4 2007
Gerda Claeskens
Summary In order to make predictions of future values of a time series, one needs to specify a forecasting model. A popular choice is an autoregressive time-series model, for which the order of the model is chosen by an information criterion. We propose an extension of the focused information criterion (FIC) for model-order selection, with emphasis on a high predictive accuracy (i.e. the mean squared forecast error is low). We obtain theoretical results and illustrate by means of a simulation study and some real data examples that the FIC is a valid alternative to the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) for selection of a prediction model. We also illustrate the possibility of using the FIC for purposes other than forecasting, and explore its use in an extended model. [source]


Variable Selection for Clustering with Gaussian Mixture Models

BIOMETRICS, Issue 3 2009
Cathy Maugis
Summary This article is concerned with variable selection for cluster analysis. The problem is regarded as a model selection problem in the model-based cluster analysis context. A model generalizing the model of Raftery and Dean (2006,,Journal of the American Statistical Association,101, 168,178) is proposed to specify the role of each variable. This model does not need any prior assumptions about the linear link between the selected and discarded variables. Models are compared with Bayesian information criterion. Variable role is obtained through an algorithm embedding two backward stepwise algorithms for variable selection for clustering and linear regression. The model identifiability is established and the consistency of the resulting criterion is proved under regularity conditions. Numerical experiments on simulated datasets and a genomic application highlight the interest of the procedure. [source]