Continuous Covariates (continuous + covariate)

Distribution by Scientific Domains


Selected Abstracts


Evaluating uses of data mining techniques in propensity score estimation: a simulation study,

PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, Issue 6 2008
DrPH, Soko Setoguchi MD
Abstract Background In propensity score modeling, it is a standard practice to optimize the prediction of exposure status based on the covariate information. In a simulation study, we examined in what situations analyses based on various types of exposure propensity score (EPS) models using data mining techniques such as recursive partitioning (RP) and neural networks (NN) produce unbiased and/or efficient results. Method We simulated data for a hypothetical cohort study (n,=,2000) with a binary exposure/outcome and 10 binary/continuous covariates with seven scenarios differing by non-linear and/or non-additive associations between exposure and covariates. EPS models used logistic regression (LR) (all possible main effects), RP1 (without pruning), RP2 (with pruning), and NN. We calculated c-statistics (C), standard errors (SE), and bias of exposure-effect estimates from outcome models for the PS-matched dataset. Results Data mining techniques yielded higher C than LR (mean: NN, 0.86; RPI, 0.79; RP2, 0.72; and LR, 0.76). SE tended to be greater in models with higher C. Overall bias was small for each strategy, although NN estimates tended to be the least biased. C was not correlated with the magnitude of bias (correlation coefficient [COR],=,,0.3, p,=,0.1) but increased SE (COR,=,0.7, p,<,0.001). Conclusions Effect estimates from EPS models by simple LR were generally robust. NN models generally provided the least numerically biased estimates. C was not associated with the magnitude of bias but was with the increased SE. Copyright © 2008 John Wiley & Sons, Ltd. [source]


Analysis of Capture,Recapture Models with Individual Covariates Using Data Augmentation

BIOMETRICS, Issue 1 2009
J. Andrew Royle
Summary I consider the analysis of capture,recapture models with individual covariates that influence detection probability. Bayesian analysis of the joint likelihood is carried out using a flexible data augmentation scheme that facilitates analysis by Markov chain Monte Carlo methods, and a simple and straightforward implementation in freely available software. This approach is applied to a study of meadow voles (Microtus pennsylvanicus) in which auxiliary data on a continuous covariate (body mass) are recorded, and it is thought that detection probability is related to body mass. In a second example, the model is applied to an aerial waterfowl survey in which a double-observer protocol is used. The fundamental unit of observation is the cluster of individual birds, and the size of the cluster (a discrete covariate) is used as a covariate on detection probability. [source]


Interpreting analyses of continuous covariates in affected sibling pair linkage studies

GENETIC EPIDEMIOLOGY, Issue 6 2007
Silke Schmidt
Abstract Datasets collected for linkage analyses of complex human diseases often include a number of clinical or environmental covariates. In this study, we evaluated the performance of three linkage analysis methods when the relationship between continuous covariates and disease risk or linkage heterogeneity was modeled in three different ways: (1) The covariate distribution is determined by a quantitative trait locus (QTL), which contributes indirectly to the disease risk; (2) the covariate is not genetically determined, but influences the disease risk through statistical interaction with a disease susceptibility locus; (3) the covariate distribution differs in families linked or unlinked to a particular disease susceptibility locus. We analyzed simulated datasets with a regression-based QTL analysis, a nonparametric analysis of the binary affection status, and the ordered subset analysis (OSA). We found that a significant OSA result may be due to a gene that influences variability in the population distribution of a continuous disease risk factor. Conversely, a regression-based QTL analysis may detect the presence of gene-environment (G × E) interaction in a sample of primarily affected individuals. The contribution of unaffected siblings and the size of baseline lod scores may help distinguish between QTL and G × E models. As illustrated by a linkage study of multiplex families with age-related macular degeneration, our findings assist in the interpretation of analysis results in real datasets. They suggest that the side-by-side evaluation of OSA and QTL results may provide important information about the relationship of measured covariates with either disease risk or linkage heterogeneity. Genet. Epidemiol. 2007. © 2007 Wiley-Liss, Inc. [source]


Failure time regression with continuous covariates measured with error

JOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES B (STATISTICAL METHODOLOGY), Issue 4 2000
Halbo Zhou
We consider failure time regression analysis with an auxiliary variable in the presence of a validation sample. We extend the nonparametric inference procedure of Zhou and Pepe to handle a continuous auxiliary or proxy covariate. We estimate the induced relative risk function with a kernel smoother and allow the selection probability of the validation set to depend on the observed covariates. We present some asymptotic properties for the kernel estimator and provide some simulation results. The method proposed is illustrated with a data set from an on-going epidemiologic study. [source]


Structured additive regression for overdispersed and zero-inflated count data

APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, Issue 4 2006
Ludwig Fahrmeir
Abstract In count data regression there can be several problems that prevent the use of the standard Poisson log-linear model: overdispersion, caused by unobserved heterogeneity or correlation, excess of zeros, non-linear effects of continuous covariates or of time scales, and spatial effects. We develop Bayesian count data models that can deal with these issues simultaneously and within a unified inferential approach. Models for overdispersed or zero-inflated data are combined with semiparametrically structured additive predictors, resulting in a rich class of count data regression models. Inference is fully Bayesian and is carried out by computationally efficient MCMC techniques. Simulation studies investigate performance, in particular how well different model components can be identified. Applications to patent data and to data from a car insurance illustrate the potential and, to some extent, limitations of our approach. Copyright © 2006 John Wiley & Sons, Ltd. [source]


Variable Selection and Model Choice in Geoadditive Regression Models

BIOMETRICS, Issue 2 2009
Thomas Kneib
Summary Model choice and variable selection are issues of major concern in practical regression analyses, arising in many biometric applications such as habitat suitability analyses, where the aim is to identify the influence of potentially many environmental conditions on certain species. We describe regression models for breeding bird communities that facilitate both model choice and variable selection, by a boosting algorithm that works within a class of geoadditive regression models comprising spatial effects, nonparametric effects of continuous covariates, interaction surfaces, and varying coefficients. The major modeling components are penalized splines and their bivariate tensor product extensions. All smooth model terms are represented as the sum of a parametric component and a smooth component with one degree of freedom to obtain a fair comparison between the model terms. A generic representation of the geoadditive model allows us to devise a general boosting algorithm that automatically performs model choice and variable selection. [source]


Functional Mixed Effects Models

BIOMETRICS, Issue 1 2002
Wensheng Guo
Summary. In this article, a new class of functional models in which smoothing splines are used to model fixed effects as well as random effects is introduced. The linear mixed effects models are extended to non-parametric mixed effects models by introducing functional random effects, which are modeled as realizations of zero-mean stochastic processes. The fixed functional effects and the random functional effects are modeled in the same functional space, which guarantee the population-average and subject-specific curves have the same smoothness property. These models inherit the flexibility of the linear mixed effects models in handling complex designs and correlation structures, can include continuous covariates as well as dummy factors in both the fixed or random design matrices, and include the nested curves models as special cases. Two estimation procedures are proposed. The first estimation procedure exploits the connection between linear mixed effects models and smoothing splines and can be fitted using existing software. The second procedure is a sequential estimation procedure using Kalman filtering. This algorithm avoids inversion of large dimensional matrices and therefore can be applied to large data sets. A generalized maximum likelihood (GML) ratio test is proposed for inference and model selection. An application to comparison of cortisol profiles is used as an illustration. [source]


Self-esteem in a clinical sample of morbidly obese children and adolescents

ACTA PAEDIATRICA, Issue 1 2009
P Nowicka
Abstract Aim: To study self-esteem in clinical sample of obese children and adolescents. Methods: Obese children and adolescents aged 8,19 years (n = 107, mean age 13.2 years, mean BMI 32.5 [range 22.3,50.6], mean BMI z-score 3.22 [range 2.19,4.79]; 50 boys and 57 girls) were referred for treatment of primary obesity. Self-esteem was measured with a validated psychological test with five subscales: physical characteristics, talents and skills, psychological well-being, relations with the family and relations with others. A linear mixed effect model used the factors gender and adolescence group, and the continuous covariates: BMI z-scores, and BMI for the parents as fixed effects and subjects as random effects. Results: Age and gender, but neither the child's BMI z-score nor the BMI of the parents were significant covariates. Self-esteem decreased (p < 0.01) with age on the global scale as well as on the subscales, and was below the normal level in higher ages in both genders. Girls had significantly lower self-esteem on the global scale (p = 0.04) and on the two subscales physical characteristics (p < 0.01) and psychological well-being (p < 0.01). Conclusion: Self-esteem is lower in girls and decreases with age. In treatment settings special attention should be paid to adolescent girls. [source]