Covariate Information (covariate + information)

Distribution by Scientific Domains


Selected Abstracts


Multipoint affected sibpair linkage methods for localizing susceptibility genes of complex diseases

GENETIC EPIDEMIOLOGY, Issue 2 2003
David V. Glidden
Abstract Recently, Liang et al. ([2001] Hum. Hered. 51:64,78) proposed a general multipoint linkage method for estimating the chromosomal position of a putative susceptibility locus. Their technique is computationally simple and does not require specification of penetrance or a mode of inheritance. In complex genetic diseases, covariate data may be available which reflect etiologic or locus heterogeneity. We developed approaches to incorporating covariates into the method of Liang et al. ([2001] Hum. Hered. 51:64,78) with particular attention to exploiting age-at-onset information. The results of simulation studies, and a worked data example using a family data set ascertained through probands with schizophrenia, suggest that utilizing covariate information can yield substantial efficiency gains in localizing susceptibility genes. Genet Epidemiol 24: 107,117, 2003. © 2003 Wiley-Liss, Inc. [source]


Empirical-likelihood-based difference-in-differences estimators

JOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES B (STATISTICAL METHODOLOGY), Issue 2 2008
Jing Qin
Summary., Recently there has been a surge in econometric and epidemiologic works focusing on estimating average treatment effects under various sets of assumptions. Estimation of average treatment effects in observational studies often requires adjustment for differences in pretreatment variables. Rosenbaum and Rubin have proposed the propensity score method for estimating the average treatment effect by adjusting pretreatment variables. In this paper, the empirical likelihood method is used to estimate average treatment effects on the treated under the difference-in-differences framework. The advantage of this approach is that the common marginal covariate information can be incorporated naturally to enhance the estimation of average treatment effects. Compared with other approaches in the literature, the method proposed can provide more efficient estimation. A simulation study and a real economic data analysis are presented. [source]


Maximum likelihood inference on a mixed conditionally and marginally specified regression model for genetic epidemiologic studies with two-phase sampling

JOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES B (STATISTICAL METHODOLOGY), Issue 2 2007
Nilanjan Chatterjee
Summary., Two-phase stratified sampling designs can reduce the cost of genetic epidemiologic studies by limiting expensive ascertainments of genetic and environmental exposure to an efficiently selected subsample (phase II) of the main study (phase I). Family history and some covariate information, which may be cheaply gathered for all subjects at phase I, can be used for sampling of informative subjects at phase II. We develop alternative maximum likelihood methods for analysis of data from such studies by using a novel regression model that permits the estimation of ,marginal' risk parameters that are associated with the genetic and environmental covariates of interest, while simultaneously characterizing the ,conditional' risk of the disease associated with family history after adjusting for the other covariates. The methods and appropriate asymptotic theories are developed with and without an assumption of gene,environment independence, allowing the distribution of the environmental factors to remain non-parametric. The performance of the alternative methods and of sampling strategies is studied by using simulated data involving rare and common genetic variants. An application of the methods proposed is illustrated by using a case,control study of colorectal adenoma embedded within the prostate, lung, colorectal and ovarian cancer screening trial. [source]


Bayesian geoadditive sample selection models

JOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES C (APPLIED STATISTICS), Issue 3 2010
Manuel Wiesenfarth
Summary., Sample selection models attempt to correct for non-randomly selected data in a two-model hierarchy where, on the first level, a binary selection equation determines whether a particular observation will be available for the second level, i.e. in the outcome equation. Ignoring the non-random selection mechanism that is induced by the selection equation may result in biased estimation of the coefficients in the outcome equation. In the application that motivated this research, we analyse relief supply in earthquake-affected communities in Pakistan, where the decision to deliver goods represents the dependent variable in the selection equation whereas factors that determine the amount of goods supplied are analysed in the outcome equation. In this application, the inclusion of spatial effects is necessary since the available covariate information on the community level is rather scarce. Moreover, the high temporal dynamics underlying the immediate delivery of relief supply after a natural disaster calls for non-linear, time varying effects. We propose a geoadditive sample selection model that allows us to address these issues in a general Bayesian framework with inference being based on Markov chain Monte Carlo simulation techniques. The model proposed is studied in simulations and applied to the relief supply data from Pakistan. [source]


Evaluating uses of data mining techniques in propensity score estimation: a simulation study,

PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, Issue 6 2008
DrPH, Soko Setoguchi MD
Abstract Background In propensity score modeling, it is a standard practice to optimize the prediction of exposure status based on the covariate information. In a simulation study, we examined in what situations analyses based on various types of exposure propensity score (EPS) models using data mining techniques such as recursive partitioning (RP) and neural networks (NN) produce unbiased and/or efficient results. Method We simulated data for a hypothetical cohort study (n,=,2000) with a binary exposure/outcome and 10 binary/continuous covariates with seven scenarios differing by non-linear and/or non-additive associations between exposure and covariates. EPS models used logistic regression (LR) (all possible main effects), RP1 (without pruning), RP2 (with pruning), and NN. We calculated c-statistics (C), standard errors (SE), and bias of exposure-effect estimates from outcome models for the PS-matched dataset. Results Data mining techniques yielded higher C than LR (mean: NN, 0.86; RPI, 0.79; RP2, 0.72; and LR, 0.76). SE tended to be greater in models with higher C. Overall bias was small for each strategy, although NN estimates tended to be the least biased. C was not correlated with the magnitude of bias (correlation coefficient [COR],=,,0.3, p,=,0.1) but increased SE (COR,=,0.7, p,<,0.001). Conclusions Effect estimates from EPS models by simple LR were generally robust. NN models generally provided the least numerically biased estimates. C was not associated with the magnitude of bias but was with the increased SE. Copyright © 2008 John Wiley & Sons, Ltd. [source]


A pharmacodynamic assessment of the impact of antihypertensive non-adherence on blood pressure control

PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, Issue 7 2000
DrPh, Peter W. Choo MD
Abstract Objectives To evaluate if antihypertensive regimens that conform to present FDA guidelines by maintaining ,,50% of their peak effect at the end of the dosing interval protect patients during sporadic lapses in adherence. Methods 169 patients on monotherapy for high blood pressure underwent electronic adherence monitoring for 3 months. Blood pressures were measured during non-study office visits and were retrieved from automated medical records. Questionnaires were used to obtain other covariate information. The ratio of the dosing interval to the half-life of drug activity (I,) was used to capture conformity with FDA guidelines. Data analysis focused on the interaction between I, and the impact on blood pressure of delayed dosing. Results The average (,±,standard deviation) blood pressure during the study was 139.0 (,±,12.0)/85.0 (,±,6.9) mm Hg. Lisinopril followed by sustained-release verapamil, atenolol, and hydrochlorothiazide were the most frequently prescribed agents. The majority of regimens (99%) conformed to FDA dosing guidelines. Of the patients 23% missed a dose before their blood pressure check. Non-adherence, however, did not have a direct impact on blood pressure, and no interaction with I, of was detected. Conclusions Among patients with relatively mild hypertension on single-drug therapy, regimens that conform to current FDA dosing guidelines may prevent losses of blood pressure control during episodic lapses of adherence. These findings should be replicated in other patient populations with standardized blood pressure measurement to confirm their validity. Copyright © 2000 John Wiley & Sons, Ltd. [source]


A Regression-based Association Test for Case-control Studies that Uses Inferred Ancestral Haplotype Similarity

ANNALS OF HUMAN GENETICS, Issue 5 2009
Youfang Liu
Summary Association methods based on haplotype similarity (HS) can overcome power and stability issues encountered in standard haplotype analyses. Current HS methods can be generally classified into evolutionary and two-sample approaches. We propose a new regression-based HS association method for case-control studies that incorporates covariate information and combines the advantages of the two classes of approaches by using inferred ancestral haplotypes. We first estimate the ancestral haplotypes of case individuals and then, for each individual, an ancestral-haplotype-based similarity score is computed by comparing that individual's observed genotype with the estimated ancestral haplotypes. Trait values are then regressed on the similarity scores. Covariates can easily be incorporated into this regression framework. To account for the bias in the raw p-values due to the use of case data in constructing ancestral haplotypes, as well as to account for variation in ancestral haplotype estimation, a permutation procedure is adopted to obtain empirical p-values. Compared with the standard haplotype score test and the multilocus T2 test, our method improves power when neither the allele frequency nor linkage disequilibrium between the disease locus and its neighboring SNPs is too low and is comparable in other scenarios. We applied our method to the Genetic Analysis Workshop 15 simulated SNP data and successfully pinpointed a stretch of SNPs that covers the fine-scale region where the causal locus is located. [source]


High-Dimensional Cox Models: The Choice of Penalty as Part of the Model Building Process

BIOMETRICAL JOURNAL, Issue 1 2010
Axel Benner
Abstract The Cox proportional hazards regression model is the most popular approach to model covariate information for survival times. In this context, the development of high-dimensional models where the number of covariates is much larger than the number of observations ( ) is an ongoing challenge. A practicable approach is to use ridge penalized Cox regression in such situations. Beside focussing on finding the best prediction rule, one is often interested in determining a subset of covariates that are the most important ones for prognosis. This could be a gene set in the biostatistical analysis of microarray data. Covariate selection can then, for example, be done by L1 -penalized Cox regression using the lasso (Tibshirani (1997). Statistics in Medicine16, 385,395). Several approaches beyond the lasso, that incorporate covariate selection, have been developed in recent years. This includes modifications of the lasso as well as nonconvex variants such as smoothly clipped absolute deviation (SCAD) (Fan and Li (2001). Journal of the American Statistical Association96, 1348,1360; Fan and Li (2002). The Annals of Statistics30, 74,99). The purpose of this article is to implement them practically into the model building process when analyzing high-dimensional data with the Cox proportional hazards model. To evaluate penalized regression models beyond the lasso, we included SCAD variants and the adaptive lasso (Zou (2006). Journal of the American Statistical Association101, 1418,1429). We compare them with "standard" applications such as ridge regression, the lasso, and the elastic net. Predictive accuracy, features of variable selection, and estimation bias will be studied to assess the practical use of these methods. We observed that the performance of SCAD and adaptive lasso is highly dependent on nontrivial preselection procedures. A practical solution to this problem does not yet exist. Since there is high risk of missing relevant covariates when using SCAD or adaptive lasso applied after an inappropriate initial selection step, we recommend to stay with lasso or the elastic net in actual data applications. But with respect to the promising results for truly sparse models, we see some advantage of SCAD and adaptive lasso, if better preselection procedures would be available. This requires further methodological research. [source]


Hierarchical and Joint Site-Edge Methods for Medicare Hospice Service Region Boundary Analysis

BIOMETRICS, Issue 2 2010
Haijun Ma
Summary Hospice service offers a convenient and ethically preferable health-care option for terminally ill patients. However, this option is unavailable to patients in remote areas not served by any hospice system. In this article, we seek to determine the service areas of two particular cancer hospice systems in northeastern Minnesota based only on death counts abstracted from Medicare billing records. The problem is one of spatial boundary analysis, a field that appears statistically underdeveloped for irregular areal (lattice) data, even though most publicly available human health data are of this type. In this article, we suggest a variety of hierarchical models for areal boundary analysis that hierarchically or jointly parameterize,both,the areas and the edge segments. This leads to conceptually appealing solutions for our data that remain computationally feasible. While our approaches parallel similar developments in statistical image restoration using Markov random fields, important differences arise due to the irregular nature of our lattices, the sparseness and high variability of our data, the existence of important covariate information, and most importantly, our desire for full posterior inference on the boundary. Our results successfully delineate service areas for our two Minnesota hospice systems that sometimes conflict with the hospices' self-reported service areas. We also obtain boundaries for the spatial residuals from our fits, separating regions that differ for reasons yet unaccounted for by our model. [source]


Quantifying the Predictive Performance of Prognostic Models for Censored Survival Data with Time-Dependent Covariates

BIOMETRICS, Issue 2 2008
R. Schoop
Summary Prognostic models in survival analysis typically aim to describe the association between patient covariates and future outcomes. More recently, efforts have been made to include covariate information that is updated over time. However, there exists as yet no standard approach to assess the predictive accuracy of such updated predictions. In this article, proposals from the literature are discussed and a conditional loss function approach is suggested, illustrated by a publicly available data set. [source]