Missing Data (miss + data)

Distribution by Scientific Domains


Selected Abstracts


Imputation of SF-12 Health Scores for Respondents with Partially Missing Data

HEALTH SERVICES RESEARCH, Issue 3 2005
Honghu Liu
Objective. To create an efficient imputation algorithm for imputing the SF-12 physical component summary (PCS) and mental component summary (MCS) scores when patients have one to eleven SF-12 items missing. Study Setting. Primary data collection was performed between 1996 and 1998. Study Design. Multi-pattern regression was conducted to impute the scores using only available SF-12 items (simple model), and then supplemented by demographics, smoking status and comorbidity (enhanced model) to increase the accuracy. A cut point of missing SF-12 items was determined for using the simple or the enhanced model. The algorithm was validated through simulation. Data Collection. Thirty-thousand-three-hundred and eight patients from 63 physician groups were surveyed for a quality of care study in 1996, which collected the SF-12 and other information. The patients were classified as "chronic" patients if they reported that they had diabetes, heart disease, asthma/chronic obstructive pulmonary disease, or low back pain. A follow-up survey was conducted in 1998. Principal Findings. Thirty-one percent of the patients missed at least one SF-12 item. Means of variance of prediction and standard errors of the mean imputed scores increased with the number of missing SF-12 items. Correlations between the observed and the imputed scores derived from the enhanced models were consistently higher than those derived from the simple model and the increments were significant for patients with ,6 missing SF-12 items (p<.03). Conclusion. Missing SF-12 items are prevalent and lead to reduced analytical power. Regression-based multi-pattern imputation using the available SF-12 items is efficient and can produce good estimates of the scores. The enhancement from the additional patient information can significantly improve the accuracy of the imputed scores for patients with ,6 items missing, leading to estimated scores that are as accurate as that of patients with <6 missing items. [source]


Estimation of Item Response Theory Parameters in the Presence of Missing Data

JOURNAL OF EDUCATIONAL MEASUREMENT, Issue 3 2008
Holmes Finch
Missing data are a common problem in a variety of measurement settings, including responses to items on both cognitive and affective assessments. Researchers have shown that such missing data may create problems in the estimation of item difficulty parameters in the Item Response Theory (IRT) context, particularly if they are ignored. At the same time, a number of data imputation methods have been developed outside of the IRT framework and been shown to be effective tools for dealing with missing data. The current study takes several of these methods that have been found to be useful in other contexts and investigates their performance with IRT data that contain missing values. Through a simulation study, it is shown that these methods exhibit varying degrees of effectiveness in terms of imputing data that in turn produce accurate sample estimates of item difficulty and discrimination parameters. [source]


Meta-Analysis of Studies with Missing Data

BIOMETRICS, Issue 2 2009
Ying Yuan
Summary Consider a meta-analysis of studies with varying proportions of patient-level missing data, and assume that each primary study has made certain missing data adjustments so that the reported estimates of treatment effect size and variance are valid. These estimates of treatment effects can be combined across studies by standard meta-analytic methods, employing a random-effects model to account for heterogeneity across studies. However, we note that a meta-analysis based on the standard random-effects model will lead to biased estimates when the attrition rates of primary studies depend on the size of the underlying study-level treatment effect. Perhaps ignorable within each study, these types of missing data are in fact not ignorable in a meta-analysis. We propose three methods to correct the bias resulting from such missing data in a meta-analysis: reweighting the DerSimonian,Laird estimate by the completion rate; incorporating the completion rate into a Bayesian random-effects model; and inference based on a Bayesian shared-parameter model that includes the completion rate. We illustrate these methods through a meta-analysis of 16 published randomized trials that examined combined pharmacotherapy and psychological treatment for depression. [source]


Discriminant Analysis for Longitudinal Data with Multiple Continuous Responses and Possibly Missing Data

BIOMETRICS, Issue 1 2009
Guillermo Marshall
Summary Multiple outcomes are often used to properly characterize an effect of interest. This article discusses model-based statistical methods for the classification of units into one of two or more groups where, for each unit, repeated measurements over time are obtained on each outcome. We relate the observed outcomes using multivariate nonlinear mixed-effects models to describe evolutions in different groups. Due to its flexibility, the random-effects approach for the joint modeling of multiple outcomes can be used to estimate population parameters for a discriminant model that classifies units into distinct predefined groups or populations. Parameter estimation is done via the expectation-maximization algorithm with a linear approximation step. We conduct a simulation study that sheds light on the effect that the linear approximation has on classification results. We present an example using data from a study in 161 pregnant women in Santiago, Chile, where the main interest is to predict normal versus abnormal pregnancy outcomes. [source]


Doubly Robust Estimation in Missing Data and Causal Inference Models

BIOMETRICS, Issue 4 2005
Heejung Bang
Summary The goal of this article is to construct doubly robust (DR) estimators in ignorable missing data and causal inference models. In a missing data model, an estimator is DR if it remains consistent when either (but not necessarily both) a model for the missingness mechanism or a model for the distribution of the complete data is correctly specified. Because with observational data one can never be sure that either a missingness model or a complete data model is correct, perhaps the best that can be hoped for is to find a DR estimator. DR estimators, in contrast to standard likelihood-based or (nonaugmented) inverse probability-weighted estimators, give the analyst two chances, instead of only one, to make a valid inference. In a causal inference model, an estimator is DR if it remains consistent when either a model for the treatment assignment mechanism or a model for the distribution of the counterfactual data is correctly specified. Because with observational data one can never be sure that a model for the treatment assignment mechanism or a model for the counterfactual data is correct, inference based on DR estimators should improve upon previous approaches. Indeed, we present the results of simulation studies which demonstrate that the finite sample performance of DR estimators is as impressive as theory would predict. The proposed method is applied to a cardiovascular clinical trial. [source]


Advanced Statistics: Missing Data in Clinical Research,Part 1: An Introduction and Conceptual Framework

ACADEMIC EMERGENCY MEDICINE, Issue 7 2007
Jason S. Haukoos MD
Missing data are commonly encountered in clinical research. Unfortunately, they are often neglected or not properly handled during analytic procedures, and this may substantially bias the results of the study, reduce study power, and lead to invalid conclusions. In this two-part series, the authors will introduce key concepts regarding missing data in clinical research, provide a conceptual framework for how to approach missing data in this setting, describe typical mechanisms and patterns of censoring of data and their relationships to specific methods of handling incomplete data, and describe in detail several simple and more complex methods of handling such data. In part 1, the authors will describe relatively simple approaches to handling missing data, including complete-case analysis, available-case analysis, and several forms of single imputation, including mean imputation, regression imputation, hot and cold deck imputation, last observation carried forward, and worst case analysis. In part 2, the authors will describe in detail multiple imputation, a more sophisticated and valid method for handling missing data. [source]


Advanced Statistics: Missing Data in Clinical Research,Part 2: Multiple Imputation

ACADEMIC EMERGENCY MEDICINE, Issue 7 2007
Craig D. Newgard MD
In part 1 of this series, the authors describe the importance of incomplete data in clinical research, and provide a conceptual framework for handling incomplete data by describing typical mechanisms and patterns of censoring, and detailing a variety of relatively simple methods and their limitations. In part 2, the authors will explore multiple imputation (MI), a more sophisticated and valid method for handling incomplete data in clinical research. This article will provide a detailed conceptual framework for MI, comparative examples of MI versus naive methods for handling incomplete data (and how different methods may impact subsequent study results), plus a practical user's guide to implementing MI, including sample statistical software MI code and a deidentified precoded database for use with the sample code. [source]