Home About us Contact | |||
Data Subject (data + subject)
Selected AbstractsAnalyzing Incomplete Data Subject to a Threshold using Empirical Likelihood Methods: An Application to a Pneumonia Risk Study in an ICU SettingBIOMETRICS, Issue 1 2010Jihnhee Yu Summary The initial detection of ventilator-associated pneumonia (VAP) for inpatients at an intensive care unit needs composite symptom evaluation using clinical criteria such as the clinical pulmonary infection score (CPIS). When CPIS is above a threshold value, bronchoalveolar lavage (BAL) is performed to confirm the diagnosis by counting actual bacterial pathogens. Thus, CPIS and BAL results are closely related and both are important indicators of pneumonia whereas BAL data are incomplete. To compare the pneumonia risks among treatment groups for such incomplete data, we derive a method that combines nonparametric empirical likelihood ratio techniques with classical testing for parametric models. This technique augments the study power by enabling us to use any observed data. The asymptotic property of the proposed method is investigated theoretically. Monte Carlo simulations confirm both the asymptotic results and good power properties of the proposed method. The method is applied to the actual data obtained in clinical practice settings and compares VAP risks among treatment groups. [source] A non-linear and non-Gaussian state-space model for censored air pollution dataENVIRONMETRICS, Issue 2 2005Craig J. Johns Abstract Lidar technology is used to quantify airborne particulate matter less than 10,,m in diameter (PM10). These spatio-temporal lidar data on PM10 are subject to censoring due to detection limits. A non-linear and non-Gaussian state-space model is modified to accommodate data subject to detection limits and outline strategies for Markov-chain Monte Carlo estimation and filtering. The methods are applied to spatio-temporal lidar measurements of dust particle concentrations. Copyright © 2004 John Wiley & Sons, Ltd. [source] Effects of nonrandom parental selection on estimation of variance componentsJOURNAL OF ANIMAL BREEDING AND GENETICS, Issue 4 2000F.S. Schenkel Summary Bayesian estimation via Gibbs sampling, REML, and Method R were compared for their empirical sampling properties in estimating genetic parameters from data subject to parental selection using an infinitesimal animal model. Models with and without contemporary groups, random or nonrandom parental selection, two levels of heritability, and none or 15% randomly missing pedigree information were considered. Nonrandom parental selection caused similar effects on estimates of variance components from all three methods. When pedigree information was complete, REML and Bayesian estimation were not biased by nonrandom parental selection for models with or without contemporary groups. Method R estimates, however, were strongly biased by nonrandom parental selection when contemporary groups were in the model. The bias was empirically shown to be a consequence of not fully accounting for gametic phase disequilibrium in the subsamples. The joint effects of nonrandom parental selection and missing pedigree information caused estimates from all methods to be highly biased. Missing pedigree information did not cause biased estimates in random mating populations. Method R estimates usually had greater mean square errors than did REML and Bayesian estimates. Zusammenfassung Bayes Schätzungen über Gibbs Stichproben, REML und Methode R wurden hinsichtlich ihrer empirischen Stichprobeneigenschaften für die Schätzung genetischer Parameter aus Daten unter Elternselektion bei Annahme eines unfinitesimalen Tiermodells verglichen. Es wurden Modelle mit und ohne Zeitgefährtimen Gruppen, zwei Stufen von Heritabilitäten, zufälliger und Elternselektion und komplette oder 15 % fehlende Ahnen verglichen. Bei vollständiger Ahneninformation waren Bayes und REML Schätzungen nicht verzerrt, Methode R aber bei Elternselektion und Zeitgefährtinnenstruktur der Daten stark verzerrt. Dies konnte empirisch auf Nichtberücksichtigung des gametischen Phasen-Ungleichgewichtes zurückgeführt werden. Die gemeinsame Wirkung fehlender Abstammungen und Elternselektion verursachte bei allen Methoden starke Verzerrungen außer bei niedrigen Heritabilitätswerten und einheitlicher Population (keine Zeitgefährten Gruppen). Unvollständige Abstammungen verursachten in zufallsgepaarten Populationen keine Verzerrung. Methode R führte i.a. zu größeren mittleren Fehlerquadraten als REML oder Bayes Methoden. [source] Modified weights based generalized quasilikelihood inferences in incomplete longitudinal binary modelsTHE CANADIAN JOURNAL OF STATISTICS, Issue 2 2010Brajendra C. Sutradhar Abstract In an incomplete longitudinal set up, a small number of repeated responses subject to an appropriate missing mechanism along with a set of covariates are collected from a large number of independent individuals over a small period of time. In this set up, the regression effects of the covariates are routinely estimated by solving certain inverse weights based generalized estimating equations. These inverse weights are introduced to make the estimating equation unbiased so that a consistent estimate of the regression parameter vector may be obtained. In the existing studies, these weights are in general formulated conditional on the past responses. Since the past responses follow a correlation structure, the present study reveals that if the longitudinal data subject to missing mechanism are generated by accommodating the longitudinal correlation structure, the conditional weights based on past correlated responses may yield biased and hence inconsistent regression estimates. The bias appears to get larger as the correlation increases. As a remedy, in this paper the authors proposed a modification to the formulation of the existing weights so that weights are not affected directly or indirectly by the correlations. They have then exploited these modified weights to form a weighted generalized quasi-likelihood estimating equation that yields unbiased and hence consistent estimates for the regression effects irrespective of the magnitude of correlation. The efficiencies of the regression estimates follow due to the use of the true correlation structure as a separate longitudinal weights matrix in the estimating equation. The Canadian Journal of Statistics © 2010 Statistical Society of Canada Dans un cadre de données longitudinales incomplètes, nous observons un petit nombre de réponses répétées sujettes à un mécanisme de valeurs manquantes approprié avec un ensemble de covariables provenant d'un grand nombre d'individus indépendants observés sur une petite période de temps. Dans ce cadre, les composantes de régression des covariables sont habituellement estimées en résolvant certains poids inverses obtenus à partir d'équations d'estimation généralisées. Ces poids inverses sont utilisés afin de rendre les équations d'estimation sans biais et ainsi permettre d'obtenir des estimateurs cohérents pour le vecteur des paramètres de régressions. Dans les études déjà existantes, ces poids sont généralement formulés conditionnement aux réponses passées. Puisque les réponses passées possèdent une structure de corrélation, cet article révèle que si les données longitudinales, soumises à un mécanisme de valeurs manquantes, sont générées en adaptant la structure de corrélation longitudinale, alors les poids conditionnels basés sur les réponses corrélées passées peuvent mener à des estimations biaisées, et conséquemment non cohérentes, des composantes de régression. Ce biais semble augmenter lorsque la corrélation augmente. Pour remédier à cette situation, les auteurs proposent dans cet article, une modification aux poids déjà existants afin que ceux-ci ne soient plus affectés directement ou indirectement par les corrélations. Par la suite, ils ont exploité ces poids modifiés pour obtenir une équation d'estimation généralisée pondérée basée sur la quasi-vraisemblance qui conduit à des estimateurs sans biais, et ainsi cohérents, pour les composantes de régression sans égard à l'ampleur de la corrélation. L'efficacité de ces estimateurs est attribuable à l'utilisation de la vraie structure de corrélation comme matrice de poids longitudinale à part dans l'équation d'estimation. La revue canadienne de statistique © 2010 Société statistique du Canada [source] MAXIMUM LIKELIHOOD ESTIMATION FOR A POISSON RATE PARAMETER WITH MISCLASSIFIED COUNTSAUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, Issue 2 2005James D. Stamey Summary This paper proposes a Poisson-based model that uses both error-free data and error-prone data subject to misclassification in the form of false-negative and false-positive counts. It derives maximum likelihood estimators (MLEs) for the Poisson rate parameter and the two misclassification parameters , the false-negative parameter and the false-positive parameter. It also derives expressions for the information matrix and the asymptotic variances of the MLE for the rate parameter, the MLE for the false-positive parameter, and the MLE for the false-negative parameter. Using these expressions the paper analyses the value of the fallible data. It studies characteristics of the new double-sampling rate estimator via a simulation experiment and applies the new MLE estimators and confidence intervals to a real dataset. [source] A Bayesian Chi-Squared Goodness-of-Fit Test for Censored Data ModelsBIOMETRICS, Issue 2 2010Jing Cao Summary We propose a Bayesian chi-squared model diagnostic for analysis of data subject to censoring. The test statistic has the form of Pearson's chi-squared test statistic and is easy to calculate from standard output of Markov chain Monte Carlo algorithms. The key innovation of this diagnostic is that it is based only on observed failure times. Because it does not rely on the imputation of failure times for observations that have been censored, we show that under heavy censoring it can have higher power for detecting model departures than a comparable test based on the complete data. In a simulation study, we show that tests based on this diagnostic exhibit comparable power and better nominal Type I error rates than a commonly used alternative test proposed by Akritas (1988,,Journal of the American Statistical Association,83, 222,230). An important advantage of the proposed diagnostic is that it can be applied to a broad class of censored data models, including generalized linear models and other models with nonidentically distributed and nonadditive error structures. We illustrate the proposed model diagnostic for testing the adequacy of two parametric survival models for Space Shuttle main engine failures. [source] Sensitivity Analysis for Nonrandom Dropout: A Local Influence ApproachBIOMETRICS, Issue 1 2001Geert Verbeke Summary. Diggle and Kenward (1994, Applied Statistics43, 49,93) proposed a selection model for continuous longitudinal data subject to nonrandom dropout. It has provoked a large debate about the role for such models. The original enthusiasm was followed by skepticism about the strong but untestable assumptions on which this type of model invariably rests. Since then, the view has emerged that these models should ideally be made part of a sensitivity analysis. This paper presents a formal and flexible approach to such a sensitivity assessment based on local influence (Cook, 1986, Journal of the Royal Statistical Society, Series B48, 133,169). The influence of perturbing a missing-at-random dropout model in the direction of nonrandom dropout is explored. The method is applied to data from a randomized experiment on the inhibition of testosterone production in rats. [source] Multiple imputation for combining confidential data owned by two agenciesJOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES A (STATISTICS IN SOCIETY), Issue 2 2009Christine N. Kohnen Summary., Statistical agencies that own different databases on overlapping subjects can benefit greatly from combining their data. These benefits are passed on to secondary data analysts when the combined data are disseminated to the public. Sometimes combining data across agencies or sharing these data with the public is not possible: one or both of these actions may break promises of confidentiality that have been given to data subjects. We describe an approach that is based on two stages of multiple imputation that facilitates data sharing and dissemination under restrictions of confidentiality. We present new inferential methods that properly account for the uncertainty that is caused by the two stages of imputation. We illustrate the approach by using artificial and genuine data. [source] |