Censored Data (censored + data)

Distribution by Scientific Domains


Selected Abstracts


Relationships Among Tests for Censored Data

BIOMETRICAL JOURNAL, Issue 3 2005
Emilio Letón
Abstract In this paper we will give the relationships of several Score tests and Weighted tests for right censoring data with other classical tests. Special care will be taken with the case of ties and with the kind of estimation of the variance used. After the description of ten tests, a comparative study will be made among them. Finally, an application with a real example will be included. (© 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim) [source]


Asymptotic Distribution of Score Statistics for Spatial Cluster Detection with Censored Data

BIOMETRICS, Issue 4 2008
Daniel Commenges
SummaryCook, Gold, and Li (2007, Biometrics 63, 540,549) extended the Kulldorff (1997, Communications in Statistics 26, 1481,1496) scan statistic for spatial cluster detection to survival-type observations. Their approach was based on the score statistic and they proposed a permutation distribution for the maximum of score tests. The score statistic makes it possible to apply the scan statistic idea to models including explanatory variables. However, we show that the permutation distribution requires strong assumptions of independence between potential cluster and both censoring and explanatory variables. In contrast, we present an approach using the asymptotic distribution of the maximum of score statistics in a manner not requiring these assumptions. [source]


On Estimating Medical Cost and Incremental Cost-Effectiveness Ratios with Censored Data

BIOMETRICS, Issue 4 2001
Hongwei Zhao
Summary. Medical cost estimation is very important to health care organizations and health policy makers. We consider cost-effectiveness analysis for competing treatments in a staggered-entry, survival-analysis-based clinical trial. We propose a method for estimating mean medical cost over patients in such settings. The proposed estimator is shown to be consistent and asymptotically normal, and its asymptotic variance can be obtained. In addition, we propose a method for estimating the incremental cost-effectiveness ratio and for obtaining a confidence interval for it. Simulation experiments are conducted to evaluate our proposed methods. Finally, we apply our methods to a clinical trial comparing the cost effectiveness of implanted cardiac defibrillators with conventional therapy for individuals at high risk for ventricular arrhythmias. [source]


A Semiparametric Estimate of Treatment Effects with Censored Data

BIOMETRICS, Issue 3 2001
Ronghui Xu
Summary. A semiparametric estimate of an average regression effect with right-censored failure time data has recently been proposed under the Cox-type model where the regression effect ,(t) is allowed to vary with time. In this article, we derive a simple algebraic relationship between this average regression effect and a measurement of group differences in K -sample transformation models when the random error belongs to the Gp family of Harrington and Fleming (1982, Biometrika69, 553,566), the latter being equivalent to the conditional regression effect in a gamma frailty model. The models considered here are suitable for the attenuating hazard ratios that often arise in practice. The results reveal an interesting connection among the above three classes of models as alternatives to the proportional hazards assumption and add to our understanding of the behavior of the partial likelihood estimate under nonproportional hazards. The algebraic relationship provides a simple estimator under the transformation model. We develop a variance estimator based on the empirical influence function that is much easier to compute than the previously suggested resampling methods. When there is truncation in the right tail of the failure times, we propose a method of bias correction to improve the coverage properties of the confidence intervals. The estimate, its estimated variance, and the bias correction term can all be calculated with minor modifications to standard software for proportional hazards regression. [source]


Generalized additive models for location, scale and shape

JOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES C (APPLIED STATISTICS), Issue 3 2005
R. A. Rigby
Summary., A general class of statistical models for a univariate response variable is presented which we call the generalized additive model for location, scale and shape (GAMLSS). The model assumes independent observations of the response variable y given the parameters, the explanatory variables and the values of the random effects. The distribution for the response variable in the GAMLSS can be selected from a very general family of distributions including highly skew or kurtotic continuous and discrete distributions. The systematic part of the model is expanded to allow modelling not only of the mean (or location) but also of the other parameters of the distribution of y, as parametric and/or additive nonparametric (smooth) functions of explanatory variables and/or random-effects terms. Maximum (penalized) likelihood estimation is used to fit the (non)parametric models. A Newton,Raphson or Fisher scoring algorithm is used to maximize the (penalized) likelihood. The additive terms in the model are fitted by using a backfitting algorithm. Censored data are easily incorporated into the framework. Five data sets from different fields of application are analysed to emphasize the generality of the GAMLSS class of models. [source]


Interactions between genetic and reproductive factors in breast cancer risk in a population-based sample of African-American families

GENETIC EPIDEMIOLOGY, Issue 4 2002
Valérie Chaudru
Abstract Incidence of breast cancer (BC) varies among ethnic groups, with higher rates in white than in African-American women. Until now, most epidemiological and genetic studies have been carried out in white women. To investigate whether interactions between genetic and reproductive risk factors may explain part of the ethnic disparity in BC incidence, a genetic epidemiology study was conducted, between 1989 and 1994, at the Howard University Cancer Center (Washington, DC), which led to the recruitment of 245 African-American families. Segregation analysis of BC was performed by use of the class D regressive logistic model that allows for censored data to account for a variable age of onset of disease, as implemented in the REGRESS program. Segregation analysis of BC was consistent with a putative dominant gene effect (P < 0.000001) and residual sister-dependence (P < 0.0001). This putative gene was found to interact significantly with age at menarche (P = 0.048), and an interaction with a history of spontaneous abortions was suggested (P = 0.08). A late age at menarche increased BC risk in gene carriers but had a protective effect in non-gene carriers. A history of spontaneous abortions had a protective effect in gene carriers and increased BC risk in non-gene carriers. Our findings agree partially with a similar analysis of French families showing a significant gene × parity interaction and a suggestive gene × age at menarche interaction. Investigating gene × risk factor interactions in different populations may have important implications for further biological investigations and for BC risk assessment. Genet. Epidemiol. 22:285,297, 2002. © 2002 Wiley-Liss, Inc. [source]


Geostatistical Simulation for the Assessment of Regional Soil Pollution

GEOGRAPHICAL ANALYSIS, Issue 2 2010
Marc Van Meirvenne
Regional scale inventories of heavy metal concentrations in soil increasingly are being done to evaluate their global patterns of variation. Sometimes these global pattern evaluations reveal information that is not identified by more detailed studies. Geostatistical methods, such as stochastic simulation, have not yet been used routinely for this purpose in spite of their potential. To investigate such a use of geostatistical methods, we analyzed a data set of 14,674 copper and 12,441 cadmium observations in the topsoil of Flanders, Belgium, covering 13,522 km2. Outliers were identified and removed, and the distributions were spatially declustered. Copper was analyzed using sequential Gaussian simulation, whereas for cadmium we used sequential indicator simulation because of the large proportion (43%) of censored data. We complemented maps of the estimated values with maps of the probability of exceeding a critical sanitation threshold for agricultural land use. These sets of maps allowed the identification of regional patterns of increased metal concentrations and provided insight into their potential causes. Mostly areas with known industrial activities (such as lead and zinc smelters) could be delineated, but the effects of shells fired during the First World War were also identified. En los estudios de contaminación de suelos as escala regional, es práctica común la implementación de inventarios de concentraciones de metales pesados en el suelo con el fin de evaluar sus patrones globales de variación espacial. A veces dichas evaluaciones de patrones globales proporcionan información que no son aparentes en estudios realizados a escalas más detalladas. En este contexto, a pesar del potencial analítico que poseen, los métodos geostadísticos como la simulación estocástica han recibido poca atención. Los autores del presente artículo proponen llenar este vacío aplicando métodos geostadísticos para el análisis de dos bases de datos: 14,674 observaciones de cobre (Cu) y 12,441 observaciones de cadmio (Cd). Los datos corresponden a la capa superior de suelo en un área de 13,522 km2 en Flandes, Belgica. Tras la remoción de los valores extremos (outliers) y la desaglomeración de las distribuciones, los autores analizan los datos vía dos procedimientos: a) una Simulación Secuencial Gausiana (SGS) para los datos de cobre, y b) una Simulación Secuencial Indicador (SIS). La diferencia en el tratamiento analítico para ambos metales obedece a la considerable proporción (43%) de datos censurados de cadmio. Los mapas resultantes de valores estimados fueron complementados con mapas que ilustran la probabilidad de exceder los umbrales críticos para uso agrícola de la tierra. Esta serie de mapas permitió la identificación de patrones regionales de concentraciones crecientes de metales y proporciono claves importantes acerca de sus posibles causas. Los patrones hallados coinciden con áreas donde se realizan actividades industriales (como fundiciones de plomo y zinc), pero también con la distribución espacial de casquillos de balas disparadas durante la Primera Guerra Mundial. [source]


Time to establishment success for introduced signal crayfish in Sweden , a statistical evaluation when success is partially known

JOURNAL OF APPLIED ECOLOGY, Issue 5 2010
Ullrika Sahlin
Summary 1.,The signal crayfish Pacifastacus leniusculus is an invasive species in Sweden, threatening the red-listed nobel crayfish Astacus astacus through spreading the crayfish plague. Time-to-event models can handle censored data on such introduced populations for which the state (successful or not) is only partially known at the last observation, but even though data on introduced populations most often are censored, this type of model is usually not used for likelihood-based inference and predictions of the dynamics of establishing populations. 2.,We specified and fitted a probabilistic time-to-event model to be used to predict the time to successful establishment of signal crayfish populations introduced into Sweden. Important covariates of establishment success were found by the methods of ,model averaging' and ,hierarchical partitioning', considering model uncertainty and multi-colinearity, respectively. 3.,The hazard function that received the highest evidence based on the empirical data showed that the chances of establishment were highest in the time periods immediately following the first introduction. The model predicts establishment success to be <50% within 5 years after first introduction over the current distributional range of signal crayfish in Sweden today. 4.,Among covariates related to temperature, fish species and physical properties of the habitat, the length of the growing season was the most important and consistent covariate of establishment success. We found that establishment success of signal crayfish is expected to increase with the number of days when growth is possible, and decrease with the number of days with extremely high temperatures, which can be seen to approximate conditions of stress. 5.,Synthesis and applications. The results demonstrate lower establishment success of signal crayfish further north in Sweden, which may decrease the incentives of additional illegal introductions that may threaten the red-listed noble crayfish Astacus astacus. We provide a fully probabilistic statistical evaluation that quantifies uncertainty in the duration of the establishment stage that is useful for management decisions of invasive species. The combination of model averaging and hierarchical partitioning provides a comprehensive method to address multi-colinearity common to retrospective data on establishment success of invasive species. [source]


Maximum likelihood estimation in semiparametric regression models with censored data

JOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES B (STATISTICAL METHODOLOGY), Issue 4 2007
D. Zeng
Summary., Semiparametric regression models play a central role in formulating the effects of covariates on potentially censored failure times and in the joint modelling of incomplete repeated measures and failure times in longitudinal studies. The presence of infinite dimensional parameters poses considerable theoretical and computational challenges in the statistical analysis of such models. We present several classes of semiparametric regression models, which extend the existing models in important directions. We construct appropriate likelihood functions involving both finite dimensional and infinite dimensional parameters. The maximum likelihood estimators are consistent and asymptotically normal with efficient variances. We develop simple and stable numerical techniques to implement the corresponding inference procedures. Extensive simulation experiments demonstrate that the inferential and computational methods proposed perform well in practical settings. Applications to three medical studies yield important new insights. We conclude that there is no reason, theoretical or numerical, not to use maximum likelihood estimation for semiparametric regression models. We discuss several areas that need further research. [source]


Refined Rank Regression Method with Censors

QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, Issue 7 2004
Wendai Wang
Abstract Reliability engineers often face failure data with suspensions. The rank regression method with an approach introduced by Johnson has been commonly used to handle data with suspensions in engineering practice and commercial software. However, the Johnson method makes partial use of suspension information only,the positions of suspensions, not the exact times to suspensions. A new approach for rank regression with censored data is proposed in this paper, which makes full use of suspension information. Taking advantage of the parametric approach, the refined rank regression obtains the ,exact' mean order number for each failure point in the sample. With the ,exact' mean order number, the proposed method gives the ,best' fit to sample data for an assumed times-to-failure distribution. This refined rank regression is simple to implement and appears to have good statistical and convergence properties. An example is provided to illustrate the proposed method. Copyright © 2004 John Wiley & Sons, Ltd. [source]


Effective directed tests for models with ordered categorical data

AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, Issue 3 2003
Arthur Cohen
Summary This paper offers a new method for testing one-sided hypotheses in discrete multivariate data models. One-sided alternatives mean that there are restrictions on the multidimensional parameter space. The focus is on models dealing with ordered categorical data. In particular, applications are concerned with R×C contingency tables. The method has advantages over other general approaches. All tests are exact in the sense that no large sample theory or large sample distribution theory is required. Testing is unconditional although its execution is done conditionally, section by section, where a section is determined by marginal totals. This eliminates any potential nuisance parameter issues. The power of the tests is more robust than the power of the typical linear tests often recommended. Furthermore, computer programs are available to carry out the tests efficiently regardless of the sample sizes or the order of the contingency tables. Both censored data and uncensored data models are discussed. [source]


The Concordance Index C and the Mann,Whitney Parameter Pr(X>Y) with Randomly Censored Data

BIOMETRICAL JOURNAL, Issue 3 2009
James A. Koziol
Abstract Harrell's c -index or concordance C has been widely used as a measure of separation of two survival distributions. In the absence of censored data, the c -index estimates the Mann,Whitney parameter Pr(X>Y), which has been repeatedly utilized in various statistical contexts. In the presence of randomly censored data, the c -index no longer estimates Pr(X>Y); rather, a parameter that involves the underlying censoring distributions. This is in contrast to Efron's maximum likelihood estimator of the Mann,Whitney parameter, which is recommended in the setting of random censorship. [source]


Exact, Distribution Free Confidence Intervals for Late Effects in Censored Matched Pairs

BIOMETRICAL JOURNAL, Issue 1 2009
Shoshana R. Daniel
Abstract When comparing censored survival times for matched treated and control subjects, a late effect on survival is one that does not begin to appear until some time has passed. In a study of provider specialty in the treatment of ovarian cancer, a late divergence in the Kaplan,Meier survival curves hinted at superior survival among patients of gynecological oncologists, who employ chemotherapy less intensively, when compared to patients of medical oncologists, who employ chemotherapy more intensively; we ask whether this late divergence should be taken seriously. Specifically, we develop exact, permutation tests, and exact confidence intervals formed by inverting the tests, for late effects in matched pairs subject to random but heterogeneous censoring. Unlike other exact confidence intervals with censored data, the proposed intervals do not require knowledge of censoring times for patients who die. Exact distributions are consequences of two results about signs, signed ranks, and their conditional independence properties. One test, the late effects sign test, has the binomial distribution; the other, the late effects signed rank test, uses nonstandard ranks but nonetheless has the same exact distribution as Wilcoxon's signed rank test. A simulation shows that the late effects signed rank test has substantially more power to detect late effects than do conventional tests. The confidence statement provides information about both the timing and magnitude of late effects (© 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim) [source]


Nonparametric Modeling of the Mean Survival Time in a Multi-factor Design Based on Randomly Right-Censored Data

BIOMETRICAL JOURNAL, Issue 5 2004
M. H. Rahbar
Abstract Statistical procedures and methodology for assessment of interventions or treatments based on medical data often involves complexities due to incompleteness of the available data as a result of drop out or the inability of complete follow up until the endpoint of interest. In this article we propose a nonparametric regression model based on censored data when we are concerned with investigation of the simultaneous effects of the two or more factors. Specifically, we will assess the effect of a treatment (dose) and a covariate (e.g., age categories) on the mean survival time of subjects assigned to combinations of the levels of these factors. The proposed method allows for varying levels of censorship in the outcome among different groups of subjects at different levels of the independent variables (factors). We derive the asymptotic distribution of the estimators of the parameters in our model, which then allows for statistical inference. Finally, through a simulation study we assess the effect of the censoring rates on the standard error of these types of estimators. (© 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim) [source]


Regression Analysis of Doubly Censored Failure Time Data Using the Additive Hazards Model

BIOMETRICS, Issue 3 2004
Liuquan Sun
Summary Doubly censored failure time data arise when the survival time of interest is the elapsed time between two related events and observations on occurrences of both events could be censored. Regression analysis of doubly censored data has recently attracted considerable attention and for this a few methods have been proposed (Kim et al., 1993, Biometrics49, 13,22; Sun et al., 1999, Biometrics55, 909,914; Pan, 2001, Biometrics57, 1245,1250). However, all of the methods are based on the proportional hazards model and it is well known that the proportional hazards model may not fit failure time data well sometimes. This article investigates regression analysis of such data using the additive hazards model and an estimating equation approach is proposed for inference about regression parameters of interest. The proposed method can be easily implemented and the properties of the proposed estimates of regression parameters are established. The method is applied to a set of doubly censored data from an AIDS cohort study. [source]


Time-Dependent ROC Curves for Censored Survival Data and a Diagnostic Marker

BIOMETRICS, Issue 2 2000
Patrick J. Heagerty
Summary. ROC curves are a popular method for displaying sensitivity and specificity of a continuous marker, X, for a binary disease variable, D. However, many disease outcomes are time dependent, D(t, and ROC curves that vary as a function of time may be mire appropriate. A common examples of a time-dependent variable is vital status, where D(t) = 1 if a patient has died prior to time t and zero otherwise. We propose summarizing the discrimination potential of a marker X, measured at baseline (t= 0), by calculating ROC Curves for cumulative disease or death incidence by time t, which we denote as ROC(t). A typical complexity with survival data is that observations may be censored. Two ROC curve estimators are proposed that can accommodate censored data. A simple estimator is based on using the Kaplan-Meier estimated for each possible subset X > c. However, this estimator does not guarantee the necessary condition that sensitivity and specificity are monotone in X. An alternative estimator that does guarantee monotonicity is based on a nearest neighbor estimator for the bivariate distribution function of (X, T), where T represents survival time (Akritas, M. J., 1994, Annals of Statistics22, 1299,1327). We present an example where ROC(t) is used to compare a standard and a modified flow cytometry measurement for predicting survival after detection of breast cancer and an example where the ROC(t) curve displays the impact of modifying eligibility criteria for sample size and power in HIV prevention trials. [source]


A Multiple Imputation Approach to Cox Regression with Interval-Censored Data

BIOMETRICS, Issue 1 2000
Wei Pan
Summary. We propose a general semiparametric method based on multiple imputation for Cox regression with interval-censored data. The method consists of iterating the following two steps. First, from finite-interval-censored (but not right-censored) data, exact failure times are imputed using Tanner and Wei's poor man's or asymptotic normal data augmentation scheme based on the current estimates of the regression coefficient and the baseline survival curve. Second, a standard statistical procedure for right-censored data, such as the Cox partial likelihood method, is applied to imputed data to update the estimates. Through simulation, we demonstrate that the resulting estimate of the regression coefficient and its associated standard error provide a promising alternative to the nonparametric maximum likelihood estimate. Our proposal is easily implemented by taking advantage of existing computer programs for right,censored data. [source]


Improved prediction of recurrence after curative resection of colon carcinoma using tree-based risk stratification

CANCER, Issue 5 2004
Martin Radespiel-Tröger M.D.
Abstract BACKGROUND Patients who are at high risk of recurrence after undergoing curative (R0) resection for colon carcinoma may benefit most from adjuvant treatment and from intensive follow-up for early detection and treatment of recurrence. However, in light of new clinical evidence, there is a need for continuous improvement in the calculation of the risk of recurrence. METHODS Six hundred forty-one patients with R0-resected colon carcinoma who underwent surgery between January 1, 1984 and December 31, 1996 were recruited from the Erlangen Registry of Colorectal Carcinoma. The study end point was time until first locoregional or distant recurrence. The factors analyzed were: age, gender, site in colon, International Union Against Cancer (UICC) pathologic tumor classification (pT), UICC pathologic lymph node classification, histologic tumor type, malignancy grade, lymphatic invasion, venous invasion, number of examined lymph nodes, number of lymph node metastases, emergency presentation, intraoperative tumor cell spillage, surgeon, and time period. The resulting prognostic tree was evaluated by means of an independent sample using a measure of predictive accuracy based on the Brier score for censored data. Predictive accuracy was compared with several proposed stage groupings. RESULTS The prognostic tree contained the following variables: pT, the number of lymph node metastases, venous invasion, and emergency presentation. Predictive accuracy based on the validation sample was 0.230 (95% confidence interval [95% CI], 0.227,0.233) for the prognostic tree and 0.212 (95% CI, 0.209,0.215) for the UICC TNM sixth edition stage grouping. CONCLUSIONS The prognostic tree showed superior predictive accuracy when it was validated using an independent sample. It is interpreted easily and may be applied under clinical circumstances. Provided that their classification system can be validated successfully in other centers, the authors propose using the prognostic tree as a starting point for studies of adjuvant treatment and follow-up strategies. Cancer 2004;100:958,67. © 2004 American Cancer Society. [source]