Type I Error Rates (type + i_error_rate)

Distribution by Scientific Domains


Selected Abstracts


STATISTICAL ANALYSIS OF DIVERSIFICATION WITH SPECIES TRAITS

EVOLUTION, Issue 1 2005
Emmanuel Paradis
Abstract Testing whether some species traits have a significant effect on diversification rates is central in the assessment of macroevolutionary theories. However, we still lack a powerful method to tackle this objective. I present a new method for the statistical analysis of diversification with species traits. The required data are observations of the traits on recent species, the phylogenetic tree of these species, and reconstructions of ancestral values of the traits. Several traits, either continuous or discrete, and in some cases their interactions, can be analyzed simultaneously. The parameters are estimated by the method of maximum likelihood. The statistical significance of the effects in a model can be tested with likelihood ratio tests. A simulation study showed that past random extinction events do not affect the Type I error rate of the tests, whereas statistical power is decreased, though some power is still kept if the effect of the simulated trait on speciation is strong. The use of the method is illustrated by the analysis of published data on primates. The analysis of these data showed that the apparent overall positive relationship between body mass and species diversity is actually an artifact due to a clade-specific effect. Within each clade the effect of body mass on speciation rate was in fact negative. The present method allows to take both effects (clade and body mass) into account simultaneously. [source]


Hierarchical Logistic Regression: Accounting for Multilevel Data in DIF Detection

JOURNAL OF EDUCATIONAL MEASUREMENT, Issue 3 2010
Brian F. French
The purpose of this study was to examine the performance of differential item functioning (DIF) assessment in the presence of a multilevel structure that often underlies data from large-scale testing programs. Analyses were conducted using logistic regression (LR), a popular, flexible, and effective tool for DIF detection. Data were simulated using a hierarchical framework, such as might be seen when examinees are clustered in schools, for example. Both standard and hierarchical LR (accounting for multilevel data) approaches to DIF detection were employed. Results highlight the differences in DIF detection rates when the analytic strategy matches the data structure. Specifically, when the grouping variable was within clusters, LR and HLR performed similarly in terms of Type I error control and power. However, when the grouping variable was between clusters, LR failed to maintain the nominal Type I error rate of .05. HLR was able to maintain this rate. However, power for HLR tended to be low under many conditions in the between cluster variable case. [source]


Statistical Properties of the K -Index for Detecting Answer Copying

JOURNAL OF EDUCATIONAL MEASUREMENT, Issue 2 2002
Leonardo S. Sotaridona
We investigated the statistical properties of the K-index (Holland, 1996) that can be used to detect copying behavior on a test. A simulation study was conducted to investigate the applicability of the K-index for small, medium, and large datasets. Furthermore, the Type I error rate and the detection rate of this index were compared with the copying index, , (Wollack, 1997). Several approximations were used to calculate the K-index. Results showed that all approximations were able to hold the Type I error rates below the nominal level. Results further showed that using , resulted in higher detection rates than the K-indices for small and medium sample sizes (100 and 500 simulees). [source]


COMPARISON OF METHODS FOR ANALYZING REPLICATED PREFERENCE TESTS

JOURNAL OF SENSORY STUDIES, Issue 6 2005
CHUN-YEN CHANG COCHRANE
ABSTRACT Preference testing is commonly used in consumer sensory evaluation. Traditionally, it is done without replication, effectively leading to a single 0/1 (binary) measurement on each panelist. However, to understand the nature of the preference, replicated preference tests are a better approach, resulting in binomial counts of preferences on each panelist. Variability among panelists then leads to overdispersion of the counts when the binomial model is used and to an inflated Type I error rate for statistical tests of preference. Overdispersion can be adjusted by Pearson correction or by other models such as correlated binomial or beta-binomial. Several methods are suggested or reviewed in this study for analyzing replicated preference tests and their Type I error rates and power are compared. Simulation studies show that all methods have reasonable Type I error rates and similar power. Among them, the binomial model with Pearson adjustment is probably the safest way to analyze replicated preference tests, while a normal model in which the binomial distribution is not assumed is the easiest. [source]


Assessing teratogenicity of antiretroviral drugs: monitoring and analysis plan of the Antiretroviral Pregnancy Registry,,

PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, Issue 8 2004
Deborah L. Covington DrPH
Abstract This paper describes the Antiretroviral Pregnancy Registry's (APR) monitoring and analysis plan. APR is overseen by a committee of experts in obstetrics, pediatrics, teratology, infectious diseases, epidemiology and biostatistics from academia, government and the pharmaceutical industry. APR uses a prospective exposure-registration cohort design. Clinicians voluntarily register pregnant women with prenatal exposures to any antiretroviral therapy and provide fetal/neonatal outcomes. A birth defect is any birth outcome ,20 weeks gestation with a structural or chromosomal abnormality as determined by a geneticist. The prevalence is calculated by dividing the number of defects by the total number of live births and is compared to the prevalence in the CDC's population-based surveillance system. Additionally, first trimester exposures, in which organogenesis occurs, are compared with second/third trimester exposures. Statistical inference is based on exact methods for binomial proportions. Overall, a cohort of 200 exposed newborns is required to detect a doubling of risk, with 80% power and a Type I error rate of 5%. APR uses the Rule of Three: immediate review occurs once three specific defects are reported for a specific exposure. The likelihood of finding three specific defects in a cohort of ,600 by chance alone is less than 5% for all but the most common defects. To enhance the assurance of prompt, responsible, and appropriate action in the event of a potential signal, APR employs the strategy of ,threshold'. The threshold for action is determined by the extent of certainty about the cases, driven by statistical considerations and tempered by the specifics of the cases. Copyright © 2004 John Wiley & Sons, Ltd. [source]


Inflation of Type I error rate in multiple regression when independent variables are measured with error

THE CANADIAN JOURNAL OF STATISTICS, Issue 1 2009
Jerry Brunner
MSC 2000: Primary 62J99; secondary 62H15 Abstract When independent variables are measured with error, ordinary least squares regression can yield parameter estimates that are biased and inconsistent. This article documents an inflation of Type I error rate that can also occur. In addition to analytic results, a large-scale Monte Carlo study shows unacceptably high Type I error rates under circumstances that could easily be encountered in practice. A set of smaller-scale simulations indicate that the problem applies to various types of regression and various types of measurement error. The Canadian Journal of Statistics 37: 33-46; 2009 © 2009 Statistical Society of Canada Lorsque les variables indépendantes sont mesurées avec erreur, la régression des moindres carrés ordinaires peut conduire à une estimation biaisée et incohérente des paramètres. Cet article montre qu'un accroissement de l'erreur de type I peut aussi se produire. En plus de résultats analytiques, une étude par simulations Monte-Carlo de grande envergure montre que, dans certaines conditions que nous pouvons rencontrer facilement en pratique, l'erreur de type I peut être trop élevée. Une autre étude de Monte-Carlo de moindre envergure suggère que ce problème se rencontre aussi dans plusieurs types de régression et différents types d'erreur de mesure. La revue canadienne de statistique 37: 33-46; 2009 © 2009 Société statistique du Canada [source]


Blinded Sample Size Reestimation in Non-Inferiority Trials with Binary Endpoints

BIOMETRICAL JOURNAL, Issue 6 2007
Tim Friede
Abstract Sample size calculations in the planning of clinical trials depend on good estimates of the model parameters involved. When the estimates of these parameters have a high degree of uncertainty attached to them, it is advantageous to reestimate the sample size after an internal pilot study. For non-inferiority trials with binary outcome we compare the performance of Type I error rate and power between fixed-size designs and designs with sample size reestimation. The latter design shows itself to be effective in correcting sample size and power of the tests when misspecification of nuisance parameters occurs with the former design. (© 2007 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim) [source]


PHYLOGENETICALLY NESTED COMPARISONS FOR TESTING CORRELATES OF SPECIES RICHNESS: A SIMULATION STUDY OF CONTINUOUS VARIABLES

EVOLUTION, Issue 1 2003
NICK J. B. ISAAC
Abstract., Explaining the uneven distribution of species among lineages is one of the oldest questions in evolution. Proposed correlations between biological traits and species diversity are routinely tested by making comparisons between phylogenetic sister clades. Several recent studies have used nested sister-clade comparisons to test hypotheses linking continuously varying traits, such as body size, with diversity. Evaluating the findings of these studies is complicated because they differ in the index of species richness difference used, the way in which trait differences were treated, and the statistical tests employed. In this paper, we use simulations to compare the performance of four species richness indices, two choices about the branch lengths used to estimate trait values for internal nodes and two statistical tests under a range of models of clade growth and character evolution. All four indices returned appropriate Type I error rates when the assumptions of the method were met and when branch lengths were set proportional to time. Only two of the indices were robust to the different evolutionary models and to different choices of branch lengths and statistical tests. These robust indices had comparable power under one nonnull scenario. Regression through the origin was consistently more powerful than the t -test, and the choice of branch lengths exerts a strong effect on both the validity and power. In the light of our simulations, we re-evaluate the findings of those who have previously used nested comparisons in the context of species richness. We provide a set of simple guidelines to maximize the performance of phylogenetically nested comparisons in tests of putative correlates of species richness. [source]


A Comparison of Item Fit Statistics for Mixed IRT Models

JOURNAL OF EDUCATIONAL MEASUREMENT, Issue 3 2010
Kyong Hee Chon
In this study we examined procedures for assessing model-data fit of item response theory (IRT) models for mixed format data. The model fit indices used in this study include PARSCALE's,G2,,Orlando and Thissen's,S,,,X2,and,S,,,G2,,and Stone's,,2*,and,G2*. To investigate the relative performance of the fit statistics at the item level, we conducted two simulation studies: Type I error and power studies. We evaluated the performance of the item fit indices for various conditions of test length, sample size, and IRT models. Among the competing measures, the summed score-based indices,S,,,X2,and,S,,,G2,were found to be the sensible and efficient choice for assessing model fit for mixed format data. These indices performed well, particularly with short tests. The pseudo-observed score indices, ,2*,and,G2*,,showed inflated Type I error rates in some simulation conditions. Consistent with the findings of current literature, the PARSCALE's,G2,index was rarely useful, although it provided reasonable results for long tests. [source]


Statistical Properties of the K -Index for Detecting Answer Copying

JOURNAL OF EDUCATIONAL MEASUREMENT, Issue 2 2002
Leonardo S. Sotaridona
We investigated the statistical properties of the K-index (Holland, 1996) that can be used to detect copying behavior on a test. A simulation study was conducted to investigate the applicability of the K-index for small, medium, and large datasets. Furthermore, the Type I error rate and the detection rate of this index were compared with the copying index, , (Wollack, 1997). Several approximations were used to calculate the K-index. Results showed that all approximations were able to hold the Type I error rates below the nominal level. Results further showed that using , resulted in higher detection rates than the K-indices for small and medium sample sizes (100 and 500 simulees). [source]


COMPARISON OF METHODS FOR ANALYZING REPLICATED PREFERENCE TESTS

JOURNAL OF SENSORY STUDIES, Issue 6 2005
CHUN-YEN CHANG COCHRANE
ABSTRACT Preference testing is commonly used in consumer sensory evaluation. Traditionally, it is done without replication, effectively leading to a single 0/1 (binary) measurement on each panelist. However, to understand the nature of the preference, replicated preference tests are a better approach, resulting in binomial counts of preferences on each panelist. Variability among panelists then leads to overdispersion of the counts when the binomial model is used and to an inflated Type I error rate for statistical tests of preference. Overdispersion can be adjusted by Pearson correction or by other models such as correlated binomial or beta-binomial. Several methods are suggested or reviewed in this study for analyzing replicated preference tests and their Type I error rates and power are compared. Simulation studies show that all methods have reasonable Type I error rates and similar power. Among them, the binomial model with Pearson adjustment is probably the safest way to analyze replicated preference tests, while a normal model in which the binomial distribution is not assumed is the easiest. [source]


Inflation of Type I error rate in multiple regression when independent variables are measured with error

THE CANADIAN JOURNAL OF STATISTICS, Issue 1 2009
Jerry Brunner
MSC 2000: Primary 62J99; secondary 62H15 Abstract When independent variables are measured with error, ordinary least squares regression can yield parameter estimates that are biased and inconsistent. This article documents an inflation of Type I error rate that can also occur. In addition to analytic results, a large-scale Monte Carlo study shows unacceptably high Type I error rates under circumstances that could easily be encountered in practice. A set of smaller-scale simulations indicate that the problem applies to various types of regression and various types of measurement error. The Canadian Journal of Statistics 37: 33-46; 2009 © 2009 Statistical Society of Canada Lorsque les variables indépendantes sont mesurées avec erreur, la régression des moindres carrés ordinaires peut conduire à une estimation biaisée et incohérente des paramètres. Cet article montre qu'un accroissement de l'erreur de type I peut aussi se produire. En plus de résultats analytiques, une étude par simulations Monte-Carlo de grande envergure montre que, dans certaines conditions que nous pouvons rencontrer facilement en pratique, l'erreur de type I peut être trop élevée. Une autre étude de Monte-Carlo de moindre envergure suggère que ce problème se rencontre aussi dans plusieurs types de régression et différents types d'erreur de mesure. La revue canadienne de statistique 37: 33-46; 2009 © 2009 Société statistique du Canada [source]


EXTENSIONS OF THE STANDARDIZED CROSS-SECTIONAL APPROACH TO SHORT-HORIZON EVENT STUDIES

THE JOURNAL OF FINANCIAL RESEARCH, Issue 4 2007
Ronald Bremer
Abstract Strong evidence indicates that short-horizon event-induced abnormal returns and volatility vary significantly over event days. Event-study methods that assume constant event-induced abnormal returns and volatility over event days have potentially inflated Type I error rates and poor test power. Our simple extensions of the Boehmer, Musumeci, and Poulsen (1991) approach scale abnormal returns with conditional variance, which is estimated with GARCH(1,1) and an indicator of the event in a two-stage estimation. Our method improves the Boehmer, Musumeci, and Poulsen approach on model specification and test power, even under challenging event-induced mean and volatility structures, and could standardize short-horizon event studies. [source]


Resampling-Based Empirical Bayes Multiple Testing Procedures for Controlling Generalized Tail Probability and Expected Value Error Rates: Focus on the False Discovery Rate and Simulation Study

BIOMETRICAL JOURNAL, Issue 5 2008
Sandrine Dudoit
Abstract This article proposes resampling-based empirical Bayes multiple testing procedures for controlling a broad class of Type I error rates, defined as generalized tail probability (gTP) error rates, gTP (q,g) = Pr(g (Vn,Sn) > q), and generalized expected value (gEV) error rates, gEV (g) = E [g (Vn,Sn)], for arbitrary functions g (Vn,Sn) of the numbers of false positives Vn and true positives Sn. Of particular interest are error rates based on the proportion g (Vn,Sn) = Vn /(Vn + Sn) of Type I errors among the rejected hypotheses, such as the false discovery rate (FDR), FDR = E [Vn /(Vn + Sn)]. The proposed procedures offer several advantages over existing methods. They provide Type I error control for general data generating distributions, with arbitrary dependence structures among variables. Gains in power are achieved by deriving rejection regions based on guessed sets of true null hypotheses and null test statistics randomly sampled from joint distributions that account for the dependence structure of the data. The Type I error and power properties of an FDR-controlling version of the resampling-based empirical Bayes approach are investigated and compared to those of widely-used FDR-controlling linear step-up procedures in a simulation study. The Type I error and power trade-off achieved by the empirical Bayes procedures under a variety of testing scenarios allows this approach to be competitive with or outperform the Storey and Tibshirani (2003) linear step-up procedure, as an alternative to the classical Benjamini and Hochberg (1995) procedure. (© 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim) [source]


A Bayesian Chi-Squared Goodness-of-Fit Test for Censored Data Models

BIOMETRICS, Issue 2 2010
Jing Cao
Summary We propose a Bayesian chi-squared model diagnostic for analysis of data subject to censoring. The test statistic has the form of Pearson's chi-squared test statistic and is easy to calculate from standard output of Markov chain Monte Carlo algorithms. The key innovation of this diagnostic is that it is based only on observed failure times. Because it does not rely on the imputation of failure times for observations that have been censored, we show that under heavy censoring it can have higher power for detecting model departures than a comparable test based on the complete data. In a simulation study, we show that tests based on this diagnostic exhibit comparable power and better nominal Type I error rates than a commonly used alternative test proposed by Akritas (1988,,Journal of the American Statistical Association,83, 222,230). An important advantage of the proposed diagnostic is that it can be applied to a broad class of censored data models, including generalized linear models and other models with nonidentically distributed and nonadditive error structures. We illustrate the proposed model diagnostic for testing the adequacy of two parametric survival models for Space Shuttle main engine failures. [source]


WHY DOES A METHOD THAT FAILS CONTINUE TO BE USED?

EVOLUTION, Issue 4 2009
THE ANSWER
It has been claimed that hundreds of researchers use nested clade phylogeographic analysis (NCPA) based on what the method promises rather than requiring objective validation of the method. The supposed failure of NCPA is based upon the argument that validating it by using positive controls ignored type I error, and that computer simulations have shown a high type I error. The first argument is factually incorrect: the previously published validation analysis fully accounted for both type I and type II errors. The simulations that indicate a 75% type I error rate have serious flaws and only evaluate outdated versions of NCPA. These outdated type I error rates fall precipitously when the 2003 version of single-locus NCPA is used or when the 2002 multilocus version of NCPA is used. It is shown that the tree-wise type I errors in single-locus NCPA can be corrected to the desired nominal level by a simple statistical procedure, and that multilocus NCPA reconstructs a simulated scenario used to discredit NCPA with 100% accuracy. Hence, NCPA is a not a failed method at all, but rather has been validated both by actual data and by simulated data in a manner that satisfies the published criteria given by its critics. The critics have come to different conclusions because they have focused on the pre-2002 versions of NCPA and have failed to take into account the extensive developments in NCPA since 2002. Hence, researchers can choose to use NCPA based upon objective critical validation that shows that NCPA delivers what it promises. [source]


Influence of population stratification on population-based marker-disease association analysis

ANNALS OF HUMAN GENETICS, Issue 4 2010
Tengfei Li
Summary Population-based genetic association analysis may suffer from the failure to control for confounders such as population stratification (PS). There has been extensive study on the influence of PS on candidate gene-disease association analysis, but much less attention has been paid to its influence on marker-disease association analysis. In this paper, we focus on the Pearson ,2 test and the trend test for marker-disease association analysis. The mean and variance of the test statistics are derived under presence of PS, so that the power and inflated type I error rate can be evaluated. It is shown that the bias and the variance distortion are not zero in the presence of both PS and penetrance heterogeneity (PH). Unlike candidate gene-disease association analysis, when PS is present, the bias is not zero no matter whether PH is present or not. This work generalises the published results, where only the fully recessive penetrance model is considered and only the bias is calculated. It is shown that candidate gene-disease association analysis can be treated as a special case of marker-disease association analysis. Consequently, our results extend previous studies on candidate gene-disease association analysis. A simulation study confirms the theoretical findings. [source]


Trimmed Weighted Simes' Test for Two One-Sided Hypotheses With Arbitrarily Correlated Test Statistics

BIOMETRICAL JOURNAL, Issue 6 2009
Werner Brannath
Abstract The two-sided Simes test is known to control the type I error rate with bivariate normal test statistics. For one-sided hypotheses, control of the type I error rate requires that the correlation between the bivariate normal test statistics is non-negative. In this article, we introduce a trimmed version of the one-sided weighted Simes test for two hypotheses which rejects if (i) the one-sided weighted Simes test rejects and (ii) both p -values are below one minus the respective weighted Bonferroni adjusted level. We show that the trimmed version controls the type I error rate at nominal significance level , if (i) the common distribution of test statistics is point symmetric and (ii) the two-sided weighted Simes test at level 2, controls the level. These assumptions apply, for instance, to bivariate normal test statistics with arbitrary correlation. In a simulation study, we compare the power of the trimmed weighted Simes test with the power of the weighted Bonferroni test and the untrimmed weighted Simes test. An additional result of this article ensures type I error rate control of the usual weighted Simes test under a weak version of the positive regression dependence condition for the case of two hypotheses. This condition is shown to apply to the two-sided p -values of one- or two-sample t -tests for bivariate normal endpoints with arbitrary correlation and to the corresponding one-sided p -values if the correlation is non-negative. The Simes test for such types of bivariate t -tests has not been considered before. According to our main result, the trimmed version of the weighted Simes test then also applies to the one-sided bivariate t -test with arbitrary correlation. [source]


Challenges and regulatory experiences with non-inferiority trial design without placebo arm

BIOMETRICAL JOURNAL, Issue 2 2009
H. M. James Hung
Abstract For a non-inferiority trial without a placebo arm, the direct comparison between the test treatment and the selected positive control is in principle the only basis for statistical inference. Therefore, evaluating the test treatment relative to the non-existent placebo presents extreme challenges and requires some kind of bridging from the past to the present with no current placebo data. For such inference based partly on an indirect bridging manipulation, fixed margin method and synthesis method are the two widely discussed methods in the recent literature. There are major differences in statistical inference paradigm between the two methods. The fixed margin method employs the historical data that assess the performances of the active control versus a placebo to guide the selection of the non-inferiority margin. Such guidance is not part of the ultimate statistical inference in the non-inferiority trial. In contrast, the synthesis method connects the historical data to the non-inferiority trial data for making broader inferences relating the test treatment to the non-existent current placebo. On the other hand, the type I error rate associated with the direct comparison between the test treatment and the active control cannot shed any light on the appropriateness of the indirect inference for faring the test treatment against the non-existent placebo. This work explores an approach for assessing the impact of potential bias due to violation of a key statistical assumption to guide determination of the non-inferiority margin. [source]


An Adaptive Hierarchical Test Procedure for Selecting Safe and Efficient Treatments

BIOMETRICAL JOURNAL, Issue 4 2006
Franz König
Abstract We consider the situation where during a multiple treatment (dose) control comparison high doses are truncated because of lack of safety and low doses are truncated because of lack of efficacy, e.g., by decisions of a data safety monitoring committee in multiple interim looks. We investigate the properties of a hierarchical test procedure for the efficacy outcome in the set of doses carried on until the end of the trial, starting with the highest selected dose group to be compared with the placebo at the full level ,. Left truncation, i.e., dropping doses in a sequence starting with the lowest dose, does not inflate the type I error rate. It is shown that right truncation does not inflate the type I error if efficacy and toxicity are positively related and dose selection is based on monotone functions of the safety data. A positive relation is given e.g. in the case where the efficacy and toxicity data are normally distributed with a positive pairwise correlation. A positive relation also applies if the probability for an adverse event is increasing with a normally distributed efficacy outcome. The properties of such truncation procedures are investigated by simulations. There is a conflict between achieving a small number of unsafely treated patients and a high power to detect safe and efficient doses. We also investigated a procedure to increase power where a reallocation of the sample size to the truncated treatments and the control remaining at the following stages is performed. (© 2006 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim) [source]


A Partially Linear Tree-based Regression Model for Multivariate Outcomes

BIOMETRICS, Issue 1 2010
Kai Yu
Summary In the genetic study of complex traits, especially behavior related ones, such as smoking and alcoholism, usually several phenotypic measurements are obtained for the description of the complex trait, but no single measurement can quantify fully the complicated characteristics of the symptom because of our lack of understanding of the underlying etiology. If those phenotypes share a common genetic mechanism, rather than studying each individual phenotype separately, it is more advantageous to analyze them jointly as a multivariate trait to enhance the power to identify associated genes. We propose a multilocus association test for the study of multivariate traits. The test is derived from a partially linear tree-based regression model for multiple outcomes. This novel tree-based model provides a formal statistical testing framework for the evaluation of the association between a multivariate outcome and a set of candidate predictors, such as markers within a gene or pathway, while accommodating adjustment for other covariates. Through simulation studies we show that the proposed method has an acceptable type I error rate and improved power over the univariate outcome analysis, which studies each component of the complex trait separately with multiple-comparison adjustment. A candidate gene association study of multiple smoking-related phenotypes is used to demonstrate the application and advantages of this new method. The proposed method is general enough to be used for the assessment of the joint effect of a set of multiple risk factors on a multivariate outcome in other biomedical research settings. [source]


Flexible Designs for Genomewide Association Studies

BIOMETRICS, Issue 3 2009
André Scherag
Summary Genomewide association studies attempting to unravel the genetic etiology of complex traits have recently gained attention. Frequently, these studies employ a sequential genotyping strategy: A large panel of markers is examined in a subsample of subjects, and the most promising markers are genotyped in the remaining subjects. In this article, we introduce a novel method for such designs enabling investigators to, for example, modify marker densities and sample proportions while strongly controlling the family-wise type I error rate. Loss of efficiency is avoided by redistributing conditional type I error rates of discarded markers. Our approach can be combined with cost optimal designs and entails a greater flexibility than all previously suggested designs. Among other features, it allows for marker selections based upon biological criteria instead of statistical criteria alone, or the option to modify the sample size at any time during the course of the project. For practical applicability, we develop a new algorithm, subsequently evaluate it by simulations, and illustrate it using a real data set. [source]


Marginal Analysis of Incomplete Longitudinal Binary Data: A Cautionary Note on LOCF Imputation

BIOMETRICS, Issue 3 2004
Richard J. Cook
Summary In recent years there has been considerable research devoted to the development of methods for the analysis of incomplete data in longitudinal studies. Despite these advances, the methods used in practice have changed relatively little, particularly in the reporting of pharmaceutical trials. In this setting, perhaps the most widely adopted strategy for dealing with incomplete longitudinal data is imputation by the "last observation carried forward" (LOCF) approach, in which values for missing responses are imputed using observations from the most recently completed assessment. We examine the asymptotic and empirical bias, the empirical type I error rate, and the empirical coverage probability associated with estimators and tests of treatment effect based on the LOCF imputation strategy. We consider a setting involving longitudinal binary data with longitudinal analyses based on generalized estimating equations, and an analysis based simply on the response at the end of the scheduled follow-up. We find that for both of these approaches, imputation by LOCF can lead to substantial biases in estimators of treatment effects, the type I error rates of associated tests can be greatly inflated, and the coverage probability can be far from the nominal level. Alternative analyses based on all available data lead to estimators with comparatively small bias, and inverse probability weighted analyses yield consistent estimators subject to correct specification of the missing data process. We illustrate the differences between various methods of dealing with drop-outs using data from a study of smoking behavior. [source]


Incorporating Data Received after a Sequential Trial Has Stopped into the Final Analysis: Implementation and Comparison of Methods

BIOMETRICS, Issue 3 2003
Marina Roshini Sooriyarachchi
Summary. In a sequential clinical trial, accrual of data on patients often continues after the stopping criterion for the study has been met. This is termed "overrunning." Overrunning occurs mainly when the primary response from each patient is measured after some extended observation period. The objective of this article is to compare two methods of allowing for overrunning. In particular, simulation studies are reported that assess the two procedures in terms of how well they maintain the intended type I error rate. The effect on power resulting from the incorporation of "overrunning data" using the two procedures is evaluated. [source]


Methods to account for spatial autocorrelation in the analysis of species distributional data: a review

ECOGRAPHY, Issue 5 2007
Carsten F. Dormann
Species distributional or trait data based on range map (extent-of-occurrence) or atlas survey data often display spatial autocorrelation, i.e. locations close to each other exhibit more similar values than those further apart. If this pattern remains present in the residuals of a statistical model based on such data, one of the key assumptions of standard statistical analyses, that residuals are independent and identically distributed (i.i.d), is violated. The violation of the assumption of i.i.d. residuals may bias parameter estimates and can increase type I error rates (falsely rejecting the null hypothesis of no effect). While this is increasingly recognised by researchers analysing species distribution data, there is, to our knowledge, no comprehensive overview of the many available spatial statistical methods to take spatial autocorrelation into account in tests of statistical significance. Here, we describe six different statistical approaches to infer correlates of species' distributions, for both presence/absence (binary response) and species abundance data (poisson or normally distributed response), while accounting for spatial autocorrelation in model residuals: autocovariate regression; spatial eigenvector mapping; generalised least squares; (conditional and simultaneous) autoregressive models and generalised estimating equations. A comprehensive comparison of the relative merits of these methods is beyond the scope of this paper. To demonstrate each method's implementation, however, we undertook preliminary tests based on simulated data. These preliminary tests verified that most of the spatial modeling techniques we examined showed good type I error control and precise parameter estimates, at least when confronted with simplistic simulated data containing spatial autocorrelation in the errors. However, we found that for presence/absence data the results and conclusions were very variable between the different methods. This is likely due to the low information content of binary maps. Also, in contrast with previous studies, we found that autocovariate methods consistently underestimated the effects of environmental controls of species distributions. Given their widespread use, in particular for the modelling of species presence/absence data (e.g. climate envelope models), we argue that this warrants further study and caution in their use. To aid other ecologists in making use of the methods described, code to implement them in freely available software is provided in an electronic appendix. [source]


WHY DOES A METHOD THAT FAILS CONTINUE TO BE USED?

EVOLUTION, Issue 4 2009
THE ANSWER
It has been claimed that hundreds of researchers use nested clade phylogeographic analysis (NCPA) based on what the method promises rather than requiring objective validation of the method. The supposed failure of NCPA is based upon the argument that validating it by using positive controls ignored type I error, and that computer simulations have shown a high type I error. The first argument is factually incorrect: the previously published validation analysis fully accounted for both type I and type II errors. The simulations that indicate a 75% type I error rate have serious flaws and only evaluate outdated versions of NCPA. These outdated type I error rates fall precipitously when the 2003 version of single-locus NCPA is used or when the 2002 multilocus version of NCPA is used. It is shown that the tree-wise type I errors in single-locus NCPA can be corrected to the desired nominal level by a simple statistical procedure, and that multilocus NCPA reconstructs a simulated scenario used to discredit NCPA with 100% accuracy. Hence, NCPA is a not a failed method at all, but rather has been validated both by actual data and by simulated data in a manner that satisfies the published criteria given by its critics. The critics have come to different conclusions because they have focused on the pre-2002 versions of NCPA and have failed to take into account the extensive developments in NCPA since 2002. Hence, researchers can choose to use NCPA based upon objective critical validation that shows that NCPA delivers what it promises. [source]


Bivariate combined linkage and association mapping of quantitative trait loci

GENETIC EPIDEMIOLOGY, Issue 5 2008
Jeesun Jung
Abstract In this paper, bivariate/multivariate variance component models are proposed for high-resolution combined linkage and association mapping of quantitative trait loci (QTL), based on combinations of pedigree and population data. Suppose that a quantitative trait locus is located in a chromosome region that exerts pleiotropic effects on multiple quantitative traits. In the region, multiple markers such as single nucleotide polymorphisms are typed. Two regression models, "genotype effect model" and "additive effect model", are proposed to model the association between the markers and the trait locus. The linkage information, i.e., recombination fractions between the QTL and the markers, is modeled in the variance and covariance matrix. By analytical formulae, we show that the "genotype effect model" can be used to model the additive and dominant effects simultaneously; the "additive effect model" only takes care of additive effect. Based on the two models, F -test statistics are proposed to test association between the QTL and markers. By analytical power analysis, we show that bivariate models can be more powerful than univariate models. For moderate-sized samples, the proposed models lead to correct type I error rates; and so the models are reasonably robust. As a practical example, the method is applied to analyze the genetic inheritance of rheumatoid arthritis for the data of The North American Rheumatoid Arthritis Consortium, Problem 2, Genetic Analysis Workshop 15, which confirms the advantage of the proposed bivariate models. Genet. Epidemiol. 2008. © 2008 Wiley-Liss, Inc. [source]


Global transmission/disequilibrium tests based on haplotype sharing in multiple candidate genes

GENETIC EPIDEMIOLOGY, Issue 4 2005
Kai Yu
Abstract It is well recognized that multiple genes are likely contributing to the susceptibility of most common complex diseases. Studying one gene at a time might reduce our chance to identify disease susceptibility genes with relatively small effect sizes. Therefore, it is crucial to develop statistical methods that can assess the effect of multiple genes collectively. Motivated by the increasingly available high-density markers across the whole human genome, we propose a class of TDT-type methods that can jointly analyze haplotypes from multiple candidate genes (linked or unlinked). Our approach first uses a linear signed rank statistic to compare at an individual gene level the structural similarity among transmitted haplotypes against that among non-transmitted haplotypes. The results of the ranked comparisons from all considered genes are subsequently combined into global statistics, which can simultaneously test the association of the set of genes with the disease. Using simulation studies, we find that the proposed tests yield correct type I error rates in stratified populations. Compared with the gene-by-gene test, the new global tests appear to be more powerful in situations where all candidate genes are associated with the disease. Genet. Epidemiol. 2005. © 2005 Wiley-Liss, Inc. [source]


Candidate-gene association studies with pedigree data: Controlling for environmental covariates

GENETIC EPIDEMIOLOGY, Issue 4 2003
S.L. Slager
Abstract Case-control studies provide an important epidemiological tool to evaluate candidate genes. There are many different study designs available. We focus on a more recently proposed design, which we call a multiplex case-control (MCC) design. This design compares allele frequencies between related cases, each of whom are sampled from multiplex families, and unrelated controls. Since within-family genotype correlations will exist, statistical methods will need to take this into account. Moreover, there is a need to develop methods to simultaneously control for potential confounders in the analysis. Generalized estimating equations (GEE) are one approach to analyze this type of data; however, this approach can have singularity problems when estimating the correlation matrix. To allow for modeling of other covariates, we extend our previously developed method to a more general model-based approach. Our proposed methods use the score statistic, derived from a composite likelihood. We propose three different approaches to estimate the variance of this statistic. Under random ascertainment of pedigrees, score tests have correct type I error rates; however, pedigrees are not randomly ascertained. Thus, through simulations, we test the validity and power of the score tests under different ascertainment schemes, and an illustration of our methods, applied to data from a prostate cancer study, is presented. We find that our robust score statistic has estimated type I error rates within the expected range for all situations we considered whereas the other two statistics have inflated type I error rates under nonrandom ascertainment schemes. We also find GEE to fail at least 5% of the time for each simulation configuration; at times, the failure rate reaches above 80%. In summary, our robust method may be the only current regression analysis method available for MCC data. Genet Epidemiol 24:273,283, 2003. © 2003 Wiley-Liss, Inc. [source]


A randomisation program to compare species-richness values

INSECT CONSERVATION AND DIVERSITY, Issue 3 2008
JEAN M. L. RICHARDSON
Abstract., 1Comparisons of biodiversity estimates among sites or through time are hampered by a focus on using mean and variance estimates for diversity measures. These estimators depend on both sampling effort and on the abundances of organisms in communities, which makes comparison of communities possible only through the use of rarefaction curves that reduce all samples to the lowest sample size. However, comparing species richness among communities does not demand absolute estimates of species richness and statistical tests of similarity among communities are potentially more straightforward. 2This paper presents a program that uses randomisation methods to robustly test for differences in species richness among samples. Simulated data are used to show that the analysis has acceptable type I error rates and sufficient power to detect violations of the null hypothesis. An analysis of published bee data collected in 4 years shows how both sample size and hierarchical structure in sample type are incorporated into the analysis. 3The randomisation program is shown to be very robust to the presence of a dominant species, many rare species, and decreased sample size, giving quantitatively similar conclusions under all conditions. This method of testing for differences in biodiversity provides an important tool for researchers working on questions in community ecology and conservation biology. [source]