FDR Control (fdr + control)

Distribution by Scientific Domains


Selected Abstracts


FDR Control by the BH Procedure for Two-Sided Correlated Tests with Implications to Gene Expression Data Analysis

BIOMETRICAL JOURNAL, Issue 1 2007
Anat Reiner-Benaim
Abstract The multiple testing problem attributed to gene expression analysis is challenging not only by its size, but also by possible dependence between the expression levels of different genes resulting from co-regulations of the genes. Furthermore, the measurement errors of these expression levels may be dependent as well since they are subjected to several technical factors. Multiple testing of such data faces the challenge of correlated test statistics. In such a case, the control of the False Discovery Rate (FDR) is not straightforward, and thus demands new approaches and solutions that will address multiplicity while accounting for this dependency. This paper investigates the effects of dependency between bormal test statistics on FDR control in two-sided testing, using the linear step-up procedure (BH) of Benjamini and Hochberg (1995). The case of two multiple hypotheses is examined first. A simulation study offers primary insight into the behavior of the FDR subjected to different levels of correlation and distance between null and alternative means. A theoretical analysis follows in order to obtain explicit upper bounds to the FDR. These results are then extended to more than two multiple tests, thereby offering a better perspective on the effect of the proportion of false null hypotheses, as well as the structure of the test statistics correlation matrix. An example from gene expression data analysis is presented. (© 2007 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim) [source]


Cluster Formation as a Measure of Interpretability in Multiple Testing

BIOMETRICAL JOURNAL, Issue 5 2008
Juliet Popper Shaffer
Abstract Multiple test procedures are usually compared on various aspects of error control and power. Power is measured as some function of the number of false hypotheses correctly identified as false. However, given equal numbers of rejected false hypotheses, the pattern of rejections, i.e. the particular set of false hypotheses identified, may be crucial in interpreting the results for potential application. In an important area of application, comparisons among a set of treatments based on random samples from populations, two different approaches, cluster analysis and model selection, deal implicitly with such patterns, while traditional multiple testing procedures generally focus on the outcomes of subset and pairwise equality hypothesis tests, without considering the overall pattern of results in comparing methods. An important feature involving the pattern of rejections is their relevance for dividing the treatments into distinct subsets based on some parameter of interest, for example their means. This paper introduces some new measures relating to the potential of methods for achieving such divisions. Following Hartley (1955), sets of treatments with equal parameter values will be called clusters. Because it is necessary to distinguish between clusters in the populations and clustering in sample outcomes, the population clusters will be referred to as P -clusters; any related concepts defined in terms of the sample outcome will be referred to with the prefix outcome. Outcomes of multiple comparison procedures will be studied in terms of their probabilities of leading to separation of treatments into outcome clusters, with various measures relating to the number of such outcome clusters and the proportion of true vs. false outcome clusters. The definitions of true and false outcome clusters and related concepts, and the approach taken here, is in the tradition of hypothesis testing with attention to overall error control and power, but with added consideration of cluster separation potential. The pattern approach will be illustrated by comparing two methods with apparent FDR control but with different ways of ordering outcomes for potential significance: The original Benjamini,Hochberg (1995) procedure (BH), and the Newman,Keuls (Newman, 1939; Keuls, 1952) procedure (NK). (© 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim) [source]


FDR Control by the BH Procedure for Two-Sided Correlated Tests with Implications to Gene Expression Data Analysis

BIOMETRICAL JOURNAL, Issue 1 2007
Anat Reiner-Benaim
Abstract The multiple testing problem attributed to gene expression analysis is challenging not only by its size, but also by possible dependence between the expression levels of different genes resulting from co-regulations of the genes. Furthermore, the measurement errors of these expression levels may be dependent as well since they are subjected to several technical factors. Multiple testing of such data faces the challenge of correlated test statistics. In such a case, the control of the False Discovery Rate (FDR) is not straightforward, and thus demands new approaches and solutions that will address multiplicity while accounting for this dependency. This paper investigates the effects of dependency between bormal test statistics on FDR control in two-sided testing, using the linear step-up procedure (BH) of Benjamini and Hochberg (1995). The case of two multiple hypotheses is examined first. A simulation study offers primary insight into the behavior of the FDR subjected to different levels of correlation and distance between null and alternative means. A theoretical analysis follows in order to obtain explicit upper bounds to the FDR. These results are then extended to more than two multiple tests, thereby offering a better perspective on the effect of the proportion of false null hypotheses, as well as the structure of the test statistics correlation matrix. An example from gene expression data analysis is presented. (© 2007 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim) [source]


Controlling the False Discovery Rate with Constraints: The Newman-Keuls Test Revisited

BIOMETRICAL JOURNAL, Issue 1 2007
Juliet Popper Shaffer
Abstract The Newman-Keuls (NK) procedure for testing all pairwise comparisons among a set of treatment means, introduced by Newman (1939) and in a slightly different form by Keuls (1952) was proposed as a reasonable way to alleviate the inflation of error rates when a large number of means are compared. It was proposed before the concepts of different types of multiple error rates were introduced by Tukey (1952a, b; 1953). Although it was popular in the 1950s and 1960s, once control of the familywise error rate (FWER) was accepted generally as an appropriate criterion in multiple testing, and it was realized that the NK procedure does not control the FWER at the nominal level at which it is performed, the procedure gradually fell out of favor. Recently, a more liberal criterion, control of the false discovery rate (FDR), has been proposed as more appropriate in some situations than FWER control. This paper notes that the NK procedure and a nonparametric extension controls the FWER within any set of homogeneous treatments. It proves that the extended procedure controls the FDR when there are well-separated clusters of homogeneous means and between-cluster test statistics are independent, and extensive simulation provides strong evidence that the original procedure controls the FDR under the same conditions and some dependent conditions when the clusters are not well-separated. Thus, the test has two desirable error-controlling properties, providing a compromise between FDR control with no subgroup FWER control and global FWER control. Yekutieli (2002) developed an FDR-controlling procedure for testing all pairwise differences among means, without any FWER-controlling criteria when there is more than one cluster. The empirical example in Yekutieli's paper was used to compare the Benjamini-Hochberg (1995) method with apparent FDR control in this context, Yekutieli's proposed method with proven FDR control, the Newman- Keuls method that controls FWER within equal clusters with apparent FDR control, and several methods that control FWER globally. The Newman-Keuls is shown to be intermediate in number of rejections to the FWER-controlling methods and the FDR-controlling methods in this example, although it is not always more conservative than the other FDR-controlling methods. (© 2007 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim) [source]


Linear Mixed Model Selection for False Discovery Rate Control in Microarray Data Analysis

BIOMETRICS, Issue 2 2010
Cumhur Yusuf Demirkale
Summary In a microarray experiment, one experimental design is used to obtain expression measures for all genes. One popular analysis method involves fitting the same linear mixed model for each gene, obtaining gene-specific,p -values for tests of interest involving fixed effects, and then choosing a threshold for significance that is intended to control false discovery rate (FDR) at a desired level. When one or more random factors have zero variance components for some genes, the standard practice of fitting the same full linear mixed model for all genes can result in failure to control FDR. We propose a new method that combines results from the fit of full and selected linear mixed models to identify differentially expressed genes and provide FDR control at target levels when the true underlying random effects structure varies across genes. [source]