Home About us Contact | |||
Type I Error (type + i_error)
Terms modified by Type I Error Selected AbstractsA hybrid method for simulation factor screeningNAVAL RESEARCH LOGISTICS: AN INTERNATIONAL JOURNAL, Issue 1 2010Hua Shen Abstract Factor screening is performed to eliminate unimportant factors so that the remaining important factors can be more thoroughly studied in later experiments. Controlled sequential bifurcation (CSB) and controlled sequential factorial design (CSFD) are two new screening methods for discrete-event simulations. Both methods use hypothesis testing procedures to control the Type I Error and power of the screening results. The scenarios for which each method is most efficient are complementary. This study proposes a two-stage hybrid approach that combines CSFD and an improved CSB called CSB-X. In Phase 1, a prescreening procedure will estimate each effect and determine whether CSB-X or CSFD will be used for further screening. In Phase 2, CSB-X and CSFD are performed separately based on the assignment of Phase 1. The new method usually has the same error control as CSB-X and CSFD. The efficiency, on the other hand, is usually much better than either component method. © 2009 Wiley Periodicals, Inc. Naval Research Logistics, 2010 [source] Two-Stage Group Sequential Robust Tests in Family-Based Association Studies: Controlling Type I ErrorANNALS OF HUMAN GENETICS, Issue 4 2008Lihan K. Yan Summary In family-based association studies, an optimal test statistic with asymptotic normal distribution is available when the underlying genetic model is known (e.g., recessive, additive, multiplicative, or dominant). In practice, however, genetic models for many complex diseases are usually unknown. Using a single test statistic optimal for one genetic model may lose substantial power when the model is mis-specified. When a family of genetic models is scientifically plausible, the maximum of several tests, each optimal for a specific genetic model, is robust against the model mis-specification. This robust test is preferred over a single optimal test. Recently, cost-effective group sequential approaches have been introduced to genetic studies. The group sequential approach allows interim analyses and has been applied to many test statistics, but not to the maximum statistic. When the group sequential method is applied, type I error should be controlled. We propose and compare several approaches of controlling type I error rates when group sequential analysis is conducted with the maximum test for family-based candidate-gene association studies. For a two-stage group sequential robust procedure with a single interim analysis, two critical values for the maximum tests are provided based on a given alpha spending function to control the desired overall type I error. [source] Investigating the Incidence of type i errors for chronic whole effluent toxicity testing using Ceriodaphnia dubiaENVIRONMENTAL TOXICOLOGY & CHEMISTRY, Issue 1 2000Timothy F. Moore Abstract The risk of Type I error (false positives) is thought to be controlled directly by the selection of a critical p value for conducting statistical analyses. The critical value for whole effluent toxicity (WET) tests is routinely set to 0.05, thereby establishing a 95% confidence level about the statistical inferences. In order to estimate the incidence of Type I errors in chronic WET testing, a method blank-type study was performed. A number of municipal wastewater dischargers contracted 16 laboratories to conduct chronic WET tests using the standard test organism Ceriodaphnia dubia. Unbeknownst to the laboratories, the samples they received from the wastewater dischargers were comprised only of moderately hard water, using the U.S. Environmental Protection Agency's standard dilution water formula. Because there was functionally no difference between the sample water and the laboratory control/dilution water, the test results were expected to be less than or equal to 1 TUc (toxic unit). Of the 16 tests completed by the biomonitoring laboratories, two did not meet control performance criteria. Six of the remaining 14 valid tests (43%) indicated toxicity (TUc > 1) in the sample (i.e., no-observed-effect concentration or IC25 < 100%). This incidence of false positives was six times higher than expected when the critical value was set to 0.05. No plausible causes for this discrepancy were found. Various alternatives for reducing the rate of Type I errors are recommended, including greater reliance on survival endpoints and use of additional test acceptance criteria. [source] Census error and the detection of density dependenceJOURNAL OF ANIMAL ECOLOGY, Issue 4 2006ROBERT P. FRECKLETON Summary 1Studies aiming to identify the prevalence and nature of density dependence in ecological populations have often used statistical analysis of ecological time-series of population counts. Such time-series are also being used increasingly to parameterize models that may be used in population management. 2If time-series contain measurement errors, tests that rely on detecting a negative relationship between log population change and population size are biased and prone to spuriously detecting density dependence (Type I error). This is because the measurement error in density for a given year appears in the corresponding change in population density, with equal magnitude but opposite sign. 3This effect introduces bias that may invalidate comparisons of ecological data with density-independent time-series. Unless census error can be accounted for, time-series may appear to show strongly density-dependent dynamics, even though the density-dependent signal may in reality be weak or absent. 4We distinguish two forms of census error, both of which have serious consequences for detecting density dependence. 5First, estimates of population density are based rarely on exact counts, but on samples. Hence there exists sampling error, with the level of error depending on the method employed and the number of replicates on which the population estimate is based. 6Secondly, the group of organisms measured is often not a truly self-contained population, but part of a wider ecological population, defined in terms of location or behaviour. Consequently, the subpopulation studied may effectively be a sample of the population and spurious density dependence may be detected in the dynamics of a single subpopulation. In this case, density dependence is detected erroneously, even if numbers within the subpopulation are censused without sampling error. 7In order to illustrate how process variation and measurement error may be distinguished we review data sets (counts of numbers of birds by single observers) for which both census error and long-term variance in population density can be estimated. 8Tests for density dependence need to obviate the problem that measured population sizes are typically estimates rather than exact counts. It is possible that in some cases it may be possible to test for density dependence in the presence of unknown levels of census error, for example by uncovering nonlinearities in the density response. However, it seems likely that these may lack power compared with analyses that are able to explicitly include census error and we review some recently developed methods. [source] Non-parametric permutation test for the discrimination of float glass samples based on LIBS spectraJOURNAL OF CHEMOMETRICS, Issue 6 2010Erin McIntee Abstract Laser-induced breakdown spectroscopy (LIBS) coupled with non-parametric permutation based hypothesis testing is demonstrated to have good performance in discriminating float glass samples. This type of pairwise sample comparison is important in manufacturing process quality control, forensic science and other applications where determination of a match probability between two samples is required. Analysis of the pairwise comparisons between multiple LIBS spectra from a single glass sample shows that some assumptions required by parametric methods may not hold in practice, motivating the adoption of a non-parametric permutation test. Without rigid distributional assumptions, the permutation test exhibits excellent discriminating power while holding the actual size of Type I error at the nominal level. Copyright © 2010 John Wiley & Sons, Ltd. [source] A Comparison of Item Fit Statistics for Mixed IRT ModelsJOURNAL OF EDUCATIONAL MEASUREMENT, Issue 3 2010Kyong Hee Chon In this study we examined procedures for assessing model-data fit of item response theory (IRT) models for mixed format data. The model fit indices used in this study include PARSCALE's,G2,,Orlando and Thissen's,S,,,X2,and,S,,,G2,,and Stone's,,2*,and,G2*. To investigate the relative performance of the fit statistics at the item level, we conducted two simulation studies: Type I error and power studies. We evaluated the performance of the item fit indices for various conditions of test length, sample size, and IRT models. Among the competing measures, the summed score-based indices,S,,,X2,and,S,,,G2,were found to be the sensible and efficient choice for assessing model fit for mixed format data. These indices performed well, particularly with short tests. The pseudo-observed score indices, ,2*,and,G2*,,showed inflated Type I error rates in some simulation conditions. Consistent with the findings of current literature, the PARSCALE's,G2,index was rarely useful, although it provided reasonable results for long tests. [source] Estimating the Accuracy of Jury VerdictsJOURNAL OF EMPIRICAL LEGAL STUDIES, Issue 2 2007Bruce D. Spencer Average accuracy of jury verdicts for a set of cases can be studied empirically and systematically even when the correct verdict cannot be known. The key is to obtain a second rating of the verdict, for example, the judge's, as in the recent study of criminal cases in the United States by the National Center for State Courts (NCSC). That study, like the famous Kalven-Zeisel study, showed only modest judge-jury agreement. Simple estimates of jury accuracy can be developed from the judge-jury agreement rate; the judge's verdict is not taken as the gold standard. Although the estimates of accuracy are subject to error, under plausible conditions they tend to overestimate the average accuracy of jury verdicts. The jury verdict was estimated to be accurate in no more than 87 percent of the NCSC cases (which, however, should not be regarded as a representative sample with respect to jury accuracy). More refined estimates, including false conviction and false acquittal rates, are developed with models using stronger assumptions. For example, the conditional probability that the jury incorrectly convicts given that the defendant truly was not guilty (a "Type I error") was estimated at 0.25, with an estimated standard error (s.e.) of 0.07, the conditional probability that a jury incorrectly acquits given that the defendant truly was guilty ("Type II error") was estimated at 0.14 (s.e. 0.03), and the difference was estimated at 0.12 (s.e. 0.08). The estimated number of defendants in the NCSC cases who truly are not guilty but are convicted does seem to be smaller than the number who truly are guilty but are acquitted. The conditional probability of a wrongful conviction, given that the defendant was convicted, is estimated at 0.10 (s.e. 0.03). [source] Variable selection and oversampling in the use of smooth support vector machines for predicting the default risk of companiesJOURNAL OF FORECASTING, Issue 6 2009Wolfgang Härdle Abstract In the era of Basel II a powerful tool for bankruptcy prognosis is vital for banks. The tool must be precise but also easily adaptable to the bank's objectives regarding the relation of false acceptances (Type I error) and false rejections (Type II error). We explore the suitability of smooth support vector machines (SSVM), and investigate how important factors such as the selection of appropriate accounting ratios (predictors), length of training period and structure of the training sample influence the precision of prediction. Moreover, we show that oversampling can be employed to control the trade-off between error types, and we compare SSVM with both logistic and discriminant analysis. Finally, we illustrate graphically how different models can be used jointly to support the decision-making process of loan officers. Copyright © 2008 John Wiley & Sons, Ltd. [source] Prenatal PCB exposure and neurobehavioral development in infants and children: Can the Oswego study inform the current debate?PSYCHOLOGY IN THE SCHOOLS, Issue 6 2004Paul Stewart In the current paper we describe the methodology and results of the Oswego study, in light of D.V. Cicchetti, A.S. Kaufman, and S.S. Sparrow's (this issue) criticisms regarding the validity of the human health/behavioral claims in the PCB literature. The Oswego project began as a replication of the Lake Michigan Maternal Infant Cohort study. Beyond replication of the Michigan findings, the study sought to extend results and conclusions through more comprehensive behavioral assessment, and improved confounder control and analytic methodology. Results over the past 5 years have demonstrated a convincing replication of the Michigan findings. The Michigan cohort reported findings relating Great Lakes fish consumption to performance impairments on the Neonatal Behavioral Assessment Scale (J. Jacobson, S. Jacobson, P. Schwartz, G. Fein, & J. Dowler, 1984). These findings were also found in the Oswego cohort (E. Lonky, J. Reihman, T. Darvill, J. Mather, & H. Daly, 1996), and the Oswego study extended the association to cord blood PCBS (P.W. Stewart, J. Reihman, E. Lonky, and T. Darvill, 2000). The Michigan cohort reported an association between prenatal PCB exposure and poorer performance on the Fagan Test of Infant Intelligence (S.W. Jacobson, G.G. Fein, J.L. Jacobson, P.M. Schwartz, & J.K. Dowler, 1985). The Oswego cohort found similar results (T. Darvill, E. Lonky, J. Reihman, P. Stewart, & J. Pagano, 2000). The Michigan Cohort reported an association between prenatal PCB exposure and performance impairments on the McCarthy Scales of Children's abilities (J. Jacobson & S. Jacobson, 1997). The Oswego study also found PCB-related impairments on the McCarthy Scales (P.W. Stewart, J. Reihman, E. Lonky, T. Darvill, & J. Pagano, 2003). The Oswego results used the same exposure metric in every paper, employed conservative statistical design and analysis, and controlled for more than 40 potentially confounding variables. Moreover, while PCBs were related to all the behavioral endpoints outlined above, alternative candidates for effect, including lead, HCB, Mirex, DDE, and MeHg were not. Taken together, these results support the hypothesis that prenatal PCB exposure results in statistically significant predictors of small, but measurable, deficits in cognitive development from infancy through early childhood. Cicchetti et al. argue that these results, generated by independent investigators, be dismissed because they reflect a combination of measurement error, Type I error, and residual confounding. The evidence Cicchetti et al. present in support of their position fails to explain the nearly identical pattern of associations observed in the Oswego and Michigan Cohorts. In light of this replication, the extensive assessment of potential confounders, the effective elimination of alternative contaminants, and the conservative statistical approach employed in the Oswego study, we find that Cicchetti et al.'s claims are not substantiated. © 2004 Wiley Periodicals, Inc. Psychol Schs 41: 639,653, 2004. [source] Investigating the Incidence of type i errors for chronic whole effluent toxicity testing using Ceriodaphnia dubiaENVIRONMENTAL TOXICOLOGY & CHEMISTRY, Issue 1 2000Timothy F. Moore Abstract The risk of Type I error (false positives) is thought to be controlled directly by the selection of a critical p value for conducting statistical analyses. The critical value for whole effluent toxicity (WET) tests is routinely set to 0.05, thereby establishing a 95% confidence level about the statistical inferences. In order to estimate the incidence of Type I errors in chronic WET testing, a method blank-type study was performed. A number of municipal wastewater dischargers contracted 16 laboratories to conduct chronic WET tests using the standard test organism Ceriodaphnia dubia. Unbeknownst to the laboratories, the samples they received from the wastewater dischargers were comprised only of moderately hard water, using the U.S. Environmental Protection Agency's standard dilution water formula. Because there was functionally no difference between the sample water and the laboratory control/dilution water, the test results were expected to be less than or equal to 1 TUc (toxic unit). Of the 16 tests completed by the biomonitoring laboratories, two did not meet control performance criteria. Six of the remaining 14 valid tests (43%) indicated toxicity (TUc > 1) in the sample (i.e., no-observed-effect concentration or IC25 < 100%). This incidence of false positives was six times higher than expected when the critical value was set to 0.05. No plausible causes for this discrepancy were found. Various alternatives for reducing the rate of Type I errors are recommended, including greater reliance on survival endpoints and use of additional test acceptance criteria. [source] Off-site monitoring systems for predicting bank underperformance: a comparison of neural networks, discriminant analysis, and professional human judgmentINTELLIGENT SYSTEMS IN ACCOUNTING, FINANCE & MANAGEMENT, Issue 3 2001Philip Swicegood This study compares the ability of discriminant analysis, neural networks, and professional human judgment methodologies in predicting commercial bank underperformance. Experience from the banking crisis of the 1980s and early 1990s suggest that improved prediction models are needed for helping prevent bank failures and promoting economic stability. Our research seeks to address this issue by exploring new prediction model techniques and comparing them to existing approaches. When comparing the predictive ability of all three models, the neural network model shows slightly better predictive ability than that of the regulators. Both the neural network model and regulators significantly outperform the benchmark discriminant analysis model's accuracy. These findings suggest that neural networks show promise as an off-site surveillance methodology. Factoring in the relative costs of the different types of misclassifications from each model also indicates that neural network models are better predictors, particularly when weighting Type I errors more heavily. Further research with neural networks in this field should yield workable models that greatly enhance the ability of regulators and bankers to identify and address weaknesses in banks before they approach failure. Copyright © 2001 John Wiley & Sons, Ltd. [source] Two New Statistics to Detect Answer CopyingJOURNAL OF EDUCATIONAL MEASUREMENT, Issue 1 2003Leonardo S. Sotaridona Two new indices to detect answer copying on a multiple-choice test,S1 and S2,were proposed. The S1 index is similar to the K index (Holland, 1996) and the K2 index (Sotaridona & Meijer, 2002) but the distribution of the number of matching incorrect answers of the source and the copier is modeled by the Poisson distribution instead of the binomial distribution to improve the detection rate of K and K2. The S2 index was proposed to overcome a limitation of the K and K2 index, namely, their insensitiveness to correct answers copying. The S2 index incorporates the matching correct answers in addition to the matching incorrect answers. A simulation study was conducted to investigate the usefulness of S1 and S2 for 40- and 80-item tests, 100 and 500 sample sizes, and 10%, 20%, 30%, and 40% answer copying. The Type I errors and detection rates of S1 and S2 were compared with those of the K2 and the , copying index (Wollack, 1997). Results showed that all four indices were able to maintain their Type I errors, with S1 and K2 being slightly conservative compared to S2 and ,. Furthermore, S1 had higher detection rates than K2. The S2 index showed a significant improvement in detection rate compared to K and K2. [source] Validation of a swallowing disturbance questionnaire for detecting dysphagia in patients with Parkinson's diseaseMOVEMENT DISORDERS, Issue 13 2007Yael Manor MA Abstract Underreporting of swallowing disturbances by Parkinson's disease (PD) patients may lead to delay in diagnosis and treatment, alerting the physician to an existing dysphagia only after the first episode of aspiration pneumonia. We developed and validated a swallowing disturbance questionnaire (SDQ) for PD patients and compared its findings to an objective assessment. Fifty-seven PD patients (mean age 69 ± 10 years) participated in this study. Each patient was queried about experiencing swallowing disturbances and asked to complete a self-reported 15-item "yes/no" questionnaire on swallowing disturbances (24 replied "no"). All study patients underwent a physical/clinical swallowing evaluation by a speech pathologist and an otolaryngologist. The 33 patients who complained of swallowing disturbances also underwent fiberoptic endoscopyic evaluation of swallowing (FEES). According to the ROC test, the "optimal" score (where the sensitivity and specificity curves cross) is 11 (sensitivity 80.5%, specificity 81.3%). Using the SDQ questionnaire substantially reduced Type I errors (specifically, an existing swallowing problem missed by the selected cutoff point). On the basis of the SDQ assessment alone, 12 of the 24 (50%) noncomplaining patients would have been referred to further evaluation that they otherwise would not have undergone. The SDQ emerged as a validated tool to detect early dysphagia in PD patients. © 2007 Movement Disorder Society [source] Resampling-Based Empirical Bayes Multiple Testing Procedures for Controlling Generalized Tail Probability and Expected Value Error Rates: Focus on the False Discovery Rate and Simulation StudyBIOMETRICAL JOURNAL, Issue 5 2008Sandrine Dudoit Abstract This article proposes resampling-based empirical Bayes multiple testing procedures for controlling a broad class of Type I error rates, defined as generalized tail probability (gTP) error rates, gTP (q,g) = Pr(g (Vn,Sn) > q), and generalized expected value (gEV) error rates, gEV (g) = E [g (Vn,Sn)], for arbitrary functions g (Vn,Sn) of the numbers of false positives Vn and true positives Sn. Of particular interest are error rates based on the proportion g (Vn,Sn) = Vn /(Vn + Sn) of Type I errors among the rejected hypotheses, such as the false discovery rate (FDR), FDR = E [Vn /(Vn + Sn)]. The proposed procedures offer several advantages over existing methods. They provide Type I error control for general data generating distributions, with arbitrary dependence structures among variables. Gains in power are achieved by deriving rejection regions based on guessed sets of true null hypotheses and null test statistics randomly sampled from joint distributions that account for the dependence structure of the data. The Type I error and power properties of an FDR-controlling version of the resampling-based empirical Bayes approach are investigated and compared to those of widely-used FDR-controlling linear step-up procedures in a simulation study. The Type I error and power trade-off achieved by the empirical Bayes procedures under a variety of testing scenarios allows this approach to be competitive with or outperform the Storey and Tibshirani (2003) linear step-up procedure, as an alternative to the classical Benjamini and Hochberg (1995) procedure. (© 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim) [source] Phylogenetics and Ecology: As Many Characters as Possible Should Be Included in the Cladistic Analysis,CLADISTICS, Issue 1 2001Philippe Grandcolas As many data as possible must be included in any scientific analysis, provided that they follow the logical principles on which this analysis is based. Phylogenetic analysis is based on the basic principle of evolution, i.e., descent with modification. Consequently, ecological characters or any other nontraditional characters must be included in phylogenetic analyses, provided that they can plausibly be postulated heritable. The claim of Zrzavý (1997, Oikos 80, 186,192) or Luckow and Bruneau (1997, Cladistics 13, 145,151) that any character of interest should be included in the analysis is thus inaccurate. Many characters, broadly defined or extrinsic (such as distribution areas), cannot be considered as actually heritable. It is argued that we should better care for the precise definition and properties of characters of interest than decide a priori to include them in any case in the analysis. The symmetrical claim of de Queiroz (1996, Am. Nat. 148, 700,708) that some characters of interest should better be excluded from analyses to reconstruct their history is similarly inaccurate. If they match the logical principles of phylogenetic analysis, there is no acceptable reason to exclude them. The different statistical testing strategies of Zrzavý (1997) and de Queiroz (1996) aimed at justifying inclusion versus exclusion of characters are ill-conceived, leading respectively to Type II and Type I errors. It is argued that phylogenetic analyses should not be constrained by testing strategies that are downstream of the logical principles of phylogenetics. Excluding characters and mapping them on an independent phylogeny produces a particular and suboptimal kind of secondary homology, the use of which can be justified only for preliminary studies dealing with broadly defined characters. [source] WHY DOES A METHOD THAT FAILS CONTINUE TO BE USED?EVOLUTION, Issue 4 2009THE ANSWER It has been claimed that hundreds of researchers use nested clade phylogeographic analysis (NCPA) based on what the method promises rather than requiring objective validation of the method. The supposed failure of NCPA is based upon the argument that validating it by using positive controls ignored type I error, and that computer simulations have shown a high type I error. The first argument is factually incorrect: the previously published validation analysis fully accounted for both type I and type II errors. The simulations that indicate a 75% type I error rate have serious flaws and only evaluate outdated versions of NCPA. These outdated type I error rates fall precipitously when the 2003 version of single-locus NCPA is used or when the 2002 multilocus version of NCPA is used. It is shown that the tree-wise type I errors in single-locus NCPA can be corrected to the desired nominal level by a simple statistical procedure, and that multilocus NCPA reconstructs a simulated scenario used to discredit NCPA with 100% accuracy. Hence, NCPA is a not a failed method at all, but rather has been validated both by actual data and by simulated data in a manner that satisfies the published criteria given by its critics. The critics have come to different conclusions because they have focused on the pre-2002 versions of NCPA and have failed to take into account the extensive developments in NCPA since 2002. Hence, researchers can choose to use NCPA based upon objective critical validation that shows that NCPA delivers what it promises. [source] A propensity score approach to correction for bias due to population stratification using genetic and non-genetic factorsGENETIC EPIDEMIOLOGY, Issue 8 2009Huaqing Zhao Abstract Confounding due to population stratification (PS) arises when differences in both allele and disease frequencies exist in a population of mixed racial/ethnic subpopulations. Genomic control, structured association, principal components analysis (PCA), and multidimensional scaling (MDS) approaches have been proposed to address this bias using genetic markers. However, confounding due to PS can also be due to non-genetic factors. Propensity scores are widely used to address confounding in observational studies but have not been adapted to deal with PS in genetic association studies. We propose a genomic propensity score (GPS) approach to correct for bias due to PS that considers both genetic and non-genetic factors. We compare the GPS method with PCA and MDS using simulation studies. Our results show that GPS can adequately adjust and consistently correct for bias due to PS. Under no/mild, moderate, and severe PS, GPS yielded estimated with bias close to 0 (mean=,0.0044, standard error=0.0087). Under moderate or severe PS, the GPS method consistently outperforms the PCA method in terms of bias, coverage probability (CP), and type I error. Under moderate PS, the GPS method consistently outperforms the MDS method in terms of CP. PCA maintains relatively high power compared to both MDS and GPS methods under the simulated situations. GPS and MDS are comparable in terms of statistical properties such as bias, type I error, and power. The GPS method provides a novel and robust tool for obtaining less-biased estimates of genetic associations that can consider both genetic and non-genetic factors. Genet. Epidemiol. 33:679,690, 2009. © 2009 Wiley-Liss, Inc. [source] Selection of the most informative individuals from families with multiple siblings for association studiesGENETIC EPIDEMIOLOGY, Issue 4 2009Chunyu Liu Abstract Association analyses may follow an initial linkage analysis for mapping and identifying genes underlying complex quantitative traits and may be conducted on unrelated subsets of individuals where only one member of a family is included. We evaluate two methods to select one sibling per sibship when multiple siblings are available: (1) one sibling with the most extreme trait value; and (2) one sibling using a combination score statistic based on extreme trait values and identity-by-descent sharing information. We compare the type I error and power. Furthermore, we compare these selection strategies with a strategy that randomly selects one sibling per sibship and with an approach that includes all siblings, using both simulation study and an application to fasting blood glucose in the Framingham Heart Study. When genetic effect is homogeneous, we find that using the combination score can increase power by 30,40% compared to a random selection strategy, and loses only 8,13% of power compared to the full sibship analysis, across all additive models considered, but offers at least 50% genotyping cost saving. In the presence of genetic heterogeneity, the score offers a 50% increase in power over a random selection strategy, but there is substantial loss compared to the full sibship analysis. In application to fasting blood sample, two SNPs are found in common for the selection strategies and the full sample among the 10 highest ranked single nucleotide polymorphisms. The EV strategy tends to agree with the IBD-EV strategy and the analysis of the full sample. Genet. Epidemiol. 2009. © 2008 Wiley-Liss, Inc. [source] A multiple splitting approach to linkage analysis in large pedigrees identifies a linkage to asthma on chromosome 12GENETIC EPIDEMIOLOGY, Issue 3 2009Céline Bellenguez Abstract Large genealogies are potentially very informative for linkage analysis. However, the software available for exact non-parametric multipoint linkage analysis is limited with respect to the complexity of the families it can handle. A solution is to split the large pedigrees into sub-families meeting complexity constraints. Different methods have been proposed to "best" split large genealogies. Here, we propose a new procedure in which linkage is performed on several carefully chosen sub-pedigree sets from the genealogy instead of using just a single sub-pedigree set. Our multiple splitting procedure capitalizes on the sensitivity of linkage results to family structure and has been designed to control computational feasibility and global type I error. We describe and apply this procedure to the extreme case of the highly complex Hutterite pedigree and use it to perform a genome-wide linkage analysis on asthma. The detection of a genome-wide significant linkage for asthma on chromosome 12q21 illustrates the potential of this multiple splitting approach. Genet. Epidemiol. 2009. © 2008 Wiley-Liss, Inc. [source] Testing association for markers on the X chromosome,GENETIC EPIDEMIOLOGY, Issue 8 2007Gang Zheng Abstract Test statistics for association between markers on autosomal chromosomes and a disease have been extensively studied. No research has been reported on performance of such test statistics for association on the X chromosome. With 100,000 or more single-nucleotide polymorphisms (SNPs) available for genome-wide association studies, thousands of them come from the X chromosome. The X chromosome contains rich information about population history and linkage disequilibrium. To identify X-linked marker susceptibility to a disease, it is important to study properties of various statistics that can be used to test for association on the X chromosome. In this article, we compare performance of several approaches for testing association on the X chromosome, and examine how departure from Hardy-Weinberg equilibrium would affect type I error and power of these association tests using X-linked SNPs. The results are applied to the X chromosome of Klein et al. [2005], a genome-wide association study with 100K SNPs for age-related macular degeneration. We found that a SNP (rs10521496) covered by DIAPH2, known to cause premature ovarian failure (POF) in females, is associated with age-related macular degeneration. Genet. Epidemiol. 2007. Published 2007 Wiley-Liss, Inc. [source] Semiparametric variance-component models for linkage and association analyses of censored trait dataGENETIC EPIDEMIOLOGY, Issue 7 2006G. Diao Abstract Variance-component (VC) models are widely used for linkage and association mapping of quantitative trait loci in general human pedigrees. Traditional VC methods assume that the trait values within a family follow a multivariate normal distribution and are fully observed. These assumptions are violated if the trait data contain censored observations. When the trait pertains to age at onset of disease, censoring is inevitable because of loss to follow-up and limited study duration. Censoring also arises when the trait assay cannot detect values below (or above) certain thresholds. The latent trait values tend to have a complex distribution. Applying traditional VC methods to censored trait data would inflate type I error and reduce power. We present valid and powerful methods for the linkage and association analyses of censored trait data. Our methods are based on a novel class of semiparametric VC models, which allows an arbitrary distribution for the latent trait values. We construct appropriate likelihood for the observed data, which may contain left or right censored observations. The maximum likelihood estimators are approximately unbiased, normally distributed, and statistically efficient. We develop stable and efficient numerical algorithms to implement the corresponding inference procedures. Extensive simulation studies demonstrate that the proposed methods outperform the existing ones in practical situations. We provide an application to the age at onset of alcohol dependence data from the Collaborative Study on the Genetics of Alcoholism. A computer program is freely available. Genet. Epidemiol. 2006. © 2006 Wiley-Liss, Inc. [source] Resampling-based multiple hypothesis testing procedures for genetic case-control association studies,GENETIC EPIDEMIOLOGY, Issue 6 2006Bingshu E. Chen Abstract In case-control studies of unrelated subjects, gene-based hypothesis tests consider whether any tested feature in a candidate gene,single nucleotide polymorphisms (SNPs), haplotypes, or both,are associated with disease. Standard statistical tests are available that control the false-positive rate at the nominal level over all polymorphisms considered. However, more powerful tests can be constructed that use permutation resampling to account for correlations between polymorphisms and test statistics. A key question is whether the gain in power is large enough to justify the computational burden. We compared the computationally simple Simes Global Test to the min,P test, which considers the permutation distribution of the minimum p -value from marginal tests of each SNP. In simulation studies incorporating empirical haplotype structures in 15 genes, the min,P test controlled the type I error, and was modestly more powerful than the Simes test, by 2.1 percentage points on average. When disease susceptibility was conferred by a haplotype, the min,P test sometimes, but not always, under-performed haplotype analysis. A resampling-based omnibus test combining the min,P and haplotype frequency test controlled the type I error, and closely tracked the more powerful of the two component tests. This test achieved consistent gains in power (5.7 percentage points on average), compared to a simple Bonferroni test of Simes and haplotype analysis. Using data from the Shanghai Biliary Tract Cancer Study, the advantages of the newly proposed omnibus test were apparent in a population-based study of bile duct cancer and polymorphisms in the prostaglandin-endoperoxide synthase 2 (PTGS2) gene. Genet. Epidemiol. 2006. Published 2006 Wiley-Liss, Inc. [source] Comparison of single-nucleotide polymorphisms and microsatellite markers for linkage analysis in the COGA and simulated data sets for Genetic Analysis Workshop 14: Presentation Groups 1, 2, and 3GENETIC EPIDEMIOLOGY, Issue S1 2005Marsha A. Wilcox Abstract The papers in presentation groups 1,3 of Genetic Analysis Workshop 14 (GAW14) compared microsatellite (MS) markers and single-nucleotide polymorphism (SNP) markers for a variety of factors, using multiple methods in both data sets provided to GAW participants. Group 1 focused on data provided from the Collaborative Study on the Genetics of Alcoholism (COGA). Group 2 focused on data simulated for the workshop. Group 3 contained analyses of both data sets. Issues examined included: information content, signal strength, localization of the signal, use of haplotype blocks, population structure, power, type I error, control of type I error, the effect of linkage disequilibrium, and computational challenges. There were several broad resulting observations. 1) Information content was higher for dense SNP marker panels than for MS panels, and dense SNP markers sets appeared to provide slightly higher linkage scores and slightly higher power to detect linkage than MS markers. 2) Dense SNP panels also gave higher type I errors, suggesting that increased test thresholds may be needed to maintain the correct error rate. 3) Dense SNP panels provided better trait localization, but only in the COGA data, in which the MS markers were relatively loosely spaced. 4) The strength of linkage signals did not vary with the density of SNP panels, once the marker density was ,1 SNP/cM. 5) Analyses with SNPs were computationally challenging, and identified areas where improvements in analysis tools will be necessary to make analysis practical for widespread use. Genet. Epidemiol. 29:(Suppl. 1): S7,S28, 2005. © 2005 Wiley-Liss, Inc. [source] Modelling small-business credit scoring by using logistic regression, neural networks and decision treesINTELLIGENT SYSTEMS IN ACCOUNTING, FINANCE & MANAGEMENT, Issue 3 2005Mirta Bensic Previous research on credit scoring that used statistical and intelligent methods was mostly focused on commercial and consumer lending. The main purpose of this paper is to extract important features for credit scoring in small-business lending on a dataset with specific transitional economic conditions using a relatively small dataset. To do this, we compare the accuracy of the best models extracted by different methodologies, such as logistic regression, neural networks (NNs), and CART decision trees. Four different NN algorithms are tested, including backpropagation, radial basis function network, probabilistic and learning vector quantization, by using the forward nonlinear variable selection strategy. Although the test of differences in proportion and McNemar's test do not show a statistically significant difference in the models tested, the probabilistic NN model produces the highest hit rate and the lowest type I error. According to the measures of association, the best NN model also shows the highest degree of association with the data, and it yields the lowest total relative cost of misclassification for all scenarios examined. The best model extracts a set of important features for small-business credit scoring for the observed sample, emphasizing credit programme characteristics, as well as entrepreneur's personal and business characteristics as the most important ones. Copyright © 2005 John Wiley & Sons, Ltd. [source] Use of resampling to select among alternative error structure specifications for GLMM analyses of repeated measurements,INTERNATIONAL JOURNAL OF METHODS IN PSYCHIATRIC RESEARCH, Issue 1 2004Scott Tonidandel Abstract Autocorrelated error and missing data due to dropouts have fostered interest in the flexible general linear mixed model (GLMM) procedures for analysis of data from controlled clinical trials. The user of these adaptable statistical tools must, however, choose among alternative structural models to represent the correlated repeated measurements. The fit of the error structure model specification is important for validity of tests for differences in patterns of treatment effects across time, particularly when maximum likelihood procedures are relied upon. Results can be affected significantly by the error specification that is selected, so a principled basis for selecting the specification is important. As no theoretical grounds are usually available to guide this decision, empirical criteria have been developed that focus on model fit. The current report proposes alternative empirical criteria that focus on bootstrap estimates of actual type I error and power of tests for treatment effects. Results for model selection before and after the blind is broken are compared. Goodness-of-fit statistics also compare favourably for models fitted to the blinded or unblinded data, although the correspondence to actual type I error and power depends on the particular fit statistic that is considered. Copyright © 2004 Whurr Publishers Ltd. [source] OLS ESTIMATION AND THE t TEST REVISITED IN RANK-SIZE RULE REGRESSION,JOURNAL OF REGIONAL SCIENCE, Issue 4 2008Yoshihiko Nishiyama ABSTRACT The rank-size rule and Zipf's law for city sizes have been traditionally examined by means of OLS estimation and the t test. This paper studies the accurate and approximate properties of the OLS estimator and obtains the distribution of the t statistic under the assumption of Zipf's law (i.e., Pareto distribution). Indeed, we show that the t statistic explodes asymptotically even under the null, indicating that a mechanical application of the t test yields a serious type I error. To overcome this problem, critical regions of the t test are constructed to test the Zipf's law. Using these corrected critical regions, we can conclude that our results are in favor of the Zipf's law for many more countries than in the previous researches such as Rosen and Resnick (1980) or Soo (2005). By using the same database as that used in Soo (2005), we demonstrate that the Zipf law is rejected for only one of 24 countries under our test whereas it is rejected for 23 of 24 countries under the usual t test. We also propose a more efficient estimation procedure and provide empirical applications of the theory for some countries. [source] Estimating the variance of estimated trends in proportions when there is no unique subject identifierJOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES A (STATISTICS IN SOCIETY), Issue 1 2007William K. Mountford Summary., Longitudinal population-based surveys are widely used in the health sciences to study patterns of change over time. In many of these data sets unique patient identifiers are not publicly available, making it impossible to link the repeated measures from the same individual directly. This poses a statistical challenge for making inferences about time trends because repeated measures from the same individual are likely to be positively correlated, i.e., although the time trend that is estimated under the naďve assumption of independence is unbiased, an unbiased estimate of the variance cannot be obtained without knowledge of the subject identifiers linking repeated measures over time. We propose a simple method for obtaining a conservative estimate of variability for making inferences about trends in proportions over time, ensuring that the type I error is no greater than the specified level. The method proposed is illustrated by using longitudinal data on diabetes hospitalization proportions in South Carolina. [source] Early stopping by using stochastic curtailment in a three-arm sequential trialJOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES C (APPLIED STATISTICS), Issue 2 2003Denis Heng-Yan Leung Summary. Interim analysis is important in a large clinical trial for ethical and cost considerations. Sometimes, an interim analysis needs to be performed at an earlier than planned time point. In that case, methods using stochastic curtailment are useful in examining the data for early stopping while controlling the inflation of type I and type II errors. We consider a three-arm randomized study of treatments to reduce perioperative blood loss following major surgery. Owing to slow accrual, an unplanned interim analysis was required by the study team to determine whether the study should be continued. We distinguish two different cases: when all treatments are under direct comparison and when one of the treatments is a control. We used simulations to study the operating characteristics of five different stochastic curtailment methods. We also considered the influence of timing of the interim analyses on the type I error and power of the test. We found that the type I error and power between the different methods can be quite different. The analysis for the perioperative blood loss trial was carried out at approximately a quarter of the planned sample size. We found that there is little evidence that the active treatments are better than a placebo and recommended closure of the trial. [source] Distribution and abundance of West Greenland humpback whales (Megaptera novaeangliae)JOURNAL OF ZOOLOGY, Issue 4 2004Finn Larsen Abstract Photo-identification surveys of humpback whales Megaptera novaeangliae were conducted at West Greenland during 1988,93, the last 2 years of which were part of the internationally coordinated humpback whale research programme YoNAH, with the primary aim of estimating abundance for the West Greenland feeding aggregation. The area studied stretched from the coast out to the offshore margin of the banks, determined approximately by the 200 m depth contours, between c. 61°70,N and c. 66°N. The surveys were conducted between early July and mid-August and 993 h were expended on searching effort. A total of 670 groups of humpback whales was encountered leading to the identification of 348 individual animals. Three areas of concentration were identified: an area off Nuuk; an area at c. 63°30,N; and an area off Frederikshĺb. Sequential Petersen capture,recapture estimates of abundance were calculated for five pairs of years at 357 (1988,89), 355 (1989,90), 566 (1990,91), 376 (1991,92), and 348 (1992,93). Excluding the anomalously high estimate in 1990,91, the simple mean is 359 (se= 27.3, CV = 0.076) and the inverse CV-squared weighted mean is 356 animals (se= 24.9, CV = 0.070). These calculations lead us to conclude that between 1988 and 1993 there were 360 humpbacks (CV = 0.07) in the West Greenland feeding aggregation. Using the Cormack,Jolly,Seber model framework non-calf survival rate was estimated at 0.957 (se= 0.028). Our data have low power (P < 0.3) to detect a trend of 3.1%, assuming the probability of a type I error was 0.05. [source] Impact of baseline ECG collection on the planning, analysis and interpretation of ,thorough' QT trialsPHARMACEUTICAL STATISTICS: THE JOURNAL OF APPLIED STATISTICS IN THE PHARMACEUTICAL INDUSTRY, Issue 2 2009Venkat Sethuraman Abstract The current guidelines, ICH E14, for the evaluation of non-antiarrhythmic compounds require a ,thorough' QT study (TQT) conducted during clinical development (ICH Guidance for Industry E14, 2005). Owing to the regulatory choice of margin (10,ms), the TQT studies must be conducted to rigorous standards to ensure that variability is minimized. Some of the key sources of variation can be controlled by use of randomization, crossover design, standardization of electrocardiogram (ECG) recording conditions and collection of replicate ECGs at each time point. However, one of the key factors in these studies is the baseline measurement, which if not controlled and consistent across studies could lead to significant misinterpretation. In this article, we examine three types of baseline methods widely used in the TQT studies to derive a change from baseline in QTc (time-matched, time-averaged and pre-dose-averaged baseline). We discuss the impact of the baseline values on the guidance-recommended ,largest time-matched' analyses. Using simulation we have shown the impact of these baseline approaches on the type I error and power for both crossover and parallel group designs. In this article, we show that the power of study decreases as the number of time points tested in TQT study increases. A time-matched baseline method is recommended by several authors (Drug Saf. 2005; 28(2):115,125, Health Canada guidance document: guide for the analysis and review of QT/QTc interval data, 2006) due to the existence of the circadian rhythm in QT. However, the impact of the time-matched baseline method on statistical inference and sample size should be considered carefully during the design of TQT study. The time-averaged baseline had the highest power in comparison with other baseline approaches. Copyright © 2008 John Wiley & Sons, Ltd. [source] |