Used Computer Simulations (used + computer_simulation)

Distribution by Scientific Domains


Selected Abstracts


Estimation of allele frequencies with data on sibships

GENETIC EPIDEMIOLOGY, Issue 3 2001
Karl W. Broman
Abstract Allele frequencies are generally estimated with data on a set of unrelated individuals. In genetic studies of late-onset diseases, the founding individuals in pedigrees are often not available, and so one is confronted with the problem of estimating allele frequencies with data on related individuals. We focus on sibpairs and sibships, and compare the efficiency of four methods for estimating allele frequencies in this situation: (1) use the data for one individual from each sibship; (2) use the data for all individuals, ignoring their relationships; (3) use the data for all individuals, taking proper account of their relationships, considering a single marker at a time; and (4) use the data for all individuals, taking proper account of their relationships, considering a set of linked markers simultaneously. We derived the variance of estimator 2, and showed that the estimator is unbiased and provides substantial improvement over method 1. We used computer simulation to study the performance of methods 3 and 4, and showed that method 3 provides some improvement over method 2, while method 4 improves little on method 3. Genet. Epidemiol. 20:307,315, 2001. © 2001 Wiley-Liss, Inc. [source]


Optimal designs for estimating penetrance of rare mutations of a disease-susceptibility gene

GENETIC EPIDEMIOLOGY, Issue 3 2003
Gail Gong
Abstract Many clinical decisions require accurate estimates of disease risks associated with mutations of known disease-susceptibility genes. Such risk estimation is difficult when the mutations are rare. We used computer simulations to compare the performance of estimates obtained from two types of designs based on family data. In the first (clinic-based designs), families are ascertained because they meet certain criteria concerning multiple disease occurrences among family members. In the second (population-based designs), families are sampled through a population-based registry of affected individuals called probands, with oversampling of probands whose families are more likely to segregate mutations. We generated family structures, genotypes, and phenotypes using models that reflect the frequencies and penetrances of mutations of the BRCA1/2 genes. We studied the effects of risk heterogeneity due to unmeasured, shared risk factors by including risk variation due to unmeasured genotypes of another gene. The simulations were chosen to mimic the ascertainment and selection processes commonly used in the two types of designs. We found that penetrance estimates from both designs are nearly unbiased in the absence of unmeasured shared risk factors, but are biased upward in the presence of such factors. The bias increases with increasing variation in risks across genotypes of the second gene. However, it is small compared to the standard error of the estimates. Standard errors from population-based designs are roughly twice those from clinic-based designs with the same number of families. Using the root-mean-square error as a measure of performance, we found that in all instances, the clinic-based designs gave more accurate estimates than did the population-based designs with the same numbers of families. Rough variance calculations suggest that clinic-based designs give more accurate estimates because they include more identified mutation carriers. Genet Epidemiol 24:173,180, 2003. © 2003 Wiley-Liss, Inc. [source]


Power for detecting genetic divergence: differences between statistical methods and marker loci

MOLECULAR ECOLOGY, Issue 8 2006
NILS RYMAN
Abstract Information on statistical power is critical when planning investigations and evaluating empirical data, but actual power estimates are rarely presented in population genetic studies. We used computer simulations to assess and evaluate power when testing for genetic differentiation at multiple loci through combining test statistics or P values obtained by four different statistical approaches, viz. Pearson's chi-square, the log-likelihood ratio G -test, Fisher's exact test, and an FST -based permutation test. Factors considered in the comparisons include the number of samples, their size, and the number and type of genetic marker loci. It is shown that power for detecting divergence may be substantial for frequently used sample sizes and sets of markers, also at quite low levels of differentiation. The choice of statistical method may be critical, though. For multi-allelic loci such as microsatellites, combining exact P values using Fisher's method is robust and generally provides a high resolving power. In contrast, for few-allele loci (e.g. allozymes and single nucleotide polymorphisms) and when making pairwise sample comparisons, this approach may yield a remarkably low power. In such situations chi-square typically represents a better alternative. The G -test without Williams's correction frequently tends to provide an unduly high proportion of false significances, and results from this test should be interpreted with great care. Our results are not confined to population genetic analyses but applicable to contingency testing in general. [source]


Statistical power when testing for genetic differentiation

MOLECULAR ECOLOGY, Issue 10 2001
N. Ryman
Abstract A variety of statistical procedures are commonly employed when testing for genetic differentiation. In a typical situation two or more samples of individuals have been genotyped at several gene loci by molecular or biochemical means, and in a first step a statistical test for allele frequency homogeneity is performed at each locus separately, using, e.g. the contingency chi-square test, Fisher's exact test, or some modification thereof. In a second step the results from the separate tests are combined for evaluation of the joint null hypothesis that there is no allele frequency difference at any locus, corresponding to the important case where the samples would be regarded as drawn from the same statistical and, hence, biological population. Presently, there are two conceptually different strategies in use for testing the joint null hypothesis of no difference at any locus. One approach is based on the summation of chi-square statistics over loci. Another method is employed by investigators applying the Bonferroni technique (adjusting the P -value required for rejection to account for the elevated alpha errors when performing multiple tests simultaneously) to test if the heterogeneity observed at any particular locus can be regarded significant when considered separately. Under this approach the joint null hypothesis is rejected if one or more of the component single locus tests is considered significant under the Bonferroni criterion. We used computer simulations to evaluate the statistical power and realized alpha errors of these strategies when evaluating the joint hypothesis after scoring multiple loci. We find that the ,extended' Bonferroni approach generally is associated with low statistical power and should not be applied in the current setting. Further, and contrary to what might be expected, we find that ,exact' tests typically behave poorly when combined in existing procedures for joint hypothesis testing. Thus, while exact tests are generally to be preferred over approximate ones when testing each particular locus, approximate tests such as the traditional chi-square seem preferable when addressing the joint hypothesis. [source]