| |||
Candidate Gene Association Studies (candidate + gene_association_studies)
Selected AbstractsTwo-sample Comparison Based on Prediction Error, with Applications to Candidate Gene Association StudiesANNALS OF HUMAN GENETICS, Issue 1 2007K. Yu Summary To take advantage of the increasingly available high-density SNP maps across the genome, various tests that compare multilocus genotypes or estimated haplotypes between cases and controls have been developed for candidate gene association studies. Here we view this two-sample testing problem from the perspective of supervised machine learning and propose a new association test. The approach adopts the flexible and easy-to-understand classification tree model as the learning machine, and uses the estimated prediction error of the resulting prediction rule as the test statistic. This procedure not only provides an association test but also generates a prediction rule that can be useful in understanding the mechanisms underlying complex disease. Under the set-up of a haplotype-based transmission/disequilibrium test (TDT) type of analysis, we find through simulation studies that the proposed procedure has the correct type I error rates and is robust to population stratification. The power of the proposed procedure is sensitive to the chosen prediction error estimator. Among commonly used prediction error estimators, the .632+ estimator results in a test that has the best overall performance. We also find that the test using the .632+ estimator is more powerful than the standard single-point TDT analysis, the Pearson's goodness-of-fit test based on estimated haplotype frequencies, and two haplotype-based global tests implemented in the genetic analysis package FBAT. To illustrate the application of the proposed method in population-based association studies, we use the procedure to study the association between non-Hodgkin lymphoma and the IL10 gene. [source] Candidate genes for cannabis use disorders: findings, challenges and directionsADDICTION, Issue 4 2009Arpana Agrawal ABSTRACT Aim Twin studies have shown that cannabis use disorders (abuse/dependence) are highly heritable. This review aims to: (i) review existing linkage studies of cannabis use disorders and (ii) review gene association studies, to identify potential candidate genes, including those that have been tested for composite substance use disorders and (iii) to highlight challenges in the genomic study of cannabis use disorders. Methods Peer-reviewed linkage and candidate gene association studies are reviewed. Results Four linkage studies are reviewed: results from these have homed in on regions on chromosomes 1, 3, 4, 9, 14, 17 and 18, which harbor candidates of predicted biological relevance, such as monoglyceride lipase (MGLL) on chromosome 3, but also novel genes, including ELTD1[epidermal growth factor (EGF), latrophilin and seven transmembrane domain containing 1] on chromosome 1. Gene association studies are presented for (a) genes posited to have specific influences on cannabis use disorders: CNR1, CB2, FAAH, MGLL, TRPV1 and GPR55 and (b) genes from various neurotransmitter systems that are likely to exert a non-specific influence on risk of cannabis use disorders, e.g. GABRA2, DRD2 and OPRM1. Conclusions There are challenges associated with (i) understanding biological complexity underlying cannabis use disorders (including the need to study gene,gene and gene,environment interactions), (ii) using diagnostic versus quantitative phenotypes, (iii) delineating which stage of cannabis involvement (e.g. use versus misuse) genes influence and (iv) problems of sample ascertainment. [source] A principal components regression approach to multilocus genetic association studiesGENETIC EPIDEMIOLOGY, Issue 2 2008Kai Wang Abstract With the rapid development of modern genotyping technology, it is becoming commonplace to genotype densely spaced genetic markers such as single nucleotide polymorphisms (SNPs) along the genome. This development has inspired a strong interest in using multiple markers located in the target region for the detection of association. We introduce a principal components (PCs) regression method for candidate gene association studies where multiple SNPs from the candidate region tend to be correlated. In this approach, the total variance in the original genotype scores is decomposed into parts that correspond to uncorrelated PCs. The PCs with the largest variances are then used as regressors in a multiple regression. Simulation studies suggest that this approach can have higher power than some popular methods. An application to CHI3L2 gene expression data confirms a significant association between CHI3L2 gene expression level and SNPs from this gene that has been previously reported by others. Genet. Epidemiol. 2008. © 2007 Wiley-Liss, Inc. [source] Quantifying bias due to allele misclassification in case-control studies of haplotypesGENETIC EPIDEMIOLOGY, Issue 7 2006Usha S. Govindarajulu Abstract Objectives Genotyping errors can induce biases in frequency estimates for haplotypes of single nucleotide polymorphisms (SNPs). Here, we considered the impact of SNP allele misclassification on haplotype odds ratio estimates from case-control studies of unrelated individuals. Methods We calculated bias analytically, using the haplotype counts expected in cases and controls under genotype misclassification. We evaluated the bias due to allele misclassification across a range of haplotype distributions using empirical haplotype frequencies within blocks of limited haplotype diversity. We also considered simple two- and three-locus haplotype distributions to understand the impact of haplotype frequency and number of SNPs on misclassification bias. Results We found that for common haplotypes (>5% frequency), realistic genotyping error rates (0.1,1% chance of miscalling an allele), and moderate relative risks (2,4), the bias was always towards the null and increases in magnitude with increasing error rate, increasing odds ratio. For common haplotypes, bias generally increased with increasing haplotype frequency, while for rare haplotypes, bias generally increased with decreasing frequency. When the chance of miscalling an allele is 0.5%, the median bias in haplotype-specific odds ratios for common haplotypes was generally small (<4% on the log odds ratio scale), but the bias for some individual haplotypes was larger (10,20%). Bias towards the null leads to a loss in power; the relative efficiency using a test statistic based upon misclassified haplotype data compared to a test based on the unobserved true haplotypes ranged from roughly 60% to 80%, and worsened with increasing haplotype frequency. Conclusions The cumulative effect of small allele-calling errors across multiple loci can induce noticeable bias and reduce power in realistic scenarios. This has implications for the design of candidate gene association studies that utilize multi-marker haplotypes. Genet. Epidemiol. 2006. © 2006 Wiley-Liss, Inc. [source] Two-sample Comparison Based on Prediction Error, with Applications to Candidate Gene Association StudiesANNALS OF HUMAN GENETICS, Issue 1 2007K. Yu Summary To take advantage of the increasingly available high-density SNP maps across the genome, various tests that compare multilocus genotypes or estimated haplotypes between cases and controls have been developed for candidate gene association studies. Here we view this two-sample testing problem from the perspective of supervised machine learning and propose a new association test. The approach adopts the flexible and easy-to-understand classification tree model as the learning machine, and uses the estimated prediction error of the resulting prediction rule as the test statistic. This procedure not only provides an association test but also generates a prediction rule that can be useful in understanding the mechanisms underlying complex disease. Under the set-up of a haplotype-based transmission/disequilibrium test (TDT) type of analysis, we find through simulation studies that the proposed procedure has the correct type I error rates and is robust to population stratification. The power of the proposed procedure is sensitive to the chosen prediction error estimator. Among commonly used prediction error estimators, the .632+ estimator results in a test that has the best overall performance. We also find that the test using the .632+ estimator is more powerful than the standard single-point TDT analysis, the Pearson's goodness-of-fit test based on estimated haplotype frequencies, and two haplotype-based global tests implemented in the genetic analysis package FBAT. To illustrate the application of the proposed method in population-based association studies, we use the procedure to study the association between non-Hodgkin lymphoma and the IL10 gene. [source] |