Posterior Probability (posterior + probability)

Distribution by Scientific Domains
Distribution within Life Sciences

Terms modified by Posterior Probability

  • posterior probability distribution

  • Selected Abstracts


    Calculation of Posterior Probabilities for Bayesian Model Class Assessment and Averaging from Posterior Samples Based on Dynamic System Data

    COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, Issue 5 2010
    Sai Hung Cheung
    Because of modeling uncertainty, a set of competing candidate model classes may be available to represent a system and it is then desirable to assess the plausibility of each model class based on system data. Bayesian model class assessment may then be used, which is based on the posterior probability of the different candidates for representing the system. If more than one model class has significant posterior probability, then Bayesian model class averaging provides a coherent mechanism to incorporate all of these model classes in making probabilistic predictions for the system response. This Bayesian model assessment and averaging requires calculation of the evidence for each model class based on the system data, which requires the evaluation of a multi-dimensional integral involving the product of the likelihood and prior defined by the model class. In this article, a general method for calculating the evidence is proposed based on using posterior samples from any Markov Chain Monte Carlo algorithm. The effectiveness of the proposed method is illustrated by Bayesian model updating and assessment using simulated earthquake data from a ten-story nonclassically damped building responding linearly and a four-story building responding inelastically. [source]


    Using BiowinÔ, Bayes, and batteries to predict ready biodegradability

    ENVIRONMENTAL TOXICOLOGY & CHEMISTRY, Issue 4 2004
    Robert S. Boethling
    Abstract Wether or not a given chemical substance is readily biodegradable is an important piece of information in risk screening for both new and existing chemicals. Despite the relatively low cost of Organization for Economic Cooperation and Development tests, data are often unavailable and biodegradability must be estimated. In this paper, we focus on the predictive value of selected BiowinÔ models and model batteries using Bayesian analysis. Posterior probabilities, calculated based on performance with the model training sets using Bayes' theorem, were closely matched by actual performance with an expanded set of 374 premanufacture notice (PMN) substances. Further analysis suggested that a simple battery consisting of Biowin3 (survey ultimate biodegradation model) and Biowin5 (Ministry of International Trade and Industry [MITI] linear model) would have enhanced predictive power in comparison to individual models. Application of the battery to PMN substances showed that performance matched expectation. This approach significantly reduced both false positives for ready biodegradability and the overall misclassification rate. Similar results were obtained for a set of 63 pharmaceuticals using a battery consisting of Biowin3 and Biowin6 (MITI nonlinear model). Biodegradation data for PMNs tested in multiple ready tests or both inherent and ready biodegradation tests yielded additional insights that may be useful in risk screening. [source]


    A Geometric Approach to Comparing Treatments for Rapidly Fatal Diseases

    BIOMETRICS, Issue 1 2006
    Peter F. Thall
    Summary In therapy of rapidly fatal diseases, early treatment efficacy often is characterized by an event, "response," which is observed relatively quickly. Since the risk of death decreases at the time of response, it is desirable not only to achieve a response, but to do so as rapidly as possible. We propose a Bayesian method for comparing treatments in this setting based on a competing risks model for response and death without response. Treatment effect is characterized by a two-dimensional parameter consisting of the probability of response within a specified time and the mean time to response. Several target parameter pairs are elicited from the physician so that, for a reference covariate vector, all elicited pairs embody the same improvement in treatment efficacy compared to a fixed standard. A curve is fit to the elicited pairs and used to determine a two-dimensional parameter set in which a new treatment is considered superior to the standard. Posterior probabilities of this set are used to construct rules for the treatment comparison and safety monitoring. The method is illustrated by a randomized trial comparing two cord blood transplantation methods. [source]


    A dictionary model for haplotyping, genotype calling, and association testing

    GENETIC EPIDEMIOLOGY, Issue 7 2007
    Kristin L. Ayers
    Abstract We propose a new method for haplotyping, genotype calling, and association testing based on a dictionary model for haplotypes. In this framework, a haplotype arises as a concatenation of conserved haplotype segments, drawn from a predefined dictionary according to segment specific probabilities. The observed data consist of unphased multimarker genotypes gathered on a random sample of unrelated individuals. These genotypes are subject to mutation, genotyping errors, and missing data. The true pair of haplotypes corresponding to a person's multimarker genotype is reconstructed using a Markov chain that visits haplotype pairs according to their posterior probabilities. Our implementation of the chain alternates Gibbs steps, which rearrange the phase of a single marker, and Metropolis steps, which swap maternal and paternal haplotypes from a given maker onward. Output of the chain include the most likely haplotype pairs, the most likely genotypes at each marker, and the expected number of occurrences of each haplotype segment. Reconstruction accuracy is comparable to that achieved by the best existing algorithms. More importantly, the dictionary model yields expected counts of conserved haplotype segments. These imputed counts can serve as genetic predictors in association studies, as we illustrate by examples on cystic fibrosis, Friedreich's ataxia, and angiotensin-I converting enzyme levels. Genet. Epidemiol. © 2007 Wiley-Liss, Inc. [source]


    Novel strategies to approximate probability trees in penniless propagation

    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, Issue 2 2003
    Andrés Cano
    In this article we introduce some modifications over the Penniless propagation algorithm. When a message through the join tree is approximated, the corresponding error is quantified in terms of an improved information measure, which leads to a new way of pruning several values in a probability tree (representing a message) by a single one, computed from the value stored in the tree being pruned but taking into account the message stored in the opposite direction. Also, we have considered the possibility of replacing small probability values by zero. Locally, this is not an optimal approximation strategy, but in Penniless propagation many different local approximations are carried out in order to estimate the posterior probabilities and, as we show in some experiments, replacing by zeros can improve the quality of the final approximations. © 2003 Wiley Periodicals, Inc. [source]


    Genetic structure and differentiation of 12 African Bos indicus and Bos taurus cattle breeds, inferred from protein and microsatellite polymorphisms

    JOURNAL OF ANIMAL BREEDING AND GENETICS, Issue 1 2005
    E.M. Ibeagha-Awemu
    Summary Level of genetic differentiation, gene flow and genetic structuring of nine Bos indicus and three Bos taurus cattle breeds in Cameroon and Nigeria were estimated using the genetic information from 16 microsatellite, five blood protein and seven milk protein markers. The global heterozygote deficit across all populations (Fit) amounted to 11.7% (p < 0.001). The overall significant (p < 0.001) deficit of heterozygotes because of inbreeding within breeds (Fis) amounted to 6.1%. The breeds were moderately differentiated (Fst = 6%, p < 0.001) with all loci except CSN1S2 contributing significantly to the Fst value. The 12 populations belong to two genetic clusters, a zebu and a taurine cluster. While inferred sub-clusters within the taurine group corresponded extremely well to predefined breed categorizations, no real sub-clusters, corresponding to predefined breeds, existed within the zebu cluster. With the application of prior population information, cluster analysis achieved posterior probabilities from 0.962 to 0.994 of correctly assigning individuals to their rightful populations. High gene flow was evident between the zebu populations. Positive and negative implications of the observed genetic structure of the breeds on their development, improvement and conservation are discussed. The study shows that the breeds are threatened by uncontrolled breeding and therefore are at risk to become genetically uniform in the future. This situation can be avoided by putting in place effective breeding and management measures aimed at limiting uncontrolled mating between the breeds and to preserve special characteristics, genetic as well as breed biodiversity. The first step towards realizing these goals might be to geographically demarcate the breeds. [source]


    Assessing Goodness of Fit of Item Response Theory Models: A Comparison of Traditional and Alternative Procedures

    JOURNAL OF EDUCATIONAL MEASUREMENT, Issue 4 2003
    Clement A. Stone
    Testing the goodness of fit of item response theory (IRT) models is relevant to validating IRT models, and new procedures have been proposed. These alternatives compare observed and expected response frequencies conditional on observed total scores, and use posterior probabilities for responses across , levels rather than cross-classifying examinees using point estimates of , and score responses. This research compared these alternatives with regard to their methods, properties (Type 1 error rates and empirical power), available research, and practical issues (computational demands, treatment of missing data, effects of sample size and sparse data, and available computer programs). Different advantages and disadvantages related to these characteristics are discussed. A simulation study provided additional information about empirical power and Type 1 error rates. [source]


    Phylogeny and speciation of the eastern Asian cyprinid genus Sarcocheilichthys

    JOURNAL OF FISH BIOLOGY, Issue 5 2008
    L. Zhang
    The genus Sarcocheilichthys is a group of small cyprinid fishes comprising 10 species/sub-species widely distributed in East Asia, which represents a valuable model for understanding the speciation of freshwater fishes in East Asia. In the present study, the molecular phylogenetic relationship of the genus Sarcocheilichthys was investigated using a 1140 bp section of the mitochondrial cytochrome b gene. Two different tree-building methods, maximum parsimony (MP) and Bayesian methods, yielded trees with almost the same topology, yielding high bootstrap values or posterior probabilities. The results showed that the genus Sarcocheilichthys consists of two large clades, clades I and II. Clade I contains Sarcocheilichthys lacustris, Sarcocheilichthys sinensis and Sarcocheilichthys parvus, with S. parvus at a basal position. In clade II, Sarcocheilichthys variegatus microoculus is at a basal position; samples of the widespread species, Sarcocheilichthys nigripinnis, form a large subclade containing another valid species Sarcocheilichthys czerskii. Sarcocheilichthys kiangsiensis is retained at an intermediate position. Since S. czerskii is a valid species in the S. nigripinnis clade, remaining samples of S. nigripinnis form a paraphyly. This speciation process is attributed to geographical isolation and special environmental conditions experienced by S. czerskii and stable environments experienced by the other S. nigripinnis populations. This type of speciation process was suggested to be very common. Samples of Sarcocheilichthys sinensis sinensis and Sarcocheilichthys sinensis fukiensis that did not form their own monophyletic groups suggest an early stage of speciation and support their sub-species status. Molecular clock analysis indicates that the two major lineages of the genus Sarcocheilichthys, clades I and II diverged c. 8·89 million years ago (mya). Sarcocheilichthys v. microoculus from Japan probably diverged 4·78 mya from the Chinese group. The northern,southern clades of S. nigripinnis began to diverge c. 2·12 mya, while one lineage of S. nigripinnis evolved into a new species, S. czerski, c. 0·34 mya. [source]


    A Bayesian threshold nonlinearity test for financial time series

    JOURNAL OF FORECASTING, Issue 1 2005
    Mike K. P. So
    Abstract We propose in this paper a threshold nonlinearity test for financial time series. Our approach adopts reversible-jump Markov chain Monte Carlo methods to calculate the posterior probabilities of two competitive models, namely GARCH and threshold GARCH models. Posterior evidence favouring the threshold GARCH model indicates threshold nonlinearity or volatility asymmetry. Simulation experiments demonstrate that our method works very well in distinguishing GARCH and threshold GARCH models. Sensitivity analysis shows that our method is robust to misspecification in error distribution. In the application to 10 market indexes, clear evidence of threshold nonlinearity is discovered and thus supporting volatility asymmetry. Copyright © 2005 John Wiley & Sons, Ltd. [source]


    PHYLOGENETIC RELATIONSHIPS WITHIN THE GENUS HYPNEA (GIGARTINALES, RHODOPHYTA), WITH A DESCRIPTION OF H. CAESPITOSA SP.

    JOURNAL OF PHYCOLOGY, Issue 2 2010

    Species discrimination within the gigartinalean red algal genus Hypnea has been controversial. To help resolve the controversy and explore phylogeny within the genus, we determined rbcL sequences from 30 specimens of 23 species within the genus, cox1 from 22 specimens of 10 species, and psaA from 16 species. We describe H. caespitosa as a new species characterized by a relatively slender main axis; a pulvinate growth habit with entangled, anastomosing, and subulate uppermost branches; and unilaterally borne tetrasporangial sori. The new species occurs in the warm waters of Malaysia, the Philippines, and Singapore. The phylogenetic trees of rbcL, psaA, and cox1 sequences showed a distant relationship of H. caespitosa to H. pannosa J. Agardh from Baja California and the marked differentiation from other similar species. The rbcL + psaA tree supported monophyly of the genus with high bootstrap values and posterior probabilities. The analysis revealed three clades within the genus, corresponding to three sections, namely, Virgatae, Spinuligerae, and Pulvinatae first recognized by J. G. Agardh. Exceptions were H. japonica T. Tanaka in Pulvinatae and H. spinella (C. Agardh) Kütz. in Spinuligerae. [source]


    ULTRASTRUCTURE AND LSU rDNA,BASED REVISION OF PERIDINIUM GROUP PALATINUM (DINOPHYCEAE) WITH THE DESCRIPTION OF PALATINUS GEN.

    JOURNAL OF PHYCOLOGY, Issue 5 2009

    The name Peridinium palatinum Lauterborn currently designates a freshwater peridinioid with 13 epithecal and six cingular plates, and no apical pore complex. Freshwater dinoflagellate floras classify it in Peridinium group palatinum together with P. pseudolaeve M. Lefèvre. General ultrastructure, flagellar apparatus, and pusular components of P. palatinum were examined by serial section TEM and compared to P. cinctum (O. F. Müll.) Ehrenb. and Peridiniopsis borgei Lemmerm., respectively, types of Peridinium and Peridiniopsis. Partial LSU rDNA sequences from P. palatinum, P. pseudolaeve and several peridinioids, woloszynskioids, gymnodinioids, and other dinoflagellates were used for a phylogenetic analysis. General morphology and tabulation of taxa in group palatinum were characterized by SEM. Differences in plate numbers, affecting both the epitheca and the cingulum, combine with differences in plate ornamentation and a suite of internal cell features to suggest a generic-level distinction between Peridinium group palatinum and typical Peridinium. The branching pattern of the phylogenetic tree is compatible with this conclusion, although with low support from bootstrap values and posterior probabilities, as are sequence divergences estimated between species in group palatinum, and typical Peridinium and Peridiniopsis. Palatinus nov. gen. is proposed with the new combinations Palatinus apiculatus nov. comb. (type species; syn. Peridinium palatinum), P. apiculatus var. laevis nov. comb., and P. pseudolaevis nov. comb. Distinctive characters for Palatinus include a smooth or slightly granulate, but not areolate, plate surface, a large central pyrenoid penetrated by cytoplasmic channels and radiating into chloroplast lobes, and the presence of a peduncle-homologous microtubular strand. Palatinus cells exit the theca through the antapical-postcingular area. [source]


    A Bayesian nonlinearity test for threshold moving average models

    JOURNAL OF TIME SERIES ANALYSIS, Issue 5 2010
    Qiang Xia
    We propose a Bayesian test for nonlinearity of threshold moving average (TMA) models. First, we obtain the marginal posterior densities of all parameters, including the threshold and delay, of the TMA model using Gibbs sampler with the Metropolis,Hastings algorithm. And then, we adopt reversible-jump Markov chain Monte Carlo methods to calculate the posterior probabilities for MA and TMA models. Posterior evidence in favour of the TMA model indicates threshold nonlinearity. Simulation experiments and a real example show that our method works very well in distinguishing MA and TMA models. [source]


    New molecular data for tardigrade phylogeny, with the erection of Paramacrobiotus gen. nov.

    JOURNAL OF ZOOLOGICAL SYSTEMATICS AND EVOLUTIONARY RESEARCH, Issue 4 2009
    R. Guidetti
    Abstract Up to few years ago, the phylogenies of tardigrade taxa have been investigated using morphological data, but relationships within and between many taxa are still unresolved. Our aim has been to verify those relationships adding molecular analysis to morphological analysis, using nearly complete 18S ribosomal DNA gene sequences (five new) of 19 species, as well as cytochrome oxidase subunit 1 (COI) mitochondrial DNA gene sequences (15 new) from 20 species, from a total of seven families. The 18S rDNA tree was calculated by minimum evolution, maximum parsimony (MP) and maximum likelihood (ML) analyses. DNA sequences coding for COI were translated to amino acid sequences and a tree was also calculated by neighbour-joining, MP and ML analyses. For both trees (18S rDNA and COI) posterior probabilities were calculated by MrBayes. Prominent findings are as follows: the molecular data on Echiniscidae (Heterotardigrada) are in line with the phylogenetic relationships identifiable by morphological analysis. Among Eutardigrada, orders Apochela and Parachela are confirmed as sister groups. Ramazzottius (Hypsibiidae) results more related to Macrobiotidae than to the genera here considered of Hypsibiidae. Macrobiotidae and Macrobiotus result not monophyletic and confirm morphological data on the presence of at least two large groups within Macrobiotus. Using 18S rDNA and COI mtDNA genes, a new phylogenetic line has been identified within Macrobiotus, corresponding to the ,richtersi-areolatus group'. Moreover, cryptic species have been identified within the Macrobiotus,richtersi group' and within Richtersius. Some evolutionary lines of tardigrades are confirmed, but others suggest taxonomic revision. In particular, the new genus Paramacrobiotus gen. n. has been identified, corresponding to the phylogenetic line represented by the ,richtersi-areolatus group'. Zusammenfassung Die Anzahl der Arten im Phylum Tardigrada ist in den letzten 25 Jahren von 500 Arten auf inzwischen fast 1000 Arten angestiegen. Zurzeit besteht die Gruppe aus zwei Klassen (Heterotardigrada und Eutardigrada), vier Ordnungen, 21 Familien, und 104 Gattungen. Trotz der Häufigkeit der Tardigraden wurde ihnen seit ihrer Entdeckung im Jahr 1773 nur wenig Aufmerksamkeit geschenkt. Bis vor wenigen Jahren wurden ausschließlich morphologische Merkmale verwendet, um die Phylogenie der Tardigrada zu untersuchen. Dennoch sind die Verhältnisse zwischen und innerhalb vieler Arten noch nicht eindeutig geklärt. Das Ziel der vorliegenden Arbeit war es, die bereits bekannten, morphologischen Verhältnisse mit molekularen Ergebnissen zu belegen. Hierzu wurden nur vollständige Sequenzen der ribosomalen 18S rDNA von 19 Arten verwendet. Fünf neue Sequenzen wurden dabei hinzugefügt. Weiterhin wurden von 15 Arten neue mitochrondriale COI Sequenzen verwendet, die mit fünf bekannten COI Sequenzen zu insgesamt sieben Familien gehören. Der 18S rDNA-Baum wurde durch ME, maximum parsimony (MP) and ML Analysen berechnet. Die für COI kodierenden Sequenzen wurden in Aminosäuren übersetzt und der Baum mit NJ, MP and ML Analysen berechnet. Für beide Bäume (18 rDNA und COI) wurden die Wahrscheinlichkeiten durch MrBayes ermittelt. Dabei ergab sich, dass molekulare Daten mit den morphologischen Untersuchungen bei den Echiniscidae (Heterotardigrada) übereinstimmen. Bei Eutardigrada wurden die Ordnungen Apochela und Parachela als Schwestergruppen bestätigt. Ramazzottius (Hypsibiidae) gehört zu der Familie Macrobiotidae und weniger zu Hypsibiidae, zu der die Gattung gegenwärtig gestellt wird. Die molekularen und morphologischen Daten deuten darauf hin, dass es mindestens zwei großer Gruppen innerhalb von Macrobiotus gibt. Durch die 18 rDNA und COI mtDNA Sequenzen konnte eine neue phylogenetische Linie innerhalb von Macrobiotus, der ,richtersi-areolatus Gruppe' zugehörig, identifiziert werden. Weiterhin sind kryptische Arten innerhalb der Macrobiotus richtersi Gruppe' und innerhalb von Richtersius gefunden worden. Die vorliegende Arbeit verifiziert die in vorangegangene Untersuchungen erarbeitete Phylogenie von Tardigraden. Es konnten einige Entwicklungslinien innerhalb den Tardigraden bestätigt werden, andere deuten zukünftige, taxonomische Revisionen an. So wurde die neue Gattung Paramacrobiotus eingeführt, entsprechend der phylogenetischen Linie, die bisher durch die ,richtersi-areolatus Gruppe' vertreten war. [source]


    Join tree propagation with prioritized messages

    NETWORKS: AN INTERNATIONAL JOURNAL, Issue 4 2010
    C. J. Butz
    Abstract Current join tree propagation algorithms treat all propagated messages as being of equal importance. On the contrary, it is often the case in real-world Bayesian networks that only some of the messages propagated from one join tree node to another are relevant to subsequent message construction at the receiving node. In this article, we propose the first join tree propagation algorithm that identifies and constructs the relevant messages first. Our approach assigns lower priority to the irrelevant messages as they only need to be constructed so that posterior probabilities can be computed when propagation terminates. Experimental results, involving the processing of evidence in four real-world Bayesian networks, empirically demonstrate an improvement over the state-of-the-art method for exact inference in discrete Bayesian networks. © 2009 Wiley Periodicals, Inc. NETWORKS, 2010 [source]


    Wild grapevine: silvestris, hybrids or cultivars that escaped from vineyards?

    PLANT BIOLOGY, Issue 3 2010
    Molecular evidence in Sardinia
    Abstract Vitis vinifera ssp. silvestris, the spontaneous subspecies of V. vinifera L., is believed to be the ancestor of present grapevine cultivars. In this work, polymorphism at 13 SSR loci was investigated to answer the following key question: are wild plants (i) true silvestris, (ii) hybrids between wild and cultivated plants or (iii) or ,escapes' from vineyards? In particular, the objective of the present study was to identify truly wild individuals and to search for possible hybridization events. The study was performed in Sardinia, the second largest island in the Mediterranean Sea, which is characterized by a large and well-described number of both grape cultivars and wild populations. This region was ideal for the study because of its spatial isolation and, consequently, limited contamination from outside material. The results of this study show that domesticated and wild grapevine germplasms are genetically divergent and thus are real silvestris. Pure lineages (both domesticated and wild) show very high average posterior probabilities of assignment to their own clusters, with a low level of introgression. [source]


    Semiparametric Bayes Multiple Testing: Applications to Tumor Data

    BIOMETRICS, Issue 2 2010
    Lianming Wang
    Summary In National Toxicology Program (NTP) studies, investigators want to assess whether a test agent is carcinogenic overall and specific to certain tumor types, while estimating the dose-response profiles. Because there are potentially correlations among the tumors, a joint inference is preferred to separate univariate analyses for each tumor type. In this regard, we propose a random effect logistic model with a matrix of coefficients representing log-odds ratios for the adjacent dose groups for tumors at different sites. We propose appropriate nonparametric priors for these coefficients to characterize the correlations and to allow borrowing of information across different dose groups and tumor types. Global and local hypotheses can be easily evaluated by summarizing the output of a single Monte Carlo Markov chain (MCMC). Two multiple testing procedures are applied for testing local hypotheses based on the posterior probabilities of local alternatives. Simulation studies are conducted and an NTP tumor data set is analyzed illustrating the proposed approach. [source]


    Adaptive Randomization for Multiarm Comparative Clinical Trials Based on Joint Efficacy/Toxicity Outcomes

    BIOMETRICS, Issue 3 2009
    Yuan Ji
    Summary We present an outcome-adaptive randomization (AR) scheme for comparative clinical trials in which the primary endpoint is a joint efficacy/toxicity outcome. Under the proposed scheme, the randomization probabilities are unbalanced adaptively in favor of treatments with superior joint outcomes characterized by higher efficacy and lower toxicity. This type of scheme is advantageous from the patients' perspective because on average, more patients are randomized to superior treatments. We extend the approximate Bayesian time-to-event model in Cheung and Thall (2002,,Biometrics,58, 89,97) to model the joint efficacy/toxicity outcomes and perform posterior computation based on a latent variable approach. Consequently, this allows us to incorporate essential information about patients with incomplete follow-up. Based on the computed posterior probabilities, we propose an AR scheme that favors the treatments with larger joint probabilities of efficacy and no toxicity. We illustrate our methodology with a leukemia trial that compares three treatments in terms of their 52-week molecular remission rates and 52-week toxicity rates. [source]


    A General Class of Pattern Mixture Models for Nonignorable Dropout with Many Possible Dropout Times

    BIOMETRICS, Issue 2 2008
    Jason Roy
    Summary In this article we consider the problem of fitting pattern mixture models to longitudinal data when there are many unique dropout times. We propose a marginally specified latent class pattern mixture model. The marginal mean is assumed to follow a generalized linear model, whereas the mean conditional on the latent class and random effects is specified separately. Because the dimension of the parameter vector of interest (the marginal regression coefficients) does not depend on the assumed number of latent classes, we propose to treat the number of latent classes as a random variable. We specify a prior distribution for the number of classes, and calculate (approximate) posterior model probabilities. In order to avoid the complications with implementing a fully Bayesian model, we propose a simple approximation to these posterior probabilities. The ideas are illustrated using data from a longitudinal study of depression in HIV-infected women. [source]


    The DNA Database Search Controversy Revisited: Bridging the Bayesian,Frequentist Gap

    BIOMETRICS, Issue 3 2007
    Geir Storvik
    Summary Two different quantities have been suggested for quantification of evidence in cases where a suspect is found by a search through a database of DNA profiles. The likelihood ratio, typically motivated from a Bayesian setting, is preferred by most experts in the field. The so-called np rule has been suggested through frequentist arguments and has been suggested by the American National Research Council and Stockmarr (1999, Biometrics55, 671,677). The two quantities differ substantially and have given rise to the DNA database search controversy. Although several authors have criticized the different approaches, a full explanation of why these differences appear is still lacking. In this article we show that a P-value in a frequentist hypothesis setting is approximately equal to the result of the np rule. We argue, however, that a more reasonable procedure in this case is to use conditional testing, in which case a P-value directly related to posterior probabilities and the likelihood ratio is obtained. This way of viewing the problem bridges the gap between the Bayesian and frequentist approaches. At the same time it indicates that the np rule should not be used to quantify evidence. [source]


    A genetic model for determining MSH2 and MLH1 carrier probabilities based on family history and tumor microsatellite instability

    CLINICAL GENETICS, Issue 3 2006
    F Marroni
    Mutation-predicting models can be useful when deciding on the genetic testing of individuals at risk and in determining the cost effectiveness of screening strategies at the population level. The aim of this study was to evaluate the performance of a newly developed genetic model that incorporates tumor microsatellite instability (MSI) information, called the AIFEG model, and in predicting the presence of mutations in MSH2 and MLH1 in probands with suspected hereditary non-polyposis colorectal cancer. The AIFEG model is based on published estimates of mutation frequencies and cancer penetrances in carriers and non-carriers and employs the program MLINK of the FASTLINK package to calculate the proband's carrier probability. Model performance is evaluated in a series of 219 families screened for mutations in both MSH2 and MLH1, in which 68 disease-causing mutations were identified. Predictions are first obtained using family history only and then converted into posterior probabilities using information on MSI. This improves predictions substantially. Using a probability threshold of 10% for mutation analysis, the AIFEG model applied to our series has 100% sensitivity and 71% specificity. [source]


    Probabilistically Valid Inference of Covariation From a Single x,y Observation When Univariate Characteristics Are Known

    COGNITIVE SCIENCE - A MULTIDISCIPLINARY JOURNAL, Issue 2 2009
    Michael E. Doherty
    Abstract Participants were asked to draw inferences about correlation from single x,y observations. In Experiment 1 statistically sophisticated participants were given the univariate characteristics of distributions of x and y and asked to infer whether a single x, y observation came from a correlated or an uncorrelated population. In Experiment 2, students with a variety of statistical backgrounds assigned posterior probabilities to five possible populations based on single x, y observations, again given knowledge of the univariate statistics. In Experiment 3, statistically naïve participants were given a problem analogous to that given in Experiment 1, framed verbally. Experiment 4 replicated Experiment 3 but added an "impossible to determine" response option. Models that rely on computing sample correlations make no predictions about these investigations. From a Bayesian perspective, participants' inferences in all four experiments tended to make probabilistically valid inferences as long as the single datum was directional. The results are discussed in light of the Brunswikian notion of vicarious functioning. [source]


    Calculation of Posterior Probabilities for Bayesian Model Class Assessment and Averaging from Posterior Samples Based on Dynamic System Data

    COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, Issue 5 2010
    Sai Hung Cheung
    Because of modeling uncertainty, a set of competing candidate model classes may be available to represent a system and it is then desirable to assess the plausibility of each model class based on system data. Bayesian model class assessment may then be used, which is based on the posterior probability of the different candidates for representing the system. If more than one model class has significant posterior probability, then Bayesian model class averaging provides a coherent mechanism to incorporate all of these model classes in making probabilistic predictions for the system response. This Bayesian model assessment and averaging requires calculation of the evidence for each model class based on the system data, which requires the evaluation of a multi-dimensional integral involving the product of the likelihood and prior defined by the model class. In this article, a general method for calculating the evidence is proposed based on using posterior samples from any Markov Chain Monte Carlo algorithm. The effectiveness of the proposed method is illustrated by Bayesian model updating and assessment using simulated earthquake data from a ten-story nonclassically damped building responding linearly and a four-story building responding inelastically. [source]


    Examining the statistical properties of fine-scale mapping in large-scale association studies

    GENETIC EPIDEMIOLOGY, Issue 3 2008
    Steven Wiltshire
    Abstract Interpretation of dense single nucleotide polymorphism (SNP) follow-up of genome-wide association or linkage scan signals can be facilitated by establishing expectation for the behaviour of primary mapping signals upon fine-mapping, under both null and alternative hypotheses. We examined the inferences that can be made regarding the posterior probability of a real genetic effect and considered different disease-mapping strategies and prior probabilities of association. We investigated the impact of the extent of linkage disequilibrium between the disease SNP and the primary analysis signal and the extent to which the disease gene can be physically localised under these scenarios. We found that large increases in significance (>2 orders of magnitude) appear in the exclusive domain of genuine genetic effects, especially in the follow-up of genome-wide association scans or consensus regions from multiple linkage scans. Fine-mapping significant association signals that reside directly under linkage peaks yield little improvement in an already high posterior probability of a real effect. Following fine-mapping, those signals that increase in significance also demonstrate improved localisation. We found local linkage disequiliptium patterns around the primary analysis signal(s) and tagging efficacy of typed markers to play an important role in determining a suitable interval for fine-mapping. Our findings help inform the interpretation and design of dense SNP-mapping follow-up studies, thus facilitating discrimination between a genuine genetic effect and chance fluctuation (false positive). Genet. Epidemiol. 2007. © 2007 Wiley-Liss, Inc. [source]


    A score for Bayesian genome screening

    GENETIC EPIDEMIOLOGY, Issue 3 2003
    E. Warwick Daw
    Abstract Bayesian Monte Carlo Markov chain (MCMC) techniques have shown promise in dissecting complex genetic traits. The methods introduced by Heath ([1997], Am. J. Hum. Genet. 61:748,760), and implemented in the program Loki, have been able to localize genes for complex traits in both real and simulated data sets. Loki estimates the posterior probability of quantitative trait loci (QTL) at locations on a chromosome in an iterative MCMC process. Unfortunately, interpretation of the results and assessment of their significance have been difficult. Here, we introduce a score, the log of the posterior placement probability ratio (LOP), for assessing oligogenic QTL detection and localization. The LOP is the log of the posterior probability of linkage to the real chromosome divided by the posterior probability of linkage to an unlinked pseudochromosome, with marker informativeness similar to the marker data on the real chromosome. Since the LOP cannot be calculated exactly, we estimate it in simultaneous MCMC on both real and pseudochromosomes. We investigate empirically the distributional properties of the LOP in the presence and absence of trait genes. The LOP is not subject to trait model misspecification in the way a lod score may be, and we show that the LOP can detect linkage for loci of small effect when the lod score cannot. We show how, in the absence of linkage, an empirical distribution of the LOP may be estimated by simulation and used to provide an assessment of linkage detection significance. Genet Epidemiol 24:181,190, 2003. © 2003 Wiley-Liss, Inc. [source]


    Linkage between a new splicing site mutation in the MDR3 alias ABCB4 gene and intrahepatic cholestasis of pregnancy,

    HEPATOLOGY, Issue 1 2007
    Gudrun Schneider
    Intrahepatic cholestasis of pregnancy (ICP) is defined as pruritus and elevated bile acid serum concentrations in late pregnancy. Splicing mutations have been described in the multidrug resistance p-glycoprotein 3 (MDR3, ABCB4) gene in up to 20% of ICP women. Pedigrees studied were not large enough for linkage analysis. Ninety-seven family members of a woman with proven ICP were asked about pruritus in earlier pregnancies, birth complications and symptomatic gallstone disease. The familial cholestasis type 1 (FIC1, ATP8B1) gene, bile salt export pump (BSEP, ABCB11) and MDR3 gene were analyzed in 55 relatives. We identified a dominant mode of inheritance with female restricted expression and a new intronic MDR3 mutation c.3486+5G>A resulting in a 54 bp (3465,3518) inframe deletion via cryptic splicing site activation. Linkage analysis of the ICP trait versus this intragenic MDR3 variant yielded a LOD score of 2.48. A Bayesian analysis involving MDR3, BSEP, FIC1 and an unknown locus gave a posterior probability of >0.9966 in favor of MDR3 as causative ICP locus. During the episode of ICP the median ,-glutamyl transpeptidase (,-GT) activity was 10 U/l (95% CI, 6.9 to 14.7 U/l) in the index woman. Four stillbirths were reported in seven heterozygous women (22 pregnancies) and none in five women (14 pregnancies) without MDR3 mutation. Symptomatic gallstone disease was more prevalent in heterozygous relatives (7/21) than in relatives without the mutation (1/34), (P = 0.00341). Conclusion: This study demonstrates that splicing mutations in the MDR3 gene can cause ICP with normal ,-GT and may be associated with stillbirths and gallstone disease. (HEPATOLOGY 2007;45:150,158.) [source]


    Model uncertainty in cross-country growth regressions

    JOURNAL OF APPLIED ECONOMETRICS, Issue 5 2001
    Carmen Fernández
    We investigate the issue of model uncertainty in cross-country growth regressions using Bayesian Model Averaging (BMA). We find that the posterior probability is spread widely among many models, suggesting the superiority of BMA over choosing any single model. Out-of-sample predictive results support this claim. In contrast to Levine and Renelt (1992), our results broadly support the more ,optimistic' conclusion of Sala-i-Martin (1997b), namely that some variables are important regressors for explaining cross-country growth patterns. However, care should be taken in the methodology employed. The approach proposed here is firmly grounded in statistical theory and immediately leads to posterior and predictive inference. Copyright © 2001 John Wiley & Sons, Ltd. [source]


    PHYLOGENY OF THE DASYCLADALES (CHLOROPHYTA, ULVOPHYCEAE) BASED ON ANALYSES OF RUBISCO LARGE SUBUNIT (rbcL) GENE SEQUENCES,

    JOURNAL OF PHYCOLOGY, Issue 4 2003
    Frederick W. Zechman
    The phylogeny of the green algal Order Dasycladales was inferred by maximum parsimony and Bayesian analyses of chloroplast-encoded rbcL sequence data. Bayesian analysis suggested that the tribe Acetabularieae is monophyletic but that some genera within the tribe, such as Acetabularia Lamouroux and Polyphysa Lamouroux, are not. Bayesian analysis placed Halicoryne Harvey as the sister group of the Acetabularieae, a result consistent with limited fossil evidence and monophyly of the family Acetabulariaceae but was not supported by significant posterior probability. Bayesian analysis further suggested that the family Dasycladaceae is a paraphyletic assemblage at the base of the Dasycladales radiation, casting doubt on the current family-level classification. The genus Cymopolia Lamouroux was inferred to be the basal-most dasycladalean genus, which is also consistent with limited fossil evidence. Unweighted parsimony analyses provided similar results but primarily differed by the sister relationship between Halicoryne Lamouroux and Bornetella Munier-Chalmas, thus supporting the monophyly of neither the families Acetabulariaceae nor Dasycladaceae. This result, however, was supported by low bootstrap values. Low transition-to-transversion ratios, potential loss of phylogenetic signal in third codon positions, and the 550 million year old Dasycladalean lineage suggest that dasyclad rbcL sequences may be saturated due to deep time divergences. Such factors may have contributed to inaccurate reconstruction of phylogeny, particularly with respect to potential inconsistency of parsimony analyses. Regardless, strongly negative g1 values were obtained in analyses including all codon positions, indicating the presence of considerable phylogenetic signal in dasyclad rbcL sequence data. Morphological features relevant to the separation of taxa within the Dasycladales and the possible effects of extinction on phylogeny reconstruction are discussed relative to the inferred phylogenies. [source]


    Estimating ancestral distributions of lineages with uncertain sister groups: a statistical approach to Dispersal,Vicariance Analysis and a case using Aesculus L. (Sapindaceae) including fossils

    JOURNAL OF SYSTEMATICS EVOLUTION, Issue 5 2009
    A.J. HARRIS
    Abstract, We propose a simple statistical approach for using Dispersal,Vicariance Analysis (DIVA) software to infer biogeographic histories without fully bifurcating trees. In this approach, ancestral ranges are first optimized for a sample of Bayesian trees. The probability P of an ancestral range r at a node is then calculated as where Y is a node, and F(rY) is the frequency of range r among all the optimal solutions resulting from DIVA optimization at node Y, t is one of n topologies optimized, and Pt is the probability of topology t. Node Y is a hypothesized ancestor shared by a specific crown lineage and the sister of that lineage "x", where x may vary due to phylogenetic uncertainty (polytomies and nodes with posterior probability <100%). Using this method, the ancestral distribution at Y can be estimated to provide inference of the geographic origins of the specific crown group of interest. This approach takes into account phylogenetic uncertainty as well as uncertainty from DIVA optimization. It is an extension of the previously described method called Bayes-DIVA, which pairs Bayesian phylogenetic analysis with biogeographic analysis using DIVA. Further, we show that the probability P of an ancestral range at Y calculated using this method does not equate to pp*F(rY) on the Bayesian consensus tree when both variables are <100%, where pp is the posterior probability and F(rY) is the frequency of range r for the node containing the specific crown group. We tested our DIVA-Bayes approach using Aesculus L., which has major lineages unresolved as a polytomy. We inferred the most probable geographic origins of the five traditional sections of Aesculus and of Aesculus californica Nutt. and examined range subdivisions at parental nodes of these lineages. Additionally, we used the DIVA-Bayes data from Aesculus to quantify the effects on biogeographic inference of including two wildcard fossil taxa in phylogenetic analysis. Our analysis resolved the geographic ranges of the parental nodes of the lineages of Aesculus with moderate to high probabilities. The probabilities were greater than those estimated using the simple calculation of pp*F(ry) at a statistically significant level for two of the six lineages. We also found that adding fossil wildcard taxa in phylogenetic analysis generally increased P for ancestral ranges including the fossil's distribution area. The ,P was more dramatic for ranges that include the area of a wildcard fossil with a distribution area underrepresented among extant taxa. This indicates the importance of including fossils in biogeographic analysis. Exmination of range subdivision at the parental nodes revealed potential range evolution (extinction and dispersal events) along the stems of A. californica and sect. Parryana. [source]


    A Bayesian discovery procedure

    JOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES B (STATISTICAL METHODOLOGY), Issue 5 2009
    Michele Guindani
    Summary., We discuss a Bayesian discovery procedure for multiple-comparison problems. We show that, under a coherent decision theoretic framework, a loss function combining true positive and false positive counts leads to a decision rule that is based on a threshold of the posterior probability of the alternative. Under a semiparametric model for the data, we show that the Bayes rule can be approximated by the optimal discovery procedure, which was recently introduced by Storey. Improving the approximation leads us to a Bayesian discovery procedure, which exploits the multiple shrinkage in clusters that are implied by the assumed non-parametric model. We compare the Bayesian discovery procedure and the optimal discovery procedure estimates in a simple simulation study and in an assessment of differential gene expression based on microarray data from tumour samples. We extend the setting of the optimal discovery procedure by discussing modifications of the loss function that lead to different single-thresholding statistics. Finally, we provide an application of the previous arguments to dependent (spatial) data. [source]


    The utility of pretest probability assessment in patients with clinically suspected venous thromboembolism

    JOURNAL OF THROMBOSIS AND HAEMOSTASIS, Issue 9 2003
    J. Kelly
    Summary. ,The assessment of pretest probability (PTP), with stratification into low-, intermediate- and high-risk groups is an essential initial step in the current diagnostic management of patients with suspected venous thromboembolism (VTE). In combination with additional information, it reduces the need for initial and supplementary imaging, and allows considerable refinement of the posterior probability of VTE following non-invasive imaging. PTP may be assessed either empirically or by using various decision rules or scoring systems, the best known of which are the simplified Wells scores for suspected deep vein thrombosis (DVT) and pulmonary embolism (PE), and the Geneva score for suspected PE. Each of these approaches shows similar directional and categorical accuracy, and has been validated as facilitating clinically useful classification of the PTP, although an overview of data suggests that fewer patients tend to be classified as low PTP when assessed empirically. This group is the most important to identify, as several outcome studies have shown that imaging and treatment are safely obviated in outpatients with suspected DVT or PE who have a low PTP in combination with negative d -dimer testing, a subgroup accounting for up to half of all patients studied. Hence, while probably not of critical importance, the explicit approach offered by scoring systems might be preferred over empirical assessment, particularly when used by more junior staff. [source]