Home About us Contact | |||
Bayesian Analysis (bayesian + analysis)
Kinds of Bayesian Analysis Selected AbstractsVARIATIONAL BAYESIAN ANALYSIS FOR HIDDEN MARKOV MODELSAUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, Issue 2 2009C. A. McGrory Summary The variational approach to Bayesian inference enables simultaneous estimation of model parameters and model complexity. An interesting feature of this approach is that it also leads to an automatic choice of model complexity. Empirical results from the analysis of hidden Markov models with Gaussian observation densities illustrate this. If the variational algorithm is initialized with a large number of hidden states, redundant states are eliminated as the method converges to a solution, thereby leading to a selection of the number of hidden states. In addition, through the use of a variational approximation, the deviance information criterion for Bayesian model selection can be extended to the hidden Markov model framework. Calculation of the deviance information criterion provides a further tool for model selection, which can be used in conjunction with the variational approach. [source] Bayesian comparison of spatially regularised general linear modelsHUMAN BRAIN MAPPING, Issue 4 2007Will Penny Abstract In previous work (Penny et al., [2005]: Neuroimage 24:350,362) we have developed a spatially regularised General Linear Model for the analysis of functional magnetic resonance imaging data that allows for the characterisation of regionally specific effects using Posterior Probability Maps (PPMs). In this paper we show how it also provides an approximation to the model evidence. This is important as it is the basis of Bayesian model comparison and provides a unified framework for Bayesian Analysis of Variance, Cluster of Interest analyses and the principled selection of signal and noise models. We also provide extensions that implement spatial and anatomical regularisation of noise process parameters. Hum Brain Mapp 2007. © 2006 Wiley-Liss, Inc. [source] Hierarchical Bayesian Analysis of Correlated Zero-inflated Count DataBIOMETRICAL JOURNAL, Issue 6 2004Getachew A. Dagne Abstract This article presents two-component hierarchical Bayesian models which incorporate both overdispersion and excess zeros. The components may be resultants of some intervention (treatment) that changes the rare event generating process. The models are also expanded to take into account any heterogeneity that may exist in the data. Details of the model fitting, checking and selecting alternative models from a Bayesian perspective are also presented. The proposed methods are applied to count data on the assessment of an efficacy of pesticides in controlling the reproduction of whitefly. (© 2004 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim) [source] Semiparametric Bayesian Analysis of Nutritional Epidemiology Data in the Presence of Measurement ErrorBIOMETRICS, Issue 2 2010Samiran Sinha Summary:, We propose a semiparametric Bayesian method for handling measurement error in nutritional epidemiological data. Our goal is to estimate nonparametrically the form of association between a disease and exposure variable while the true values of the exposure are never observed. Motivated by nutritional epidemiological data, we consider the setting where a surrogate covariate is recorded in the primary data, and a calibration data set contains information on the surrogate variable and repeated measurements of an unbiased instrumental variable of the true exposure. We develop a flexible Bayesian method where not only is the relationship between the disease and exposure variable treated semiparametrically, but also the relationship between the surrogate and the true exposure is modeled semiparametrically. The two nonparametric functions are modeled simultaneously via B-splines. In addition, we model the distribution of the exposure variable as a Dirichlet process mixture of normal distributions, thus making its modeling essentially nonparametric and placing this work into the context of functional measurement error modeling. We apply our method to the NIH-AARP Diet and Health Study and examine its performance in a simulation study. [source] Bayesian Analysis for Generalized Linear Models with Nonignorably Missing CovariatesBIOMETRICS, Issue 3 2005Lan Huang Summary We propose Bayesian methods for estimating parameters in generalized linear models (GLMs) with nonignorably missing covariate data. We show that when improper uniform priors are used for the regression coefficients, ,, of the multinomial selection model for the missing data mechanism, the resulting joint posterior will always be improper if (i) all missing covariates are discrete and an intercept is included in the selection model for the missing data mechanism, or (ii) at least one of the covariates is continuous and unbounded. This impropriety will result regardless of whether proper or improper priors are specified for the regression parameters, ,, of the GLM or the parameters, ,, of the covariate distribution. To overcome this problem, we propose a novel class of proper priors for the regression coefficients, ,, in the selection model for the missing data mechanism. These priors are robust and computationally attractive in the sense that inferences about , are not sensitive to the choice of the hyperparameters of the prior for , and they facilitate a Gibbs sampling scheme that leads to accelerated convergence. In addition, we extend the model assessment criterion of Chen, Dey, and Ibrahim (2004a, Biometrika91, 45,63), called the weighted L measure, to GLMs and missing data problems as well as extend the deviance information criterion (DIC) of Spiegelhalter et al. (2002, Journal of the Royal Statistical Society B64, 583,639) for assessing whether the missing data mechanism is ignorable or nonignorable. A novel Markov chain Monte Carlo sampling algorithm is also developed for carrying out posterior computation. Several simulations are given to investigate the performance of the proposed Bayesian criteria as well as the sensitivity of the prior specification. Real datasets from a melanoma cancer clinical trial and a liver cancer study are presented to further illustrate the proposed methods. [source] Bayesian Analysis of Serial Dilution AssaysBIOMETRICS, Issue 2 2004Andrew Gelman Summary. In a serial dilution assay, the concentration of a compound is estimated by combining measurements of several different dilutions of an unknown sample. The relation between concentration and measurement is nonlinear and heteroscedastic, and so it is not appropriate to weight these measurements equally. In the standard existing approach for analysis of these data, a large proportion of the measurements are discarded as being above or below detection limits. We present a Bayesian method for jointly estimating the calibration curve and the unknown concentrations using all the data. Compared to the existing method, our estimates have much lower standard errors and give estimates even when all the measurements are outside the "detection limits." We evaluate our method empirically using laboratory data on cockroach allergens measured in house dust samples. Our estimates are much more accurate than those obtained using the usual approach. In addition, we develop a method for determining the "effective weight" attached to each measurement, based on a local linearization of the estimated model. The effective weight can give insight into the information conveyed by each data point and suggests potential improvements in design of serial dilution experiments. [source] RAPID SPECIATION FOLLOWING RECENT HOST SHIFTS IN THE PLANT PATHOGENIC FUNGUS RHYNCHOSPORIUMEVOLUTION, Issue 6 2008Pascal L. Zaffarano Agriculture played a significant role in increasing the number of pathogen species and in expanding their geographic range during the last 10,000 years. We tested the hypothesis that a fungal pathogen of cereals and grasses emerged at the time of domestication of cereals in the Fertile Crescent and subsequently speciated after adaptation to its hosts. Rhynchosporium secalis, originally described from rye, causes an important disease on barley called scald, although it also infects other species of Hordeum and Agropyron. Phylogenetic analyses based on four DNA sequence loci identified three host-associated lineages that were confirmed by cross-pathogenicity tests. Bayesian analyses of divergence time suggested that the three lineages emerged between ,1200 to 3600 years before present (B.P.) with a 95% highest posterior density ranging from 100 to 12,000 years B.P. depending on the implemented clock models. The coalescent inference of demographic history revealed a very recent population expansion for all three pathogens. We propose that Rhynchosporium on barley, rye, and Agropyron host species represent three cryptic pathogen species that underwent independent evolution and ecological divergence by host-specialization. We postulate that the recent emergence of these pathogens followed host shifts. The subsequent population expansions followed the expansion of the cultivated host populations and accompanying expansion of the weedy Agropyron spp. found in fields of cultivated cereals. Hence, agriculture played a major role in the emergence of the scald diseases, the adaptation of the pathogens to new hosts and their worldwide dissemination. [source] Comparison of repeatability and multiple trait threshold models for litter size in sheep using observed and simulated data in Bayesian analysesJOURNAL OF ANIMAL BREEDING AND GENETICS, Issue 4 2010W. Mekkawy Summary Bayesian analyses were used to estimate genetic parameters on 5580 records of litter size in the first four parities from 1758 Mule ewes. To examine the appropriateness of fitting repeatability (RM) or multiple trait threshold models (MTM) to litter size of different parities, both models were used to estimate genetic parameters on the observed data and were thereafter compared in a simulation study. Posterior means of the heritabilities of litter size in different parities using a MTM ranged from 0.12 to 0.18 and were higher than the heritability based on the RM (0.08). Posterior means of the genetic correlations between litter sizes of different parities were positive and ranged from 0.24 to 0.71. Data sets were simulated based on the same pedigree structure and genetic parameters of the Mule ewe population obtained from both models. The simulation showed that the relative loss in accuracy and increase in mean squared error (MSE) was substantially higher when using the RM, given that the parameters estimated from the observed data using the opposite model are the true parameters. In contrast, Bayesian information criterion (BIC) selected the RM as most appropriate model given the data because of substantial penalty for the higher number of parameters to be estimated in the MTM model. In conclusion, when the relative change in accuracy and MSE is of main interest for estimation of breeding values of litter size of different parities, the MTM is recommended for the given population. When reduction in risk of using the wrong model is the main aim, the BIC suggest that the RM is the most appropriate model. [source] Phylogeography of the northern hogsucker, Hypentelium nigricans (Teleostei: Cypriniformes): genetic evidence for the existence of the ancient Teays RiverJOURNAL OF BIOGEOGRAPHY, Issue 8 2003Peter B. Berendzen Abstract Aim, To assess the roles of dispersal and vicariance in shaping the present distribution and diversity within Hypentelium nigricans, the northern hogsucker (Teleostei: Cypriniformes). Location, Eastern United States. Methods, Parsimony analyses, Bayesian analyses, pairwise genetic divergence and mismatch plots are used to examine patterns of genetic variation across H. nigricans. Results, Species relationships within the genus Hypentelium were consistent with previous hypotheses. However, relationships between haplotypes within H. nigricans revealed two deeply divergent groups, a clade containing haplotypes from the New and Roanoke rivers (Atlantic Slope) plus Interior Highlands and upper Mississippi River and a clade containing haplotypes from the Eastern Highlands, previously glaciated regions of the Ohio and Wabash rivers, and the Amite and Homochitto rivers of south-western Mississippi. Main conclusions, The phylogenetic history of Hypentelium was shaped by old vicariant events associated with erosion of the Blue Ridge and separation of the Mobile and Mississippi river basins. Within H. nigricans two clades existed prior to the Pleistocene; a widespread clade in the pre-glacial Teays-Mississippi River system and a clade in Cumberland and Tennessee rivers. Pleistocene events fragmented the Teays-Mississippi fauna. Following the retreat of the glaciers H. nigricans dispersed northward into previously glaciated regions. These patterns are replicated in other clades of fishes and are consistent with some of the predictions of Mayden's (Systematic Zoology, 37, 329, 1988) pre-Pleistocene vicariance hypothesis. [source] Fluctuating asymmetry as a bio-indicator in isolated populations of the Taita thrush: a Bayesian perspectiveJOURNAL OF BIOGEOGRAPHY, Issue 5-6 2002Luc Lens Aim We examined whether developmental instability can be used as a bio-monitoring tool in the endangered Taita thrush (Turdus helleri L.) through the measurement of individual levels of fluctuating asymmetry in tarsus length. Because estimates of the association between developmental instability, stress and fitness derived from traditional regression are biased, we compared parameter estimates obtained from likelihood based analysis with those obtained from a Bayesian latent variable model. Location Taita thrushes were captured and measured in three isolated cloud forest fragments located in the Taita Hills of south-east Kenya. Methods We applied mixed-effects regression with Restricted Maximum Likelihood parameter estimation (performed with SAS version 8.0) and Bayesian latent variable modelling (performed with WINBUGS version 1.3 and CODA version 0.4) to estimate unbiased levels of developmental instability and to model relationships between developmental instability and body condition in 312 Taita thrushes. Results Likelihood and Bayesian analyses yielded highly comparable results. Individual levels of developmental instability were strongly inversely related to body condition in the subpopulation with the lowest average condition. In contrast, both variables were unrelated in two other subpopulations with higher average condition. Such heterogeneity in association was in the direction expected by developmental theory, given that higher condition suggests more benign ambient conditions. The estimated levels of body condition in the three subpopulations did not support their presumed ranking in relation to environmental stress. Developmental instability and body condition are therefore believed to reflect different aspects of individual fitness. Main conclusions Variation in developmental homeostasis, either modelled as observable variable (fluctuating asymmetry) or latent variable (developmental instability), appears a useful indicator of stress effects in the Taita thrush. Because relationships between environmental stress and developmental instability may depend on the extent to which stress-mediated changes in other components of phenotypic variation are correlated, the study of trait asymmetry should preferably be combined with that of other measures of trait variability, such as trait size or organismal condition. [source] PHYLOGENY OF THE DASYCLADALES (CHLOROPHYTA, ULVOPHYCEAE) BASED ON ANALYSES OF RUBISCO LARGE SUBUNIT (rbcL) GENE SEQUENCES,JOURNAL OF PHYCOLOGY, Issue 4 2003Frederick W. Zechman The phylogeny of the green algal Order Dasycladales was inferred by maximum parsimony and Bayesian analyses of chloroplast-encoded rbcL sequence data. Bayesian analysis suggested that the tribe Acetabularieae is monophyletic but that some genera within the tribe, such as Acetabularia Lamouroux and Polyphysa Lamouroux, are not. Bayesian analysis placed Halicoryne Harvey as the sister group of the Acetabularieae, a result consistent with limited fossil evidence and monophyly of the family Acetabulariaceae but was not supported by significant posterior probability. Bayesian analysis further suggested that the family Dasycladaceae is a paraphyletic assemblage at the base of the Dasycladales radiation, casting doubt on the current family-level classification. The genus Cymopolia Lamouroux was inferred to be the basal-most dasycladalean genus, which is also consistent with limited fossil evidence. Unweighted parsimony analyses provided similar results but primarily differed by the sister relationship between Halicoryne Lamouroux and Bornetella Munier-Chalmas, thus supporting the monophyly of neither the families Acetabulariaceae nor Dasycladaceae. This result, however, was supported by low bootstrap values. Low transition-to-transversion ratios, potential loss of phylogenetic signal in third codon positions, and the 550 million year old Dasycladalean lineage suggest that dasyclad rbcL sequences may be saturated due to deep time divergences. Such factors may have contributed to inaccurate reconstruction of phylogeny, particularly with respect to potential inconsistency of parsimony analyses. Regardless, strongly negative g1 values were obtained in analyses including all codon positions, indicating the presence of considerable phylogenetic signal in dasyclad rbcL sequence data. Morphological features relevant to the separation of taxa within the Dasycladales and the possible effects of extinction on phylogeny reconstruction are discussed relative to the inferred phylogenies. [source] Systematic positions of Lamiophlomis and Paraphlomis (Lamiaceae) based on nuclear and chloroplast sequencesJOURNAL OF SYSTEMATICS EVOLUTION, Issue 6 2009Yue-Zhi PAN Abstract, Genera Lamiophlomis and Paraphlomis were originally separated from genus Phlomis s.l. on the basis of particular morphological characteristics. However, their relationship was highly contentious, as evidenced by the literature. In the present paper, the systematic positions of Lamiophlomis, Paraphlomis, and their related genera were assessed based on nuclear internal transcribed spacer (ITS) and chloroplast rpl16 and trnL-F sequence data using maximum parsimony (MP) and Bayesian methods. In total, 24 species representing six genera of the ingroup and outgroup were sampled. Analyses of both separate and combined sequence data were conducted to resolve the systematic relationships of these genera. The results reveal that Lamiophlomis is nested within Phlomis sect. Phlomoides and its generic status is not supported. With the inclusion of Lamiophlomis rotata in sect. Phlomoides, sections Phlomis and Phlomoides of Phlomis were resolved as monophyletic. Paraphlomis was supported as an independent genus. However, the resolution of its monophyly conflicted between MP and Bayesian analyses, suggesting the need for expended sampling and further evidence. [source] A phylogeny of anisopterous dragonflies (Insecta, Odonata) using mtRNA genes and mixed nucleotide/doublet modelsJOURNAL OF ZOOLOGICAL SYSTEMATICS AND EVOLUTIONARY RESEARCH, Issue 4 2008G. Fleck Abstract The application of mixed nucleotide/doublet substitution models has recently received attention in RNA-based phylogenetics. Within a Bayesian approach, it was shown that mixed models outperformed analyses relying on simple nucleotide models. We analysed an mt RNA data set of dragonflies representing all major lineages of Anisoptera plus outgroups, using a mixed model in a Bayesian and parsimony (MP) approach. We used a published mt 16S rRNA secondary consensus structure model and inferred consensus models for the mt 12S rRNA and tRNA valine. Secondary structure information was used to set data partitions for paired and unpaired sites on which doublet or nucleotide models were applied, respectively. Several different doublet models are currently available of which we chose the most appropriate one by a Bayes factor test. The MP reconstructions relied on recoded data for paired sites in order to account for character covariance and an application of the ratchet strategy to find most parsimonious trees. Bayesian and parsimony reconstructions are partly differently resolved, indicating sensitivity of the reconstructions to model specification. Our analyses depict a tree in which the damselfly family Lestidae is sister group to a monophyletic clade Epiophlebia + Anisoptera, contradicting recent morphological and molecular work. In Bayesian analyses, we found a deep split between Libelluloidea and a clade ,Aeshnoidea' within Anisoptera largely congruent with Tillyard's early ideas of anisopteran evolution, which had been based on evidently plesiomorphic character states. However, parsimony analysis did not support a clade ,Aeshnoidea', but instead, placed Gomphidae as sister taxon to Libelluloidea. Monophyly of Libelluloidea is only modestly supported, and many inter-family relationships within Libelluloidea do not receive substantial support in Bayesian and parsimony analyses. We checked whether high Bayesian node support was inflated owing to either: (i) wrong secondary consensus structures; (ii) under-sampling of the MCMC process, thereby missing other local maxima; or (iii) unrealistic prior assumptions on topologies or branch lengths. We found that different consensus structure models exert strong influence on the reconstruction, which demonstrates the importance of taxon-specific realistic secondary structure models in RNA phylogenetics. [source] Influence of habitat discontinuity, geographical distance, and oceanography on fine-scale population genetic structure of copper rockfish (Sebastes caurinus)MOLECULAR ECOLOGY, Issue 13 2008M. L. JOHANSSON Abstract The copper rockfish is a benthic, nonmigratory, temperate rocky reef marine species with pelagic larvae and juveniles. A previous range-wide study of the population-genetic structure of copper rockfish revealed a pattern consistent with isolation-by-distance. This could arise from an intrinsically limited dispersal capability in the species or from regularly,spaced extrinsic barriers that restrict gene flow (offshore jets that advect larvae offshore and/or habitat patchiness). Tissue samples were collected along the West Coast of the contiguous USA between Neah Bay, WA and San Diego, CA, with dense sampling along Oregon. At the whole-coast scale (~2200 km), significant population subdivision (FST = 0.0042), and a significant correlation between genetic and geographical distance were observed based on 11 microsatellite DNA loci. Population divergence was also significant among Oregon collections (~450 km, FST = 0.001). Hierarchical amova identified a weak but significant 130-km habitat break as a possible barrier to gene flow within Oregon, across which we estimated that dispersal (Nem) is half that of the coast-wide average. However, individual-based Bayesian analyses failed to identify more than a single population along the Oregon coast. In addition, no correlation between pairwise population genetic and geographical distances was detected at this scale. The offshore jet at Cape Blanco was not a significant barrier to gene flow in this species. These findings are consistent with low larval dispersal distances calculated in previous studies on this species, support a mesoscale dispersal model, and highlight the importance of continuity of habitat and adult population size in maintaining gene flow. [source] Bayesian analyses of admixture in wild and domestic cats (Felis silvestris) using linked microsatellite lociMOLECULAR ECOLOGY, Issue 1 2006R. LECIS Abstract Methods recently developed to infer population structure and admixture mostly use individual genotypes described by unlinked neutral markers. However, Hardy,Weinberg and linkage disequilibria among independent markers decline rapidly with admixture time, and the admixture signals could be lost in a few generations. In this study, we aimed to describe genetic admixture in 182 European wild and domestic cats (Felis silvestris), which hybridize sporadically in Italy and extensively in Hungary. Cats were genotyped at 27 microsatellites, including 21 linked loci mapping on five distinct feline linkage groups. Genotypes were analysed with structure 2.1, a Bayesian procedure designed to model admixture linkage disequilibrium, which promises to assess efficiently older admixture events using tightly linked markers. Results showed that domestic and wild cats sampled in Italy were split into two distinct clusters with average proportions of membership Q > 0.90, congruent with prior morphological identifications. In contrast, free-living cats sampled in Hungary were assigned partly to the domestic and the wild cat clusters, with Q < 0.50. Admixture analyses of individual genotypes identified, respectively, 5/61 (8%), and 16,20/65 (25,31%) hybrids among the Italian wildcats and Hungarian free-living cats. Similar results were obtained in the past using unlinked loci, although the new linked markers identified additional admixed wildcats in Italy. Linkage analyses confirm that hybridization is limited in Italian, but widespread in Hungarian wildcats, a population that is threatened by cross-breeding with free-ranging domestic cats. The total panel of 27 loci performed better than the linked loci alone in the identification of domestic and known hybrid cats, suggesting that a large number of linked plus unlinked markers can improve the results of admixture analyses. Inferred recombination events led to identify the population of origin of chromosomal segments, suggesting that admixture mapping experiments can be designed also in wild populations. [source] Reevaluation of the Phylogenetic Relationship between Mobilid and Sessilid Peritrichs (Ciliophora, Oligohymenophorea) Based on Small Subunit rRNA Genes SequencesTHE JOURNAL OF EUKARYOTIC MICROBIOLOGY, Issue 5 2006YING-CHUN GONG ABSTRACT. Based on morphological characters, peritrich ciliates (Class Olygohymenophorea, Subclass Peritrichia) have been subdivided into the Orders Sessilida and Mobilida. Molecular phylogenetic studies on peritrichs have been restricted to members of the Order Sessilida. In order to shed more light into the evolutionary relationships within peritrichs, the complete small subunit rRNA (SSU rRNA) sequences of four mobilid species, Trichodina nobilis, Trichodina heterodentata, Trichodina reticulata, and Trichodinella myakkae were used to construct phylogenetic trees using maximum parsimony, neighbor joining, and Bayesian analyses. Whatever phylogenetic method used, the peritrichs did not constitute a monophyletic group: mobilid and sessilid species did not cluster together. Similarity in morphology but difference in molecular data led us to suggest that the oral structures of peritrichs are the result of evolutionary convergence. In addition, Trichodina reticulata, a Trichodina species with granules in the center of the adhesive disc, branched separately from its congeners, Trichodina nobilis and Trichodina heterodentata, trichodinids without such granules. This indicates that granules in the adhesive disc might be a phylogenetic character of high importance within the Family Trichodinidae. [source] The phylogenetic position of toadfishes (order Batrachoidiformes) in the higher ray-finned fish as inferred from partitioned Bayesian analysis of 102 whole mitochondrial genome sequencesBIOLOGICAL JOURNAL OF THE LINNEAN SOCIETY, Issue 3 2005MASAKI MIYA In a previous study based on 100 whole mitochondrial genome (mitogenome) sequences, we sought to provide a new perspective on the ordinal relationships of higher ray-finned fish (Actinopterygii). The study left unexplored the phylogenetic ,position, of, toadfishes, (order, Batrachoidiformes),, as, data, were, unavailable, owing, to, technical, difficulties. In the present study, we successfully determined mitogenomic sequences for two toadfish species (Batrachomoeus trispinosus and Porichthys myriaster) and found that the difficulties resulted from unusual gene arrangements and associated repetitive non-coding sequences. Unambiguously aligned, concatenated mitogenomic sequences (13 461 bp) from 102 higher actinopterygians (excluding the ND6 gene and control region) were divided into five partitions (1st, 2nd and 3rd codon positions of the protein-coding genes, tRNA genes and rRNA genes) and partitioned Bayesian analyses were conducted. The resultant phylogenies strongly suggest that the toadfishes are not members of relatively primitive higher actinopterygians (Paracanthopterygii), but belong to a crown group of actinopterygians (Percomorpha), as was demonstrated for ophidiiform eels (Ophidiiformes) and anglerfishes (Lophiiformes) in the previous study. We propose revised limits of major unranked categories for higher actinopterygians and a new name (Berycomorpha) for a clade comprising two reciprocally paraphyletic orders (Beryciformes and Stephanoberyciformes) based on the present mitogenomic phylogenies. © 2005 The Linnean Society of London, Biological Journal of the Linnean Society, 2005, 85, 289,306. [source] Bayesian Hierarchical Functional Data Analysis Via Contaminated Informative PriorsBIOMETRICS, Issue 3 2009Bruno Scarpa Summary A variety of flexible approaches have been proposed for functional data analysis, allowing both the mean curve and the distribution about the mean to be unknown. Such methods are most useful when there is limited prior information. Motivated by applications to modeling of temperature curves in the menstrual cycle, this article proposes a flexible approach for incorporating prior information in semiparametric Bayesian analyses of hierarchical functional data. The proposed approach is based on specifying the distribution of functions as a mixture of a parametric hierarchical model and a nonparametric contamination. The parametric component is chosen based on prior knowledge, while the contamination is characterized as a functional Dirichlet process. In the motivating application, the contamination component allows unanticipated curve shapes in unhealthy menstrual cycles. Methods are developed for posterior computation, and the approach is applied to data from a European fecundability study. [source] Bayesian Multivariate Logistic RegressionBIOMETRICS, Issue 3 2004Sean M. O'Brien Summary Bayesian analyses of multivariate binary or categorical outcomes typically rely on probit or mixed effects logistic regression models that do not have a marginal logistic structure for the individual outcomes. In addition, difficulties arise when simple noninformative priors are chosen for the covariance parameters. Motivated by these problems, we propose a new type of multivariate logistic distribution that can be used to construct a likelihood for multivariate logistic regression analysis of binary and categorical data. The model for individual outcomes has a marginal logistic structure, simplifying interpretation. We follow a Bayesian approach to estimation and inference, developing an efficient data augmentation algorithm for posterior computation. The method is illustrated with application to a neurotoxicology study. [source] Model Selection for Integrated Recovery/Recapture DataBIOMETRICS, Issue 4 2002R. King Summary. Catchpole et al. (1998, Biometrics 54, 33,46) provide a novel scheme for integrating both recovery and recapture data analyses and derive sufficient statistics that facilitate likelihood computations. In this article, we demonstrate how their efficient likelihood expression can facilitate Bayesian analyses of these kinds of data and extend their methodology to provide a formal framework for model determination. We consider in detail the issue of model selection with respect to a set of recapture/recovery histories of shags (Phalacrocorax aristotelis) and determine, from the enormous range of biologically plausible models available, which best describe the data. By using reversible jump Markov chain Monte Carlo methodology, we demonstrate how this enormous model space can be efficiently and effectively explored without having to resort to performing an infeasibly large number of pairwise comparisons or some ad hoc stepwise procedure. We find that the model used by Catchpole et al. (1998) has essentially zero posterior probability and that, of the 477,144 possible models considered, over 60% of the posterior mass is placed on three neighboring models with biologically interesting interpretations. [source] Effects of data incompleteness on the relative performance of parsimony and Bayesian approaches in a supermatrix phylogenetic reconstruction of Mustelidae and Procyonidae (Carnivora)CLADISTICS, Issue 2 2010Mieczyslaw Wolsan Missing data are commonly thought to impede a resolved or accurate reconstruction of phylogenetic relationships, and probabilistic analysis techniques are increasingly viewed as less vulnerable to the negative effects of data incompleteness than parsimony analyses. We test both assumptions empirically by conducting parsimony and Bayesian analyses on an approximately 1.5 × 106 -cell (27 965 characters × 52 species) mustelid,procyonid molecular supermatrix with 62.7% missing entries. Contrary to the first assumption, phylogenetic relationships inferred from our analyses are fully (Bayesian) or almost fully (parsimony) resolved topologically with mostly strong support and also largely in accord with prior molecular estimations of mustelid and procyonid phylogeny derived with parsimony, Bayesian, and other probabilistic analysis techniques from smaller but complete or nearly complete data sets. Contrary to the second assumption, we found no compelling evidence in support of a relationship between the inferior performance of parsimony and taxon incompleteness (i.e. the proportion of missing character data for a taxon), although we found evidence for a connection between the inferior performance of parsimony and character incompleteness (i.e. no overlap in character data between some taxa). The relatively good performance of our analyses may be related to the large number of sampled characters, so that most taxa (even highly incomplete ones) are represented by a sufficient number of characters allowing both approaches to resolve their relationships. © The Willi Hennig Society 2009. [source] Multilocus ribosomal RNA phylogeny of the leaf beetles (Chrysomelidae)CLADISTICS, Issue 1 2008Jesús Gómez-Zurita Basal relationships in the Chrysomelidae (leaf beetles) were investigated using two nuclear (small and partial large subunits) and mitochondrial (partial large subunit) rRNA (, 3000 bp total) for 167 taxa covering most major lineages and relevant outgroups. Separate and combined data analyses were performed under parsimony and model-based tree building algorithms from dynamic (direct optimization) and static (Clustal and BLAST) sequence alignments. The performance of methods differed widely and recovery of well established nodes was erratic, in particular when using single gene partitions, but showed a slight advantage for Bayesian inferences and one of the fast likelihood algorithms (PHYML) over others. Direct optimization greatly gained from simultaneous analysis and provided a valuable hypothesis of chrysomelid relationships. The BLAST-based alignment, which removes poorly aligned sequence segments, in combination with likelihood and Bayesian analyses, resulted in highly defensible trees obtained in much shorter time than direct optimization, and hence is a viable alternative when data sets grow. The main taxonomic findings include the recognition of three major lineages of Chrysomelidae, including a basal "sagrine" clade (Criocerinae, Donaciinae, Bruchinae), which was sister to the "eumolpine" (Spilopyrinae, Eumolpinae, Cryptocephalinae, Cassidinae) plus "chrysomeline" (Chrysomelinae, Galerucinae) clades. The analyses support a broad definition of subfamilies (i.e., merging previously separated subfamilies) in the case of Cassidinae (cassidines + hispines) and Cryptocephalinae (chlamisines + cryptocephalines + clytrines), whereas two subfamilies, Chrysomelinae and Eumolpinae, were paraphyletic. The surprising separation of monocot feeding Cassidinae (associated with the eumolpine clade) from the other major monocot feeding groups in the sagrine clade was well supported. The study highlights the need for thorough taxon sampling, and reveals that morphological data affected by convergence had a great impact when combined with molecular data in previous phylogenetic analyses of Chrysomelidae. © The Willi Hennig Society 2007. [source] A Survey of Model Evaluation Approaches With a Tutorial on Hierarchical Bayesian MethodsCOGNITIVE SCIENCE - A MULTIDISCIPLINARY JOURNAL, Issue 8 2008Richard M. Shiffrin Abstract This article reviews current methods for evaluating models in the cognitive sciences, including theoretically based approaches, such as Bayes factors and minimum description length measures; simulation approaches, including model mimicry evaluations; and practical approaches, such as validation and generalization measures. This article argues that, although often useful in specific settings, most of these approaches are limited in their ability to give a general assessment of models. This article argues that hierarchical methods, generally, and hierarchical Bayesian methods, specifically, can provide a more thorough evaluation of models in the cognitive sciences. This article presents two worked examples of hierarchical Bayesian analyses to demonstrate how the approach addresses key questions of descriptive adequacy, parameter interference, prediction, and generalization in principled and coherent ways. [source] Sensitivity to sampling in Bayesian word learningDEVELOPMENTAL SCIENCE, Issue 3 2007Fei Xu We report a new study testing our proposal that word learning may be best explained as an approximate form of Bayesian inference (Xu & Tenenbaum, in press). Children are capable of learning word meanings across a wide range of communicative contexts. In different contexts, learners may encounter different sampling processes generating the examples of word,object pairings they observe. An ideal Bayesian word learner could take into account these differences in the sampling process and adjust his/her inferences about word meaning accordingly. We tested how children and adults learned words for novel object kinds in two sampling contexts, in which the objects to be labeled were sampled either by a knowledgeable teacher or by the learners themselves. Both adults and children generalized more conservatively in the former context; that is, they restricted the label to just those objects most similar to the labeled examples when the exemplars were chosen by a knowledgeable teacher, but not when chosen by the learners themselves. We discuss how this result follows naturally from a Bayesian analysis, but not from other statistical approaches such as associative word-learning models. [source] Using BiowinÔ, Bayes, and batteries to predict ready biodegradabilityENVIRONMENTAL TOXICOLOGY & CHEMISTRY, Issue 4 2004Robert S. Boethling Abstract Wether or not a given chemical substance is readily biodegradable is an important piece of information in risk screening for both new and existing chemicals. Despite the relatively low cost of Organization for Economic Cooperation and Development tests, data are often unavailable and biodegradability must be estimated. In this paper, we focus on the predictive value of selected BiowinÔ models and model batteries using Bayesian analysis. Posterior probabilities, calculated based on performance with the model training sets using Bayes' theorem, were closely matched by actual performance with an expanded set of 374 premanufacture notice (PMN) substances. Further analysis suggested that a simple battery consisting of Biowin3 (survey ultimate biodegradation model) and Biowin5 (Ministry of International Trade and Industry [MITI] linear model) would have enhanced predictive power in comparison to individual models. Application of the battery to PMN substances showed that performance matched expectation. This approach significantly reduced both false positives for ready biodegradability and the overall misclassification rate. Similar results were obtained for a set of 63 pharmaceuticals using a battery consisting of Biowin3 and Biowin6 (MITI nonlinear model). Biodegradation data for PMNs tested in multiple ready tests or both inherent and ready biodegradation tests yielded additional insights that may be useful in risk screening. [source] Geographic Variation of Pediatric Burn Injuries in a Metropolitan AreaACADEMIC EMERGENCY MEDICINE, Issue 7 2003Kristine G. Williams MD Objectives: To use a geographic information system (GIS) and spatial statistics to describe the geographic variation of burn injuries in children 0,14 years of age in a major metropolitan area. Methods: The authors reviewed patient records for burn injuries treated during 1995 at the two children's hospitals in St. Louis. Patient addresses were matched to block groups using a GIS, and block group burn injury rates were calculated. Mapping software and Bayesian analysis were used to create maps of burn injury rates and risks in the city of St. Louis. Results: Three hundred eleven children from the city of St. Louis were treated for burn injuries in 1995. The authors identified an area of high incidence for burn injuries in North St. Louis. The filtered rate contour was 6 per 1,000 children at risk, with block group rates within the area of 0 to 58.8 per 1,000 children at risk. Hierarchical Bayesian analysis of North St. Louis burn data revealed a relative risk range of 0.8771 to 1.182 for census tracts within North St. Louis, suggesting that there may be pockets of high risk within an already identified high-risk area. Conclusions: This study shows the utility of geographic mapping in providing information about injury patterns within a defined area. The combination of mapping injury rates and spatial statistical analysis provides a detailed level of injury surveillance, allowing for identification of small geographic areas with elevated rates of specific injuries. [source] Bayesian analysis of dynamic factor models: an application to air pollution and mortality in São Paulo, BrazilENVIRONMETRICS, Issue 6 2008T. Sáfadi Abstract The Bayesian estimation of a dynamic factor model where the factors follow a multivariate autoregressive model is presented. We derive the posterior distributions for the parameters and the factors and use Monte Carlo methods to compute them. The model is applied to study the association between air pollution and mortality in the city of São Paulo, Brazil. Statistical analysis was performed through a Bayesian analysis of a dynamic factor model. The series considered were minimal temperature, relative humidity, air pollutant of PM10 and CO, mortality circulatory disease and mortality respiratory disease. We found a strong association between air pollutant (PM10), Humidity and mortality respiratory disease for the city of São Paulo. Copyright © 2007 John Wiley & Sons, Ltd. [source] Bayesian analysis of changes in Radiosonde Atmospheric TemperatureINTERNATIONAL JOURNAL OF CLIMATOLOGY, Issue 5 2009Christoph Schleip Abstract This paper describes long-term changes of global atmospheric temperature, using a strict Bayesian approach which considers three different models to describe the time series: the constant model, the linear model and a change point model. The change point model allows the description of nonlinear annual rates of change with associated confidence intervals. We calculate the probabilities of each of the three models and average finally over these models to obtain the expected functional behaviour and rate of change in temperature with annual resolution. We apply this procedure to a new homogenized Radiosonde Atmospheric Temperature Products for Assessing Climate (RATPAC-A) data set. Annual mean temperature for 13 pressure levels from the surface to 30 hPa is examined. Residual sums of squares reveal that Bayesian-model-averaged function descriptions and rates of changes are especially useful and informative for the surface, troposphere and tropopause and less appropriate for the stratosphere. From the surface up to the tropopause (200,100 hPa), the results reveal that the change point model provides the best data fit. Despite the occurrence of two volcanic eruptions El Chicón (1982) and Mt. Pinatubo (1991), the stratosphere (70,30 hPa) shows a preference for the linear model (60%). The near surface changes exhibit comparatively high change point probability around 1985 and 1995, whereas those at the tropopause level are highest between 1995 and 2000. For the surface and troposphere the model-averaged functional behaviour increases quite constantly, whereas the model-averaged functional behaviour for the tropopause decreases until the end of the 1990s and increases from 2000 onwards. The limitations of the currently used radiosonde data render interpretation of the observed changes difficult. Additionally undetected change points may result from our limited model space. In future it should be tested whether a multiple change point model provides a better data description for the stratosphere. Copyright © 2008 Royal Meteorological Society [source] Frequency distribution of a Cys430Ser polymorphism in peroxisome proliferator-activated receptor-gamma coactivator-1 (PPARGC1) gene sequence in Chinese and Western pig breedsJOURNAL OF ANIMAL BREEDING AND GENETICS, Issue 1 2005T. Kunej Summary Identification of major genes, that genetically impact fat tissue formation is important for successful selection of lean animals with good meat quality. Because of its central role in fat cell differentiation and muscle fibre type determination, PPARGC1 is a potential candidate gene affecting fattening traits and pig meat quality. In this study, a T/A substitution at position 1378 (GenBank accession no. AY346131) in the porcine PPARGC1 gene causing a Cys430Ser amino acid substitution at position 430 was genotyped on a total of 239 animals, including 101 from seven Chinese and 138 from six Western pig breeds. Bayesian analysis revealed that the mean frequency of allele T (Cys) was 92.64 ± 4.82% in Chinese pigs, and 45.99 ± 4.13% in Western pigs. The 95% interval of the posterior mean frequency of allele T was 0.82,1.00 in Chinese pigs and 0.38,0.54 in Western pigs, indicating these two groups of pigs diverged at this locus during genetic evolution of the breed. Because marked differences in fat and lean tissue deposition exist between Western and Chinese pig breeds, this Cys430Ser exchange in the PPARGC1 gene deserves further evaluation to determine its phenotypic effect on fattening and carcass traits in commercial pig populations. [source] Reference and probability-matching priors in Bayesian analysis of mixed linear modelsJOURNAL OF ANIMAL BREEDING AND GENETICS, Issue 5 2002A. L. Pretorius Summary Determination of reasonable non-informative priors in multiparameter problems is not easy; common non-informative priors, such as Jeffrey's prior, can have features that have an unexpectedly dramatic effect on the posterior. In recognition of this problem Berger and Bernardo (Bayesian Statistics IV. Oxford University Press, Oxford, UK, pp. 35,70, 1992), proposed the Reference Prior approach to the development of non-informative priors. In the present paper the reference priors of Berger and Bernardo (1992) are derived for the mixed linear model. In spite of these difficulties, there is growing evidence, mainly through examples that reference priors provide ,sensible' answers from a Bayesian point of view. We also examine whether the reference priors satisfy the probability-matching criterion. The theory and results are applied to a real problem consisting of 879 weaning weight records, from the progeny of 17 sires. These important aspects are explored via Monte Carlo simulations. Zusammenfassung Reference und Probability-Matching Priors in Bayesian Analysen von gemischten linearen Modellen Die Festlegung von vernünftigen Non-Informative Priors in Multi-Paramter Analysen ist nicht leicht; gewöhnliche Non-Informative Priors, wie beispielsweise Jeffrey's Priors, können unerwartete dramatische Effekte auf die Lösungen haben. Zur Lösung dieses Problems schlagen Berger and Bernardo (Bayesian Statistics IV. Oxford University Press, Oxford, UK, pp. 35,70, 1992) den Reference Prior Ansatz zur Entwicklung von Non-Informative Priors vor. In der vorliegenden Untersuchung werden die von Berger and Bernardo (1992) vorgeschlagenen Reference Priors in linearen gemischten Modellen angewandt. Trotz der bekannten Schwierigkeiten gibt es hauptsächlich anhand von Beispielen mehr und mehr Anhaltspunkte, dass die Reference Priors aus der Sicht von Bayesians zu brauchbaren Antworten führen. Es wurde auch untersucht, ob diese Reference Priors den Probability-Matching Kriterien genügen. Die Theorie und die Ergebnisse sind an einem Beispiel mit 879 Datensätzen mit Absatzgewichten von Schafen, die von 17 Böcken abstammen, verifiziert worden. Die wichtigen Aspekte sind mittels Monte Carlo Simulation untersucht worden. [source] |