Home About us Contact | |||
Complex Models (complex + models)
Selected AbstractsData Preparation for Real-time High Quality Rendering of Complex ModelsCOMPUTER GRAPHICS FORUM, Issue 3 2006Reinhard Klein The capability of current 3D acquisition systems to digitize the geometry reflection behaviour of objects as well as the sophisticated application of CAD techniques lead to rapidly growing digital models which pose new challenges for interaction and visualization. Due to the sheer size of the geometry as well as the texture and reflection data which are often in the range of several gigabytes, efficient techniques for analyzing, compressing and rendering are needed. In this talk I will present some of the research we did in our graphics group over the past years motivated by industrial partners in order to automate the data preparation step and allow for real-time high quality rendering e.g. in the context of VR-applications. Strength and limitations of the different techniques will be discussed and future challenges will be identified. The presentation will go along with live demonstrations. [source] Score Tests for Exploring Complex Models: Application to HIV Dynamics ModelsBIOMETRICAL JOURNAL, Issue 1 2010Julia Drylewicz Abstract In biostatistics, more and more complex models are being developed. This is particularly the case in system biology. Fitting complex models can be very time-consuming, since many models often have to be explored. Among the possibilities are the introduction of explanatory variables and the determination of random effects. The particularity of this use of the score test is that the null hypothesis is not itself very simple; typically, some random effects may be present under the null hypothesis. Moreover, the information matrix cannot be computed, but only an approximation based on the score. This article examines this situation with the specific example of HIV dynamics models. We examine the score test statistics for testing the effect of explanatory variables and the variance of random effect in this complex situation. We study type I errors and the statistical powers of this score test statistics and we apply the score test approach to a real data set of HIV-infected patients. [source] Deformation Transfer to Multi-Component ObjectsCOMPUTER GRAPHICS FORUM, Issue 2 2010Kun Zhou Abstract We present a simple and effective algorithm to transfer deformation between surface meshes with multiple components. The algorithm automatically computes spatial relationships between components of the target object, builds correspondences between source and target, and finally transfers deformation of the source onto the target while preserving cohesion between the target's components. We demonstrate the versatility of our approach on various complex models. [source] Dynamic Sampling and Rendering of Algebraic Point Set SurfacesCOMPUTER GRAPHICS FORUM, Issue 2 2008Gaël Guennebaud Abstract Algebraic Point Set Surfaces (APSS) define a smooth surface from a set of points using local moving least-squares (MLS) fitting of algebraic spheres. In this paper we first revisit the spherical fitting problem and provide a new, more generic solution that includes intuitive parameters for curvature control of the fitted spheres. As a second contribution we present a novel real-time rendering system of such surfaces using a dynamic up-sampling strategy combined with a conventional splatting algorithm for high quality rendering. Our approach also includes a new view dependent geometric error tailored to efficient and adaptive up-sampling of the surface. One of the key features of our system is its high degree of flexibility that enables us to achieve high performance even for highly dynamic data or complex models by exploiting temporal coherence at the primitive level. We also address the issue of efficient spatial search data structures with respect to construction, access and GPU friendliness. Finally, we present an efficient parallel GPU implementation of the algorithms and search structures. [source] Multiresolution Surface Representation Based on Displacement VolumesCOMPUTER GRAPHICS FORUM, Issue 3 2003Mario Botsch We propose a new representation for multiresolution models which uses volume elements enclosed between thedifferent resolution levels to encode the detail information. Keeping these displacement volumes locally constantduring a deformation of the base surface leads to a natural behaviour of the detail features. The correspondingreconstruction operator can be implemented efficiently by a hierarchical iterative relaxation scheme, providingclose to interactive response times for moderately complex models. Based on this representation we implement a multiresolution editing tool for irregular polygon meshes that allowsthe designer to freely edit the base surface of a multiresolution model without having to care about self-intersectionsin the respective detailed surface. We demonstrate the effectiveness and robustness of the reconstructionby several examples with real-world data. [source] Public policy and corporate environmental behaviour: a broader viewCORPORATE SOCIAL RESPONSIBILITY AND ENVIRONMENTAL MANAGEMENT, Issue 5 2008Runa Sarkar Abstract Corporate strategies to manage the business,ecological environment interface have evolved against the backdrop of regulatory pressures and stakeholder activism. Despite its relevance with respect to sustainable development, a well developed theory encompassing all aspects of corporate environmental behaviour, especially incorporating incentive compatible public policy measures, is yet to be developed. This paper is a step in this direction, aiming to assimilate contributions related to different aspects of corporate environmental behaviour, capturing the transition from environmental management to environmental strategy. In the process we identify areas where there is a need for further research. We find that there is plenty of scope in developing more complex models to explain a manager's rationale for adopting sustainable strategies in the backdrop of the policy regime, and in conducting more empirical (both descriptive and quantitative) work to obtain clearer insights into managerial decisions. Copyright © 2007 John Wiley & Sons, Ltd and ERP Environment. [source] Comparison of soil moisture and meteorological controls on pine and spruce transpirationECOHYDROLOGY, Issue 3 2008Eric E. Small Abstract Transpiration is an important component of the water balance in the high elevation headwaters of semi-arid drainage basins. We compare the importance of soil moisture and meteorological controls on transpiration and quantify how these controls are different at a ponderosa pine site and a spruce site in the Jemez river drainage basin of northern New Mexico, a sub-basin of the Rio Grande. If only soil moisture controls fluctuations in transpiration, then simple hydrologic models focussed only on soil moisture limitations are reasonable for water balance studies. If meteorological controls are also critical, then more complex models are required. We measured volumetric water content in the soil and sap velocity, and assumed that transpiration is proportional to sap velocity. Ponderosa sap velocity varies with root zone soil moisture. Nearly all of the scatter in the ponderosa sap velocity,soil moisture relationship can be predicted using a simple model of potential evapotranspiration (ET), which depends only on measured incident radiation and air temperature. Therefore, simple hydrologic models of ponderosa pine transpiration are warranted. In contrast, spruce sap velocity does not clearly covary with soil moisture. Including variations in potential evapotranspiration does not clarify the relationship between sap velocity and soil moisture. Likewise, variations in radiation, air temperature, and vapour pressure do not explain the observed fluctuations in sap velocity, at least according to the standard models and parameters for meteorological restrictions on transpiration. Both the simple and more complex models commonly used to predict transpiration are not adequate to model the water balance in the spruce forest studied here. Copyright © 2008 John Wiley & Sons, Ltd. [source] Empirical Bayes estimators and non-parametric mixture models for space and time,space disease mapping and surveillanceENVIRONMETRICS, Issue 5 2003Dankmar Böhning Abstract The analysis of the geographic variation of disease and its representation on a map is an important topic in epidemiological research and in public health in general. Identification of spatial heterogeneity of relative risk using morbidity and mortality data is required. Frequently, interest is also in the analysis of space data with respect to time, where typically data are used which are aggregated in certain time windows like 5 or 10 years. The occurrence measure of interest is usually the standardized mortality (morbidity) ratio (SMR). It is well known that disease maps in space or in space and time should not solely be based upon the crude SMR but rather some smoothed version of it. This fact has led to a tremendous amount of theoretical developments in spatial methodology, in particular in the area of hierarchical modeling in connection with fully Bayesian estimation techniques like Markov chain Monte Carlo. It seems, however, that at the same time, where these theoretical developments took place, on the practical side only very few of these developments have found their way into daily practice of epidemiological work and surveillance routines. In this article we focus on developments that avoid the pitfalls of the crude SMR and simultaneously retain a simplicity and, at least approximately, the validity of more complex models. After an illustration of the typical pitfalls of the crude SMR the article is centered around three issues: (a) the separation of spatial random variation from spatial structural variation; (b) a simple mixture model for capturing spatial heterogeneity; (c) an extension of this model for capturing temporal information. The techniques are illustrated by numerous examples. Public domain software like Dismap is mentioned that enables easy mixture modeling in the context of disease mapping. Copyright © 2003 John Wiley & Sons, Ltd. [source] Contemporary Models of Youth Development and Problem Prevention: Toward an Integration of Terms, Concepts, and ModelsFAMILY RELATIONS, Issue 1 2004Stephen Small Over the past several years, increased interest in preventing youth problems and promoting healthy youth development has led youth and family practitioners, policy makers, and researchers to develop a wide range of approaches based on various theoretical frameworks. Although the growth in guiding frameworks has led to more complex models and a greater diversity in the options available to scholars and practitioners, the lack of an integrative conceptual scheme and consistent terminology has led to some confusion in the field. Here, we provide an overview of three approaches to youth development and problem prevention, critically examine their strengths and weaknesses, and offer some elaborations to help clarify, extend, and integrate the models. We conclude by discussing some general implications for researchers, practitioners, and policy makers. [source] Model complexity versus scatter in fatigueFATIGUE & FRACTURE OF ENGINEERING MATERIALS AND STRUCTURES, Issue 11 2004T. SVENSSON ABSTRACT Fatigue assessment in industry is often based on simple empirical models, such as the Wöhler curve or the Paris' law. In contrast, fatigue research to a great extent works with very complex models, far from the engineering practice. One explanation for this discrepancy is that the scatter in service fatigue obscures many of the subtle phenomena that can be studied in a laboratory. Here we use a statistical theory for stepwise regression to investigate the role of scatter in the choice of model complexity in fatigue. The results suggest that the amount of complexity used in different design concepts reflects the appreciated knowledge about input parameters. The analysis also points out that even qualitative knowledge about the neglected complexity may be important in order to avoid systematic errors. [source] MCMC-based linkage analysis for complex traits on general pedigrees: multipoint analysis with a two-locus model and a polygenic componentGENETIC EPIDEMIOLOGY, Issue 2 2007Yun Ju Sung Abstract We describe a new program lm_twoqtl, part of the MORGAN package, for parametric linkage analysis with a quantitative trait locus (QTL) model having one or two QTLs and a polygenic component, which models additional familial correlation from other unlinked QTLs. The program has no restriction on number of markers or complexity of pedigrees, facilitating use of more complex models with general pedigrees. This is the first available program that can handle a model with both two QTLs and a polygenic component. Competing programs use only simpler models: one QTL, one QTL plus a polygenic component, or variance components (VC). Use of simple models when they are incorrect, as for complex traits that are influenced by multiple genes, can bias estimates of QTL location or reduce power to detect linkage. We compute the likelihood with Markov Chain Monte Carlo (MCMC) realization of segregation indicators at the hypothesized QTL locations conditional on marker data, summation over phased multilocus genotypes of founders, and peeling of the polygenic component. Simulated examples, with various sized pedigrees, show that two-QTL analysis correctly identifies the location of both QTLs, even when they are closely linked, whereas other analyses, including the VC approach, fail to identify the location of QTLs with modest contribution. Our examples illustrate the advantage of parametric linkage analysis with two QTLs, which provides higher power for linkage detection and better localization than use of simpler models. Genet. Epidemiol. © 2006 Wiley-Liss, Inc. [source] Analysis of multilocus models of associationGENETIC EPIDEMIOLOGY, Issue 1 2003B. Devlin Abstract It is increasingly recognized that multiple genetic variants, within the same or different genes, combine to affect liability for many common diseases. Indeed, the variants may interact among themselves and with environmental factors. Thus realistic genetic/statistical models can include an extremely large number of parameters, and it is by no means obvious how to find the variants contributing to liability. For models of multiple candidate genes and their interactions, we prove that statistical inference can be based on controlling the false discovery rate (FDR), which is defined as the expected number of false rejections divided by the number of rejections. Controlling the FDR automatically controls the overall error rate in the special case that all the null hypotheses are true. So do more standard methods such as Bonferroni correction. However, when some null hypotheses are false, the goals of Bonferroni and FDR differ, and FDR will have better power. Model selection procedures, such as forward stepwise regression, are often used to choose important predictors for complex models. By analysis of simulations of such models, we compare a computationally efficient form of forward stepwise regression against the FDR methods. We show that model selection includes numerous genetic variants having no impact on the trait, whereas FDR maintains a false-positive rate very close to the nominal rate. With good control over false positives and better power than Bonferroni, the FDR-based methods we introduce present a viable means of evaluating complex, multivariate genetic models. Naturally, as for any method seeking to explore complex genetic models, the power of the methods is limited by sample size and model complexity. Genet Epidemiol 25:36,47, 2003. © 2003 Wiley-Liss, Inc. [source] The GEOTOP snow moduleHYDROLOGICAL PROCESSES, Issue 18 2004Fabrizio Zanotti Abstract A snow accumulation and melt module implemented in the GEOTOP model is presented and tested. GEOTOP, a distributed model of the hydrological cycle, based on digital elevation models (DEMs), calculates the discharge at the basin outlet and estimates the local and distributed values of several hydro-meteorological quantities. It solves the energy and the mass balance jointly and deals accurately with the effects of topography on the interactions among radiation physics, energy balance and the hydrological cycle. Soil properties are considered to depend on soil temperature and moisture, and the heat and water transfer in the soil is modelled using a multilayer approach. The snow module solves for the soil,snow energy and mass exchanges, and, together with a runoff production module, is embedded in a more general energy balance model that provides all the boundary conditions required. The snowpack is schematized as a single snow layer where a limited number of physical processes are described. The module can be seen essentially as a parameter-free model. The application to an alpine catchment (Rio Valbiolo, Trentino, Italy), monitored by an in situ snow-depth sensor, is discussed and shown to give results comparable to those of more complex models. Copyright © 2004 John Wiley & Sons, Ltd. [source] A modelling strategy for the analysis of clinical trials with partly missing longitudinal dataINTERNATIONAL JOURNAL OF METHODS IN PSYCHIATRIC RESEARCH, Issue 3 2003Ian R. White Abstract Standard statistical analyses of randomized controlled trials with partially missing outcome data often exclude valuable information from individuals with incomplete follow-up. This may lead to biased estimates of the intervention effect and loss of precision. We consider a randomized trial with a repeatedly measured outcome, in which the value of the outcome on the final occasion is of primary interest. We propose a modelling strategy in which the model is successively extended to include baseline values of the outcome, then intermediate values of the outcome, and finally values of other outcome variables. Likelihood-based estimation of random effects models is used, allowing the incorporation of data from individuals with some missing outcomes. Each estimated intervention effect is free of non-response bias under a different missing-at-random assumption. These assumptions become more plausible as the more complex models are fitted, so we propose using the trend in estimated intervention effects to assess the nature of any non-response bias. The methods are applied to data from a trial comparing intensive case management with standard case management for severely psychotic patients. All models give similar estimates of the intervention effect and we conclude that non-response bias is likely to be small. Copyright © 2003 Whurr Publishers Ltd. [source] Identifiability of parameters and behaviour of MCMC chains: a case study using the reaction norm modelJOURNAL OF ANIMAL BREEDING AND GENETICS, Issue 2 2009M.M. Shariati Summary Markov chain Monte Carlo (MCMC) enables fitting complex hierarchical models that may adequately reflect the process of data generation. Some of these models may contain more parameters than can be uniquely inferred from the distribution of the data, causing non-identifiability. The reaction norm model with unknown covariates (RNUC) is a model in which unknown environmental effects can be inferred jointly with the remaining parameters. The problem of identifiability of parameters at the level of the likelihood and the associated behaviour of MCMC chains were discussed using the RNUC as an example. It was shown theoretically that when environmental effects (covariates) are considered as random effects, estimable functions of the fixed effects, (co)variance components and genetic effects are identifiable as well as the environmental effects. When the environmental effects are treated as fixed and there are other fixed factors in the model, the contrasts involving environmental effects, the variance of environmental sensitivities (genetic slopes) and the residual variance are the only identifiable parameters. These different identifiability scenarios were generated by changing the formulation of the model and the structure of the data and the models were then implemented via MCMC. The output of MCMC sampling schemes was interpreted in the light of the theoretical findings. The erratic behaviour of the MCMC chains was shown to be associated with identifiability problems in the likelihood, despite propriety of posterior distributions, achieved by arbitrarily chosen uniform (bounded) priors. In some cases, very long chains were needed before the pattern of behaviour of the chain may signal the existence of problems. The paper serves as a warning concerning the implementation of complex models where identifiability problems can be difficult to detect a priori. We conclude that it would be good practice to experiment with a proposed model and to understand its features before embarking on a full MCMC implementation. [source] Reliable computing in estimation of variance componentsJOURNAL OF ANIMAL BREEDING AND GENETICS, Issue 6 2008I. Misztal Summary The purpose of this study is to present guidelines in selection of statistical and computing algorithms for variance components estimation when computing involves software packages. For this purpose two major methods are to be considered: residual maximal likelihood (REML) and Bayesian via Gibbs sampling. Expectation-Maximization (EM) REML is regarded as a very stable algorithm that is able to converge when covariance matrices are close to singular, however it is slow. However, convergence problems can occur with random regression models, especially if the starting values are much lower than those at convergence. Average Information (AI) REML is much faster for common problems but it relies on heuristics for convergence, and it may be very slow or even diverge for complex models. REML algorithms for general models become unstable with larger number of traits. REML by canonical transformation is stable in such cases but can support only a limited class of models. In general, REML algorithms are difficult to program. Bayesian methods via Gibbs sampling are much easier to program than REML, especially for complex models, and they can support much larger datasets; however, the termination criterion can be hard to determine, and the quality of estimates depends on a number of details. Computing speed varies with computing optimizations, with which some large data sets and complex models can be supported in a reasonable time; however, optimizations increase complexity of programming and restrict the types of models applicable. Several examples from past research are discussed to illustrate the fact that different problems required different methods. [source] Description of growth by simple versus complex models for Baltic Sea spring spawning herringJOURNAL OF APPLIED ICHTHYOLOGY, Issue 1 2001J. Gröger The objective was to find a length,growth model to help differentiate between herring stocks (Clupea harengus l.) when their length,growth shows systematically different patterns. The most essential model restriction was that it should react robustly against variations in the underlying age range which varies not only over time but also between the different herring stocks. Because of the limited age range, significance tests as well as confidence intervals of the model parameters should allow a small sample restriction. Thus, parameter estimation should be of an analytical rather than asymptotic nature and the model should contain a minimum set of parameters. The article studies the comparative characteristics of a simple non-asymptotic two-parameter growth model (allometric length,growth function, abbreviated as ALG model) in contrast to higher parametric and more complex growth models (logistic and von-Bertalanffy growth functions, abbreviated as LGF and VBG models). An advantage of the ALG model is that it can be easily linearized and the growth coefficients can be directly derived as regression parameters. The intrinsic ALG model linearity makes it easy to test restrictions (normality, homoscedasticity and serial uncorrelation of the error term) and to formulate analytic confidence intervals. The ALG model features were exemplified and validated by a 1995 Baltic spring spawning herring (BSSH) data set that included a 12-year age range. The model performance was compared with that of the logistic and the von-Bertalanffy length,growth curves for different age ranges and by means of various parameter estimation techniques. In all cases the ALG model performed better and all ALG model restrictions (no autocorrelation, homoscedasticity, and normality of the error term) were fulfilled. Furthermore, all findings seemed to indicate a pseudo-asymptotic growth for BSSH. The proposed model was explicitly derived for of herring length-growth; the results thus should not be generalized interspecifically without additional proof. [source] Non-parametric statistical methods for multivariate calibration model selection and comparison,JOURNAL OF CHEMOMETRICS, Issue 12 2003Edward V. Thomas Abstract Model selection is an important issue when constructing multivariate calibration models using methods based on latent variables (e.g. partial least squares regression and principal component regression). It is important to select an appropriate number of latent variables to build an accurate and precise calibration model. Inclusion of too few latent variables can result in a model that is inaccurate over the complete space of interest. Inclusion of too many latent variables can result in a model that produces noisy predictions through incorporation of low-order latent variables that have little or no predictive value. Commonly used metrics for selecting the number of latent variables are based on the predicted error sum of squares (PRESS) obtained via cross-validation. In this paper a new approach for selecting the number of latent variables is proposed. In this new approach the prediction errors of individual observations (obtained from cross-validation) are compared across models incorporating varying numbers of latent variables. Based on these comparisons, non-parametric statistical methods are used to select the simplest model (least number of latent variables) that provides prediction quality that is indistinguishable from that provided by more complex models. Unlike methods based on PRESS, this new approach is robust to the effects of anomalous observations. More generally, the same approach can be used to compare the performance of any models that are applied to the same data set where reference values are available. The proposed methodology is illustrated with an industrial example involving the prediction of gasoline octane numbers from near-infrared spectra. Published in 2004 by John Wiley & Sons, Ltd. [source] Using phylochronology to reveal cryptic population histories: review and synthesis of 29 ancient DNA studiesMOLECULAR ECOLOGY, Issue 7 2009UMA RAMAKRISHNAN Abstract The evolutionary history of a population involves changes in size, movements and selection pressures through time. Reconstruction of population history based on modern genetic data tends to be averaged over time or to be biased by generally reflecting only recent or extreme events, leaving many population historic processes undetected. Temporal genetic data present opportunities to reveal more complex population histories and provide important insights into what processes have influenced modern genetic diversity. Here we provide a synopsis of methods available for the analysis of ancient genetic data. We review 29 ancient DNA studies, summarizing the analytical methods and general conclusions for each study. Using the serial coalescent and a model-testing approach, we then re-analyse data from two species represented by these data sets in a common interpretive framework. Our analyses show that phylochronologic data can reveal more about population history than modern data alone, thus revealing ,cryptic' population processes, and enable us to determine whether simple or complex models best explain the data. Our re-analyses point to the need for novel methods that consider gene flow, multiple populations and population size in reconstruction of population history. We conclude that population genetic samples over large temporal and geographical scales, when analysed using more complex models and the serial coalescent, are critical to understand past population dynamics and provide important tools for reconstructing the evolutionary process. [source] Statistical hypothesis testing in intraspecific phylogeography: nested clade phylogeographical analysis vs. approximate Bayesian computationMOLECULAR ECOLOGY, Issue 2 2009ALAN R. TEMPLETON Abstract Nested clade phylogeographical analysis (NCPA) and approximate Bayesian computation (ABC) have been used to test phylogeographical hypotheses. Multilocus NCPA tests null hypotheses, whereas ABC discriminates among a finite set of alternatives. The interpretive criteria of NCPA are explicit and allow complex models to be built from simple components. The interpretive criteria of ABC are ad hoc and require the specification of a complete phylogeographical model. The conclusions from ABC are often influenced by implicit assumptions arising from the many parameters needed to specify a complex model. These complex models confound many assumptions so that biological interpretations are difficult. Sampling error is accounted for in NCPA, but ABC ignores important sources of sampling error that creates pseudo-statistical power. NCPA generates the full sampling distribution of its statistics, but ABC only yields local probabilities, which in turn make it impossible to distinguish between a good fitting model, a non-informative model, and an over-determined model. Both NCPA and ABC use approximations, but convergences of the approximations used in NCPA are well defined whereas those in ABC are not. NCPA can analyse a large number of locations, but ABC cannot. Finally, the dimensionality of tested hypothesis is known in NCPA, but not for ABC. As a consequence, the ,probabilities' generated by ABC are not true probabilities and are statistically non-interpretable. Accordingly, ABC should not be used for hypothesis testing, but simulation approaches are valuable when used in conjunction with NCPA or other methods that do not rely on highly parameterized models. [source] SOFTWARE ENGINEERING CONSIDERATIONS FOR INDIVIDUAL-BASED MODELSNATURAL RESOURCE MODELING, Issue 1 2002GLEN E. ROPELLA ABSTRACT. Software design is much more important for individual-based models (IBMs) than it is for conventional models, for three reasons. First, the results of an IBM are the emergent properties of a system of interacting agents that exist only in the software; unlike analytical model results, an IBMs outcomes can be reproduced only by exactly reproducing its software implementation. Second, outcomes of an IBM are expected to be complex and novel, making software errors difficult to identify. Third, an IBM needs ,systems software' that manages populations of multiple kinds of agents, often has nonlinear and multi-threaded process control and simulates a wide range of physical and biological processes. General software guidelines for complex models are especially important for IBMs. (1) Have code critically reviewed by several people. (2) Follow prudent release management prac-tices, keeping careful control over the software as changes are implemented. (3) Develop multiple representations of the model and its software; diagrams and written descriptions of code aid design and understanding. (4) Use appropriate and widespread software tools which provide numerous major benefits; coding ,from scratch' is rarely appropriate. (5) Test the software continually, following a planned, multi-level, exper-imental strategy. (6) Provide tools for thorough, pervasive validation and verification. (7) Pay attention to how pseudorandom numbers are generated and used. Additional guidelines for IBMs include: (a) design the model's organization before starting to write code,(b) provide the ability to observe all parts of the model from the beginning,(c) make an extensive effort to understand how the model executes how often different pieces of code are called by which objects, and (d) design the software to resemble the system being mod-eled, which helps maintain an understanding of the software. Strategies for meeting these guidelines include planning adequate resources for software development, using software professionals to implement models and using tools like Swarm that are designed specifically for IBMs. [source] Artificial neural networks as statistical tools in epidemiological studies: analysis of risk factors for early infant wheezePAEDIATRIC & PERINATAL EPIDEMIOLOGY, Issue 6 2004Andrea Sherriff Summary Artificial neural networks (ANNs) are being used increasingly for the prediction of clinical outcomes and classification of disease phenotypes. A lack of understanding of the statistical principles underlying ANNs has led to widespread misuse of these tools in the biomedical arena. In this paper, the authors compare the performance of ANNs with that of conventional linear logistic regression models in an epidemiological study of infant wheeze. Data on the putative risk factors for infant wheeze have been obtained from a sample of 7318 infants taking part in the Avon Longitudinal Study of Parents and Children (ALSPAC). The data were analysed using logistic regression models and ANNs, and performance based on misclassification rates of a validation data set were compared. Misclassification rates in the training data set decreased as the complexity of the ANN increased: h = 0: 17.9%; h = 2: 16.2%; h = 5: 14.9%, and h = 10: 9.2%. However, the more complex models did not generalise well to new data sets drawn from the same population: validation data set misclassification rates: h = 0: 17.9%; h = 2: 19.6%; h = 5: 20.2% and h = 10: 22.9%. There is no evidence from this study that ANNs outperform conventional methods of analysing epidemiological data. Increasing the complexity of the models serves only to overfit the model to the data. It is important that a validation or test data set is used to assess the performance of highly complex ANNs to avoid overfitting. [source] Octahedral tilting in cation-ordered Jahn,Teller distorted perovskites , a group-theoretical analysisACTA CRYSTALLOGRAPHICA SECTION B, Issue 1 2010Christopher J. Howard Computer-based group-theoretical methods are used to enumerate structures arising in A2BB,X6 perovskites, with either rock-salt or checkerboard ordering of the B and B, cations, under the additional assumption that one of these two cations is Jahn,Teller active and thereby induces a distortion of the BX6 (or B,X6) octahedron. The requirement to match the pattern of Jahn,Teller distortions to the cation ordering implies that the corresponding irreducible representations should be associated with the same point in the Brillouin zone. Effects of BX6 (and B,X6) octahedral tilting are included in the usual way. Finally, an analysis is presented of more complex models of ordering and distortion as might lead to the doubling of the long axis of the common Pnma perovskite, observed in systems such as Pr1,,,xCaxMnO3 (x, 0.5). The structural hierarchies derived in this work should prove useful in interpreting experimental results. [source] Sexual selection research on spiders: progress and biasesBIOLOGICAL REVIEWS, Issue 3 2005Bernhard A. Huber ABSTRACT The renaissance of interest in sexual selection during the last decades has fuelled an extraordinary increase of scientific papers on the subject in spiders. Research has focused both on the process of sexual selection itself, for example on the signals and various modalities involved, and on the patterns, that is the outcome of mate choice and competition depending on certain parameters. Sexual selection has most clearly been demonstrated in cases involving visual and acoustical signals but most spiders are myopic and mute, relying rather on vibrations, chemical and tactile stimuli. This review argues that research has been biased towards modalities that are relatively easily accessible to the human observer. Circumstantial and comparative evidence indicates that sexual selection working via substrate-borne vibrations and tactile as well as chemical stimuli may be common and widespread in spiders. Pattern-oriented research has focused on several phenomena for which spiders offer excellent model objects, like sexual size dimorphism, nuptial feeding, sexual cannibalism, and sperm competition. The accumulating evidence argues for a highly complex set of explanations for seemingly uniform patterns like size dimorphism and sexual cannibalism. Sexual selection appears involved as well as natural selection and mechanisms that are adaptive in other contexts only. Sperm competition has resulted in a plethora of morphological and behavioural adaptations, and simplistic models like those linking reproductive morphology with behaviour and sperm priority patterns in a straightforward way are being replaced by complex models involving an array of parameters. Male mating costs are increasingly being documented in spiders, and sexual selection by male mate choice is discussed as a potential result. Research on sexual selection in spiders has come a long way since Darwin, whose spider examples are reanalysed in the context of contemporary knowledge, but the same biases and methodological constraints have persisted almost unchanged through the current boom of research. [source] Score Tests for Exploring Complex Models: Application to HIV Dynamics ModelsBIOMETRICAL JOURNAL, Issue 1 2010Julia Drylewicz Abstract In biostatistics, more and more complex models are being developed. This is particularly the case in system biology. Fitting complex models can be very time-consuming, since many models often have to be explored. Among the possibilities are the introduction of explanatory variables and the determination of random effects. The particularity of this use of the score test is that the null hypothesis is not itself very simple; typically, some random effects may be present under the null hypothesis. Moreover, the information matrix cannot be computed, but only an approximation based on the score. This article examines this situation with the specific example of HIV dynamics models. We examine the score test statistics for testing the effect of explanatory variables and the variance of random effect in this complex situation. We study type I errors and the statistical powers of this score test statistics and we apply the score test approach to a real data set of HIV-infected patients. [source] Models for Estimating Bayes Factors with Applications to Phylogeny and Tests of MonophylyBIOMETRICS, Issue 3 2005Marc A. Suchard Summary Bayes factors comparing two or more competing hypotheses are often estimated by constructing a Markov chain Monte Carlo (MCMC) sampler to explore the joint space of the hypotheses. To obtain efficient Bayes factor estimates, Carlin and Chib (1995, Journal of the Royal Statistical Society, Series B57, 473,484) suggest adjusting the prior odds of the competing hypotheses so that the posterior odds are approximately one, then estimating the Bayes factor by simple division. A byproduct is that one often produces several independent MCMC chains, only one of which is actually used for estimation. We extend this approach to incorporate output from multiple chains by proposing three statistical models. The first assumes independent sampler draws and models the hypothesis indicator function using logistic regression for various choices of the prior odds. The two more complex models relax the independence assumption by allowing for higher-lag dependence within the MCMC output. These models allow us to estimate the uncertainty in our Bayes factor calculation and to fully use several different MCMC chains even when the prior odds of the hypotheses vary from chain to chain. We apply these methods to calculate Bayes factors for tests of monophyly in two phylogenetic examples. The first example explores the relationship of an unknown pathogen to a set of known pathogens. Identification of the unknown's monophyletic relationship may affect antibiotic choice in a clinical setting. The second example focuses on HIV recombination detection. For potential clinical application, these types of analyses must be completed as efficiently as possible. [source] Phylogeny, biogeography and classification of the snake superfamily Elapoidea: a rapid radiation in the late EoceneCLADISTICS, Issue 1 2009Christopher M. R. Kelly The snake superfamily Elapoidea presents one of the most intransigent problems in systematics of the Caenophidia. Its monophyly is undisputed and several cohesive constituent lineages have been identified (including the diverse and clinically important family Elapidae), but its basal phylogenetic structure is obscure. We investigate phylogenetic relationships and spatial and temporal history of the Elapoidea using 94 caenophidian species and approximately 2300,4300 bases of DNA sequence from one nuclear and four mitochondrial genes. Phylogenetic reconstruction was conducted in a parametric framework using complex models of sequence evolution. We employed Bayesian relaxed clocks and Penalized Likelihood with rate smoothing to date the phylogeny, in conjunction with seven fossil calibration constraints. Elapoid biogeography was investigated using maximum likelihood and maximum parsimony methods. Resolution was poor for early relationships in the Elapoidea and in Elapidae and our results imply rapid basal diversification in both clades, in the late Eocene of Africa (Elapoidea) and the mid-Oligocene of the Oriental region (Elapidae). We identify the major elapoid and elapid lineages, present a phylogenetic classification system for the superfamily (excluding Elapidae), and combine our phylogenetic, temporal and biogeographic results to provide an account of elapoid evolution in light of current palaeontological data and palaeogeographic models. © The Willi Hennig Society 2009. [source] Patients' explanations for depression: a factor analytic studyCLINICAL PSYCHOLOGY AND PSYCHOTHERAPY (AN INTERNATIONAL JOURNAL OF THEORY & PRACTICE), Issue 1 2008Rick Budd Objectives: Previous questionnaire studies have attempted to explore the factor structure of lay beliefs about the causes of depression. These studies have tended to either fail to sample the full range of possible causal explanations or extract too many factors, thereby producing complex solutions. The main objective of the present study was to obtain a more complete and robust factor structure of lay theories of depression while more adequately sampling from the full range of hypothesized causes of depression. A second objective of the study was to explore the relationship between respondents' explanations for depression and their perceptions of the helpfulness of different treatments received. Method and design: A 77-item questionnaire comprising possible reasons for ,why a person might get depressed' was mailed out to members of a large self-help organization. Also included was a short questionnaire inviting respondents to note treatments received and their perceptions of the helpfulness of these treatments. Data from the 77-item questionnaire were subjected to a principal components analysis. Results: The reasons rated as most important causes of depression related to recent bereavement, imbalance in brain chemistry and having suffered sexual assault/abuse. The data were best described by a two-factor solution, with the first factor clearly representing stress and the second factor depressogenic beliefs, the latter corresponding to a cognitive,behavioural formulation of depression aetiology. The two scales thus derived did not, however, correspond substantially with rated helpfulness for different treatments received. Conclusions: The factor structure obtained was in contrast to more complex models from previous studies, comprising two factors. It is likely to be more robust and meaningful. It accords with previous research on lay theories of depression, which highlight ,stress' as a key cause for depression. Possible limitations in the study are discussed, and it is suggested that using the questionnaire with more recently depressed people might yield clearer findings in relation to perceptions of treatment helpfulness.,Copyright © 2008 John Wiley & Sons, Ltd. [source] |