Home About us Contact

Statistical Properties (statistical + property)

Distribution by Scientific Domains

Life Sciences	24%
Physics and Astronomy	13%
Business, Economics, Finance and Accounting	12%
Engineering	11%
Mathematics and Statistics	10%
Earth and Environmental Science	10%
Medical Sciences	6%
Chemistry	4%
2 Other Domains	10%

Distribution within Life Sciences

Evolution	33%
Ecology & Organismal Biology	20%

Selected Abstracts

Statistical Properties of the K -Index for Detecting Answer Copying

JOURNAL OF EDUCATIONAL MEASUREMENT, Issue 2 2002
Leonardo S. Sotaridona
We investigated the statistical properties of the K-index (Holland, 1996) that can be used to detect copying behavior on a test. A simulation study was conducted to investigate the applicability of the K-index for small, medium, and large datasets. Furthermore, the Type I error rate and the detection rate of this index were compared with the copying index, , (Wollack, 1997). Several approximations were used to calculate the K-index. Results showed that all approximations were able to hold the Type I error rates below the nominal level. Results further showed that using , resulted in higher detection rates than the K-indices for small and medium sample sizes (100 and 500 simulees). [source]

A Comparison of the Statistical Properties of Financial Variables in the USA, UK and Germany over the Business Cycle

THE MANCHESTER SCHOOL, Issue 4 2000
Elena Andreou
This paper presents business cycle stylized facts for the US, UK and German economies. We examine whether financial variables (interest rates, stock market price indices, dividend yields and monetary aggregates) predict economic activity over the business cycle, and we investigate the nature of any non-linearities in these variables. Leading indicator properties are examined using cross-correlations for both the values of the variables and their volatilities. Our results imply that the most reliable leading indicator across the three countries is the interest rate term structure, although other variables also appear to be useful for specific countries. The volatilities of financial variables may also contain predictive information for production growth as well as production volatility. Non-linearities are uncovered for all financial series, especially in terms of autoregressive conditional heteroscedasticity effects. Strong evidence of mean non-linearity is also found for many financial series and this can be associated with business cycle asymmetries in the mean. This is the case for a number of American and British financial variables, especially interest rates, but the corresponding evidence for Germany is confined largely to the real long-term rate of interest. [source]

Statistical properties of population differentiation estimators under stepwise mutation in a finite island model

MOLECULAR ECOLOGY, Issue 4 2002
F. Balloux
Abstract Microsatellite loci mutate at an extremely high rate and are generally thought to evolve through a stepwise mutation model. Several differentiation statistics taking into account the particular mutation scheme of the microsatellite have been proposed. The most commonly used is , which is independent of the mutation rate under a generalized stepwise mutation model. and are commonly reported in the literature, but often differ widely. Here we compare their statistical performances using individual-based simulations of a finite island model. The simulations were run under different levels of gene flow, mutation rates, population number and sizes. In addition to the per locus statistical properties, we compare two ways of combining over loci. Our simulations show that even under a strict stepwise mutation model, no statistic is best overall. All estimators suffer to different extents from large bias and variance. While better reflects population differentiation in populations characterized by very low gene-exchange, gives better estimates in cases of high levels of gene flow. The number of loci sampled (12, 24, or 96) has only a minor effect on the relative performance of the estimators under study. For all estimators there is a striking effect of the number of samples, with the differentiation estimates showing very odd distributions for two samples. [source]

Statistical properties of the Cooper pair operators

PHYSICA STATUS SOLIDI (B) BASIC SOLID STATE PHYSICS, Issue 9 2005
I. G. Kaplan
Abstract The Cooper pair has the total spin S = 0. So, in accordance with the Pauli principle, the wave functions describing the Cooper pair system should have the boson permutation symmetry, but the pairon operators (Cooper's pair operators) do not obey the boson commutation relations. The pairon operators may not be considered neither as the Bose operators, nor as the Fermi operators. In this work, we analyze the statistical properties and the commutation relations for the pairon operator and reveal that they correspond to the modified parafermi statistics of rank p = 1. Two different expressions for the Cooper pair number operator are presented. We demonstrate that the calculations with a Hamiltonian expressed via pairon operators is more convenient using the commutation properties of these operators without presenting them as a product of fermion operators. This allows to study problems in which the interactions between Cooper's pair are also included. The problem of two interacting Cooper's pairs is discussed. (© 2005 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim) [source]

Controlling jumps in correlated processes of Poisson counts

APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, Issue 5 2009
Christian H. WeißArticle first published online: 3 DEC 200
Abstract Processes of autocorrelated Poisson counts can often be modelled by a Poisson INAR(1) model, which proved to apply well to typical tasks of SPC. Statistical properties of this model are briefly reviewed. Based on these properties, we propose a new control chart: the combined jumps chart. It monitors the counts and jumps of a Poisson INAR(1) process simultaneously. As the bivariate process of counts and jumps is a homogeneous Markov chain, average run lengths (ARLs) can be computed exactly with the well-known Markov chain approach. Based on an investigation of such ARLs, we derive design recommendations and show that a properly designed chart can be applied nearly universally. This is also demonstrated by a real-data example from the insurance field. Copyright © 2008 John Wiley & Sons, Ltd. [source]

Statistical properties and performance of pairwise relatedness estimators using turbot (Scophthalmus maximus L.) family data

AQUACULTURE RESEARCH, Issue 4 2010
Ania Pino-Querido
Abstract The statistical properties and performance of four estimators of pairwise relatedness were evaluated in several scenarios using the microsatellite genotype data from a set of large known full-sibships of turbot. All estimators showed a significant negative bias for the four kinships commonly used in these studies (unrelated: UR, half-sibs, full-sibs and parent,offspring), when allele frequencies of the reference population were estimated from the individuals analysed. When these frequencies were obtained from the base population from which all families proceeded, the bias was mostly corrected. The Wang (W) and Li (L) estimators were the least sensitive to this factor, while the Lynch and Ritland (L&R estimator) was the highest one. The error (mean around 0.130) was very similar in all scenarios for W, L and Queller and Goodnight (QG) estimators, while L&R was the highest error-prone estimator. Parent,offspring kinship resulted in the lowest error, when using W, L and QG estimators, while UR resulted in the lowest error with the L&R estimator. Globally, W was the best-performing estimator, although L&R could perform better in specific sampling scenarios. In summary, pairwise estimators represent useful tools for kinship classification in aquaculture broodstock management by applying appropriate thresholds depending on the goals of the analysis. [source]

Roughness Characterization through 3D Textured Image Analysis: Contribution to the Study of Road Wear Level

COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, Issue 2 2004
M. Khoudeir
The microtexture is defined as surface irregularities whose height ranges from 0.001 mm to 0.5 mm and whose width is less than 0.5 mm (Alvarez and Mprel, 1994). The deterioration due to the road traffic, especially polishing effect, involves a change in the microtexture. So, we suggest a method to characterize, through image analysis, wear level or microroughness of road surfaces. We propose then, on one hand a photometric model for road surface, and, on the other hand, a geometrical model for road surface profile. These two models allow us to develop roughness criteria based on the study of the statistical properties of: the distribution of the gray levels in the image, the distribution of the absolute value of its gradient, the form of its autocorrelation function, and the distribution of its curvature map. Experiments have been done with images of laboratory-made road specimens at different wear levels. The obtained results are similar to those obtained by a direct method using road profiles. [source]

Physical foundations, models, and methods of diffusion magnetic resonance imaging of the brain: A review

CONCEPTS IN MAGNETIC RESONANCE, Issue 5 2007
Ludovico Minati
Abstract The foundations and characteristics of models and methods used in diffusion magnetic resonance imaging, with particular reference to in vivo brain imaging, are reviewed. The first section introduces Fick's laws, propagators, and the relationship between tissue microstructure and the statistical properties of diffusion of water molecules. The second section introduces the diffusion-weighted signal in terms of diffusion of magnetization (Bloch,Torrey equation) and of spin-bearing particles (cumulant expansion). The third section is dedicated to the rank-2 tensor model, the bb -matrix, and the derivation of indexes of anisotropy and shape. The fourth section introduces diffusion in multiple compartments: Gaussian mixture models, relationship between fiber layout, displacement probability and diffusivity, and effect of the b -value. The fifth section is devoted to higher-order generalizations of the tensor model: singular value decompositions (SVD), representation of angular diffusivity patterns and derivation of generalized anisotropy (GA) and scaled entropy (SE), and modeling of non-Gaussian diffusion by means of series expansion of Fick's laws. The sixth section covers spherical harmonic decomposition (SHD) and determination of fiber orientation by means of spherical deconvolution. The seventh section presents the Fourier relationship between signal and displacement probability (Q -space imaging, QSI, or diffusion-spectrum imaging, DSI), and reconstruction of orientation-distribution functions (ODF) by means of the Funk,Radon transform (Q -ball imaging, QBI). © 2007 Wiley Periodicals, Inc. Concepts Magn Reson Part A 30A: 278,307, 2007. [source]

Analysis of historical landslide time series in the Emilia-Romagna region, northern Italy

EARTH SURFACE PROCESSES AND LANDFORMS, Issue 10 2010
Mauro Rossi
Abstract A catalogue of historical landslides, 1951,2002, for three provinces in the Emilia-Romagna region of northern Italy is presented and its statistical properties studied. The catalogue consists of 2255 reported landslides and is based on historical archives and chronicles. We use two measures for the intensity of landsliding over time: (i) the number of reported landslides in a day (DL) and (ii) the number of reported landslides in an event (Sevent), where an event is one or more consecutive days with landsliding. From 1951,2002 in our study area there were 1057 days with 1 , DL ,?45 landslides per day, and 596 events with 1 , Sevent , 129 landslides per event. In the first set of analyses, we find that the probability density of landslide intensities in the time series are power-law distributed over at least two-orders of magnitude, with exponent of about ,2·0. Although our data is a proxy for landsliding built from newspaper reports, it is the first tentative evidence that the frequency-size of triggered landslide events over time (not just the landslides in a given triggered event), like earthquakes, scale as a power-law or other heavy-tailed distributions. If confirmed, this could have important implications for risk assessment and erosion modelling in a given area. In our second set of analyses, we find that for short antecedent rainfall periods, the minimum amount of rainfall necessary to trigger landslides varies considerably with the intensity of the landsliding (DL and Sevent); whereas for long antecedent periods the magnitude is largely independent of the cumulative amount of rainfall, and the largest values of landslide intensity are always preceded by abundant rainfall. Further, the analysis of the rainfall trend suggests that the trigger of landslides in the study area is related to seasonal rainfall. Copyright © 2010 John Wiley & Sons, Ltd. [source]

Landslide inventories and their statistical properties

EARTH SURFACE PROCESSES AND LANDFORMS, Issue 6 2004
Bruce D. Malamud
Abstract Landslides are generally associated with a trigger, such as an earthquake, a rapid snowmelt or a large storm. The landslide event can include a single landslide or many thousands. The frequency,area (or volume) distribution of a landslide event quanti,es the number of landslides that occur at different sizes. We examine three well-documented landslide events, from Italy, Guatemala and the USA, each with a different triggering mechanism, and ,nd that the landslide areas for all three are well approximated by the same three-parameter inverse-gamma distribution. For small landslide areas this distribution has an exponential ,roll-over' and for medium and large landslide areas decays as a power-law with exponent -2·40. One implication of this landslide distribution is that the mean area of landslides in the distribution is independent of the size of the event. We also introduce a landslide-event magnitude scale mL = log(NLT), with NLT the total number of landslides associated with a trigger. If a landslide-event inventory is incomplete (i.e. smaller landslides are not included), the partial inventory can be compared with our landslide probability distribution, and the corresponding landslide-event magnitude inferred. This technique can be applied to inventories of historical landslides, inferring the total number of landslides that occurred over geologic time, and how many of these have been erased by erosion, vegetation, and human activity. We have also considered three rockfall-dominated inventories, and ,nd that the frequency,size distributions differ substantially from those associated with other landslide types. We suggest that our proposed frequency,size distribution for landslides (excluding rockfalls) will be useful in quantifying the severity of landslide events and the contribution of landslides to erosion. Copyright © 2004 John Wiley & Sons, Ltd. [source]

Semi-empirical model for site effects on acceleration time histories at soft-soil sites.

EARTHQUAKE ENGINEERING AND STRUCTURAL DYNAMICS, Issue 13 2004
Part 2: calibration
Abstract A previously developed simplified model of ground motion amplification is applied to the simulation of acceleration time histories at several soft-soil sites in the Valley of Mexico, on the basis of the corresponding records on firm ground. The main objective is to assess the ability of the model to reproduce characteristics such as effective duration, frequency content and instantaneous intensity. The model is based on the identification of a number of parameters that characterize the complex firm-ground to soft-soil transfer function, and on the adjustment of these parameters in order to account for non-linear soil behavior. Once the adjusted model parameters are introduced, the statistical properties of the simulated and the recorded ground motions agree reasonably well. For the sites and for the seismic events considered in this study, it is concluded that non-linear soil behavior may have a significant effect on the amplification of ground motion. The non-linear soil behavior significantly affects the effective ground motion duration for the components with the higher intensities, but it does not have any noticeable influence on the lengthening of the dominant ground period. Copyright © 2004 John Wiley & Sons, Ltd. [source]

Representing genetic variation as continuous surfaces: an approach for identifying spatial dependency in landscape genetic studies

ECOGRAPHY, Issue 6 2008
Melanie A. Murphy
Landscape genetics, an emerging field integrating landscape ecology and population genetics, has great potential to influence our understanding of habitat connectivity and distribution of organisms. Whereas typical population genetics studies summarize gene flow as pairwise measures between sampling localities, landscape characteristics that influence population genetic connectivity are often continuously distributed in space. Thus, there are currently gaps in both the ability to analyze genotypic data in a continuous spatial context and our knowledge of expected of landscape genetic structure under varying conditions. We present a framework for generating continuous "genetic surfaces", evaluate their statistical properties, and quantify statistical behavior of landscape genetic structure in a simple landscape. We simulated microsatellite genotypes under varying parameters (time since vicariance, migration, effective population size) and used ancestry (q) values from STRUCTURE to interpolate a genetic surface. Using a spatially adjusted Pearson's correlation coefficient to test the significance of landscape variable(s) on genetic structure we were able to detect landscape genetic structure on a contemporary time scale (,5 generations post vicariance, migration probability ,0.10) even when population differentiation was minimal (FST,0.00015). We show that genetic variation can be significantly correlated with geographic distance even when genetic structure is due to landscape variable(s), demonstrating the importance of testing landscape influence on genetic structure. Finally, we apply genetic surfacing to analyze an empirical dataset of black bears from northern Idaho USA. We find black bear genetic variation is a function of distance (autocorrelation) and habitat patch (spatial dependency), consistent with previous results indicating genetic variation was influenced by landscape by resistance. These results suggest genetic surfaces can be used to test competing hypotheses of the influence of landscape characteristics on genetic structure without delineation of categorical groups. [source]

The quest for a null model for macroecological patterns: geometry of species distributions at multiple spatial scales

ECOLOGY LETTERS, Issue 8 2008
David Storch
Abstract There have been several attempts to build a unified framework for macroecological patterns. However, these have mostly been based either on questionable assumptions or have had to be parameterized to obtain realistic predictions. Here, we propose a new model explicitly considering patterns of aggregated species distributions on multiple spatial scales, the property which lies behind all spatial macroecological patterns, using the idea we term ,generalized fractals'. Species' spatial distributions were modelled by a random hierarchical process in which the original ,habitat' patches were randomly replaced by sets of smaller patches nested within them, and the statistical properties of modelled species assemblages were compared with macroecological patterns in observed bird data. Without parameterization based on observed patterns, this simple model predicts realistic patterns of species abundance, distribution and diversity, including fractal-like spatial distributions, the frequency distribution of species occupancies/abundances and the species,area relationship. Although observed macroecological patterns may differ in some quantitative properties, our concept of random hierarchical aggregation can be considered as an appropriate null model of fundamental macroecological patterns which can potentially be modified to accommodate ecologically important variables. [source]

Sample Splitting and Threshold Estimation

ECONOMETRICA, Issue 3 2000
Bruce E. Hansen
Threshold models have a wide variety of applications in economics. Direct applications include models of separating and multiple equilibria. Other applications include empirical sample splitting when the sample split is based on a continuously-distributed variable such as firm size. In addition, threshold models may be used as a parsimonious strategy for nonparametric function estimation. For example, the threshold autoregressive model (TAR) is popular in the nonlinear time series literature. Threshold models also emerge as special cases of more complex statistical frameworks, such as mixture models, switching models, Markov switching models, and smooth transition threshold models. It may be important to understand the statistical properties of threshold models as a preliminary step in the development of statistical tools to handle these more complicated structures. Despite the large number of potential applications, the statistical theory of threshold estimation is undeveloped. It is known that threshold estimates are super-consistent, but a distribution theory useful for testing and inference has yet to be provided. This paper develops a statistical theory for threshold estimation in the regression context. We allow for either cross-section or time series observations. Least squares estimation of the regression parameters is considered. An asymptotic distribution theory for the regression estimates (the threshold and the regression slopes) is developed. It is found that the distribution of the threshold estimate is nonstandard. A method to construct asymptotic confidence intervals is developed by inverting the likelihood ratio statistic. It is shown that this yields asymptotically conservative confidence regions. Monte Carlo simulations are presented to assess the accuracy of the asymptotic approximations. The empirical relevance of the theory is illustrated through an application to the multiple equilibria growth model of Durlauf and Johnson (1995). [source]

Applications and statistical properties of minimum significant difference-based criterion testing in a toxicity testing program,

ENVIRONMENTAL TOXICOLOGY & CHEMISTRY, Issue 1 2000
Qin Wang
Abstract As a follow up to the recommendations of the September 1995 SETAC Pellston Workshop on Whole Effluent Toxicity (WET) on test methods and appropriate endpoints, this paper will discuss the applications and statistical properties of using a statistical criterion of minimum significant difference (MSD). We examined the upper limits of acceptable MSDs as acceptance criterion in the case of normally distributed data. The implications of this approach are examined in terms of false negative rate as well as false positive rate. Results indicated that the proposed approach has reasonable statistical properties. Reproductive data from short-term chronic WET test with Ceriodaphnia dubia tests were used to demonstrate the applications of the proposed approach. The data were collected by the North Carolina Department of Environment, Health, and Natural Resources (Raleigh, NC, USA) as part of their National Pollutant Discharge Elimination System program. [source]

Hourly surface wind monitor consistency checking over an extended observation period

ENVIRONMETRICS, Issue 4 2009
Scott Beaver
Abstract A consistency checking methodology is presented to aid in identifying biased values in extended historical records of hourly surface wind measurements obtained from a single station. The method is intended for screening extended observation periods for values which do not fail physical consistency checks (i.e., standard or complex quality assurance methods), yet nonetheless exhibit statistical properties which differ from the bulk of the record. Several specific types of inconsistencies common in surface wind monitoring datasets are considered: annual biases, unexpected values, and discontinuities. The purely empirical method checks for self-consistency in the temporal distribution of the wind measurements by explicitly modeling the diurnal variability. Each year of data is modeled using principal component analysis (PCA) (or empirical orthogonal functions, EOF), then hierarchical clustering with nearest neighbor linkage is used to visualize any annual biases existing in the measurements. The diurnal distributions for wind speed and direction are additionally estimated and visualized to determine any periods of time which are inconsistent with the typical diurnal cycle for a given monitor. The robust consistency checking method is applied to a set of 44 monitors operating in the San Joaquin Valley (SJV) of Central California over a 9-year period. Monitors from the SLAMS, CIMIS, and RAWS networks are considered. Similar inconsistencies are detected in all three networks; however, network-specific types of inconsistencies are found as well. Copyright © 2008 John Wiley & Sons, Ltd. [source]

POWER AND POTENTIAL BIAS IN FIELD STUDIES OF NATURAL SELECTION

EVOLUTION, Issue 3 2004
Erika I. Hersch
Abstract The advent of multiple regression analyses of natural selection has facilitated estimates of both the direct and indirect effects of selection on many traits in numerous organisms. However, low power in selection studies has possibly led to a bias in our assessment of the levels of selection shaping natural populations. Using calculations and simulations based on the statistical properties of selection coefficients, we find that power to detect total selection (the selection differential) depends on sample size and the strength of selection relative to the opportunity of selection. The power of detecting direct selection (selection gradients) is more complicated and depends on the relationship between the correlation of each trait and fitness and the pattern of correlation among traits. In a review of 298 previously published selection differentials, we find that most studies have had insufficient power to detect reported levels of selection acting on traits and that, in general, the power of detecting weak levels of selection is low given current study designs. We also find that potential publication bias could explain the trend that reported levels of direct selection tend to decrease as study sizes increase, suggesting that current views of the strength of selection may be inaccurate and biased upward. We suggest that studies should be designed so that selection is analyzed on at least several hundred individuals, the total opportunity of selection be considered along with the pattern of selection on individual traits, and nonsignificant results be actively reported combined with an estimate of power. [source]

Sequential methods and group sequential designs for comparative clinical trials

FUNDAMENTAL & CLINICAL PHARMACOLOGY, Issue 5 2003
Véronique Sébille
Abstract Comparative clinical trials are performed to assess whether a new treatment has superior efficacy than a placebo or a standard treatment (one-sided formulation) or whether two active treatments have different efficacies (two-sided formulation) in a given population. The reference approach is the single-stage design and the statistical test is performed after inclusion and evaluation of a predetermined sample size. In practice, the single-stage design is sometimes difficult to implement because of ethical concerns and/or economic reasons. Thus, specific early termination procedures have been developed to allow repeated statistical analyses to be performed on accumulating data and stop the trial as soon as the information is sufficient to conclude. Two main different approaches can be used. The first one is derived from strictly sequential methods and includes the sequential probability ratio test and the triangular test. The second one is derived from group sequential designs and includes Peto, Pocock, and O'Brien and Fleming methods, , and , spending functions, and one-parameter boundaries. We review all these methods and describe the bases on which they rely as well as their statistical properties. We also compare these methods and comment on their advantages and drawbacks. We present software packages which are available for the planning, monitoring and analysis of comparative clinical trials with these methods and discuss the practical problems encountered when using them. The latest versions of all these methods can offer substantial sample size reductions when compared with the single-stage design not only in the case of clear efficacy but also in the case of complete lack of efficacy of the new treatment. The software packages make their use quite simple. However, it has to be stressed that using these methods requires efficient logistics with real-time data monitoring and, apart from survival studies or long-term clinical trials with censored endpoints, is most appropriate when the endpoint is obtained quickly when compared with the recruitment rate. [source]

A propensity score approach to correction for bias due to population stratification using genetic and non-genetic factors

GENETIC EPIDEMIOLOGY, Issue 8 2009
Huaqing Zhao
Abstract Confounding due to population stratification (PS) arises when differences in both allele and disease frequencies exist in a population of mixed racial/ethnic subpopulations. Genomic control, structured association, principal components analysis (PCA), and multidimensional scaling (MDS) approaches have been proposed to address this bias using genetic markers. However, confounding due to PS can also be due to non-genetic factors. Propensity scores are widely used to address confounding in observational studies but have not been adapted to deal with PS in genetic association studies. We propose a genomic propensity score (GPS) approach to correct for bias due to PS that considers both genetic and non-genetic factors. We compare the GPS method with PCA and MDS using simulation studies. Our results show that GPS can adequately adjust and consistently correct for bias due to PS. Under no/mild, moderate, and severe PS, GPS yielded estimated with bias close to 0 (mean=,0.0044, standard error=0.0087). Under moderate or severe PS, the GPS method consistently outperforms the PCA method in terms of bias, coverage probability (CP), and type I error. Under moderate PS, the GPS method consistently outperforms the MDS method in terms of CP. PCA maintains relatively high power compared to both MDS and GPS methods under the simulated situations. GPS and MDS are comparable in terms of statistical properties such as bias, type I error, and power. The GPS method provides a novel and robust tool for obtaining less-biased estimates of genetic associations that can consider both genetic and non-genetic factors. Genet. Epidemiol. 33:679,690, 2009. © 2009 Wiley-Liss, Inc. [source]

Examining the statistical properties of fine-scale mapping in large-scale association studies

GENETIC EPIDEMIOLOGY, Issue 3 2008
Steven Wiltshire
Abstract Interpretation of dense single nucleotide polymorphism (SNP) follow-up of genome-wide association or linkage scan signals can be facilitated by establishing expectation for the behaviour of primary mapping signals upon fine-mapping, under both null and alternative hypotheses. We examined the inferences that can be made regarding the posterior probability of a real genetic effect and considered different disease-mapping strategies and prior probabilities of association. We investigated the impact of the extent of linkage disequilibrium between the disease SNP and the primary analysis signal and the extent to which the disease gene can be physically localised under these scenarios. We found that large increases in significance (>2 orders of magnitude) appear in the exclusive domain of genuine genetic effects, especially in the follow-up of genome-wide association scans or consensus regions from multiple linkage scans. Fine-mapping significant association signals that reside directly under linkage peaks yield little improvement in an already high posterior probability of a real effect. Following fine-mapping, those signals that increase in significance also demonstrate improved localisation. We found local linkage disequiliptium patterns around the primary analysis signal(s) and tagging efficacy of typed markers to play an important role in determining a suitable interval for fine-mapping. Our findings help inform the interpretation and design of dense SNP-mapping follow-up studies, thus facilitating discrimination between a genuine genetic effect and chance fluctuation (false positive). Genet. Epidemiol. 2007. © 2007 Wiley-Liss, Inc. [source]

Gene-dropping vs. empirical variance estimation for allele-sharing linkage statistics

GENETIC EPIDEMIOLOGY, Issue 8 2006
Jeesun Jung
Abstract In this study, we compare the statistical properties of a number of methods for estimating P -values for allele-sharing statistics in non-parametric linkage analysis. Some of the methods are based on the normality assumption, using different variance estimation methods, and others use simulation (gene-dropping) to find empirical distributions of the test statistics. For variance estimation methods, we consider the perfect variance approximation and two empirical variance estimates. The simulation-based methods are gene-dropping with and without conditioning on the observed founder alleles. We also consider the Kong and Cox linear and exponential models and a Monte Carlo method modified from a method for finding genome-wide significance levels. We discuss the analytical properties of these various P -value estimation methods and then present simulation results comparing them. Assuming that the sample sizes are large enough to justify a normality assumption for the linkage statistic, the best P -value estimation method depends to some extent on the (unknown) genetic model and on the types of pedigrees in the sample. If the sample sizes are not large enough to justify a normality assumption, then gene-dropping is the best choice. We discuss the differences between conditional and unconditional gene-dropping. Genet. Epidemiol. 2006. © 2006 Wiley-Liss, Inc. [source]

Properties of the transmission-disequilibrium test in the presence of inbreeding

GENETIC EPIDEMIOLOGY, Issue 2 2002
Emmanuelle Génin
Abstract Family-based association tests such as the transmission-disequilibrium test (TDT), which compare alleles transmitted and non-transmitted from parents to affected offspring, are widely used to detect the role of genetic risk factors in diseases. These methods have the advantage of being robust to population stratification and are thus believed to be valid whatever the population context. In different studies of the statistical properties of the TDT, parents of affected offspring are typically assumed to be neither inbred nor related. In many human populations, however, this assumption is false and parental alleles are then no longer independent. It is thus of interest to determine whether the TDT is a valid test of linkage and association in the presence of inbreeding. We present a method to derive the expected value of the TDT statistic under different disease models and for any relationship between the parents of affected offspring. Using this method, we show that in the presence of inbreeding, the TDT is still a valid test for linkage but not for association. The power of the test to detect linkage may, however, be increased in the presence of inbreeding under different modes of inheritance. Genet. Epidemiol. 22:116,127, 2002. © 2002 Wiley-Liss, Inc. [source]

A covariance-adaptive approach for regularized inversion in linear models

GEOPHYSICAL JOURNAL INTERNATIONAL, Issue 2 2007
Christopher Kotsakis
SUMMARY The optimal inversion of a linear model under the presence of additive random noise in the input data is a typical problem in many geodetic and geophysical applications. Various methods have been developed and applied for the solution of this problem, ranging from the classic principle of least-squares (LS) estimation to other more complex inversion techniques such as the Tikhonov,Philips regularization, truncated singular value decomposition, generalized ridge regression, numerical iterative methods (Landweber, conjugate gradient) and others. In this paper, a new type of optimal parameter estimator for the inversion of a linear model is presented. The proposed methodology is based on a linear transformation of the classic LS estimator and it satisfies two basic criteria. First, it provides a solution for the model parameters that is optimally fitted (in an average quadratic sense) to the classic LS parameter solution. Second, it complies with an external user-dependent constraint that specifies a priori the error covariance (CV) matrix of the estimated model parameters. The formulation of this constrained estimator offers a unified framework for the description of many regularization techniques that are systematically used in geodetic inverse problems, particularly for those methods that correspond to an eigenvalue filtering of the ill-conditioned normal matrix in the underlying linear model. Our study lies on the fact that it adds an alternative perspective on the statistical properties and the regularization mechanism of many inversion techniques commonly used in geodesy and geophysics, by interpreting them as a family of ,CV-adaptive' parameter estimators that obey a common optimal criterion and differ only on the pre-selected form of their error CV matrix under a fixed model design. [source]

Impact of time-scale of the calibration objective function on the performance of watershed models

HYDROLOGICAL PROCESSES, Issue 25 2007
K. P. Sudheer
Abstract Many of the continuous watershed models perform all their computations on a daily time step, yet they are often calibrated at an annual or monthly time-scale that may not guarantee good simulation performance on a daily time step. The major objective of this paper is to evaluate the impact of the calibration time-scale on model predictive ability. This study considered the Soil and Water Assessment Tool for the analyses, and it has been calibrated at two time-scales, viz. monthly and daily for the War Eagle Creek watershed in the USA. The results demonstrate that the model's performance at the smaller time-scale (such as daily) cannot be ensured by calibrating them at a larger time-scale (such as monthly). It is observed that, even though the calibrated model possesses satisfactory ,goodness of fit' statistics, the simulation residuals failed to confirm the assumption of their homoscedasticity and independence. The results imply that evaluation of models should be conducted considering their behavior in various aspects of simulation, such as predictive uncertainty, hydrograph characteristics, ability to preserve statistical properties of the historic flow series, etc. The study enlightens the scope for improving/developing effective autocalibration procedures at the daily time step for watershed models. Copyright © 2007 John Wiley & Sons, Ltd. [source]

Topographic parameterization in continental hydrology: a study in scale

HYDROLOGICAL PROCESSES, Issue 18 2003
Robert N. Armstrong
Abstract Digital elevation models (DEMs) are useful and popular tools from which topographic parameters can be quickly and efficiently extracted for various hydrologic applications. DEMs coupled with automated methods for extracting topographic information provide a powerful means of parameterizing hydrologic models over a wide range of scales. However, choosing appropriate DEM scales for particular hydrologic modelling applications is limited by a lack of understanding of the effects of scale and grid resolution on land-surface representation. The scale effects of aggregation on square-grid DEMs of two continental-scale basins are examined. Base DEMs of the Mackenzie and Missouri River basins are extracted from the HYDRO1k DEM of North America. Successively coarser grids of 2, 4, 8, , 64 km were generated from the ,base' DEMs using simple linear averaging. TOPAZ (Topographic Parameterization) was applied to the base and aggregated DEMs using constant critical source area and minimum source channel length values to extract topographic variables at varying scales or resolutions. The effects of changing DEM resolution are examined by considering changes in the spatial distribution and statistical properties of selected topographic variables of hydrological importance. The effects of increasing grid size on basin and drainage network delineation, and derived topographic variables, tends to be non-linear. In particular, changes in overall basin extent and drainage network configuration make it impractical to apply a simple scaling function to estimate variable values for fine-resolution DEMs from those derived from coarse-resolution DEMs. Results also suggest the resolution to which a DEM can be reduced by aggregation and still provide useful topographic information for continental-scale hydrologic modelling is that at which the mean hydraulic slope falls to approximately 1%. In this study, that generally occurred at a resolution of about 10 km. Copyright © 2003 John Wiley & Sons, Ltd. [source]

A data-driven algorithm for constructing artificial neural network rainfall-runoff models

HYDROLOGICAL PROCESSES, Issue 6 2002
K. P. Sudheer
Abstract A new approach for designing the network structure in an artificial neural network (ANN)-based rainfall-runoff model is presented. The method utilizes the statistical properties such as cross-, auto- and partial-auto-correlation of the data series in identifying a unique input vector that best represents the process for the basin, and a standard algorithm for training. The methodology has been validated using the data for a river basin in India. The results of the study are highly promising and indicate that it could significantly reduce the effort and computational time required in developing an ANN model. Copyright © 2002 John Wiley & Sons, Ltd. [source]

Reproduction of temporal scaling by a rectangular pulses rainfall model

HYDROLOGICAL PROCESSES, Issue 3 2002
Jonas Olsson
Abstract The presence of scaling statistical properties in temporal rainfall has been well established in many empirical investigations during the latest decade. These properties have more and more come to be regarded as a fundamental feature of the rainfall process. How to best use the scaling properties for applied modelling remains to be assessed, however, particularly in the case of continuous rainfall time-series. One therefore is forced to use conventional time-series modelling, e.g. based on point process theory, which does not explicitly take scaling into account. In light of this, there is a need to investigate the degree to which point-process models are able to ,unintentionally' reproduce the empirical scaling properties. In the present study, four 25-year series of 20-min rainfall intensities observed in Arno River basin, Italy, were investigated. A Neyman,Scott rectangular pulses (NSRP) model was fitted to these series, so enabling the generation of synthetic time-series suitable for investigation. A multifractal scaling behaviour was found to characterize the raw data within a range of time-scales between approximately 20 min and 1 week. The main features of this behaviour were surprisingly well reproduced in the simulated data, although some differences were observed, particularly at small scales below the typical duration of a rain cell. This suggests the possibility of a combined use of the NSRP model and a scaling approach, in order to extend the NSRP range of applicability for simulation purposes. Copyright © 2002 John Wiley & Sons, Ltd. [source]

Weather regimes and their connection to the winter rainfall in Portugal

INTERNATIONAL JOURNAL OF CLIMATOLOGY, Issue 1 2005
J.A. Santos
Abstract Wintertime rainfall over Portugal is strongly coupled with the large-scale atmospheric flow in the Euro-Atlantic sector. A K -means cluster analysis, on the space spanned by a subset of the empirical orthogonal functions of the daily mean sea-level pressure fields, is performed aiming to isolate the weather regimes responsible for the interannual variability of the winter precipitation. Each daily circulation pattern is keyed to a set of five weather regimes (C, W, NAO,, NAO+ and E). The dynamical structure of each regime substantiates the statistical properties of the respective rainfall distribution and validates the clustering technique. The C regime is related to low-pressure systems over the North Atlantic that induce southwesterly and westerly moist winds over the country. The W regime is characterized by westerly disturbed weather associated with low-pressure systems mainly located over northern Europe. The NAO, regime is manifested by weak low-pressure systems near Portugal. The NAO+ regime corresponds to a well-developed Azores high with generally settled and dry weather conditions. Finally, the E regime is related to anomalous strong easterly winds and rather dry conditions. Although the variability in the frequencies of occurrence of the C and NAO, regimes is largely dominant in the interannual variability of the winter rainfall throughout Portugal, the C regime is particularly meaningful over northern Portugal and the NAO, regime acquires higher relevance over southern Portugal. The inclusion of the W regime improves the description of the variability over northern and central Portugal. Dry weather conditions prevail in both the NAO+ and E regimes, with hardly any exceptions. The occurrence of the NAO+ and the NAO, regimes is also strongly coupled with the North Atlantic oscillation. Copyright © 2005 Royal Meteorological Society [source]

Does compression affect image retrieval performance?

INTERNATIONAL JOURNAL OF IMAGING SYSTEMS AND TECHNOLOGY, Issue 2-3 2008
Gerald Schaefer
Abstract Image retrieval and image compression are both fields of intensive research. As lossy image compression degrades the visual quality of images and hence changes the actual pixel values of an image, low level image retrieval descriptors which are based on statistical properties of pixel values will change, too. In this article we investigate how image compression affects the performance of low-level colour descriptors. Several image retrieval algorithms are evaluated on a speciated image database compressed at different image quality levels. Extensive experiments reveal that while distribution-based colour descriptors are fairly stable with respect to image compression a drop in retrieval performance can nevertheless be observed for JPEG compressed images. On the other hand, after application of JPEG2000 compression only a negligible performance drop is observed even at high compression ratios. © 2008 Wiley Periodicals, Inc. Int J Imaging Syst Technol, 18, 101,112, 2008 [source]

Computed molecular surface electrostatic potentials of two groups of reverse transcriptase inhibitors: Relationships to anti-HIV-1 activities

INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY, Issue 3-4 2001
Oscar Galvez Gonzalez
Abstract We have used the GIPF approach (general interaction properties function) to develop analytical representations for the anti-HIV-1 potencies of two groups of reverse transcriptase inhibitors. Their activities are expressed in terms of certain statistical properties of their molecular surface electrostatic potentials, computed at the HF/STO-5G*//HF/STO-3G* level. The results provide insight into some of the factors that promote inhibition. © 2001 John Wiley & Sons, Inc. Int J Quant Chem 83: 115,121, 2001 [source]