Distribution Theory (distribution + theory)

Distribution by Scientific Domains


Selected Abstracts


Assessing accuracy of a continuous screening test in the presence of verification bias

JOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES C (APPLIED STATISTICS), Issue 1 2005
Todd A. Alonzo
Summary., In studies to assess the accuracy of a screening test, often definitive disease assessment is too invasive or expensive to be ascertained on all the study subjects. Although it may be more ethical or cost effective to ascertain the true disease status with a higher rate in study subjects where the screening test or additional information is suggestive of disease, estimates of accuracy can be biased in a study with such a design. This bias is known as verification bias. Verification bias correction methods that accommodate screening tests with binary or ordinal responses have been developed; however, no verification bias correction methods exist for tests with continuous results. We propose and compare imputation and reweighting bias-corrected estimators of true and false positive rates, receiver operating characteristic curves and area under the receiver operating characteristic curve for continuous tests. Distribution theory and simulation studies are used to compare the proposed estimators with respect to bias, relative efficiency and robustness to model misspecification. The bias correction estimators proposed are applied to data from a study of screening tests for neonatal hearing loss. [source]


Ecological relevance of temporal stability in regional fish catches

JOURNAL OF FISH BIOLOGY, Issue 5 2003
H. Hinz
The concept of habitat selection based on ,Ideal Free Distribution' theory suggests that areas of high suitability may attract larger quantities of fishes than less suitable or unsuitable areas. Catch data were used from groundfish surveys to identify areas of consistently high densities of whiting Merlangius merlangus, cod Gadus morhua and haddock Melanogrammus aeglefinus in the Irish Sea and plaice Pleuronectes platessa, sole Solea solea, lemon sole Microstomus kitt in the English Channel over a period of 10 and 9 years respectively. A method was introduced to delineate areas of the seabed that held consistently high numbers of fishes objectively from large datasets. These areas may constitute important habitat characteristics which may merit further scientific investigations in respect to ,Essential Fish Habitats'(EFH). In addition, the number of stations with consistently high abundances of fishes and the number of stations where no fishes were caught gave an indication of the site specificity of the fish species analysed. For the gadoids, whiting was found to be less site specific than cod and haddock, while for the flatfishes, plaice and sole were less site specific than lemon sole. The findings are discussed in the context of previously published studies on dietary specializm. The site specificity of demersal fishes has implications for the siting process for marine protected areas as fish species with a strong habitat affinity can be expected to benefit more from such management schemes. [source]


US state alcohol sales compared to survey data, 1993,2006

ADDICTION, Issue 9 2010
David E. Nelson
ABSTRACT Aims Assess long-term trends of the correlation between alcohol sales data and survey data. Design Analyses of state alcohol consumption data from the US Alcohol Epidemiologic Data System based on sales, tax receipts or alcohol shipments. Cross-sectional, state annual estimates of alcohol-related measures for adults from the US Behavioral Risk Factor Surveillance System using telephone surveys. Setting United States. Participants State alcohol tax authorities, alcohol vendors, alcohol industry (sales data) and randomly selected adults aged , 18 years 1993,2006 (survey data). Measurements State-level per capita annual alcohol consumption estimates from sales data. Self-reported alcohol consumption, current drinking, heavy drinking, binge drinking and alcohol-impaired driving from surveys. Correlation coefficients were calculated using linear regression models. Findings State survey estimates of consumption accounted for a median of 22% to 32% of state sales data across years. Nevertheless, state consumption estimates from both sources were strongly correlated with annual r-values ranging from 0.55,0.71. State sales data had moderate-to-strong correlations with survey estimates of current drinking, heavy drinking and binge drinking (range of r-values across years: 0.57,0.65; 0.33,0.70 and 0.45,0.61, respectively), but a weaker correlation with alcohol-impaired driving (range of r-values: 0.24,0.56). There were no trends in the magnitude of correlation coefficients. Conclusions Although state surveys substantially underestimated alcohol consumption, the consistency of the strength of the association between sales consumption and survey data for most alcohol measures suggest both data sources continue to provide valuable information. These findings support and extend the distribution of consumption model and single distribution theory, suggesting that both sales and survey data are useful for monitoring population changes in alcohol use. [source]


Instrumental Variables Estimates of the Effect of Subsidized Training on the Quantiles of Trainee Earnings

ECONOMETRICA, Issue 1 2002
Alberto Abadie
This paper reports estimates of the effects of JTPA training programs on the distribution of earnings. The estimation uses a new instrumental variable (IV) method that measures program impacts on quantiles. The quantile treatment effects (QTE) estimator reduces to quantile regression when selection for treatment is exogenously determined. QTE can be computed as the solution to a convex linear programming problem, although this requires first-step estimation of a nuisance function. We develop distribution theory for the case where the first step is estimated nonparametrically. For women, the empirical results show that the JTPA program had the largest proportional impact at low quantiles. Perhaps surprisingly, however, JTPA training raised the quantiles of earnings for men only in the upper half of the trainee earnings distribution. [source]


GMM with Weak Identification

ECONOMETRICA, Issue 5 2000
James H. Stock
This paper develops asymptotic distribution theory for GMM estimators and test statistics when some or all of the parameters are weakly identified. General results are obtained and are specialized to two important cases: linear instrumental variables regression and Euler equations estimation of the CCAPM. Numerical results for the CCAPM demonstrate that weak-identification asymptotics explains the breakdown of conventional GMM procedures documented in previous Monte Carlo studies. Confidence sets immune to weak identification are proposed. We use these results to inform an empirical investigation of various CCAPM specifications; the substantive conclusions reached differ from those obtained using conventional methods. [source]


Sample Splitting and Threshold Estimation

ECONOMETRICA, Issue 3 2000
Bruce E. Hansen
Threshold models have a wide variety of applications in economics. Direct applications include models of separating and multiple equilibria. Other applications include empirical sample splitting when the sample split is based on a continuously-distributed variable such as firm size. In addition, threshold models may be used as a parsimonious strategy for nonparametric function estimation. For example, the threshold autoregressive model (TAR) is popular in the nonlinear time series literature. Threshold models also emerge as special cases of more complex statistical frameworks, such as mixture models, switching models, Markov switching models, and smooth transition threshold models. It may be important to understand the statistical properties of threshold models as a preliminary step in the development of statistical tools to handle these more complicated structures. Despite the large number of potential applications, the statistical theory of threshold estimation is undeveloped. It is known that threshold estimates are super-consistent, but a distribution theory useful for testing and inference has yet to be provided. This paper develops a statistical theory for threshold estimation in the regression context. We allow for either cross-section or time series observations. Least squares estimation of the regression parameters is considered. An asymptotic distribution theory for the regression estimates (the threshold and the regression slopes) is developed. It is found that the distribution of the threshold estimate is nonstandard. A method to construct asymptotic confidence intervals is developed by inverting the likelihood ratio statistic. It is shown that this yields asymptotically conservative confidence regions. Monte Carlo simulations are presented to assess the accuracy of the asymptotic approximations. The empirical relevance of the theory is illustrated through an application to the multiple equilibria growth model of Durlauf and Johnson (1995). [source]


Environmental power analysis , a new perspective

ENVIRONMETRICS, Issue 5 2001
David R. Fox
Abstract Power analysis and sample-size determination are related tools that have recently gained popularity in the environmental sciences. Their indiscriminate application, however, can lead to wildly misleading results. This is particularly true in environmental monitoring and assessment, where the quality and nature of data is such that the implicit assumptions underpinning power and sample-size calculations are difficult to justify. When the assumptions are reasonably met these statistical techniques provide researchers with an important capability for the allocation of scarce and expensive resources to detect putative impact or change. Conventional analyses are predicated on a general linear model and normal distribution theory with statistical tests of environmental impact couched in terms of changes in a population mean. While these are ,optimal' statistical tests (uniformly most powerful), they nevertheless pose considerable practical difficulties for the researcher. Compounding this difficulty is the subsequent analysis of the data and the impost of a decision framework that commences with an assumption of ,no effect'. This assumption is only discarded when the sample data indicate demonstrable evidence to the contrary. The alternative (,green') view is that any anthropogenic activity has an impact on the environment and therefore a more realistic initial position is to assume that the environment is already impacted. In this article we examine these issues and provide a re-formulation of conventional mean-based hypotheses in terms of population percentiles. Prior information or belief concerning the probability of exceeding a criterion is incorporated into the power analysis using a Bayesian approach. Finally, a new statistic is introduced which attempts to balance the overall power regardless of the decision framework adopted. Copyright © 2001 John Wiley & Sons, Ltd. [source]


Seasonal and long-term changes in fishing depth of Lake Constance whitefish

FISHERIES MANAGEMENT & ECOLOGY, Issue 5 2010
G. THOMAS
Abstract, The ecosystem of Lake Constance in central Europe has undergone profound modifications over the last six decades. Seasonal and inter-annual changes in the vertical distribution patterns of whitefish were examined and related to changes in biotic and abiotic gradients. Between 1958 and 2007, the average fishing depth in late summer and autumn was related to two factors influencing food supply of whitefish , lake productivity and standing stock biomass. In years with low food supply, whitefish were harvested from greater depths, where temperatures were up to 4 °C lower. The whitefish's distribution towards colder water might be a bioenergetic optimisation behaviour whereby fish reduce metabolic losses at lower temperatures, or it may result from a reassessment of habitat preference under conditions of limited food supply, according to the ideal free distribution theory. [source]


Measurements of rain splash on bench terraces in a humid tropical steepland environment

HYDROLOGICAL PROCESSES, Issue 3 2003
A. I. J. M. Van Dijk
Abstract Soil loss continues to threaten Java's predominantly bench-terraced volcanic uplands. Sediment transport processes on back-sloping terraces with well-aggregated clay-rich oxisols in West Java were studied using two different techniques. Splash on bare, cropped, or mulched sub-horizontal (2,3°) terrace beds was studied using splash cups of different sizes, whereas transport of sediment on the predominantly bare and steep (30,40/deg ) terrace risers was measured using a novel device combining a Gerlach-type trough with a splash box to enable the separate measurement of transport by wash and splash processes. Measurements were made during two consecutive rainy seasons. The results were interpreted using a recently developed splash distribution theory and related to effective rainfall erosive energy. Splash transportability (i.e. transport per unit contour length and unit erosive energy) on the terrace risers was more than an order of magnitude greater than on bare terrace beds (0·39,0·57 versus 0·013,0·016 g m J,1). This was caused primarily by a greater average splash distance on the short, steep risers (>11 cm versus c. 1 cm on the beds). Splashed amounts were reduced by the gradual formation of a protective ,pavement' of coarser aggregates, in particular on the terrace beds. Soil aggregate size exhibited an inverse relationship with detachability (i.e. detachment per unit area and unit erosive energy) and average splash length, and therefore also with transportability, as did the degree of canopy and mulch cover. On the terrace risers, splash-creep and gravitational processes transported an additional 6,50% of measured rain splash, whereas transport by wash played a marginal role. Copyright © 2002 John Wiley & Sons, Ltd. [source]


Crystallization of Silicate Magmas Deciphered Using Crystal Size Distributions

JOURNAL OF THE AMERICAN CERAMIC SOCIETY, Issue 3 2007
Bruce D. Marsh
The remoteness and inhospitable nature of natural silicate magma make it exceedingly difficult to study in its natural setting deep beneath volcanoes. Although laboratory experiments involving molten rock are routinely performed, it is the style and nature of crystallization under natural conditions that is important to understand. This is where the crystal size distributions (CSD) method becomes fundamentally valuable. Just as chemical thermodynamics offers a quantitative macroscopic means of investigating chemical processes that occur at the atomic level, crystal size distribution theory quantitatively relates the overall observed spectrum of crystal sizes to both the kinetics of crystallization and the physical processes affecting the population of crystals themselves. Petrography, which is the qualitative study of rock textures, is the oldest, most comprehensively developed, and perhaps most beautiful aspect of studying magmatic rocks. It is the ultimate link to the kinetics of crystallization and the integrated space,time history of evolution of every magma. CSD analysis offers a quantitative inroad to unlocking and quantifying the observed textures of magmatic rocks. Perhaps the most stunning feature of crystal-rich magmatic rocks is that the constituent crystal populations show smooth and often quasi-linear log-normal distributions of negative slope when plotted as population density against crystal size. These patterns are decipherable using CSD theory, and this method has proven uniquely valuable in deciphering the kinetics of crystallization of magma. The CSD method has been largely developed in chemical engineering by Randolph and Larson,1,2 among many others, for use in understanding industrial crystallization processes, and its introduction to natural magmatic systems began in 1988. The CSD approach is particularly valuable in its ease of application to complex systems. It is an aid to classical kinetic theory by being, in its purest form, free of any atomistic assumptions regarding crystal nucleation and growth. Yet the CSD method provides kinetic information valuable to understanding the connection between crystal nucleation and growth and the overall cooling and dynamics of magma. It offers a means of investigating crystallization in dynamic systems, involving both physical and chemical processes, independent of an exact kinetic theory. The CSD method applied to rocks shows a systematic and detailed history of crystal nucleation and growth that forms the foundation of a comprehensive and general model of magma solidification. [source]


Signal-to-interference-plus-noise ratio estimation for wireless communication systems: Methods and analysis

NAVAL RESEARCH LOGISTICS: AN INTERNATIONAL JOURNAL, Issue 5 2004
Daniel R. Jeske
Abstract The Signal-to-Interference-plus-Noise Ratio (SINR) is an important metric of wireless communication link quality. SINR estimates have several important applications. These include optimizing the transmit power level for a target quality of service, assisting with handoff decisions and dynamically adapting the data rate for wireless Internet applications. Accurate SINR estimation provides for both a more efficient system and a higher user-perceived quality of service. In this paper, we develop new SINR estimators and compare their mean squared error (MSE) performance. We show that our new estimators dominate estimators that have previously appeared in the literature with respect to MSE. The sequence of transmitted bits in wireless communication systems consists of both pilot bits (which are known both to the transmitter and receiver) and user bits (which are known only by the transmitter). The SINR estimators we consider alternatively depend exclusively on pilot bits, exclusively on user bits, or simultaneously use both pilot and user bits. In addition, we consider estimators that utilize smoothing and feedback mechanisms. Smoothed estimators are motivated by the fact that the interference component of the SINR changes relatively slowly with time, typically with the addition or departure of a user to the system. Feedback estimators are motivated by the fact that receivers typically decode bits correctly with a very high probability, and therefore user bits can be thought of as quasipilot bits. For each estimator discussed, we derive an exact or approximate formula for its MSE. Satterthwaite approximations, noncentral F distributions (singly and doubly) and distribution theory of quadratic forms are the key statistical tools used in developing the MSE formulas. In the case of approximate MSE formulas, we validate their accuracy using simulation techniques. The approximate MSE formulas, of interest in their own right for comparing the quality of the estimators, are also used for optimally combining estimators. In particular, we derive optimal weights for linearly combining an estimator based on pilot bits with an estimator based on user bits. The optimal weights depend on the MSE of the two estimators being combined, and thus the accurate approximate MSE formulas can conveniently be used. The optimal weights also depend on the unknown SINR, and therefore need to be estimated in order to construct a useable combined estimator. The impact on the MSE of the combined estimator due to estimating the weights is examined. © 2004 Wiley Periodicals, Inc. Naval Research Logistics, 2004 [source]


Assessing the magnitude of the concentration parameter in a simultaneous equations model

THE ECONOMETRICS JOURNAL, Issue 1 2009
D. S. Poskitt
Summary, This paper provides the practitioner with a method of ascertaining when the concentration parameter in a simultaneous equations model is small. We provide some exact distribution theory for a proposed statistic and show that the statistic possesses the minimal desirable characteristics of a test statistic when used to test that the concentration parameter is zero. The discussion is then extended to consider how to test for weak instruments using this statistic as a basis for inference. We also discuss the statistic's relationship to various other procedures that have appeared in the literature. [source]


Effective directed tests for models with ordered categorical data

AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, Issue 3 2003
Arthur Cohen
Summary This paper offers a new method for testing one-sided hypotheses in discrete multivariate data models. One-sided alternatives mean that there are restrictions on the multidimensional parameter space. The focus is on models dealing with ordered categorical data. In particular, applications are concerned with R×C contingency tables. The method has advantages over other general approaches. All tests are exact in the sense that no large sample theory or large sample distribution theory is required. Testing is unconditional although its execution is done conditionally, section by section, where a section is determined by marginal totals. This eliminates any potential nuisance parameter issues. The power of the tests is more robust than the power of the typical linear tests often recommended. Furthermore, computer programs are available to carry out the tests efficiently regardless of the sample sizes or the order of the contingency tables. Both censored data and uncensored data models are discussed. [source]


Semiparametric Models of Time-Dependent Predictive Values of Prognostic Biomarkers

BIOMETRICS, Issue 1 2010
Yingye Zheng
Summary Rigorous statistical evaluation of the predictive values of novel biomarkers is critical prior to applying novel biomarkers into routine standard care. It is important to identify factors that influence the performance of a biomarker in order to determine the optimal conditions for test performance. We propose a covariate-specific time-dependent positive predictive values curve to quantify the predictive accuracy of a prognostic marker measured on a continuous scale and with censored failure time outcome. The covariate effect is accommodated with a semiparametric regression model framework. In particular, we adopt a smoothed survival time regression technique (Dabrowska, 1997,,The Annals of Statistics,25, 1510,1540) to account for the situation where risk for the disease occurrence and progression is likely to change over time. In addition, we provide asymptotic distribution theory and resampling-based procedures for making statistical inference on the covariate-specific positive predictive values. We illustrate our approach with numerical studies and a dataset from a prostate cancer study. [source]