Discrete Data (discrete + data)

Distribution by Scientific Domains


Selected Abstracts


The multi-clump finite mixture distribution and model selection

ENVIRONMETRICS, Issue 2 2010
Sudhir R. Paul
Abstract In practical data analysis, often an important problem is to determine the number of clumps in discrete data in the form of proportions. This can be done through model selection in a multi-clump finite mixture model. In this paper, we propose bootstrap likelihood ratio tests to test the fit of a multinomial model against the single clump finite mixture distribution and to determine the number of clumps in the data, that is, to select a model with appropriate number of clumps. Shortcomings of some traditional large sample procedures are also shown. Three datasets are analyzed. Copyright © 2009 John Wiley & Sons, Ltd. [source]


A score test for non-nested hypotheses with applications to discrete data models

JOURNAL OF APPLIED ECONOMETRICS, Issue 5 2001
J. M. C. Santos Silva
In this paper it is shown that a convenient score test against non-nested alternatives can be constructed from the linear combination of the likelihood functions of the competing models. This is essentially a test for the correct specification of the conditional distribution of the variable of interest. Given its characteristics, the proposed test is particularly attractive to check the distributional assumptions in models for discrete data. The usefulness of the test is illustrated with an application to models for recreational boating trips. Copyright © 2001 John Wiley & Sons, Ltd. [source]


Estimation of the hindered settling function R(,) from batch-settling tests

AICHE JOURNAL, Issue 4 2005
Daniel R. Lester
Abstract The hindered settling function R(,) is a material function that quantifies the interphase drag of colloidal suspensions for all solids volume fractions ,. A method is presented to estimate R(,) from batch-settling tests for solids volume fractions between the initial solids volume fraction, ,0, and the solids volume fraction at which the suspension forms a continuously networked structure, ,g, known as the gel point. The method is based on an analytic solution of the associated inverse problem. Techniques are presented to address initialization mechanics observed in such tests as well as experimental noise and discrete data. Analysis of synthetic and experimental data suggests that accurate estimates of R(,) are possible in most cases. These results provide scope for characterization of suspension dewaterability from batch-settling tests alone. © 2005 American Institute of Chemical Engineers AIChE J, 2005 [source]


Geostatistical inference under preferential sampling

JOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES C (APPLIED STATISTICS), Issue 2 2010
Peter J. Diggle
Summary. Geostatistics involves the fitting of spatially continuous models to spatially discrete data. Preferential sampling arises when the process that determines the data locations and the process being modelled are stochastically dependent. Conventional geostatistical methods assume, if only implicitly, that sampling is non-preferential. However, these methods are often used in situations where sampling is likely to be preferential. For example, in mineral exploration, samples may be concentrated in areas that are thought likely to yield high grade ore. We give a general expression for the likelihood function of preferentially sampled geostatistical data and describe how this can be evaluated approximately by using Monte Carlo methods. We present a model for preferential sampling and demonstrate through simulated examples that ignoring preferential sampling can lead to misleading inferences. We describe an application of the model to a set of biomonitoring data from Galicia, northern Spain, in which making allowance for preferential sampling materially changes the results of the analysis. [source]


Application of GIS for processing and establishing the correlation between weather radar reflectivity and precipitation data

METEOROLOGICAL APPLICATIONS, Issue 1 2005
Y. Gorokhovich
Correlation between weather radar reflectivity and precipitation data collected by rain gauges allows empirical formulae to be obtained that can be used to create continuous rainfall surfaces from discrete data. Such surfaces are useful in distributed hydrologic modelling and early warning systems in flood management. Because of the spatial relationship between rain gauge locations and radar coverage area, GIS provides the basis for data analysis and manipulation. A database of 82 radar stations and more than 1500 rain gauges in continental USA was compiled and used for the continuous downloading of radar images and rain data. Image sequences corresponding to rain events were extracted for two randomly selected radar stations in South and North Carolina. Rainfall data from multiple gauges within the radar zone of 124 nautical miles (nm) (,230 km) were extracted and combined with corresponding reflectivity values for each time interval of the selected rain event. Data were normalised to one-hour intervals and then statistical analysis was applied to study the potential correlation. Results of regression analysis showed a significant correlation between rain gauge data and radar reflectivity values and allowed derivation of empirical formulae. Copyright © 2005 Royal Meteorological Society. [source]


EXACT P -VALUES FOR DISCRETE MODELS OBTAINED BY ESTIMATION AND MAXIMIZATION

AUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, Issue 4 2008
Chris J. Lloyd
Summary In constructing exact tests from discrete data, one must deal with the possible dependence of the P -value on nuisance parameter(s) , as well as the discreteness of the sample space. A classical but heavy-handed approach is to maximize over ,. We prove what has previously been understood informally, namely that maximization produces the unique and smallest possible P -value subject to the ordering induced by the underlying test statistic and test validity. On the other hand, allowing for the worst case will be more attractive when the P -value is less dependent on ,. We investigate the extent to which estimating , under the null reduces this dependence. An approach somewhere between full maximization and estimation is partial maximization, with appropriate penalty, as introduced by Berger & Boos (1994, P values maximized over a confidence set for the nuisance parameter. J. Amer. Statist. Assoc.,89, 1012,1016). It is argued that estimation followed by maximization is an attractive, but computationally more demanding, alternative to partial maximization. We illustrate the ideas on a range of low-dimensional but important examples for which the alternative methods can be investigated completely numerically. [source]