Simple Random Sample (simple + random_sample)

Distribution by Scientific Domains


Selected Abstracts


A cost analysis of ranked set sampling to estimate a population mean

ENVIRONMETRICS, Issue 3 2005
Rebecca A. Buchanan
Abstract Ranked set sampling (RSS) can be a useful environmental sampling method when measurement costs are high but ranking costs are low. RSS estimates of the population mean can have higher precision than estimates from a simple random sample (SRS) of the same size, leading to potentially lower sampling costs from RSS than from SRS for a given precision. However, RSS introduces ranking costs not present in SRS; these costs must be considered in determining whether RSS is cost effective. We use a simple cost model to determine the minimum ratio of measurement to ranking costs (cost ratio) necessary in order for RSS to be as cost effective as SRS for data from the normal, exponential, and lognormal distributions. We consider both equal and unequal RSS allocations and two types of estimators of the mean: the typical distribution-free (DF) estimator and the best linear unbiased estimator (BLUE). The minimum cost ratio necessary for RSS to be as cost effective as SRS depends on the underlying distribution of the data, as well as the allocation and type of estimator used. Most minimum necessary cost ratios are in the range of 1,6, and are lower for BLUEs than for DF estimators. The higher the prior knowledge of the distribution underlying the data, the lower the minimum necessary cost ratio and the more attractive RSS is over SRS. Copyright © 2005 John Wiley & Sons, Ltd. [source]


Sampling for a longitudinal study of the careers of nurses qualifying from the English pre-registration Project 2000 diploma course

JOURNAL OF ADVANCED NURSING, Issue 4 2000
Louise Marsland PhD BSc RMN
Sampling for a longitudinal study of the careers of nurses qualifying from the English pre-registration Project 2000 diploma course This paper describes the processes involved in selecting a sample, from the eight English regional health authorities, of nurse qualifiers from all four branches of the Project 2000 pre-registration diploma course, for a longitudinal study of nurses' careers. A simple random sample was not feasible since accurate information about the population could not be obtained and the study design involved recruiting participants by personal visit. A multi-stage approach was therefore adopted in which ,college of nursing' was taken as the primary sampling unit. Sampling was further complicated by the fact that adult branch students could generally only be visited in larger groups than was ideal. Information obtained during pilot work about the accuracy of data about the population, course completion rates and the proportion of students who were likely to agree to participate was used to calculate required sampling fractions. The final sample was therefore a function of this information and the practicalities of recruiting nurses into the study. [source]


A new ranked set sample estimator of variance

JOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES B (STATISTICAL METHODOLOGY), Issue 2 2002
Steven N. MacEachern
Summary. We develop an unbiased estimator of the variance of a population based on a ranked set sample. We show that this new estimator is better than estimating the variance based on a simple random sample and more efficient than the estimator based on a ranked set sample proposed by Stokes. Also, a test to determine the effectiveness of the judgment ordering process is proposed. [source]


Statistical and methodological issues in the analysis of complex sample survey data: Practical guidance for trauma researchers,

JOURNAL OF TRAUMATIC STRESS, Issue 5 2008
Brady T. West
Standard methods for the analysis of survey data assume that the data arise from a simple random sample of the target population. In practice, analysts of survey data sets collected from nationally representative probability samples often pay little attention to important properties of the survey data. Standard statistical software procedures do not allow analysts to take these properties of survey data into account. A failure to use more specialized procedures designed for survey data analysis can impact both simple descriptive statistics and estimation of parameters in multivariate models. In this article, the author provides trauma researchers with a practical introduction to specialized methods that have been developed for the analysis of complex sample survey data. [source]


Application in stochastic volatility models of nonlinear regression with stochastic design

APPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, Issue 2 2010
Ping Chen
Abstract In regression model with stochastic design, the observations have been primarily treated as a simple random sample from a bivariate distribution. It is of enormous practical significance to generalize the situation to stochastic processes. In this paper, estimation and hypothesis testing problems in stochastic volatility model are considered, when the volatility depends on a nonlinear function of the state variable of other stochastic process, but the correlation coefficient |,|,±1. The methods are applied to estimate the volatility of stock returns from Shanghai stock exchange. Copyright © 2009 John Wiley & Sons, Ltd. [source]


Estimation of the ROC Curve under Verification Bias

BIOMETRICAL JOURNAL, Issue 3 2009
Ronen Fluss
Abstract The ROC (receiver operating characteristic) curve is the most commonly used statistical tool for describing the discriminatory accuracy of a diagnostic test. Classical estimation of the ROC curve relies on data from a simple random sample from the target population. In practice, estimation is often complicated due to not all subjects undergoing a definitive assessment of disease status (verification). Estimation of the ROC curve based on data only from subjects with verified disease status may be badly biased. In this work we investigate the properties of the doubly robust (DR) method for estimating the ROC curve under verification bias originally developed by Rotnitzky, Faraggi and Schisterman (2006) for estimating the area under the ROC curve. The DR method can be applied for continuous scaled tests and allows for a non-ignorable process of selection to verification. We develop the estimator's asymptotic distribution and examine its finite sample properties via a simulation study. We exemplify the DR procedure for estimation of ROC curves with data collected on patients undergoing electron beam computer tomography, a diagnostic test for calcification of the arteries. [source]


Design and Inference for Cancer Biomarker Study with an Outcome and Auxiliary-Dependent Subsampling

BIOMETRICS, Issue 2 2010
Xiaofei Wang
Summary In cancer research, it is important to evaluate the performance of a biomarker (e.g., molecular, genetic, or imaging) that correlates patients' prognosis or predicts patients' response to treatment in a large prospective study. Due to overall budget constraint and high cost associated with bioassays, investigators often have to select a subset from all registered patients for biomarker assessment. To detect a potentially moderate association between the biomarker and the outcome, investigators need to decide how to select the subset of a fixed size such that the study efficiency can be enhanced. We show that, instead of drawing a simple random sample from the study cohort, greater efficiency can be achieved by allowing the selection probability to depend on the outcome and an auxiliary variable; we refer to such a sampling scheme as,outcome and auxiliary-dependent subsampling,(OADS). This article is motivated by the need to analyze data from a lung cancer biomarker study that adopts the OADS design to assess epidermal growth factor receptor (EGFR) mutations as a predictive biomarker for whether a subject responds to a greater extent to EGFR inhibitor drugs. We propose an estimated maximum-likelihood method that accommodates the OADS design and utilizes all observed information, especially those contained in the likelihood score of EGFR mutations (an auxiliary variable of EGFR mutations) that is available to all patients. We derive the asymptotic properties of the proposed estimator and evaluate its finite sample properties via simulation. We illustrate the proposed method with a data example. [source]


Estimating the Encounter Rate Variance in Distance Sampling

BIOMETRICS, Issue 1 2009
Rachel M. Fewster
Summary The dominant source of variance in line transect sampling is usually the encounter rate variance. Systematic survey designs are often used to reduce the true variability among different realizations of the design, but estimating the variance is difficult and estimators typically approximate the variance by treating the design as a simple random sample of lines. We explore the properties of different encounter rate variance estimators under random and systematic designs. We show that a design-based variance estimator improves upon the model-based estimator of Buckland et al. (2001, Introduction to Distance Sampling. Oxford: Oxford University Press, p. 79) when transects are positioned at random. However, if populations exhibit strong spatial trends, both estimators can have substantial positive bias under systematic designs. We show that poststratification is effective in reducing this bias. [source]