Kernel Smoothing (kernel + smoothing)

Distribution by Scientific Domains


Selected Abstracts


Comparison of missing value imputation methods for crop yield data

ENVIRONMETRICS, Issue 4 2006
Ravindra S. Lokupitiya
Abstract Most ecological data sets contain missing values, a fact which can cause problems in the analysis and limit the utility of resulting inference. However, ecological data also tend to be spatially correlated, which can aid in estimating and imputing missing values. We compared four existing methods of estimating missing values: regression, kernel smoothing, universal kriging, and multiple imputation. Data on crop yields from the National Agricultural Statistical Survey (NASS) and the Census of Agriculture (Ag Census) were the basis for our analysis. Our goal was to find the best method to impute missing values in the NASS datasets. For this comparison, we selected the NASS data for barley crop yield in 1997 as our reference dataset. We found in this case that multiple imputation and regression were superior to methods based on spatial correlation. Universal kriging was found to be the third best method. Kernel smoothing seemed to perform very poorly. Copyright 2005 John Wiley & Sons, Ltd. [source]


Inference Based on Kernel Estimates of the Relative Risk Function in Geographical Epidemiology

BIOMETRICAL JOURNAL, Issue 1 2009
Martin L. Hazelton
Abstract Kernel smoothing is a popular approach to estimating relative risk surfaces from data on the locations of cases and controls in geographical epidemiology. The interpretation of such surfaces is facilitated by plotting of tolerance contours which highlight areas where the risk is sufficiently high to reject the null hypothesis of unit relative risk. Previously it has been recommended that these tolerance intervals be calculated using Monte Carlo randomization tests. We examine a computationally cheap alternative whereby the tolerance intervals are derived from asymptotic theory. We also examine the performance of global tests of hetereogeneous risk employing statistics based on kernel risk surfaces, paying particular attention to the choice of smoothing parameters on test power ( 2009 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim) [source]


Comparison of missing value imputation methods for crop yield data

ENVIRONMETRICS, Issue 4 2006
Ravindra S. Lokupitiya
Abstract Most ecological data sets contain missing values, a fact which can cause problems in the analysis and limit the utility of resulting inference. However, ecological data also tend to be spatially correlated, which can aid in estimating and imputing missing values. We compared four existing methods of estimating missing values: regression, kernel smoothing, universal kriging, and multiple imputation. Data on crop yields from the National Agricultural Statistical Survey (NASS) and the Census of Agriculture (Ag Census) were the basis for our analysis. Our goal was to find the best method to impute missing values in the NASS datasets. For this comparison, we selected the NASS data for barley crop yield in 1997 as our reference dataset. We found in this case that multiple imputation and regression were superior to methods based on spatial correlation. Universal kriging was found to be the third best method. Kernel smoothing seemed to perform very poorly. Copyright 2005 John Wiley & Sons, Ltd. [source]


Non-parametric regression with a latent time series

THE ECONOMETRICS JOURNAL, Issue 2 2009
Oliver Linton
Summary, In this paper we investigate a class of semi-parametric models for panel data sets where the cross-section and time dimensions are large. Our model contains a latent time series that is to be estimated and perhaps forecasted along with a non-parametric covariate effect. Our model is motivated by the need to be flexible with regard to the functional form of covariate effects but also the need to be practical with regard to forecasting of time series effects. We propose estimation procedures based on local linear kernel smoothing; our estimators are all explicitly given. We establish the pointwise consistency and asymptotic normality of our estimators. We also show that the effects of estimating the latent time series can be ignored in certain cases. [source]


Estimating the Intensity of a Spatial Point Process from Locations Coarsened by Incomplete Geocoding

BIOMETRICS, Issue 1 2008
Dale L. Zimmerman
Summary The estimation of spatial intensity is an important inference problem in spatial epidemiologic studies. A standard data assimilation component of these studies is the assignment of a geocode, that is, point-level spatial coordinates, to the address of each subject in the study population. Unfortunately, when geocoding is performed by the standard automated method of street-segment matching to a georeferenced road file and subsequent interpolation, it is rarely completely successful. Typically, 10,30% of the addresses in the study population, and even higher percentages in particular subgroups, fail to geocode, potentially leading to a selection bias, called geographic bias, and an inefficient analysis. Missing-data methods could be considered for analyzing such data; however, because there is almost always some geographic information coarser than a point (e.g., a Zip code) observed for the addresses that fail to geocode, a coarsened-data analysis is more appropriate. This article develops methodology for estimating spatial intensity from coarsened geocoded data. Both nonparametric (kernel smoothing) and likelihood-based estimation procedures are considered. Substantial improvements in the estimation quality of coarsened-data analyses relative to analyses of only the observations that geocode are demonstrated via simulation and an example from a rural health study in Iowa. [source]


Nonparametric Inference for Local Extrema with Application to Oligonucleotide Microarray Data in Yeast Genome

BIOMETRICS, Issue 2 2006
Peter X.-K.
Summary Identifying local extrema of expression profiles is one primary objective in some cDNA microarray experiments. To study the replication dynamics of the yeast genome, for example, local peaks of hybridization intensity profiles correspond to putative replication origins. We propose a nonparametric kernel smoothing (NKS) technique to detect local hybridization intensity extrema across chromosomes. The novelty of our approach is that we base our inference procedures on equilibrium points, namely those locations at which the first derivative of the intensity curve is zero. The proposed smoothing technique provides both point and interval estimation for the location of local extrema. Also, this technique can be used to test for the hypothesis of either one or multiple suspected locations being the true equilibrium points. We illustrate the proposed method on a microarray data set from an experiment designed to study the replication origins in the yeast genome, in that the locations of autonomous replication sequence (ARS) elements are identified through the equilibrium points of the smoothed intensity profile curve. Our method found a few ARS elements that were not detected by the current smoothing methods such as the Fourier convolution smoothing. [source]