Nonparametric Methods (nonparametric + methods)

Distribution by Scientific Domains


Selected Abstracts


The Wilcoxon Signed Rank Test for Paired Comparisons of Clustered Data

BIOMETRICS, Issue 1 2006
Bernard Rosner
Summary The Wilcoxon signed rank test is a frequently used nonparametric test for paired data (e.g., consisting of pre- and posttreatment measurements) based on independent units of analysis. This test cannot be used for paired comparisons arising from clustered data (e.g., if paired comparisons are available for each of two eyes of an individual). To incorporate clustering, a generalization of the randomization test formulation for the signed rank test is proposed, where the unit of randomization is at the cluster level (e.g., person), while the individual paired units of analysis are at the subunit within cluster level (e.g., eye within person). An adjusted variance estimate of the signed rank test statistic is then derived, which can be used for either balanced (same number of subunits per cluster) or unbalanced (different number of subunits per cluster) data, with an exchangeable correlation structure, with or without tied values. The resulting test statistic is shown to be asymptotically normal as the number of clusters becomes large, if the cluster size is bounded. Simulation studies are performed based on simulating correlated ranked data from a signed log-normal distribution. These studies indicate appropriate type I error for data sets with ,20 clusters and a superior power profile compared with either the ordinary signed rank test based on the average cluster difference score or the multivariate signed rank test of Puri and Sen (1971, Nonparametric Methods in Multivariate Analysis, New York: John Wiley). Finally, the methods are illustrated with two data sets, (i) an ophthalmologic data set involving a comparison of electroretinogram (ERG) data in retinitis pigmentosa (RP) patients before and after undergoing an experimental surgical procedure, and (ii) a nutritional data set based on a randomized prospective study of nutritional supplements in RP patients where vitamin E intake outside of study capsules is compared before and after randomization to monitor compliance with nutritional protocols. [source]


Are parametric models suitable for estimating avian growth rates?

JOURNAL OF AVIAN BIOLOGY, Issue 4 2007
William P. Brown
For many bird species, growth is negative or equivocal during development. Traditional, parametric growth curves assume growth follows a sigmoidal form with prescribed inflection points and is positive until asymptotic size. Accordingly, these curves will not accurately capture the variable, sometimes considerable, fluctuations in avian growth over the course of the trajectory. We evaluated the fit of three traditional growth curves (logistic, Gompertz, and von Bertalanffy) and a nonparametric spline estimator to simulated growth data of six different specified forms over a range of sample sizes. For all sample sizes, the spline best fit the simulated model that exhibited negative growth during a portion of the trajectory. The Gompertz curve was the most flexible for fitting simulated models that were strictly sigmoidal in form, yet the fit of the spline was comparable to that of the Gompertz curve as sample size increased. Importantly, confidence intervals for all of the fitted, traditional growth curves were wholly inaccurate, negating the apparent robustness of the Gompertz curve, while confidence intervals of the spline were acceptable. We further evaluated the fit of traditional growth curves and the spline to a large data set of wood thrush Hylocichla mustelina mass and wing chord observations. The spline fit the wood thrush data better than the traditional growth curves, produced estimates that did not differ from known observations, and described negative growth rates at relevant life history stages that were not detected by the growth curves. The common rationale for using parametric growth curves, which compress growth information into a few parameters, is to predict an expected size or growth rate at some age or to compare estimated growth with other published estimates. The suitability of these traditional growth curves may be compromised by several factors, however, including variability in the true growth trajectory. Nonparametric methods, such as the spline, provide a precise description of empirical growth yet do not produce such parameter estimates. Selection of a growth descriptor is best determined by the question being asked but may be constrained by inherent patterns in the growth data. [source]


Statistical models of shape for the analysis of protein spots in two-dimensional electrophoresis gel images

PROTEINS: STRUCTURE, FUNCTION AND BIOINFORMATICS, Issue 6 2003
Mike Rogers
Abstract In image analysis of two-dimensional electrophoresis gels, individual spots need to be identified and quantified. Two classes of algorithms are commonly applied to this task. Parametric methods rely on a model, making strong assumptions about spot appearance, but are often insufficiently flexible to adequately represent all spots that may be present in a gel. Nonparametric methods make no assumptions about spot appearance and consequently impose few constraints on spot detection, allowing more flexibility but reducing robustness when image data is complex. We describe a parametric representation of spot shape that is both general enough to represent unusual spots, and specific enough to introduce constraints on the interpretation of complex images. Our method uses a model of shape based on the statistics of an annotated training set. The model allows new spot shapes, belonging to the same statistical distribution as the training set, to be generated. To represent spot appearance we use the statistically derived shape convolved with a Gaussian kernel, simulating the diffusion process in spot formation. We show that the statistical model of spot appearance and shape is able to fit to image data more closely than the commonly used spot parameterizations based solely on Gaussian and diffusion models. We show that improvements in model fitting are gained without degrading the specificity of the representation. [source]


European population substructure is associated with mucocutaneous manifestations and autoantibody production in systemic lupus erythematosus

ARTHRITIS & RHEUMATISM, Issue 8 2009
Sharon A. Chung
Objective To determine whether genetic substructure in European-derived populations is associated with specific manifestations of systemic lupus erythematosus (SLE), including mucocutaneous phenotypes, autoantibody production, and renal disease. Methods SLE patients of European descent (n = 1,754) from 8 case collections were genotyped for >1,400 ancestry informative markers that define a north,south gradient of European substructure. Using the Structure program, each SLE patient was characterized in terms of percent Northern (versus percent Southern) European ancestry based on these genetic markers. Nonparametric methods, including tests for trend, were used to identify associations between Northern European ancestry and specific SLE manifestations. Results In multivariate analyses, increasing levels of Northern European ancestry were significantly associated with photosensitivity (Ptrend = 0.0021, odds ratio for highest quartile of Northern European ancestry versus lowest quartile [ORhigh,low] 1.64, 95% confidence interval [95% CI] 1.13,2.35) and discoid rash (Ptrend = 0.014, ORhigh,low 1.93, 95% CI 0.98,3.83). In contrast, increasing levels of Northern European ancestry had a protective effect against the production of anticardiolipin autoantibodies (Ptrend = 1.6 × 10,4, ORhigh,low 0.46, 95% CI 0.30,0.69) and anti,double-stranded DNA autoantibodies (Ptrend = 0.017, ORhigh,low 0.67, 95% CI 0.46,0.96). Conclusion This study demonstrates that specific SLE manifestations vary according to Northern versus Southern European ancestry. Thus, genetic ancestry may contribute to the clinical heterogeneity and variation in disease outcomes among SLE patients of European descent. Moreover, these results suggest that genetic studies of SLE subphenotypes will need to carefully address issues of population substructure based on genetic ancestry. [source]


Assessing the joint effects of chlorinated dioxins, some pesticides and polychlorinated biphenyls on thyroid hormone status in Japanese breast-fed infants

ENVIRONMETRICS, Issue 2 2003
Takashi Yanagawa
Abstract Joint effects of dioxin related chemicals (DXNs), hexachlorocyclohexanes (HCHs), DDT, dieldrin, heptachlor-epoxide (HCE), chlordane and polychlorinated biphenyls (PCB) on the levels of triirodothyronine (T3), thyroxine (T4), thyroid stimulating hormones (TSH) and thyroid binding globulin (TBG) in the peripheral blood of 101 breast-fed infants are studied. The statistical issue involved is how to estimate the effects based on data from volunteer subjects with possible measurement errors. A chain independent graph is applied for modeling the associations among factors, and dicotomizations of selected factors are performed for estimating the effects. Use of nonparametric methods with careful consideration of over-adjustment is suggested. It is shown that the estimated odds ratios of DXNs,DDT, the first principal component of DXNs and DDT, relative to TSH are 3.02 (p -value=0.03) and 7.15 (p -value=0.02), respectively, when PCB is not adjusted and adjusted for, respectively. Copyright © 2003 John Wiley & Sons, Ltd. [source]


Lag screw fixation of dorsal cortical stress fractures of the third metacarpal bone in 116 racehorses

EQUINE VETERINARY JOURNAL, Issue 7 2010
S. L. JALIM
Summary Reasons for performing study: The effectiveness and best method to manage dorsal cortical stress fractures is not clear. This study was performed to evaluate the success of lag screw fixation of such fractures in a population of Thoroughbred racehorses. Hypothesis: Lag screw fixation of dorsal cortical stress fractures is an effective surgical procedure allowing racehorses to return to their preoperative level of performance. Methods: The records of 116 racehorses (103 Thoroughbreds) admitted to Equine Medical Centre, California between 1986 and 2008 were assessed. Information obtained from medical records included subject details, limb(s) affected, fracture configuration, length of screw used in repair and presence of concurrent surgical procedures performed. Racing performance was evaluated relative to these factors using Fisher's exact test and nonparametric methods with a level of significance of P<0.05. Results: Of 92 Thoroughbred horses, 83% raced preoperatively and 83% raced post operatively, with 63% having ,5 starts. There was no statistically significant association between age, gender, limb affected, fracture configuration or presence of concurrent surgery and likelihood of racing post operatively or of having 5 or more starts. The mean earnings per start and the performance index for the 3 races following surgery were lower compared to the 3 races prior to surgery; however, 29 and 45% of horses either improved or did not change their earnings per start and performance index, respectively. Conclusions and potential relevance: Data show that lag screw fixation is successful at restoring ability to race in horses suffering from dorsal cortical stress fractures. [source]


NONPARAMETRIC SURVEY RESPONSE ERRORS,

INTERNATIONAL ECONOMIC REVIEW, Issue 4 2007
Rosa L. Matzkin
I present nonparametric methods to identify and estimate the biases associated with response errors. When applied to survey data, these methods can be used to analyze how observable and unobservable characteristics of the respondent, and characteristics of the design of the survey, affect errors in the responses. This provides a method to correct the biases that those errors generate, by using the estimated response errors to "undo" those biases. The results are useful also to design better surveys, since they point at characteristics of the design and of subpopulations of respondents that can provide identification of response errors. Several models are considered. [source]


Short Rate Dynamics and Regime Shifts,

INTERNATIONAL REVIEW OF FINANCE, Issue 3 2009
HAITAO LI
ABSTRACT We characterize the dynamics of the US short-term interest rate using a Markov regime-switching model. Using a test developed by Garcia, we show that there are two regimes in the data: In one regime, the short rate behaves like a random walk with low volatility; in another regime, it exhibits strong mean reversion and high volatility. In our model, the sensitivity of interest rate volatility to the level of interest rate is much lower than what is commonly found in the literature. We also show that the findings of nonlinear drift in Aït-Sahalia and Stanton, using nonparametric methods, are consistent with our regime-switching model. [source]


Exact Small-Sample Differential Item Functioning Methods for Polytomous Items With Illustration Based on an Attitude Survey

JOURNAL OF EDUCATIONAL MEASUREMENT, Issue 4 2004
J. Patrick Meyer
Exact nonparametric procedures have been used to identify the level of differential item functioning (DIF) in binary items. This study explored the use of exact DIF procedures with items scored on a Likert scale. The results from an attitude survey suggest that the large-sample Cochran-Mantel-Haenszel (CMH) procedure identifies more items as statistically significant than two comparable exact nonparametric methods. This finding is consistent with previous findings; however, when items are classified in National Assessment of Educational Progress DIF categories, the results show that the CMH and its exact nonparametric counterparts produce almost identical classifications. Since DIF is often evaluated in terms of statistical and practical significance, this study provides evidence that the large-sample CMH procedure may be safely used even when the focal group has as few as 76 cases. [source]


Bivariate flood frequency analysis: Part 1.

JOURNAL OF FLOOD RISK MANAGEMENT, Issue 4 2008
Determination of marginals by parametric, nonparametric techniques
Abstract In flood frequency analysis, a flood event is mainly characterized by peak flow, volume and duration. These three variables or characteristics of floods are random in nature and mutually correlated. In this article, an effort is made to find out appropriate marginal distribution of the flood characteristics considering a set of parametric and nonparametric distributions, and further mathematically model the correlated nature among them. A set of parametric distribution functions and nonparametric methods based on kernel density estimation and orthonormal series are used to determine the marginal distribution functions for peak flow, volume and duration. In conventional methods of flood frequency analysis, the marginal distribution functions of peak flow, volume and duration are assumed to follow some specific parametric distribution function. The present work performs a better selection of marginal distribution functions for flood characteristics as both parametric and nonparametric estimation procedures are extensively followed. The methodology is demonstrated with 70-year stream flow data of Red River at Grand Forks of North Dakota, USA. [source]


Central limit theorems for nonparametric estimators with real-time random variables

JOURNAL OF TIME SERIES ANALYSIS, Issue 5 2010
Tae Yoon Kim
Primary 62G07; 62F12; Secondary 62M05 C13; C14 In this article, asymptotic theories for nonparametric methods are studied when they are applied to real-time data. In particular, we derive central limit theorems for nonparametric density and regression estimators. For this we formally introduce a sequence of real-time random variables indexed by a parameter related to fine gridding of time domain (or fine discretization). Our results show that the impact of fine gridding is greater in the density estimation case in the sense that strong dependence due to fine gridding severely affects the major strength of nonparametric density estimator (or its data-adaptive property). In addition, we discuss some issues about nonparametric regression model with fine gridding of time domain. [source]


Nonparametric confidence intervals for Tmax in sequence-stratified crossover studies

PHARMACEUTICAL STATISTICS: THE JOURNAL OF APPLIED STATISTICS IN THE PHARMACEUTICAL INDUSTRY, Issue 1 2008
Susan A. Willavize
Abstract Tmax is the time associated with the maximum serum or plasma drug concentration achieved following a dose. While Tmax is continuous in theory, it is usually discrete in practice because it is equated to a nominal sampling time in the noncompartmental pharmacokinetics approach. For a 2-treatment crossover design, a Hodges,Lehmann method exists for a confidence interval on treatment differences. For appropriately designed crossover studies with more than two treatments, a new median-scaling method is proposed to obtain estimates and confidence intervals for treatment effects. A simulation study was done comparing this new method with two previously described rank-based nonparametric methods, a stratified ranks method and a signed ranks method due to Ohrvik. The Normal theory, a nonparametric confidence interval approach without adjustment for periods, and a nonparametric bootstrap method were also compared. Results show that less dense sampling and period effects cause increases in confidence interval length. The Normal theory method can be liberal (i.e. less than nominal coverage) if there is a true treatment effect. The nonparametric methods tend to be conservative with regard to coverage probability and among them the median-scaling method is least conservative and has shortest confidence intervals. The stratified ranks method was the most conservative and had very long confidence intervals. The bootstrap method was generally less conservative than the median-scaling method, but it tended to have longer confidence intervals. Overall, the median-scaling method had the best combination of coverage and confidence interval length. All methods performed adequately with respect to bias. Copyright © 2007 John Wiley & Sons, Ltd. [source]


Nonparametric covariate adjustment for receiver operating characteristic curves

THE CANADIAN JOURNAL OF STATISTICS, Issue 1 2010
Fang Yao
Abstract The accuracy of a diagnostic test is typically characterized using the receiver operating characteristic (ROC) curve. Summarizing indexes such as the area under the ROC curve (AUC) are used to compare different tests as well as to measure the difference between two populations. Often additional information is available on some of the covariates which are known to influence the accuracy of such measures. The authors propose nonparametric methods for covariate adjustment of the AUC. Models with normal errors and possibly non-normal errors are discussed and analyzed separately. Nonparametric regression is used for estimating mean and variance functions in both scenarios. In the model that relaxes the assumption of normality, the authors propose a covariate-adjusted Mann,Whitney estimator for AUC estimation which effectively uses available data to construct working samples at any covariate value of interest and is computationally efficient for implementation. This provides a generalization of the Mann,Whitney approach for comparing two populations by taking covariate effects into account. The authors derive asymptotic properties for the AUC estimators in both settings, including asymptotic normality, optimal strong uniform convergence rates and mean squared error (MSE) consistency. The MSE of the AUC estimators was also assessed in smaller samples by simulation. Data from an agricultural study were used to illustrate the methods of analysis. The Canadian Journal of Statistics 38:27,46; 2010 © 2009 Statistical Society of Canada La précision d'un test diagnostique est habituellement établie en utilisant les courbes caracté-ristiques de fonctionnement du récepteur (« ROC »). Des statistiques telles que l'aire sous la courbe ROC (« AUC ») sont utilisées afin de comparer différents tests et pour mesurer la différence entre deux populations. Souvent de l'information supplémentaire est disponible sur quelques covariables dont l'influence sur de telles statistiques est connue. Les auteurs suggèrent des méthodes non paramétriques afin d'ajuster la statistique AUC pour prendre en compte les covariables. Des modèles avec des erreurs gaussiennes et même non gaussiennes sont présentés et analysés séparément. Une régression non paramétrique est utilisée afin d'estimer les fonctions moyenne et variance dans les deux scénarios. Pour le modèle sans l'hypothèse de normalité, les auteurs proposent un estimateur de Mann-Whithney tenant compte des covariables pour l'AUC qui utilise l'information disponible dans les données afin de construire des échantillons d'analyse pour n'importe quelle valeur des covariables. Cet estimateur est implanté, car il est calculable de façon efficace. Il généralise l'approche de Mann-Whitney pour comparer deux populations en considérant l'effet des covariables. Les auteurs obtiennent les propriétés asymptotiques des estimateurs AUC pour les deux scénarios incluant la normalité asymptotique, les vitesses optimales de convergence uniforme forte et la convergence en erreur quadratique moyenne (« MSE »). Le MSE de l'estimateur de l'AUC est aussi étudié pour les petits échantillons à l'aide de simulations. Des données provenant d'une étude dans le domaine agricole sont utilisées afin d'illustrer les méthodes d'analyse. La revue canadienne de statistique 38: 27,46; 2010 © 2009 Sociètè statistique du Canada [source]


Misprescription and misuse of one-tailed tests

AUSTRAL ECOLOGY, Issue 4 2009
CELIA M. LOMBARDI
Abstract One-tailed statistical tests are often used in ecology, animal behaviour and in most other fields in the biological and social sciences. Here we review the frequency of their use in the 1989 and 2005 volumes of two journals (Animal Behaviour and Oecologia), their advantages and disadvantages, the extensive erroneous advice on them in both older and modern statistics texts and their utility in certain narrow areas of applied research. Of those articles with data sets susceptible to one-tailed tests, at least 24% in Animal Behaviour and at least 13% in Oecologia used one-tailed tests at least once. They were used 35% more frequently with nonparametric methods than with parametric ones and about twice as often in 1989 as in 2005. Debate in the psychological literature of the 1950s established the logical criterion that one-tailed tests should be restricted to situations where there is interest only in results in one direction. ,Interest' should be defined; however, in terms of collective or societal interest and not by the individual investigator. By this ,collective interest' criterion, all uses of one-tailed tests in the journals surveyed seem invalid. In his book Nonparametric Statistics, S. Siegel unrelentingly suggested the use of one-tailed tests whenever the investigator predicts the direction of a result. That work has been a major proximate source of confusion on this issue, but so are most recent statistics textbooks. The utility of one-tailed tests in research aimed at obtaining regulatory approval of new drugs and new pesticides is briefly described, to exemplify the narrow range of research situations where such tests can be appropriate. These situations are characterized by null hypotheses stating that the difference or effect size does not exceed, or is at least as great as, some ,amount of practical interest'. One-tailed tests rarely should be used for basic or applied research in ecology, animal behaviour or any other science. [source]


Bayesian nonparametric hierarchical modeling

BIOMETRICAL JOURNAL, Issue 2 2009
David B. Dunson
Abstract In biomedical research, hierarchical models are very widely used to accommodate dependence in multivariate and longitudinal data and for borrowing of information across data from different sources. A primary concern in hierarchical modeling is sensitivity to parametric assumptions, such as linearity and normality of the random effects. Parametric assumptions on latent variable distributions can be challenging to check and are typically unwarranted, given available prior knowledge. This article reviews some recent developments in Bayesian nonparametric methods motivated by complex, multivariate and functional data collected in biomedical studies. The author provides a brief review of flexible parametric approaches relying on finite mixtures and latent class modeling. Dirichlet process mixture models are motivated by the need to generalize these approaches to avoid assuming a fixed finite number of classes. Focusing on an epidemiology application, the author illustrates the practical utility and potential of nonparametric Bayes methods. [source]


Semiparametric Models for Cumulative Incidence Functions

BIOMETRICS, Issue 1 2004
John Bryant
Summary. In analyses of time-to-failure data with competing risks, cumulative incidence functions may be used to estimate the time-dependent cumulative probability of failure due to specific causes. These functions are commonly estimated using nonparametric methods, but in cases where events due to the cause of primary interest are infrequent relative to other modes of failure, nonparametric methods may result in rather imprecise estimates for the corresponding subdistribution. In such cases, it may be possible to model the cause-specific hazard of primary interest parametrically, while accounting for the other modes of failure using nonparametric estimators. The cumulative incidence estimators so obtained are simple to compute and are considerably more efficient than the usual nonparametric estimator, particularly with regard to interpolation of cumulative incidence at early or intermediate time points within the range of data used to fit the function. More surprisingly, they are often nearly as efficient as fully parametric estimators. We illustrate the utility of this approach in the analysis of patients treated for early stage breast cancer. [source]


Preliminary testing for normality: some statistical aspects of a common concept

CLINICAL & EXPERIMENTAL DERMATOLOGY, Issue 6 2006
V. Schoder
Summary Background., Statistical methodology has become an increasingly important topic in dermatological research. Adequacy of the statistical procedure depends among others on distributional assumptions. In dermatological articles, the choice between parametric and nonparametric methods is often based on preliminary goodness-of-fit tests. Aim., For the special case of the assumption of normally distributed data, the Kolmogorov,Smirnov test is the most popular choice. We investigated the performance of this test on four types of non-normal data, representing the majority of real data in dermatological research. Methods., Simulations were run to assess the performance of the Kolmogorov,Smirnov test, depending on sample size and severity of violations of normality. Results., The Kolmogorov,Smirnov test performs badly on data with single outliers, 10% outliers and skewed data at sample sizes <,100, whereas normality is rejected to an acceptable degree for Likert-type data. Conclusion., Preliminary testing for normality is not recommended for small-to-moderate sample sizes. [source]