Home About us Contact | |||
Error Distribution (error + distribution)
Selected AbstractsBOOTSTRAP TESTS FOR THE ERROR DISTRIBUTION IN LINEAR AND NONPARAMETRIC REGRESSION MODELSAUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, Issue 2 2006Natalie Neumeyer Summary In this paper we investigate several tests for the hypothesis of a parametric form of the error distribution in the common linear and non-parametric regression model, which are based on empirical processes of residuals. It is well known that tests in this context are not asymptotically distribution-free and the parametric bootstrap is applied to deal with this problem. The performance of the resulting bootstrap test is investigated from an asymptotic point of view and by means of a simulation study. The results demonstrate that even for moderate sample sizes the parametric bootstrap provides a reliable and easy accessible solution to the problem of goodness-of-fit testing of assumptions regarding the error distribution in linear and non-parametric regression models. [source] Range Unit-Root (RUR) Tests: Robust against Nonlinearities, Error Distributions, Structural Breaks and OutliersJOURNAL OF TIME SERIES ANALYSIS, Issue 4 2006Felipe Aparicio Abstract., Since the seminal paper by Dickey and Fuller in 1979, unit-root tests have conditioned the standard approaches to analysing time series with strong serial dependence in mean behaviour, the focus being placed on the detection of eventual unit roots in an autoregressive model fitted to the series. In this paper, we propose a completely different method to test for the type of long-wave patterns observed not only in unit-root time series but also in series following more complex data-generating mechanisms. To this end, our testing device analyses the unit-root persistence exhibited by the data while imposing very few constraints on the generating mechanism. We call our device the range unit-root (RUR) test since it is constructed from the running ranges of the series from which we derive its limit distribution. These nonparametric statistics endow the test with a number of desirable properties, the invariance to monotonic transformations of the series and the robustness to the presence of important parameter shifts. Moreover, the RUR test outperforms the power of standard unit-root tests on near-unit-root stationary time series; it is invariant with respect to the innovations distribution and asymptotically immune to noise. An extension of the RUR test, called the forward,backward range unit-root (FB-RUR) improves the check in the presence of additive outliers. Finally, we illustrate the performances of both range tests and their discrepancies with the Dickey,Fuller unit-root test on exchange rate series. [source] A framework for quad/triangle subdivision surface fitting: Application to mechanical objectsCOMPUTER GRAPHICS FORUM, Issue 1 2007Guillaume Lavoué Abstract In this paper we present a new framework for subdivision surface approximation of three-dimensional models represented by polygonal meshes. Our approach, particularly suited for mechanical or Computer Aided Design (CAD) parts, produces a mixed quadrangle-triangle control mesh, optimized in terms of face and vertex numbers while remaining independent of the connectivity of the input mesh. Our algorithm begins with a decomposition of the object into surface patches. The main idea is to approximate the region boundaries first and then the interior data. Thus, for each patch, a first step approximates the boundaries with subdivision curves (associated with control polygons) and creates an initial subdivision surface by linking the boundary control points with respect to the lines of curvature of the target surface. Then, a second step optimizes the initial subdivision surface by iteratively moving control points and enriching regions according to the error distribution. The final control mesh defining the whole model is then created assembling every local subdivision control meshes. This control polyhedron is much more compact than the original mesh and visually represents the same shape after several subdivision steps, hence it is particularly suitable for compression and visualization tasks. Experiments conducted on several mechanical models have proven the coherency and the efficiency of our algorithm, compared with existing methods. [source] Gamma regression improves Haseman-Elston and variance components linkage analysis for sib-pairsGENETIC EPIDEMIOLOGY, Issue 2 2004Mathew J. Barber Abstract Existing standard methods of linkage analysis for quantitative phenotypes rest on the assumptions of either ordinary least squares (Haseman and Elston [1972] Behav. Genet. 2:3,19; Sham and Purcell [2001] Am. J. Hum. Genet. 68:1527,1532) or phenotypic normality (Almasy and Blangero [1998] Am. J. Hum. Genet. 68:1198,1199; Kruglyak and Lander [1995] Am. J. Hum. Genet. 57:439,454). The limitations of both these methods lie in the specification of the error distribution in the respective regression analyses. In ordinary least squares regression, the residual distribution is misspecified as being independent of the mean level. Using variance components and assuming phenotypic normality, the dependency on the mean level is correctly specified, but the remaining residual coefficient of variation is constrained a priori. Here it is shown that these limitations can be addressed (for a sample of unselected sib-pairs) using a generalized linear model based on the gamma distribution, which can be readily implemented in any standard statistical software package. The generalized linear model approach can emulate variance components when phenotypic multivariate normality is assumed (Almasy and Blangero [1998] Am. J. Hum Genet. 68: 1198,1211) and is therefore more powerful than ordinary least squares, but has the added advantage of being robust to deviations from multivariate normality and provides (often overlooked) model-fit diagnostics for linkage analysis. Genet Epidemiol 26:97,107, 2004. © 2004 Wiley-Liss, Inc. [source] How much confidence should we place in efficiency estimates?HEALTH ECONOMICS, Issue 11 2003Andrew StreetArticle first published online: 3 DEC 200 Abstract Ordinary least squares (OLS) and stochastic frontier (SF) analyses are commonly used to estimate industry-level and firm-specific efficiency. Using cross-sectional data for English public hospitals, a total cost function based on a specification developed by the English Department of Health is estimated. Confidence intervals are calculated around the OLS residuals and around the inefficiency component of the SF residuals. Sensitivity analysis is conducted to assess whether conclusions about relative performance are robust to choices of error distribution, functional form and model specification. It is concluded that estimates of relative hospital efficiency are sensitive to estimation decisions and that little confidence can be placed in the point estimates for individual hospitals. The use of these techniques to set annual performance targets should be avoided. Copyright © 2002 John Wiley & Sons, Ltd. [source] Forecasting financial volatility of the Athens stock exchange daily returns: an application of the asymmetric normal mixture GARCH modelINTERNATIONAL JOURNAL OF FINANCE & ECONOMICS, Issue 4 2010Anastassios A. Drakos Abstract In this paper we model the return volatility of stocks traded in the Athens Stock Exchange using alternative GARCH models. We employ daily data for the period January 1998 to November 2008 allowing us to capture possible positive and negative effects that may be due to either contagion or idiosyncratic sources. The econometric analysis is based on the estimation of a class of five GARCH models under alternative assumptions with respect to the error distribution. The main findings of our analysis are: first, based on a battery of diagnostic tests it is shown that the normal mixture asymmetric GARCH (NM-AGARCH) models perform better in modeling the volatility of stock returns. Second, it is shown that with the use of the Kupiec's tests for in-sample and out-of-sample forecasting performance the evidence is mixed as the choice of the appropriate volatility model depends on the trading position under consideration. Third, at the 99% confidence interval the NM-AGARCH model with skewed Student-distribution outperforms all other competing models both for in-sample and out-of-sample forecasting performance. This increase in predictive performance for higher confidence intervals of the NM-AGARCH model with skewed Student-distribution makes this specification consistent with the requirements of the Basel II agreement. Copyright © 2010 John Wiley & Sons, Ltd. [source] Political Cost Incentives for Managing the Property-Liability Insurer Loss ReserveJOURNAL OF ACCOUNTING RESEARCH, Issue 1 2010MARTIN F. GRACE ABSTRACT This paper examines the effect of rate regulation on the management of the property-liability insurer loss reserve. The political cost hypothesis predicts that managers make accounting choices to reduce wealth transfers resulting from the regulatory process. Managers may under-state reserves to justify lower rates to regulators. Alternatively, managers may have an incentive to report loss inflating discretionary reserves to reduce the cost of regulatory rate suppression. We find insurers over-state reserves in the presence of stringent rate regulation. Investigating the impact along the conditional reserve error distribution, we discover that a majority of the response occurs from under-reserving firms under-reserving less because of stringent rate regulation. [source] Using the t -distribution to improve the absolute structure assignment with likelihood calculationsJOURNAL OF APPLIED CRYSTALLOGRAPHY, Issue 4 2010Rob W. W. Hooft The previously described method for absolute structure determination [Hooft, Straver & Spek (2008). J. Appl. Cryst.41, 96,103] assumes a Gaussian error distribution. The method is now extended to make it robust against poor data with large systematic errors with the introduction of the Student t -distribution. It is shown that this modification makes very little difference for good data but dramatically improves results for data with a non-Gaussian error distribution. [source] Autoindexing with outlier rejection and identification of superimposed latticesJOURNAL OF APPLIED CRYSTALLOGRAPHY, Issue 3 2010Nicholas K. Sauter Constructing a model lattice to fit the observed Bragg diffraction pattern is straightforward for perfect samples, but indexing can be challenging when artifacts are present, such as poorly shaped spots, split crystals giving multiple closely aligned lattices and outright superposition of patterns from aggregated microcrystals. To optimize the lattice model against marginal data, refinement can be performed using a subset of the observations from which the poorly fitting spots have been discarded. Outliers are identified by assuming a Gaussian error distribution for the best-fitting spots and points diverging from this distribution are culled. The set of remaining observations produces a superior lattice model, while the rejected observations can be used to identify a second crystal lattice, if one is present. The prevalence of outliers provides a potentially useful measure of sample quality. The described procedures are implemented for macromolecular crystallography within the autoindexing program labelit.index (http://cci.lbl.gov/labelit). [source] Predicting habitat distribution and frequency from plant species co-occurrence dataJOURNAL OF BIOGEOGRAPHY, Issue 6 2007Christine Römermann Abstract Aim, Species frequency data have been widely used in nature conservation to aid management decisions. To determine species frequencies, information on habitat occurrence is important: a species with a low frequency is not necessarily rare if it occupies all suitable habitats. Often, information on habitat distribution is available for small geographic areas only. We aim to predict grid-based habitat occurrence from grid-based plant species distribution data in a meso-scale analysis. Location, The study was carried out over two spatial extents: Germany and Bavaria. Methods, Two simple models were set up to examine the number of characteristic plant species needed per grid cell to predict the occurrence of four selected habitats (species data from FlorKart, http://www.floraweb.de). Both models were calibrated in Bavaria using available information on habitat distribution, validated for other federal states, and applied to Germany. First, a spatially explicit regression model (generalized linear model (GLM) with assumed binomial error distribution of response variable) was obtained. Second, a spatially independent optimization model was derived that estimated species numbers without using spatial information on habitat distribution. Finally, an additional uncalibrated model was derived that calculated the frequencies of 24 habitats. It was validated using NATURA2000 habitat maps. Results, Using the Bavarian models it was possible to predict habitat distribution and frequency from the co-occurrence of habitat-specific species per grid cell. As the model validations for other German federal states were successful, the models were applied to all of Germany, and habitat distribution and frequencies could be retrieved for the national scale on the basis of habitat-specific species co-occurrences per grid cell. Using the third, uncalibrated model, which includes species distribution data only, it was possible to predict the frequencies of 24 habitats based on the co-occurrence of 24% of formation-specific species per grid cell. Predicted habitat frequencies deduced from this third model were strongly related to frequencies of NATURA2000 habitat maps. Main conclusions, It was concluded that it is possible to deduce habitat distributions and frequencies from the co-occurrence of habitat-specific species. For areas partly covered by habitat mappings, calibrated models can be developed and extrapolated to larger areas. If information on habitat distribution is completely lacking, uncalibrated models can still be applied, providing coarse information on habitat frequencies. Predicted habitat distributions and frequencies can be used as a tool in nature conservation, for example as correction factors for species frequencies, as long as the species of interest is not included in the model set-up. [source] Volatility forecasting with double Markov switching GARCH modelsJOURNAL OF FORECASTING, Issue 8 2009Cathy W. S. Chen Abstract This paper investigates inference and volatility forecasting using a Markov switching heteroscedastic model with a fat-tailed error distribution to analyze asymmetric effects on both the conditional mean and conditional volatility of financial time series. The motivation for extending the Markov switching GARCH model, previously developed to capture mean asymmetry, is that the switching variable, assumed to be a first-order Markov process, is unobserved. The proposed model extends this work to incorporate Markov switching in the mean and variance simultaneously. Parameter estimation and inference are performed in a Bayesian framework via a Markov chain Monte Carlo scheme. We compare competing models using Bayesian forecasting in a comparative value-at-risk study. The proposed methods are illustrated using both simulations and eight international stock market return series. The results generally favor the proposed double Markov switching GARCH model with an exogenous variable. Copyright © 2008 John Wiley & Sons, Ltd. [source] Optimal sampling frequency for volatility forecast models for the Indian stock marketsJOURNAL OF FORECASTING, Issue 1 2009Malay Bhattacharyya Abstract This paper evaluates the performance of conditional variance models using high-frequency data of the National Stock Index (S&P CNX NIFTY) and attempts to determine the optimal sampling frequency for the best daily volatility forecast. A linear combination of the realized volatilities calculated at two different frequencies is used as benchmark to evaluate the volatility forecasting ability of the conditional variance models (GARCH (1, 1)) at different sampling frequencies. From the analysis, it is found that sampling at 30 minutes gives the best forecast for daily volatility. The forecasting ability of these models is deteriorated, however, by the non-normal property of mean adjusted returns, which is an assumption in conditional variance models. Nevertheless, the optimum frequency remained the same even in the case of different models (EGARCH and PARCH) and different error distribution (generalized error distribution, GED) where the error is reduced to a certain extent by incorporating the asymmetric effect on volatility. Our analysis also suggests that GARCH models with GED innovations or EGRACH and PARCH models would give better estimates of volatility with lower forecast error estimates. Copyright © 2008 John Wiley & Sons, Ltd. [source] A Bayesian threshold nonlinearity test for financial time seriesJOURNAL OF FORECASTING, Issue 1 2005Mike K. P. So Abstract We propose in this paper a threshold nonlinearity test for financial time series. Our approach adopts reversible-jump Markov chain Monte Carlo methods to calculate the posterior probabilities of two competitive models, namely GARCH and threshold GARCH models. Posterior evidence favouring the threshold GARCH model indicates threshold nonlinearity or volatility asymmetry. Simulation experiments demonstrate that our method works very well in distinguishing GARCH and threshold GARCH models. Sensitivity analysis shows that our method is robust to misspecification in error distribution. In the application to 10 market indexes, clear evidence of threshold nonlinearity is discovered and thus supporting volatility asymmetry. Copyright © 2005 John Wiley & Sons, Ltd. [source] Exact mass measurement on an electrospray ionization time-of-flight mass spectrometer: error distribution and selective averagingJOURNAL OF MASS SPECTROMETRY (INCORP BIOLOGICAL MASS SPECTROMETRY), Issue 10 2003Jiejun Wu Abstract An automated, accurate and reliable way of acquiring and processing flow injection data for exact mass measurement using a bench-top electrospray ionization time-of-flight (ESI-TOF) mass spectrometer is described. Using Visual Basic programs, individual scans were selected objectively with restrictions on ion counts per second for both the compound of interest and the mass reference peaks. The selected ,good scans' were then subjected to two different data-processing schemes (,combine-then-center' and ,center-then-average'), and the results were compared at various ion count limit settings. It was found that, in general, the average of mass values from individual scans is more accurate than the centroid mass value of the combined (same) scans. In order to acquire a large number of good scans in one injection (to increase the sampling size for statistically valid averaging), an on-line dilution chamber was added to slow down the typically rapid mass chromatographic peak decay in flow-injection analysis. This simple addition worked well in automation without the need for manual sample dilution. In addition, by dissolving the reference compound directly into the mobile phase, manual syringe filling can be eliminated. Twenty-seven samples were analyzed with the new acquisition and process routines in positive electrospray ionization mode. For the best method found, the percentage of samples with RMS error less than 5 ppm was 100% with repetitive injection data (6 injections per sample), and 95% with single injection data. Afterwards, 31 other test samples were run (with MW ranging from 310 to 3493 Da, 21 samples in ESI+ and 10 in ESI, mode) and processed with similar parameters and 100% of them were mass-calculated to RMS error less than 5 ppm also. Copyright © 2003 John Wiley & Sons, Ltd. [source] Wavelet-based adaptive robust M-estimator for nonlinear system identificationAICHE JOURNAL, Issue 8 2000D. Wang A wavelet-based robust M-estimation method for the identification of nonlinear systems is proposed. Because it is not based on the assumption that there is the class of error distribution, it takes a flexible, nonparametric approach and has the advantage of directly estimating the error distribution from the data. This M-estimator is optimal over any error distribution in the sense of maximum likelihood estimation. A Monte-Carlo study on a nonlinear chemical engineering example was used to compare the results with various previously utilized methods. [source] High Moment Partial Sum Processes of Residuals in ARMA Models and their ApplicationsJOURNAL OF TIME SERIES ANALYSIS, Issue 1 2007Hao Yu Abstract., In this article, we study high moment partial sum processes based on residuals of a stationary autoregressive moving average (ARMA) model with known or unknown mean parameter. We show that they can be approximated in probability by the analogous processes which are obtained from the i.i.d. errors of the ARMA model. However, if a unknown mean parameter is used, there will be an additional term that depends on model parameters and a mean estimator. When properly normalized, this additional term will vanish. Thus the processes converge weakly to the same Gaussian processes as if the residuals were i.i.d. Applications to change-point problems and goodness-of-fit are considered, in particular, cumulative sum statistics for testing ARMA model structure changes and the Jarque,Bera omnibus statistic for testing normality of the unobservable error distribution of an ARMA model. [source] Prediction and nonparametric estimation for time series with heavy tailsJOURNAL OF TIME SERIES ANALYSIS, Issue 3 2002PETER HALL Motivated by prediction problems for time series with heavy-tailed marginal distributions, we consider methods based on `local least absolute deviations' for estimating a regression median from dependent data. Unlike more conventional `local median' methods, which are in effect based on locally fitting a polynomial of degree 0, techniques founded on local least absolute deviations have quadratic bias right up to the boundary of the design interval. Also in contrast to local least-squares methods based on linear fits, the order of magnitude of variance does not depend on tail-weight of the error distribution. To make these points clear, we develop theory describing local applications to time series of both least-squares and least-absolute-deviations methods, showing for example that, in the case of heavy-tailed data, the conventional local-linear least-squares estimator suffers from an additional bias term as well as increased variance. [source] Adaptive finite element procedures for elastoplastic problems at finite strainsPROCEEDINGS IN APPLIED MATHEMATICS & MECHANICS, Issue 1 2003A. Koch Dipl.-Ing. A major difficulty in the context of adaptive analysis of geometrically nonlinear problems is to provide a robust remeshing procedure that accounts both for the error caused by the spatial discretization and for the error due to the time discretization. For stability problems, such as strain localization and necking, it is essential to provide a step,size control in order to get a robust algorithm for the solution of the boundary value problem. For this purpose we developed an easy to implement step,size control algorithm. In addition we will consider possible a posteriori error indicators for the spatial error distribution of elastoplastic problems at finite strains. This indicator is adopted for a density,function,based adaptive remeshing procedure. Both error indicators are combined for the adaptive analysis in time and space. The performance of the proposed method is documented by means of representative numerical examples. [source] Goodness-of-fit tests for parametric models in censored regressionTHE CANADIAN JOURNAL OF STATISTICS, Issue 2 2007Juan Carlos Pardo-Fernández Abstract The authors propose a goodness-of-fit test for parametric regression models when the response variable is right-censored. Their test compares an estimation of the error distribution based on parametric residuals to another estimation relying on nonparametric residuals. They call on a bootstrap mechanism in order to approximate the critical values of tests based on Kolmogorov-Smirnov and Cramér-von Mises type statistics. They also present the results of Monte Carlo simulations and use data from a study about quasars to illustrate their work. Tests d'ajustement pour des modèles de régression paramétriques sujets à censure Les auteurs proposent un test permettant de juger de l'adéquation d'un modèle de régression paramétrique dont la variable réponse est sujette à une censure à droite. Leur test compare une estimation de la loi des erreurs déduite de résidus paramétriques à une autre estimation fondée sur des résidus non paramétriques. Ils font appel à une technique de rééchantillonnage pour approximer les valeurs critiques de tests fondés sur des statistiques de type Kolmogorov-Smirnov et Cramér-von Mises. Ils présentent aussi les résultats d'une étude de Monte-Carlo et illustrent leur propos à l'aide de données issues de travaux portant sur les quasars. [source] On describing multivariate skewed distributions: A directional approachTHE CANADIAN JOURNAL OF STATISTICS, Issue 3 2006José T. A. S. Ferreira Abstract Most multivariate measures of skewness in the literature measure the overall skewness of a distribution. These measures were designed for testing the hypothesis of distributional symmetry; their relevance for describing skewed distributions is less obvious. In this article, the authors consider the problem of characterizing the skewness of multivariate distributions. They define directional skewness as the skewness along a direction and analyze two parametric classes of skewed distributions using measures based on directional skewness. The analysis brings further insight into the classes, allowing for a more informed selection of classes of distributions for particular applications. The authors use the concept of directional skewness twice in the context of Bayesian linear regression under skewed error: first in the elicitation of a prior on the parameters of the error distribution, and then in the analysis of the skewness of the posterior distribution of the regression residuals. Décrire I'asyrnétrie de lois rnultivariées: une approche directionnelle La plupart des mesures d'asymétrie multivariées existantes donnent une idée de l'asymétrie globale d'une loi à plusieurs dimensions. Ces mesures ont été conçues pour tester l'hypothèse de symétrie distributionnelle; leur pertinence en tant qu'outil de description de l'asymétrie est moins claire. Dans cet article, les auteurs cherchent à caractériser l'asymétrie de lois multivariées. Ils définissent une notion d'asymétrie propre à une direction et étudient deux classes paramétriques de lois asymétriques à l'aide de mesures fondées sur ce concept. Leur analyse fournit des renseignements utiles sur les propriétés de ces classes de lois, permettant ainsi un choix plus éclairé dans des applications spécifiques. Les auteurs font double emploi de leur concept d'asymétrie directionnelle dans un contexte de régression linéaire bayésien-ne: d'abord pour l'élicitation d'une loi a priori sur les paramètres de la loi du terme d'erreur, puis pour l'analyse de l'asymétrie de la loi a posteriori des résidus du modèle. [source] Probabilistic forecasting from ensemble prediction systems: Improving upon the best-member method by using a different weight and dressing kernel for each memberTHE QUARTERLY JOURNAL OF THE ROYAL METEOROLOGICAL SOCIETY, Issue 617 2006Vincent Fortin Abstract Ensembles of meteorological forecasts can both provide more accurate long-term forecasts and help assess the uncertainty of these forecasts. No single method has however emerged to obtain large numbers of equiprobable scenarios from such ensembles. A simple resampling scheme, the ,best member' method, has recently been proposed to this effect: individual members of an ensemble are ,dressed' with error patterns drawn from a database of past errors made by the ,best' member of the ensemble at each time step. It has been shown that the best-member method can lead to both underdispersive and overdispersive ensembles. The error patterns can be rescaled so as to obtain ensembles which display the desired variance. However, this approach fails in cases where the undressed ensemble members are already overdispersive. Furthermore, we show in this paper that it can also lead to an overestimation of the probability of extreme events. We propose to overcome both difficulties by dressing and weighting each member differently, using a different error distribution for each order statistic of the ensemble. We show on a synthetic example and using an operational ensemble prediction system that this new method leads to improved probabilistic forecasts, when the undressed ensemble members are both underdispersive and overdispersive. Copyright © 2006 Royal Meteorological Society. [source] Falling and explosive, dormant, and rising markets via multiple-regime financial time series modelsAPPLIED STOCHASTIC MODELS IN BUSINESS AND INDUSTRY, Issue 1 2010Cathy W. S. Chen Abstract A multiple-regime threshold nonlinear financial time series model, with a fat-tailed error distribution, is discussed and Bayesian estimation and inference are considered. Furthermore, approximate Bayesian posterior model comparison among competing models with different numbers of regimes is considered which is effectively a test for the number of required regimes. An adaptive Markov chain Monte Carlo (MCMC) sampling scheme is designed, while importance sampling is employed to estimate Bayesian residuals for model diagnostic testing. Our modeling framework provides a parsimonious representation of well-known stylized features of financial time series and facilitates statistical inference in the presence of high or explosive persistence and dynamic conditional volatility. We focus on the three-regime case where the main feature of the model is to capturing of mean and volatility asymmetries in financial markets, while allowing an explosive volatility regime. A simulation study highlights the properties of our MCMC estimators and the accuracy and favourable performance as a model selection tool, compared with a deviance criterion, of the posterior model probability approximation method. An empirical study of eight international oil and gas markets provides strong support for the three-regime model over its competitors, in most markets, in terms of model posterior probability and in showing three distinct regime behaviours: falling/explosive, dormant and rising markets. Copyright © 2009 John Wiley & Sons, Ltd. [source] EXPONENTIAL SMOOTHING AND NON-NEGATIVE DATAAUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, Issue 4 2009Muhammad Akram Summary The most common forecasting methods in business are based on exponential smoothing, and the most common time series in business are inherently non-negative. Therefore it is of interest to consider the properties of the potential stochastic models underlying exponential smoothing when applied to non-negative data. We explore exponential smoothing state space models for non-negative data under various assumptions about the innovations, or error, process. We first demonstrate that prediction distributions from some commonly used state space models may have an infinite variance beyond a certain forecasting horizon. For multiplicative error models that do not have this flaw, we show that sample paths will converge almost surely to zero even when the error distribution is non-Gaussian. We propose a new model with similar properties to exponential smoothing, but which does not have these problems, and we develop some distributional properties for our new model. We then explore the implications of our results for inference, and compare the short-term forecasting performance of the various models using data on the weekly sales of over 300 items of costume jewelry. The main findings of the research are that the Gaussian approximation is adequate for estimation and one-step-ahead forecasting. However, as the forecasting horizon increases, the approximate prediction intervals become increasingly problematic. When the model is to be used for simulation purposes, a suitably specified scheme must be employed. [source] HEAVY-TAILED-DISTRIBUTED THRESHOLD STOCHASTIC VOLATILITY MODELS IN FINANCIAL TIME SERIESAUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, Issue 1 2008Cathy W. S. Chen Summary To capture mean and variance asymmetries and time-varying volatility in financial time series, we generalize the threshold stochastic volatility (THSV) model and incorporate a heavy-tailed error distribution. Unlike existing stochastic volatility models, this model simultaneously accounts for uncertainty in the unobserved threshold value and in the time-delay parameter. Self-exciting and exogenous threshold variables are considered to investigate the impact of a number of market news variables on volatility changes. Adopting a Bayesian approach, we use Markov chain Monte Carlo methods to estimate all unknown parameters and latent variables. A simulation experiment demonstrates good estimation performance for reasonable sample sizes. In a study of two international financial market indices, we consider two variants of the generalized THSV model, with US market news as the threshold variable. Finally, we compare models using Bayesian forecasting in a value-at-risk (VaR) study. The results show that our proposed model can generate more accurate VaR forecasts than can standard models. [source] BOOTSTRAP TESTS FOR THE ERROR DISTRIBUTION IN LINEAR AND NONPARAMETRIC REGRESSION MODELSAUSTRALIAN & NEW ZEALAND JOURNAL OF STATISTICS, Issue 2 2006Natalie Neumeyer Summary In this paper we investigate several tests for the hypothesis of a parametric form of the error distribution in the common linear and non-parametric regression model, which are based on empirical processes of residuals. It is well known that tests in this context are not asymptotically distribution-free and the parametric bootstrap is applied to deal with this problem. The performance of the resulting bootstrap test is investigated from an asymptotic point of view and by means of a simulation study. The results demonstrate that even for moderate sample sizes the parametric bootstrap provides a reliable and easy accessible solution to the problem of goodness-of-fit testing of assumptions regarding the error distribution in linear and non-parametric regression models. [source] Bayesian Quantile Regression for Longitudinal Studies with Nonignorable Missing DataBIOMETRICS, Issue 1 2010Ying Yuan Summary We study quantile regression (QR) for longitudinal measurements with nonignorable intermittent missing data and dropout. Compared to conventional mean regression, quantile regression can characterize the entire conditional distribution of the outcome variable, and is more robust to outliers and misspecification of the error distribution. We account for the within-subject correlation by introducing a,,2,penalty in the usual QR check function to shrink the subject-specific intercepts and slopes toward the common population values. The informative missing data are assumed to be related to the longitudinal outcome process through the shared latent random effects. We assess the performance of the proposed method using simulation studies, and illustrate it with data from a pediatric AIDS clinical trial. [source] Modeling Human Fertility in the Presence of Measurement ErrorBIOMETRICS, Issue 1 2000David B. Dunson Summary. The probability of conception in a given menstrual cycle is closely related to the timing of intercourse relative to ovulation. Although commonly used markers of time of ovulation are known to be error prone, most fertility models assume the day of ovulation is measured without error. We develop a mixture model that allows the day to be misspecified. We assume that the measurement errors are i.i.d. across menstrual cycles. Heterogeneity among couples in the per cycle likelihood of conception is accounted for using a beta mixture model. Bayesian estimation is straightforward using Markov chain Monte Carlo techniques. The methods are applied to a prospective study of couples at risk of pregnancy. In the absence of validation data or multiple independent markers of ovulation, the identifiability of the measurement error distribution depends on the assumed model. Thus, the results of studies relating the timing of intercourse to the probability of conception should be interpreted cautiously. [source] Neural network ensembles: combining multiple models for enhanced performance using a multistage approachEXPERT SYSTEMS, Issue 5 2004Shuang Yang Abstract: Neural network ensembles (sometimes referred to as committees or classifier ensembles) are effective techniques to improve the generalization of a neural network system. Combining a set of neural network classifiers whose error distributions are diverse can generate better results than any single classifier. In this paper, some methods for creating ensembles are reviewed, including the following approaches: methods of selecting diverse training data from the original source data set, constructing different neural network models, selecting ensemble nets from ensemble candidates and combining ensemble members' results. In addition, new results on ensemble combination methods are reported. [source] An adaptive clinical Type 1 diabetes control protocol to optimize conventional self-monitoring blood glucose and multiple daily-injection therapyINTERNATIONAL JOURNAL OF ADAPTIVE CONTROL AND SIGNAL PROCESSING, Issue 5 2009Xing-Wei Wong Abstract The objective of this study was to develop a safe, robust and effective protocol for the clinical control of Type 1 diabetes using conventional self-monitoring blood glucose (SMBG) measurements, and multiple daily injection (MDI) with insulin analogues. A virtual patient method is used to develop an in silico simulation tool for Type 1 diabetes using data from a Type 1 diabetes patient cohort (n=40) . The tool is used to test two prandial insulin protocols, an adaptive protocol (AC) and a conventional intensive insulin therapy (IIT) protocol (CC) against results from a representative control cohort as a function of SMBG frequency. With the prandial protocols, optimal and suboptimal basal insulin replacement using a clinically validated, forced-titration regimen is also evaluated. A Monte Carlo (MC) analysis using variability and error distributions derived from the clinical and physiological literature is used to test efficacy and robustness. MC analysis is performed for over 1 400 000 simulated patient hours. All results are compared with control data from which the virtual patients were derived. In conditions of suboptimal basal insulin replacement, the AC protocol significantly decreases HbA1c for SMBG frequencies ,6/day compared with controls and the CC protocol. With optimal basal insulin, mild and severe hypoglycaemia is reduced by 86,100% over controls for all SMBG frequencies. Control with the CC protocol and suboptimal basal insulin replacement saturates at an SMBG frequency of 6/day. The forced-titration regimen requires a minimum SMBG frequency of 6/day to prevent increased hypoglycaemia. Overaggressive basal dose titration with the CC protocol at lower SMBG frequencies is likely caused by uncorrected postprandial hyperglycaemia from the previous night. From the MC analysis, a defined peak in control is achieved at an SMBG frequency of 8/day. However, 90% of the cohort meets American Diabetes Association recommended HbA1c with just 2 measurements a day. A further 7.5% requires 4 measurements a day and only 2.5% (1 patient) required 6 measurements a day. In safety, the AC protocol is the most robust to applied MC error. Over all SMBG frequencies, the median for severe hypoglycaemia increases from 0 to 0.12% and for mild hypoglycaemia by 0,5.19% compared with the unrealistic no error simulation. While statistically significant, these figures are still very low and the distributions are well below those of the controls group. An adaptive control protocol for Type 1 diabetes is tested in silico under conditions of realistic variability and error. The adaptive (AC) protocol is effective and safe compared with conventional IIT (CC) and controls. As the fear of hypoglycaemia is a large psychological barrier to appropriate glycaemic control, adaptive model-based protocols may represent the next evolution of IIT to deliver increased glycaemic control with increased safety over conventional methods, while still utilizing the most commonly used forms of intervention (SMBG and MDI). The use of MC methods to evaluate them provides a relevant robustness test that is not considered in the no error analyses of most other studies. Copyright © 2008 John Wiley & Sons, Ltd. [source] The estimation of utility-consistent labor supply models by means of simulated scoresJOURNAL OF APPLIED ECONOMETRICS, Issue 4 2008Hans G. Bloemen We consider a utility-consistent static labor supply model with flexible preferences and a nonlinear and possibly non-convex budget set. Stochastic error terms are introduced to represent optimization and reporting errors, stochastic preferences, and heterogeneity in wages. Coherency conditions on the parameters and the support of error distributions are imposed for all observations. The complexity of the model makes it impossible to write down the probability of participation. Hence we use simulation techniques in the estimation. We compare our approach with various simpler alternatives proposed in the literature. Both in Monte Carlo experiments and for real data the various estimation methods yield very different results. Copyright © 2008 John Wiley & Sons, Ltd. [source] |