Home About us Contact | |||
Prediction Problems (prediction + problem)
Selected AbstractsNonparametric prediction intervals for the future rainfall records,ENVIRONMETRICS, Issue 5 2006Mohammad Z. Raqab Abstract Prediction of records plays an important role in the environmental applications, especially, prediction of rainfall extremes, highest water levels, sea surface, and air record temperatures. In this paper, based on the observed records drawn from a sequence sample of independent and identically random variables, we develop prediction intervals as well as prediction upper and lower bounds for records from another independent sequence. We extend the prediction problem to include prediction regions for joint upper records from a future sequence sample. The Bonferouni's inequality is used to choose appropriate prediction coefficients for the joint prediction. A real data set representing the records of the annual (January 1,December 31) rainfall at Los Angeles Civic Center is addressed to illustrate the proposed prediction procedures in the environmental applications. Copyright © 2005 John Wiley & Sons, Ltd. [source] Intra-seasonal rainfall characteristics and their importance to the seasonal prediction problemINTERNATIONAL JOURNAL OF CLIMATOLOGY, Issue 9 2002Warren J. Tennant Abstract Daily station rainfall data in South Africa from 1936 to 1999 are combined into homogeneous rainfall regions using Ward's clustering method. Various rainfall characteristics are calculated for the summer season, defined as December to February. These include seasonal rainfall total, region-average number of station rain days exceeding 1 and 20 mm, region-average of periods between rain days at stations >1 and >20 mm, region-average of wet spell length (sequential days of station rainfall >1 and >20 mm), correlation of daily station rainfall within a region and correlation of seasonal station rainfall anomalies within a region. Rank-ordered rainfall characteristic data generally form an s-shaped curve, and significance testing of discontinuities in these curves suggests that normal rainfall conditions in South Africa consist of a combined middle three quintiles separated from the outer quintiles, rather than the traditional middle tercile. The relationships between the various rainfall characteristics show that seasons with a high total rainfall generally have a higher number of heavy rain days (>20 mm) and not necessarily an increase in light rain days. The length of the period between rain days has a low correlation to season totals, demonstrating that seasons with a high total rainfall may still contain prolonged dry periods. These additional rainfall characteristics are important to end-users, and the analysis undertaken here offers a valuable starting point for seeking physical relationships between rainfall characteristics and the general circulation. Preliminary studies show that the vertical mean wind is related to rainfall characteristics in South Africa. Given that general circulation models capture this part of the circulation adequately, seasonal forecasts of rainfall characteristics become plausible. Copyright © 2002 Royal Meteorological Society. [source] Clinical versus statistical prediction: The contribution of Paul E. MeehlJOURNAL OF CLINICAL PSYCHOLOGY, Issue 10 2005William M. GroveArticle first published online: 22 JUL 200 The background of Paul E. Meehl's work on clinical versus statistical prediction is reviewed, with detailed analyses of his arguments. Meehl's four main contributions were the following: (a) he put the question, of whether clinical or statistical combinations of psychological data yielded better predictions, at center stage in applied psychology; (b) he convincingly argued, against an array of objections, that clinical versus statistical prediction was a real (not concocted) problem needing thorough study; (c) he meticulously and even-handedly dissected the logic of clinical inference from theoretical and probabilistic standpoints; and (c) he reviewed the studies available in 1954 and thereafter, which tested the validity of clinical versus statistical predictions. His early conclusion that the literature strongly favors statistical prediction has stood up extremely well, and his conceptual analyses of the prediction problem (especially his defense of applying aggregate-based probability statements to individual cases) have not been significantly improved since 1954. © 2005 Wiley Periodicals, Inc. J Clin Psychol 61: 1233,1243, 2005. [source] Inductive Inference by Using Information CompressionCOMPUTATIONAL INTELLIGENCE, Issue 2 2003Ben Choi Inductive inference is of central importance to all scientific inquiries. Automating the process of inductive inference is the major concern of machine learning researchers. This article proposes inductive inference techniques to address three inductive problems: (1) how to automatically construct a general description, a model, or a theory to describe a sequence of observations or experimental data, (2) how to modify an existing model to account for new observations, and (3) how to handle the situation where the new observations are not consistent with the existing models. The techniques proposed in this article implement the inductive principle called the minimum descriptive length principle and relate to Kolmogorov complexity and Occam's razor. They employ finite state machines as models to describe sequences of observations and measure the descriptive complexity by measuring the number of states. They can be used to draw inference from sequences of observations where one observation may depend on previous observations. Thus, they can be applied to time series prediction problems and to one-to-one mapping problems. They are implemented to form an automated inductive machine. [source] Memetic evolutionary training for recurrent neural networks: an application to time-series predictionEXPERT SYSTEMS, Issue 2 2006M. Delgado Abstract: Artificial neural networks are bio-inspired mathematical models that have been widely used to solve complex problems. The training of a neural network is an important issue to deal with, since traditional gradient-based algorithms become easily trapped in local optimal solutions, therefore increasing the time taken in the experimental step. This problem is greater in recurrent neural networks, where the gradient propagation across the recurrence makes the training difficult for long-term dependences. On the other hand, evolutionary algorithms are search and optimization techniques which have been proved to solve many problems effectively. In the case of recurrent neural networks, the training using evolutionary algorithms has provided promising results. In this work, we propose two hybrid evolutionary algorithms as an alternative to improve the training of dynamic recurrent neural networks. The experimental section makes a comparative study of the algorithms proposed, to train Elman recurrent neural networks in time-series prediction problems. [source] Robustness of alternative non-linearity tests for SETAR modelsJOURNAL OF FORECASTING, Issue 3 2004Wai-Sum Chan Abstract In recent years there has been a growing interest in exploiting potential forecast gains from the non-linear structure of self-exciting threshold autoregressive (SETAR) models. Statistical tests have been proposed in the literature to help analysts check for the presence of SETAR-type non-linearities in an observed time series. It is important to study the power and robustness properties of these tests since erroneous test results might lead to misspecified prediction problems. In this paper we investigate the robustness properties of several commonly used non-linearity tests. Both the robustness with respect to outlying observations and the robustness with respect to model specification are considered. The power comparison of these testing procedures is carried out using Monte Carlo simulation. The results indicate that all of the existing tests are not robust to outliers and model misspecification. Finally, an empirical application applies the statistical tests to stock market returns of the four little dragons (Hong Kong, South Korea, Singapore and Taiwan) in East Asia. The non-linearity tests fail to provide consistent conclusions most of the time. The results in this article stress the need for a more robust test for SETAR-type non-linearity in time series analysis and forecasting. Copyright © 2004 John Wiley & Sons, Ltd. [source] Tilting methods for assessing the influence of components in a classifierJOURNAL OF THE ROYAL STATISTICAL SOCIETY: SERIES B (STATISTICAL METHODOLOGY), Issue 4 2009Peter Hall Summary., Many contemporary classifiers are constructed to provide good performance for very high dimensional data. However, an issue that is at least as important as good classification is determining which of the many potential variables provide key information for good decisions. Responding to this issue can help us to determine which aspects of the datagenerating mechanism (e.g. which genes in a genomic study) are of greatest importance in terms of distinguishing between populations. We introduce tilting methods for addressing this problem. We apply weights to the components of data vectors, rather than to the data vectors themselves (as is commonly the case in related work). In addition we tilt in a way that is governed by L2 -distance between weight vectors, rather than by the more commonly used Kullback,Leibler distance. It is shown that this approach, together with the added constraint that the weights should be non-negative, produces an algorithm which eliminates vector components that have little influence on the classification decision. In particular, use of the L2 -distance in this problem produces properties that are reminiscent of those that arise when L1 -penalties are employed to eliminate explanatory variables in very high dimensional prediction problems, e.g. those involving the lasso. We introduce techniques that can be implemented very rapidly, and we show how to use bootstrap methods to assess the accuracy of our variable ranking and variable elimination procedures. [source] Prediction and nonparametric estimation for time series with heavy tailsJOURNAL OF TIME SERIES ANALYSIS, Issue 3 2002PETER HALL Motivated by prediction problems for time series with heavy-tailed marginal distributions, we consider methods based on `local least absolute deviations' for estimating a regression median from dependent data. Unlike more conventional `local median' methods, which are in effect based on locally fitting a polynomial of degree 0, techniques founded on local least absolute deviations have quadratic bias right up to the boundary of the design interval. Also in contrast to local least-squares methods based on linear fits, the order of magnitude of variance does not depend on tail-weight of the error distribution. To make these points clear, we develop theory describing local applications to time series of both least-squares and least-absolute-deviations methods, showing for example that, in the case of heavy-tailed data, the conventional local-linear least-squares estimator suffers from an additional bias term as well as increased variance. [source] Competing Risks and Time-Dependent CovariatesBIOMETRICAL JOURNAL, Issue 1 2010Giuliana Cortese Abstract Time-dependent covariates are frequently encountered in regression analysis for event history data and competing risks. They are often essential predictors, which cannot be substituted by time-fixed covariates. This study briefly recalls the different types of time-dependent covariates, as classified by Kalbfleisch and Prentice [The Statistical Analysis of Failure Time Data, Wiley, New York, 2002] with the intent of clarifying their role and emphasizing the limitations in standard survival models and in the competing risks setting. If random (internal) time-dependent covariates are to be included in the modeling process, then it is still possible to estimate cause-specific hazards but prediction of the cumulative incidences and survival probabilities based on these is no longer feasible. This article aims at providing some possible strategies for dealing with these prediction problems. In a multi-state framework, a first approach uses internal covariates to define additional (intermediate) transient states in the competing risks model. Another approach is to apply the landmark analysis as described by van Houwelingen [Scandinavian Journal of Statistics 2007, 34, 70,85] in order to study cumulative incidences at different subintervals of the entire study period. The final strategy is to extend the competing risks model by considering all the possible combinations between internal covariate levels and cause-specific events as final states. In all of those proposals, it is possible to estimate the changes/differences of the cumulative risks associated with simple internal covariates. An illustrative example based on bone marrow transplant data is presented in order to compare the different methods. [source] |