| |||
Brier Skill Score (brier + skill_score)
Selected AbstractsProbabilistic temperature forecast by using ground station measurements and ECMWF ensemble prediction systemMETEOROLOGICAL APPLICATIONS, Issue 4 2004P. Boi The ECMWF Ensemble Prediction System 2-metre temperature forecasts are affected by systematic errors due mainly to resolution inadequacies. Moreover, other errors sources are present: differences in height above sea level between the station and the corresponding grid point, boundary layer parameterisation, and description of the land surface. These errors are more marked in regions of complex orography. A recursive statistical procedure to adapt ECMWF EPS-2metre temperature fields to 58 meteorological stations on the Mediterranean island of Sardinia is presented. The correction has been made in three steps: (1) bias correction of systematic errors; (2) calibration to adapt the EPS temperature distribution to the station temperature distribution; and (3) doubling the ensemble size with the aim of taking into account the analysis errors. Two years of probabilistic forecasts of freezing are tested by Brier Score, reliability diagram, rank histogram and Brier Skill Score with respect to the climatological forecast. The score analysis shows much better performance in comparison with the climatological forecast and direct model output, for all forecast timse, even after the first step (bias correction). Further gains in skill are obtained by calibration and by doubling the ensemble size. Copyright © 2004 Royal Meteorological Society. [source] Limited-area ensemble predictions at the Norwegian Meteorological InstituteTHE QUARTERLY JOURNAL OF THE ROYAL METEOROLOGICAL SOCIETY, Issue 621 2006Inger-Lise Frogner Abstract This study aims at improving 0,3 day probabilistic forecasts of precipitation events in Norway. For this purpose a limited-area ensemble prediction system (LAMEPS) is tested. The horizontal resolution of LAMEPS is 28 km, and there are 31 levels in the vertical. The state variables provided as initial and lateral boundary conditions for the limited-area forecasts are perturbed using a dedicated version of the European Centre for Medium-Range Weather Forecasts (ECMWF) global ensemble prediction system, TEPS. These are constructed by combining initial and evolved singular vectors that at final time (48 h) are targeted to maximize the total energy in a domain containing northern Europe and adjacent sea areas. The resolution of TEPS is T255 with 40 levels. The test period includes 45 cases with 21 ensemble members in each case. We focus on 24 h accumulated precipitation rates with special emphasis on intense events. We also investigate a combination of TEPS and LAMEPS resulting in a system (NORLAMEPS) with 42 ensemble members. NORLAMEPS is compared with the 21-member LAMEPS and TEPS as well as the regular 51-member EPS run at ECMWF. The benefit of using targeted singular vectors is seen by comparing the 21-member TEPS with the 51-member operational EPS, as TEPS has considerably larger spread between ensemble members. For other measures, such as Brier Skill Score (BSS) and Relative Operating Characteristic (ROC) curves, the scores of the two systems are for most cases comparable, despite the difference in ensemble size. NORLAMEPS has the largest ensemble spread of all four ensemble systems studied in this paper, while EPS has the smallest spread. Nevertheless, EPS has higher BSS with NORLAMEPS approaching for the highest precipitation thresholds. For the area under the ROC curve, NORLAMEPS is comparable with or better than EPS for medium to large thresholds. Copyright © 2006 Royal Meteorological Society [source] High-resolution limited-area ensemble predictions based on low-resolution targeted singular vectorsTHE QUARTERLY JOURNAL OF THE ROYAL METEOROLOGICAL SOCIETY, Issue 582 2002Inger-Lise Frogner Abstract The operational limited-area model, HIRLAM, at the Norwegian Meteorological Institute is used at 0.25° latitude/longitude resolution for ensemble weather prediction over Northern Europe and adjacent parts of the North Atlantic Ocean; this system is called LAMEPS. Initial and lateral boundary perturbations are taken from coarse-resolution European Centre for Medium-Range Weather Forecasts global ensemble members based on targeted singular vectors (TEPS). Five winter and five summer cases in 1997 consisting of 20 ensemble members plus one control forecast are integrated. Two sets of ensembles are generated, one for which both initial and lateral boundary conditions are perturbed, and another with only the initial fields perturbed. The LAMEPS results are compared to those of TEPS using the following measures: r.m.s. ensemble spread of 500 hPa geopotential height; r.m.s. ensemble spread of mean-sea-level pressure; Brier Skill Scores (BSS); Relative Operating Characteristic (ROC) curves; and cost/loss analyses. For forecasts longer than 12 hours, all measures show that perturbing the boundary fields is crucial for the performance of LAMEPS. For the winter cases TEPS has slightly larger ensemble spread than LAMEPS, but this is reversed for the summer cases. Results from BSS, ROC and cost/loss analyses show that LAMEPS performed considerably better than TEPS for precipitation, a result that is promising for forecasting extreme precipitation amounts. We believe this result to be linked to the high predictability of mesoscale flows controlled by complex topography. For two-metre temperature, however, TEPS frequently performed better than LAMEPS. Copyright © 2002 Royal Meteorological Society [source] How much does simplification of probability forecasts reduce forecast quality?METEOROLOGICAL APPLICATIONS, Issue 1 2008F. J. Doblas-Reyes Abstract Probability forecasts from an ensemble are often discretized into a small set of categories before being distributed to the users. This study investigates how such simplification can affect the forecast quality of probabilistic predictions as measured by the Brier score (BS). An example from the European Centre for Medium-Range Weather Forecasts (ECMWF) operational seasonal ensemble forecast system is used to show that the simplification of the forecast probabilities reduces the Brier skill score (BSS) by as much as 57% with respect to the skill score obtained with the full set of probabilities issued from the ensemble. This is more obvious for a small number of probability categories and is mainly due to a decrease in forecast resolution of up to 36%. The impact of the simplification as a function of the ensemble size is also discussed. The results suggest that forecast quality should be made available for the set of probabilities that the forecast user has access to as well as for the complete set of probabilities issued by the ensemble forecasting system. Copyright © 2008 Royal Meteorological Society [source] Measuring forecast skill: is it real skill or is it the varying climatology?THE QUARTERLY JOURNAL OF THE ROYAL METEOROLOGICAL SOCIETY, Issue 621C 2006Thomas M. Hamill Abstract It is common practice to summarize the skill of weather forecasts from an accumulation of samples spanning many locations and dates. In calculating many of these scores, there is an implicit assumption that the climatological frequency of event occurrence is approximately invariant over all samples. If the event frequency actually varies among the samples, the metrics may report a skill that is different from that expected. Many common deterministic verification metrics, such as threat scores, are prone to mis-reporting skill, and probabilistic forecast metrics such as the Brier skill score and relative operating characteristic skill score can also be affected. Three examples are provided that demonstrate unexpected skill, two from synthetic data and one with actual forecast data. In the first example, positive skill was reported in a situation where metrics were calculated from a composite of forecasts that were comprised of random draws from the climatology of two distinct locations. As the difference in climatological event frequency between the two locations was increased, the reported skill also increased. A second example demonstrates that when the climatological event frequency varies among samples, the metrics may excessively weight samples with the greatest observational uncertainty. A final example demonstrates unexpectedly large skill in the equitable threat score of deterministic precipitation forecasts. Guidelines are suggested for how to adjust skill computations to minimize these effects. Copyright © 2006 Royal Meteorological Society [source] Measures of skill and value of ensemble prediction systems, their interrelationship and the effect of ensemble sizeTHE QUARTERLY JOURNAL OF THE ROYAL METEOROLOGICAL SOCIETY, Issue 577 2001David S. Richardson Abstract Ensemble forecasts provide probabilistic predictions for the future state of the atmosphere. Usually the probability of a given event E is determined from the fraction of ensemble members which predict the event. Hence there is a degree of sampling error inherent in the predictions. In this paper a theoretical study is made of the effect of ensemble size on forecast performance as measured by a reliability diagram and Brier (skill) score, and on users by using a simple cost-loss decision model. The relationship between skill and value, and a generalized skill score, dependent on the distribution of users, are discussed. The Brier skill score is reduced from its potential level for all finite-sized ensembles. The impact is most significant for small ensembles, especially when the variance of forecast probabilities is also small. The Brier score for a set of deterministic forecasts is a measure of potential predictability, assuming the forecasts are representative selections from a reliable ensemble prediction system (EPS). There is a consistent effect of finite ensemble size on the reliability diagram. Even if the underlying distribution is perfectly reliable, sampling this using only a small number of ensemble members introduces considerable unreliability. There is a consistent over-forecasting which appears as a clockwise tilt of the reliability diagram. It is important to be aware of the expected effect of ensemble size to avoid misinterpreting results. An ensemble of ten or so members should not be expected to provide reliable probability forecasts. Equally, when comparing the performance of different ensemble systems, any difference in ensemble size should be considered before attributing performance differences to other differences between the systems. The usefulness of an EPS to individual users cannot be deduced from the Brier skill score (nor even directly from the reliability diagram). An EPS with minimal Brier skill may nevertheless be of substantial value to some users, while small differences in skill may hide substantial variation in value. Using a simple cost-loss decision model, the sensitivity of users to differences in ensemble size is shown to depend on the predictability and frequency of the event and on the cost-loss ratio of the user. For an extreme event with low predictability, users with low cost-loss ratio will gain significant benefits from increasing ensemble size from 50 to 100 members, with potential for substantial additional value from further increases in number of members. This sensitivity to large ensemble size is not evident in the Brier skill score. A generalized skill score, dependent on the distribution of users, allows a summary performance measure to be tuned to a particular aspect of EPS performance. [source] A probability and decision-model analysis of PROVOST seasonal multi-model ensemble integrationsTHE QUARTERLY JOURNAL OF THE ROYAL METEOROLOGICAL SOCIETY, Issue 567 2000T. N. Palmer Abstract A probabilistic analysis is made of seasonal ensemble integrations from the PROVOST project (PRediction Of climate Variations On Seasonal to interannual Time-scales), with emphasis on the Brier score and related Murphy decomposition, and the relative operating characteristic. To illustrate the significance of these results to potential users, results from the analysis of the relative operating characteristic are input to a simple decision model. The decision-model analysis is used to define a user-specific objective measure of the economic value of seasonal forecasts. The analysis is made for two simple meteorological forecast conditions or ,events', E, based on 850 hPa temperature. The ensemble integrations result from integrating four different models over the period 1979,93. For each model a set of 9-member ensembles is generated by running from consecutive analyses. Results from the Brier skill score analysis taken over all northern hemisphere grid points indicate that, whilst the skill of individual-model ensembles is only marginally higher than a probabilistic forecast of climatological frequencies, the multi-model ensemble is substantially more skilful than climatology. Both reliability and resolution are better for the multi-model ensemble than for the individual-model ensembles. This improvement arises both from the use of different models in the ensemble, and from the enhanced ensemble size obtained by combining individual-model ensembles; the latter reason was found to be the more important. Brier skill scores are higher for years in which there were moderate or strong El Niño Southern Oscillation (ENSO) events. Over Europe, only the multi-model ensembles showed skill over climatology. Similar conclusions are reached from an analysis of the relative operating characteristic. Results from the decision-model analysis show that the economic value of seasonal forecasts is strongly dependent on the cost, C, to the user of taking precautionary action against E, in relation to the potential loss, L, if precautionary action is not taken and E occurs. However, based on the multi-model ensemble data, the economic value can be as much as 50% of the value of a hypothetical perfect deterministic forecast. For the hemisphere as a whole, value is enhanced by restriction to ENSO years. It is shown that there is potential economic value in seasonal forecasts for European users. However, the impact of ENSO on economic value over Europe is mixed; value is enhanced by El Niño only for some potential users with specific C/L. The techniques developed are applicable to complex E for arbitrary regions. Hence these techniques are proposed as the basis of an objective probabilistic and decision-model evaluation of operational seasonal ensemble forecasts. [source] |