ADVERTISEMENT

If you are seeing this message, you may be experiencing temporary network problems. Please wait a few minutes and refresh the page. If the problem persists, you may wish to report it to your local Network Manager.

It is also possible that your web browser is not configured or not able to display style sheets. In this case, although the visual presentation will be degraded, the site should continue to be functional. We recommend using the latest version of Microsoft or Mozilla web browser to help minimise these problems.

Wiley InterScience

Journal of the Royal Statistical Society: Series B (Statistical Methodology)

Journal of the Royal Statistical Society: Series B (Statistical Methodology)

Volume 69 Issue 2, Pages 243 - 268

Published Online: 5 Mar 2007

© 2010 The Royal Statistical Society and Blackwell Publishing Ltd



< Previous Abstract  |  Next Abstract >

Save Article to My Profile      Download Citation      Request Permissions

Abstract |  References  |  Full Text: HTML, PDF (Size: 837K)  | Related Articles | Citation Tracking

Probabilistic forecasts, calibration and sharpness
Tilmann Gneiting 1 , Fadoua Balabdaoui 2 and Adrian E. Raftery 3
  1 University of Washington, Seattle, USA
  2 Georg-August-Universität Göttingen, Germany
  3 University of Washington, Seattle, USA
Correspondence to Tilmann Gneiting, Department of Statistics, University of Washington, Seattle, WA 98195-4322, USA.
E-mail: tilmann@stat.washington.edu
Copyright 2007 Royal Statistical Society
KEYWORDS
Cross-validation • Density forecast • Ensemble prediction system • Ex post evaluation • Forecast verification • Model diagnostics • Posterior predictive assessment • Predictive distribution • Prequential principle • Probability integral transform • Proper scoring rule

ABSTRACT

Summary. Probabilistic forecasts of continuous variables take the form of predictive densities or predictive cumulative distribution functions. We propose a diagnostic approach to the evaluation of predictive performance that is based on the paradigm of maximizing the sharpness of the predictive distributions subject to calibration. Calibration refers to the statistical consistency between the distributional forecasts and the observations and is a joint property of the predictions and the events that materialize. Sharpness refers to the concentration of the predictive distributions and is a property of the forecasts only. A simple theoretical framework allows us to distinguish between probabilistic calibration, exceedance calibration and marginal calibration. We propose and study tools for checking calibration and sharpness, among them the probability integral transform histogram, marginal calibration plots, the sharpness diagram and proper scoring rules. The diagnostic approach is illustrated by an assessment and ranking of probabilistic forecasts of wind speed at the Stateline wind energy centre in the US Pacific Northwest. In combination with cross-validation or in the time series context, our proposal provides very general, nonparametric alternatives to the use of information criteria for model diagnostics and model selection.


[Received May 2005. Final revision October 2006]

DIGITAL OBJECT IDENTIFIER (DOI)
10.1111/j.1467-9868.2007.00587.x About DOI

Related Articles

  • Find other articles like this in Wiley InterScience
  • Find articles in Wiley InterScience written by any of the authors

Wiley InterScience is a member of CrossRef.

Cross Ref Member


Also of Interest

Statistics

Wiley-Blackwell is the largest publisher of society-based statistics journals and No. 1 in terms of quality and international scope.

Wiley-Blackwell publishes 19 statistics journals and is now the top publisher of Thomson Reuters ranked statistics journals.

Discover more about the statistics portfolio

Hot Papers
RSS

Journal of the Royal Statistical Society

See the Papers attracting early citation:

Series A: Statistics in Society
A re-evaluation of random-effects meta-analysis

Series B: Statistical Methodology
Testing for lack of fit in inverse regression—with applications to biophotonic imaging

Series C: Applied Statistics
A multifaceted sensitivity analysis of the Slovenian public opinion survey data

Announcing
SIGN

Significance

2010 Crystal Ball Competition

Try to forecast the results of 10 different events, some sporting, some cultural, some just odd, that will take place between May and July 2010.
Cash prizes and books for winners.

Take part

Check out the rules

Have Fun!