Summary Verification Measures and Their Interpretation for Ensemble Forecasts

Author/Creator ORCID

Date

2011-09-01

Department

Program

Citation of Original Publication

A. Allen Bradley, Stuart S. Schwartz , Summary Verification Measures and Their Interpretation for Ensemble Forecasts , MWR , September 2011, https://doi.org/10.1175/2010MWR3305.1

Rights

This item is likely protected under Title 17 of the U.S. Copyright Law. Unless on a Creative Commons license, for uses protected by Copyright Law, contact the copyright holder or the author.
© Copyright 2011 American Meteorological Society (AMS). Permission to use figures, tables, and brief excerpts from this work in scientific and educational works is hereby granted provided that the source is acknowledged. Any use of material in this work that is determined to be “fair use” under Section 107 of the U.S. Copyright Act or that satisfies the conditions specified in Section 108 of the U.S. Copyright Act (17 USC §108) does not require the AMS’s permission. Republication, systematic reproduction, posting in electronic form, such as on a website or in a searchable database, or other uses of this material, except as exempted by the above statement, requires written permission or a license from the AMS. All AMS journals and monograph publications are registered with the Copyright Clearance Center (http://www.copyright.com). Questions about permission to use materials for which AMS holds the copyright can also be directed to permissions@ametsoc.org. Additional details are provided in the AMS Copyright Policy statement, available on the AMS website (http://www.ametsoc.org/CopyrightInformation).

Abstract

Ensemble prediction systems produce forecasts that represent the probability distribution of a continuous forecast variable. Most often, the verification problem is simplified by transforming the ensemble forecast into probability forecasts for discrete events, where the events are defined by one or more threshold values. Then, skill is evaluated using the mean-square error (MSE; i.e., Brier) skill score for binary events, or the ranked probability skill score (RPSS) for multicategory events. A framework is introduced that generalizes this approach, by describing the forecast quality of ensemble forecasts as a continuous function of the threshold value. Viewing ensemble forecast quality this way leads to the interpretation of the RPSS and the continuous ranked probability skill score (CRPSS) as measures of the weighted-average skill over the threshold values. It also motivates additional measures, derived to summarize other features of a continuous forecast quality function, which can be interpreted as descriptions of the function’s geometric shape. The measures can be computed not only for skill, but also for skill score decompositions, which characterize the resolution, reliability, discrimination, and other aspects of forecast quality. Collectively, they provide convenient metrics for comparing the performance of an ensemble prediction system at different locations, lead times, or issuance times, or for comparing alternative forecasting systems.