Ensemble Forecast Verification for Large Data Sets

Set of tools to simplify application of atomic forecast verification metrics for (comparative) verification of ensemble forecasts to large data sets. The forecast metrics are imported from the 'SpecsVerification' package, and additional forecast metrics are provided with this package. Alternatively, new user-defined forecast scores can be implemented using the example scores provided and applied using the functionality of this package.

This package provides functions to simplify application of forecast verification metrics to large datasets of ensemble forecasts. The design goals of easyVerification are:

  • Flexibility: a variety of data structures are supported
  • Ease of use: Absolute forecasts and observations are converted to category and probability forecasts based on the threshold or probability (e.g. terciles) provided, ouputs are reformatted to fit the input
  • Convenience and flexibility over speed: R's built-in vectorisation is used where possible but more importantly, new metrics should be easy to implement

The forecast metrics are imported from the SpecsVerification package. Additional verification metrics not available through SpecsVerification are implemented directly. At the time of publication, the package offers functionality to compute the following deterministic and probabilitistic scores and skill scores:

  1. Mean error (EnsMe), mean absolute error(EnsMae), mean squared error (EnsMse), and root mean squared error (EnsRmse) of the ensemble mean and their skill scores (e.g. EnsRmsess)
  2. Correlation with the ensemble mean (EnsCorr)
  3. Spread to error ratio (EnsSprErr and FairSprErr)
  4. Area under the ROC curve (EnsRoca) and its skill score (EnsRocss)
  5. Fair (FairRps) and standard (EnsRps) rank probability scores and skill scores (e.g. FairRpss)
  6. Fair (FairCrps) and standard (EnsCrps) continuous ranked probability scores and skill scores (e.g. FairCrpss)
  7. Dressed scores (DressIgn, DressCrps) and their skill scores (DressIgnSs, DressCrpss) with default ensemble dressing method ("silverman")
  8. The generalized discrimination score for ensembles (Ens2AFC)

Additional forecast verification metrics can be added by the user following the examples above.


You can get the latest version from CRAN


You can get the latest development version using


Getting started

You can find out more about the package and its functionality in the vignette.


The following example illustrates how to compute the continous ranked probability skill score of an ensemble forecast:

## check out what is included in easyVerification
#>  [1] "convert2prob" "count2prob"   "Ens2AFC"      "EnsCorr"     
#>  [5] "EnsError"     "EnsErrorss"   "EnsMae"       "EnsMaess"    
#>  [9] "EnsMe"        "EnsMess"      "EnsMse"       "EnsMsess"    
#> [13] "EnsRmse"      "EnsRmsess"    "EnsRoca"      "EnsRocss"    
#> [17] "EnsSprErr"    "FairSprErr"   "toyarray"     "toymodel"    
#> [21] "veriApply"
## set up the forecast and observation data structures
## assumption: we have 13 x 5 spatial instances, 15 forecast 
## times and 51 ensemble members
tm <- toyarray(c(13,5), N=15, nens=51)
fo.crpss <- veriApply("EnsCrpss", fcst=tm$fcst, obs=tm$obs)
## if the data are organized differently such that forecast
## instance and ensemble members are NOT the last two array
## dimensions, this has to be indicated
## alternative setup:
## forecast instance, ensemble members, all forecast locations
## collated in one dimension
fcst2 <- array(aperm(tm$fcst, c(3,4,1,2)), c(15, 51, 13*5))
obs2 <- array(aperm(tm$obs, c(3,1,2)), c(15, 13*5))
fo2.crpss <- veriApply("EnsCrpss", fcst=fcst2, obs=obs2, 
                       ensdim=2, tdim=1)
## The forecast evaluation metrics are the same, but the 
## data structure is different in the two cases
#> [1] 13  5
#> [1] 65
range(fo.crpss$crpss - c(fo2.crpss$crpss))
#> [1] 0 0

Parallel processing

As of easyVerification, parallel processing is supported under *NIX systems. The following minimal example illustrates how to use the parallel processing capabilities of easyVerification.

## generate a toy-model forecast observation set of 
## 10 x 10 forecast locations (e.g. lon x lat)
tm <- toyarray(c(10,10))
## run and time the ROC skill score for tercile forecasts without parallelization
  tm.rocss <- veriApply("EnsRocss", tm$fcst, tm$obs, prob=1:2/3)  
#>    user  system elapsed 
#>   1.984   0.008   2.000
## run the ROC skill score with parallelization
  tm.rocss.par <- veriApply("EnsRocss", tm$fcst, tm$obs, prob=1:2/3, parallel=TRUE)
#> Loading required namespace: parallel
#> [1] "Number of CPUs 3"
#>    user  system elapsed 
#>   0.088   0.040   0.824

To get additional help and examples please see the vignette vignette('easyVerification') or the help pages of the functions in easyVerification (e.g. help(veriApply)).


Changes in version 0.4.4

  • Added min-fcst threshold for missing value masking (fraction and absolute number of forecasts)
  • fixed Rcpp dependency problems
  • fixed NA bug with reference forecasts

Changes in version 0.4.3

  • improved missing value handling
  • fixed bug in "EnsRocss"

Changes in version 0.4.2

  • Fixed bug with missing values and na.rm=T in translating to categorical forecasts in convert2prob
  • Fixed issue with limited amount of unique values in convert2prob

Changes in version 0.4.1

  • Fixed CRAN NOTE on registration of native routines
  • Fixed bug in setting up reference forecasts with missing observations
  • Improved error handling with missing values in forecast
  • Improve performance of convert2prob with reduced baseline
  • Added missing value support for count2prob
  • Fixed unit test causing episodic errors

Changes in version 0.4.0

  • Adopted to new version of SpecsVerification (0.5.0)
  • Fully implemented out-of-sample computation of percentile thresholds
  • Changed to new implementation of AUC in 'SpecsVerification' with standard errors

Changes in version 0.3.0 (2016-09-28)

  • added support for out-of-sample reference forecasts (user-defined or by keyword for a few standard approaches, issues #1 and #2)
  • added support for difference in scores functions (related to skill scores)
  • removed onload function to check package version. Updating is now dealt with by update.packages as for other CRAN packages.
  • removed non-sensical mean error skill score EnsMess.
  • deprecated ill-defined ROC area skill score. In future versions, only the ROC area score will be implemented EnsRoca.
  • Added reliability categorization following Weisheimer et al. (2014).
  • fixed bug in convert2prob for climatological forecasts (reduced set of values to compute percentile boundaries for consistency).
  • fix for FairRpss against climatological reference forecast with category boundaries defined on distribution of verifying observations.

Changes in version 0.2.0 (2016-01-25)

  • Added ignorance score for probability forecasts EnsIgn and skill score EnsIgnss
  • replaced EnsRoca and rank.ensembles in Ens2AFC with C++ equivalents that are slightly faster
  • added documentation for veriApply (issue #4)
  • added multi-model option for relative thresholds to convert2prob
  • added support for named vector output in functions such as Corr and CorrDiff from SpecsVerification
  • added bug fix from Henrik Bengtsson (pull request #3)
  • updated documentation also to reflect that package is now available on CRAN
  • minor bugfixes

Changes in version 0.1.8 (2015-10-25)

  • prepared package for release on CRAN

Changes in version

  • added support for ensembles of size 1

Changes in version

  • fixed documentation
  • added new function to convert counts (from convert2prob) to probabilities (count2prob)


  • additional arguments for parallel processing courtesy of Matteo De Felice

Changes in version

  • added parallelization of veriApply using the parallel package
  • parallelization is based on FORK nodes, and thus won't work under Windows
  • under Windows and if parallel is not available, the original, unparallelized fallback is used
  • parallelization will use up to 16 nodes, but will leave one node free for other tasks

Changes in version

  • added toymodel to produce forecast-observation pairs
  • added toyarray to produce multiple independent forecast-observation pairs, for example at different spatial locations

Changes in version

  • Fixed bug in veriApply for reformatting output of scores (not affecting skill scores)

Changes in version

  • Added ECOMS-UDG / easyVerification vignette

Changes in version

  • updated documentation for elementary skill functions

Changes in version 0.1.5

  • probability and absolute thresholds for conversion of continuous forecasts to category forecasts can now be supplied to be forecast specific (e.g. different thresholds for different lead times and spatial locations)
  • bug fix in veriApply with minimal forecast, observation examples

Changes in version

  • Fixed bug in missing value treatment with convert2prob. This will not affect functions called using veriApply as of yet as scores in veriApply are only computed for complete forecast and observation pairs

Changes in version

  • Bug fix (scaling) of standard error provided in EnsRocss

Changes in version 0.1.4

  • Support for dressed metrics from SpecsVerification. Only the standard dressing method ("silverman") is supported so far.
  • Significance for EnsRocss
  • FairSprErr Fair spread error ratio
  • EnsRocss allow for arbitrary reference forecasts in ROC area skill score (no significance for reference forecasts with ROC area != 0.5)

Changes in version 0.1.3

  • Ens2AFC Added the generalized discrimination score for ensembles.

Changes in version 0.1.2

  • Moved repository to new location on github.com.
  • Added vignette documenting the basic functionality of the package.
  • Bugfix for ensembles of size 1.
  • Removed redundant checks on ensemble size.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


0.4.4 by Jonas Bhend, 4 months ago


Browse source code at https://github.com/cran/easyVerification

Authors: MeteoSwiss [aut, cph], Jonas Bhend [cre], Jacopo Ripoldi [ctb], Claudia Mignani [ctb], Irina Mahlstein [ctb], Rebecca Hiller [ctb], Christoph Spirig [ctb], Mark Liniger [ctb], Andreas Weigel [ctb], Joaqu'in Bedia Jimenez [ctb], Matteo De Felice [ctb], Stefan Siegert [ctb], Katrin Sedlmeier [ctb]

Documentation:   PDF Manual  

GPL-3 license

Imports pbapply, Rcpp

Depends on SpecsVerification, stats, utils

Suggests testthat, knitr, rmarkdown, parallel, R.rsp, verification

Linking to Rcpp

Suggested by s2dverification.

See at CRAN