Probability of Backtest Overfitting

Following the method of Bailey et al., computes for a collection of candidate models the probability of backtest overfitting, the performance degradation and probability of loss, and the stochastic dominance.


Build Status

Implements in R some of the ideas found in the Bailey et al. paper identified below. In particular we use combinatorially symmetric cross validation (CSCV) to implement strategy performance tests evaluated by the Omega ratio. We compute the probability of backtest overfit, performance degradation, probability of loss, and stochastic dominance. We plot visual representations of these using the lattice package.

The reference authors used the Sharpe ratio as the performance measure. Other measures are suitable according to the assumptions laid out in the paper.

Example plots attached below. The first four illustrate a test with low overfitting (T-distribution, N=100, T=1600, S=8). The second four illustrate a test from the reference paper with high overfitting (normal distribution, N=100, T=1000, S=8). The third batch illustrate some study selection performance plots for both cases.

Example test case, low overfitting:

plot1 plot2 plot3

Reference test case 1, high overfitting:

plot1 plot2 plot3

Example study selection performance for the low and high cases:

low5 low6 low7

high4 high5 high6 high7

More examples with a larger number of combinations on the same high- and low-overfitting test cases. There are 12,780 CSCV combinations with the these tests (normal distribution, N=200, T=2000, S=16, Omega ratio performance).

lh1 lh2 lh3 lh4 lh5 lh6 lh7

Installation

require(devtools)
install_github('pbo',username='mrbcuda')

Example

require(pbo)
require(lattice) # for plots
require(PerformanceAnalytics) # for Omega ratio
 
N <- 200                 # studies, alternative configurations
T <- 3200                # sample returns
S <- 8                   # partition count
M <- data.frame(matrix(NA,T,N,byrow=TRUE,dimnames=list(1:T,1:N)),check.names=FALSE)
for ( i in 1:N ) M[,i] <- rt(T,10) / 100
 
# compute and plot
my_pbo <- pbo(M,S,F=Omega,threshold=1)
summary(my_pbo)
histogram(my_pbo)
dotplot(my_pbo,pch=15,col=2,cex=1.5)
xyplot(my_pbo,plotType="cscv",cex=0.8,show_rug=FALSE,osr_threshold=100)
xyplot(my_pbo,plotType="degradation")
xyplot(my_pbo,plotType="dominance",lwd=2)
xyplot(my_pbo,plotType="pairs",cex=1.1,osr_threshold=75)
xyplot(my_pbo,plotType="ranks",pch=16,cex=1.2)
xyplot(my_pbo,plotType="selection",sel_threshold=100,cex=1.2)

Example with Parallel Processing

require(pbo)
require(lattice)
require(PerformanceAnalytics)
require(doParallel)      # for parallel processing
 
N = 200
T = 2000
S = 16
 
# create some phony trial data
sr_base = 0
mu_base = sr_base/(260.0)
sigma_base = 1.00/(260.0)**0.5
 
M <- data.frame(matrix(NA,T,N,byrow=TRUE,dimnames=list(1:T,1:N)),
                check.names=FALSE)
 
M[,1:N] <- rnorm(T,mean=0,sd=1)
x <- sapply(1:N,function(i) {
            M[,i] = M[,i] * sigma_base / sd(M[,i])
            M[,i] = M[,i] + mu_base - mean(M[,i])
            })
 
# tweak one trial to exhibit low overfit
sr_case = 1
mu_case = sr_case/(260.0)
sigma_case = sigma_base
 
i = N
M[,i] <- rnorm(T,mean=0,sd=1)
M[,i] = M[,i] * sigma_case / sd(M[,i]) # re-scale
M[,i] = M[,i] + mu_case - mean(M[,i]) # re-center
 
cluster <- makeCluster(detectCores())
registerDoParallel(cluster)
pp_pbo <- pbo(M,S,F=Omega,threshold=1,allow_parallel=TRUE)
stopCluster(cluster)
histogram(pp_pbo)

Packages

  • utils for the combinations
  • lattice for plots
  • latticeExtra over plot overlays only for the SD2 measure
  • grid for plot labeling
  • foreach for parallel computation of the backtest folds

Reference

Bailey, David H. and Borwein, Jonathan M. and Lopez de Prado, Marcos and Zhu, Qiji Jim, The Probability of Back-Test Overfitting (September 1, 2013). Available at SSRN: http://ssrn.com/abstract=2326253 or http://dx.doi.org/10.2139/ssrn.2326253

News

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("pbo")

1.3.4 by Matt Barry, 6 years ago


https://github.com/mrbcuda/pbo


Report a bug at https://github.com/mrbcuda/pbo/issues


Browse source code at https://github.com/cran/pbo


Authors: Matt Barry <[email protected]>


Documentation:   PDF Manual  


Task views: Empirical Finance


MIT + file LICENSE license


Depends on utils, lattice

Suggests PerformanceAnalytics, foreach, grid, latticeExtra, testthat, doParallel, knitr


See at CRAN