Test for a Neutral Evolutionary Model in Cancer Sequencing Data

Package takes frequencies of mutations as reported by high throughput sequencing data from cancer and fits a theoretical neutral model of tumour evolution. Package outputs summary statistics and contains code for plotting the data and model fits. See Williams et al 2016 and Williams et al 2017 for further details of the method.

This is an R package to analyse Variant Allele Frequency (VAF) distributions as reported from high throughput cancer sequencing. It reports 4 summary statistics and associated p-values based on a neutral model of tumour evolution, as well as functions to plot the VAF histogram and model fits.

Getting Started

You can download the package from CRAN in the usual way.


To download the latest development version you'll need to use the devtools package:



The package comes with some preloaded test data, generated from a simulation of tumour growth. These test data sets are called VAFselection and VAFneutral.

The basic functionality of the neutralitytestr package is achieved by creating a neutralitytest object. The neutralitytest object contains a range of metrics to test for neutrality, and makes plotting histograms and cumulative distributions to visualize the output easy. The neutralitytest function takes a vector of VAFs and an upper and lower limit for the frequency range over which we wish to test whether the data is consistent with a neutral model, and then calculates all 4 metrics.

out <- neutralitytest(VAFneutral, fmin = 0.1, fmax = 0.25)

The neutralitytest object can be summarised using the summary(out) command.

Summary of neutrality metrics:

  value =  0.03067203 , p-value =  0.679
Kolmogorov Distance:
  value =  0.09036603 , p-value =  0.641
Mean distance:
  value =  0.04131414 , p-value =  0.595
  value =  0.98993 , p-value =  0.404

Effective mutation rate =  216.7985

The following commands will plot a VAF histogram, a least squares model fit and the normalized distributions. For more information see the vignette.


We can also input the read depth, cellularity, overdispersion rho and ploidy and let the package calculate an appropriate upper integration limit by considering the expected standard deviation of the clonal peak. Using this on the VAFselection data we would do the following:

out <- neutralitytest(VAFselection, read_depth = 100.0, cellularity = 0.8, rho = 0.0, ploidy = 2)
plot_all(out) #this will plot all 3 of the above plots and combine into 1 figure.


Note that the p-values should be interpreted with care and are meant to serve as an approximation to guide the interpretation of the test statistics. These p-values were generated empirically from a simulated cohort of cancers with known ground truth and are derived from the same data that generated the ROC curves in supplementary figure 3 from the paper. This cohort of simulated tumours were "sequenced" to 100X and thus if a sample you are analysing is sequenced to much higher or lower depth the p-values may no longer be valid. We have also developed a Bayesian alternative to identifying neutral and non-neutral tumours, this is available here. Note that this Bayesian method is much more computationally expensive and can take upwards of 10 hours per sample.


neutralitytestr 0.0.2

  • Added option so that maximum of integration range can be calculated from read read depth
  • Added mu/beta = annotation to lsq_plot rather than y =

neutralitytestr 0.0.1

First realease of package.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


0.0.3 by Marc Williams, 8 months ago


Report a bug at https://github.com/marcjwilliams1/neutralitytestr/issues

Browse source code at https://github.com/cran/neutralitytestr

Authors: Marc Williams [aut, cre]

Documentation:   PDF Manual  

MIT + file LICENSE license

Imports dplyr, ggplot2, scales, pracma, ggpmisc, cowplot

Suggests knitr, rmarkdown, testthat

See at CRAN