Methods for Adaptive Shrinkage, using Empirical Bayes

The R package 'ashr' implements an Empirical Bayes approach for large-scale hypothesis testing and false discovery rate (FDR) estimation based on the methods proposed in M. Stephens, 2016, "False discovery rates: a new deal", . These methods can be applied whenever two sets of summary statistics---estimated effects and standard errors---are available, just as 'qvalue' can be applied to previously computed p-values. Two main interfaces are provided: ash(), which is more user-friendly; and ash.workhorse(), which has more options and is geared toward advanced users. The ash() and ash.workhorse() also provides a flexible modeling interface that can accomodate a variety of likelihoods (e.g., normal, Poisson) and mixture priors (e.g., uniform, normal).


CRAN_Status_Badge Build Status AppVeyor Build Status Coverage Status Coverage Status

This repository contains an R package for performing "Adaptive Shrinkage."

To install the ashr package first you need to install devtools:

install.packages("devtools")
library(devtools)
install_github("stephens999/ashr")

To use the interior-point solver (optmethod = "mixIP"), you need to install MOSEK and the Rmosek package. We have provided some Mac-specific and Linux-specific instructions for installing MOSEK.

The main function in the ashr package is ash. To get minimal help:

library(ashr)
?ash

More background

The ashr ("Adaptive SHrinkage") package aims to provide simple, generic, and flexible methods to derive "shrinkage-based" estimates and credible intervals for unknown quantities $\beta=(\beta_1,\dots,\beta_J)$, given only estimates of those quantities ($\hat\beta=(\hat\beta_1,\dots, \hat\beta_J)$) and their corresponding estimated standard errors ($s=(s_1,\dots,s_J)$).

The "adaptive" nature of the shrinkage is two-fold. First, the appropriate amount of shrinkage is determined from the data, rather than being pre-specified. Second, the amount of shrinkage undergone by each $\hat\beta_j$ will depend on the standard error $s_j$: measurements with high standard error will undergo more shrinkage than measurements with low standard error.

Methods Outline

The methods are based on treating the vectors $\hat\beta$ and $s$ as "observed data", and then performing inference for $\beta$ from these observed data, using a standard hierarchical modelling framework to combine information across $j=1,\dots,J$.

Specifically, we assume that the true $\beta_j$ values are independent and identically distributed from some unimodal distribution $g$. By default we assume $g$ is unimodal about zero and symmetric. You can specify or estimate a different mode using the mode parameter. You can allow for asymmetric $g$ by specifying mixcompdist="halfuniform".

Then, we assume that the observations $\hat\beta_j \sim N(\beta_j,s_j)$, or alternatively the normal assumption can be replaced by a $t$ distribution by specifying df, the number of degrees of freedom used to estimate $s_j$. Actually this is important: do be sure to specify df if you can.

News

ashr v2.0

Major Changes

There are a few major changes in output and input that will likely break existing dependencies. Here are the highlights

  • the main output (lfsr, lfdr, etc) is rearranged into a dataframe, called result.

    • so, for example, the lfsr is now a$result$lfsr instead of a$lfsr

    • or, better, use the accessor function get_lfsr(a) to extract the lfsr etc

  • I added accessor functions get_lfsr, get_lfdr, get_pm, get_psd etc to access the lfsr, lfdr, posterior mean and posterior standard deviation. Using these functions to access results will help ensure your code remains valid if I happen to change the internal structure of the results again (although not anticipated...)

  • output fitted.g is renamed fitted_g, and flash.data becomes flash_data to make the whole package convention more consistent. Also fit becomes fit_details.

  • function prefixes comppost and compdens replaced with comp_post and comp_dens, again for consistency.

  • nonzeromode option is replaced with the option mode to specify mode. Or use mode="estimate" to estimate the mode.

  • more flexible control of output. For example, you can say you want only the logLR output by specifying outputlevel = c("lfsr","logLR"), or only posterior mean by outputlevel = c("PosteriorMean").

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("ashr")

2.2-7 by Peter Carbonetto, a year ago


https://github.com/stephens999/ashr


Report a bug at https://github.com/stephens999/ashr/issues


Browse source code at https://github.com/cran/ashr


Authors: Matthew Stephens [aut] , Peter Carbonetto [aut, cre] , Chaoxing Dai [ctb] , David Gerard [aut] , Mengyin Lu [aut] , Lei Sun [aut] , Jason Willwerscheid [aut] , Nan Xiao [aut] , Mazon Zeng [ctb]


Documentation:   PDF Manual  


GPL (>= 3) license


Imports Matrix, stats, graphics, assertthat, truncnorm, SQUAREM, doParallel, pscl, Rcpp, foreach, etrunct

Suggests testthat, roxygen2, covr, knitr, rmarkdown, ggplot2, REBayes, Rmosek

Linking to Rcpp


Imported by CorShrink.


See at CRAN