Methods for Adaptive Shrinkage, using Empirical Bayes

The R package 'ashr' implements an Empirical Bayes approach for large-scale hypothesis testing and false discovery rate (FDR) estimation based on the methods proposed in M. Stephens, 2016, "False discovery rates: a new deal", . These methods can be applied whenever two sets of summary statistics---estimated effects and standard errors---are available, just as 'qvalue' can be applied to previously computed p-values. Two main interfaces are provided: ash(), which is more user-friendly; and ash.workhorse(), which has more options and is geared toward advanced users. The ash() and ash.workhorse() also provides a flexible modeling interface that can accomodate a variety of likelihoods (e.g., normal, Poisson) and mixture priors (e.g., uniform, normal).


CRAN_Status_Badge Build Status AppVeyor Build Status Coverage Status Coverage Status

This repository contains an R package for performing "Adaptive Shrinkage."

To install the ashr package first you need to install devtools:

install.packages("devtools")
library(devtools)
install_github("stephens999/ashr")

To use the interior-point solver (optmethod = "mixIP"), you need to install MOSEK and the Rmosek package. We have provided some Mac-specific and Linux-specific instructions for installing MOSEK.

The main function in the ashr package is ash. To get minimal help:

library(ashr)
?ash

More background

The ashr ("Adaptive SHrinkage") package aims to provide simple, generic, and flexible methods to derive "shrinkage-based" estimates and credible intervals for unknown quantities $\beta=(\beta_1,\dots,\beta_J)$, given only estimates of those quantities ($\hat\beta=(\hat\beta_1,\dots, \hat\beta_J)$) and their corresponding estimated standard errors ($s=(s_1,\dots,s_J)$).

The "adaptive" nature of the shrinkage is two-fold. First, the appropriate amount of shrinkage is determined from the data, rather than being pre-specified. Second, the amount of shrinkage undergone by each $\hat\beta_j$ will depend on the standard error $s_j$: measurements with high standard error will undergo more shrinkage than measurements with low standard error.

Methods Outline

The methods are based on treating the vectors $\hat\beta$ and $s$ as "observed data", and then performing inference for $\beta$ from these observed data, using a standard hierarchical modelling framework to combine information across $j=1,\dots,J$.

Specifically, we assume that the true $\beta_j$ values are independent and identically distributed from some unimodal distribution $g$. By default we assume $g$ is unimodal about zero and symmetric. You can specify or estimate a different mode using the mode parameter. You can allow for asymmetric $g$ by specifying mixcompdist="halfuniform".

Then, we assume that the observations $\hat\beta_j \sim N(\beta_j,s_j)$, or alternatively the normal assumption can be replaced by a $t$ distribution by specifying df, the number of degrees of freedom used to estimate $s_j$. Actually this is important: do be sure to specify df if you can.

News

ashr v2.0

Major Changes

There are a few major changes in output and input that will likely break existing dependencies. Here are the highlights

  • the main output (lfsr, lfdr, etc) is rearranged into a dataframe, called result.

    • so, for example, the lfsr is now a$result$lfsr instead of a$lfsr

    • or, better, use the accessor function get_lfsr(a) to extract the lfsr etc

  • I added accessor functions get_lfsr, get_lfdr, get_pm, get_psd etc to access the lfsr, lfdr, posterior mean and posterior standard deviation. Using these functions to access results will help ensure your code remains valid if I happen to change the internal structure of the results again (although not anticipated...)

  • output fitted.g is renamed fitted_g, and flash.data becomes flash_data to make the whole package convention more consistent. Also fit becomes fit_details.

  • function prefixes comppost and compdens replaced with comp_post and comp_dens, again for consistency.

  • nonzeromode option is replaced with the option mode to specify mode. Or use mode="estimate" to estimate the mode.

  • more flexible control of output. For example, you can say you want only the logLR output by specifying outputlevel = c("lfsr","logLR"), or only posterior mean by outputlevel = c("PosteriorMean").

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("ashr")

2.2-32 by Peter Carbonetto, 7 months ago


https://github.com/stephens999/ashr


Report a bug at https://github.com/stephens999/ashr/issues


Browse source code at https://github.com/cran/ashr


Authors: Matthew Stephens [aut] , Peter Carbonetto [aut, cre] , Chaoxing Dai [ctb] , David Gerard [aut] , Mengyin Lu [aut] , Lei Sun [aut] , Jason Willwerscheid [aut] , Nan Xiao [aut] , Mazon Zeng [ctb]


Documentation:   PDF Manual  


GPL (>= 3) license


Imports Matrix, stats, graphics, assertthat, truncnorm, mixsqp, SQUAREM, doParallel, pscl, Rcpp, foreach, etrunct

Suggests testthat, knitr, rmarkdown, ggplot2, REBayes

Linking to Rcpp


Enhanced by palasso.


See at CRAN