A Program for Missing Data

A tool that "multiply imputes" missing data in a single cross-section (such as a survey), from a time series (like variables collected for each year in a country), or from a time-series-cross-sectional data set (such as collected by years for each of several countries). Amelia II implements our bootstrapping-based algorithm that gives essentially the same answers as the standard IP or EMis approaches, is usually considerably faster than existing approaches and can handle many more variables. Unlike Amelia I and other statistically rigorous imputation software, it virtually never crashes (but please let us know if you find to the contrary!). The program also generalizes existing approaches by allowing for trends in time series across observations within a cross-sectional unit, as well as priors that allow experts to incorporate beliefs they have about the values of missing cells in their data. Amelia II also includes useful diagnostics of the fit of multiple imputation models. The program works from the R command line or via a graphical user interface that does not require users to know R.

Amelia II is an R package for the multiple imputation of multivariate incomplete data. It uses an algorithm that combines bootstrapping and the EM algorithm to take draws from the posterior of the missing data. The Amelia package includes normalizing transformations, cell-level priors, and methods for handling time-series cross-sectional data.

How to install

Installation requirements

Amelia requires R version 2.14.0 or higher.

Manual installation


Installing unstable developer version:

install_github("IQSS/Amelia", ref = "develop")


// // Amelia II - User visible changes // // //

== 1.7.5 (07 May 2018) ==

  • Fixed bug with factor names under perfect collinearity
  • Added "draws" argument to overimp to control number of overimputation draws
  • Fix issue with tibbles and tscsPlot()
  • Fix issues with missmap()
  • Fixed issue with iterHist indicators being reversed

== 1.7.4 (21 Nov 2015) ==

  • Fixed issue with log axes in overimpute
  • Allow for vector in 'main' argument in tscsPlot()
  • Moved a collinearity check from error to warning.
  • Handle subsets better in moPrep()
  • tscsPlot() won't throw an error when cs is unspecified and plotall=TRUE
  • Fixed other small bugs and issues

== 1.7.3 (14 Nov 2014) ==

  • Fixed bug with overimp not being respected
  • Added an argument boot.type='none' to amelia() to allow it to run on the original, non-bootstrapped data
  • Fixed bug in plot.amelia() with matrix inputs
  • Fixed bug with lower bounds not being respected
  • Made compatible with most recent versions of Rcpp and RcppArmadillo

== 1.7.2 (08 Jun 2013) ==

  • Bug fixes to priors (especially important for multiple overimputation).
  • Fixed issue with names of imputations for integration with Zelig.

== 1.7.1 (24 Mar 2013) ==

  • Speed improvement (thanks to Paul Johnson).
  • Amelia now requires R>=2.13.5
  • missmap() now displays correctly when data is completely observed.
  • An error is now called when users try to use overimpute() on a variable marked as nominal.
  • Fixed a bug when all imputations resulted in uninvertible covariance matrices.
  • Fixed a bug where incorrectly setting the emburn argument could cause a segfault.
  • Various package cleanups for CRAN compatibility.

== 1.7 (10 Feb 2013) ==

  • Ported core EM algorithm to C++. Speed should increase.
  • Plots in AmeliaView should now use Quartz on Mac OS X instead of X11.
  • Amelia now requires R >=2.14.0.
  • Amelia now can run its imputations in parallel using infrastructure from R's parallel package. Note that R < 2.15.3 will crash if parallel is used while tcltk is loaded (or has been loaded and then unloaded). This will be fixed in R 2.15.3 (the patched version of 2.15.2) and we will require R>=2.15.3 when that version is released.
  • Fixed bug with priors not working correctly.
  • Fixed bug with character variables set to nominal.

== 1.6 (22 Feb 2012) ==

  • Added a transform function to create transformed variables in the imputed datasets.
  • Added a mi.meld() function that can combine quantities of interest using the Rubin rules.
  • Added a subset arugment to overimpute.
  • write.amelia() can now create a stacked/long imputed datatset (also updated to AmeliaView)
  • Fixed a bug in moPrep (Thanks to Jeff Arnold for the patch)
  • missmap() has an arugment to not re-order the variables.

== 1.5-4 ==

  • Fixed a bug with error messages.

== 1.5-3 ==

  • Fixed a bug with completely missing rows in the tscsPlot().

== 1.5-2 (26 Apr 2011) ==

  • Fixed a bug in the handling of priors.

== 1.5-1 (23 Nov 2010) ==

  • Fixed a bug in the new GUI where it didn't respect the "intercs" option.

== 1.5-0 (23 Nov 2010) ==

  • Major changes to the AmeliaView GUI.

== 1.2-18 (4 Nov 2010) ==

  • Fixed a bug when all variables are set to nominal or ordinal.

== 1.2-17 (10 May 2010) ==

  • Fixed a bug with the 'ask' argument when using "plot" on an 'amelia' object.

== 1.2-16 (20 Mar 2010) ==

  • Fixed a bug when priors specified.
  • When priors are used, Amelia now tries to use starting values with the prior-filled data.

== 1.2-15 (20 Feb 2010) ==

  • Fixed a bug when only 1 variable is not an ID variable or a nominal/ordinal variable.
  • Fixed a bug with the naming of columns in the imputation process.

== 1.2-14 (16 Nov 2009) ==

  • Fixed a bug that "ords" variables would return multiple copies of the same level.

== 1.2-13 (09 Aug 2009) ==

  • Fixed a small bug in the error checking routines that handled nominal variables.

== 1.2-12 (11 Jul 2009) ==

  • Fixed a bug in AmeliaView that caused it to crash.

== 1.2-11 (10 Jul 2009) ==

  • Minor bugfixes in removing test code from AmeliaView() and handling of the priors.

== 1.2-10 (07 Jul 2009) ==

  • Fixed a bug in the error checking routine that occurred when users put all of their variables into one of (idvars, noms, ords, ts, cs).

== 1.2-9 (02 Jul 2009) ==

  • Fixed typos in the manual with regard to ridge priors and clarified the advice about them.

=== 1.2-8 (01 Jul 2009) ==

* Major update to the Amelia manual (now compiled as a vignette 
  using Sweave).

* Changed a typo that stated values were the "percent missing" 
  when they should have been "fraction missing." This is fixed. 

=== 1.2-7 (29 Jul 2009) ==

* In the amelia output, mu and covMatrices now have relevant
  dimension names to be able to tell which column which.

* Fixed a bug in the handling of priors that may have affected
  answers, but not significantly.

* The missmap() function can now accept any matrix or data.frame,
  not just Amelia output. This allows for drawing a missingness
  map before running amelia().

== 1.2-0 (09 Apr 2009) ==

* Amelia output is now an instance of the S3 class 'amelia'. 

* Imputations are now stored in a list of length 'm' (the number
  of imputations) in output$imputations, which is of the class
  'mi', making it simple to pass to Zelig. 

* Amelia output contains a matrix of means (one column for each
  imputation) and an array of covariance matrices. These are the
  posterior modes found by the EM algorithm in each imputation. 

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


1.8.0 by Matthew Blackwell, 8 months ago


Browse source code at https://github.com/cran/Amelia

Authors: James Honaker [aut] , Gary King [aut] , Matthew Blackwell [aut, cre]

Documentation:   PDF Manual  

Task views: Official Statistics & Survey Methodology, Statistics for the Social Sciences, Missing Data, Official Statistics & Survey Statistics

GPL (>= 2) license

Imports foreign, utils, grDevices, graphics, methods, stats

Depends on Rcpp

Suggests tcltk, Zelig, rmarkdown, knitr

Linking to Rcpp, RcppArmadillo

Imported by COINr, NADIA, OVtool, Zelig, missCompare.

Depended on by TestDataImputation, bmem.

Suggested by MKinfer, MKmisc, MatchThem, bucky, cem, merTools, mitml, semTools, simsem.

Enhanced by miceadds.

See at CRAN