A Program for Missing Data
A tool that "multiply imputes" missing data in a single cross-section
(such as a survey), from a time series (like variables collected for
each year in a country), or from a time-series-cross-sectional data
set (such as collected by years for each of several countries).
Amelia II implements our bootstrapping-based algorithm that gives
essentially the same answers as the standard IP or EMis approaches,
is usually considerably faster than existing approaches and can
handle many more variables. Unlike Amelia I and other statistically
rigorous imputation software, it virtually never crashes (but please
let us know if you find to the contrary!). The program also
generalizes existing approaches by allowing for trends in time series
across observations within a cross-sectional unit, as well as priors
that allow experts to incorporate beliefs they have about the values
of missing cells in their data. Amelia II also includes useful
diagnostics of the fit of multiple imputation models. The program
works from the R command line or via a graphical user interface that
does not require users to know R.
Amelia II is an R package for the multiple imputation of multivariate incomplete data. It uses an algorithm that combines bootstrapping and the EM algorithm to take draws from the posterior of the missing data. The Amelia package includes normalizing transformations, cell-level priors, and methods for handling time-series cross-sectional data.
How to install
Amelia requires R version 2.14.0 or higher.
Installing unstable developer version:
install_github("IQSS/Amelia", ref = "develop")
// Amelia II - User visible changes
== 1.7.5 (07 May 2018) ==
- Fixed bug with factor names under perfect collinearity
- Added "draws" argument to overimp to control number of overimputation draws
- Fix issue with tibbles and tscsPlot()
- Fix issues with missmap()
- Fixed issue with iterHist indicators being reversed
== 1.7.4 (21 Nov 2015) ==
- Fixed issue with log axes in overimpute
- Allow for vector in 'main' argument in tscsPlot()
- Moved a collinearity check from error to warning.
- Handle subsets better in moPrep()
- tscsPlot() won't throw an error when cs is unspecified and plotall=TRUE
- Fixed other small bugs and issues
== 1.7.3 (14 Nov 2014) ==
- Fixed bug with overimp not being respected
- Added an argument boot.type='none' to amelia() to allow it to run on the original, non-bootstrapped data
- Fixed bug in plot.amelia() with matrix inputs
- Fixed bug with lower bounds not being respected
- Made compatible with most recent versions of Rcpp and RcppArmadillo
== 1.7.2 (08 Jun 2013) ==
- Bug fixes to priors (especially important for multiple overimputation).
- Fixed issue with names of imputations for integration with Zelig.
== 1.7.1 (24 Mar 2013) ==
- Speed improvement (thanks to Paul Johnson).
- Amelia now requires R>=2.13.5
- missmap() now displays correctly when data is completely observed.
- An error is now called when users try to use overimpute() on a variable marked as nominal.
- Fixed a bug when all imputations resulted in uninvertible covariance matrices.
- Fixed a bug where incorrectly setting the emburn argument could cause a segfault.
- Various package cleanups for CRAN compatibility.
== 1.7 (10 Feb 2013) ==
- Ported core EM algorithm to C++. Speed should increase.
- Plots in AmeliaView should now use Quartz on Mac OS X instead of X11.
- Amelia now requires R >=2.14.0.
- Amelia now can run its imputations in parallel using infrastructure from R's parallel package. Note that R < 2.15.3 will crash if parallel is used while tcltk is loaded (or has been loaded and then unloaded). This will be fixed in R 2.15.3 (the patched version of 2.15.2) and we will require R>=2.15.3 when that version is released.
- Fixed bug with priors not working correctly.
- Fixed bug with character variables set to nominal.
== 1.6 (22 Feb 2012) ==
- Added a transform function to create transformed variables
in the imputed datasets.
- Added a mi.meld() function that can combine quantities of
interest using the Rubin rules.
- Added a subset arugment to overimpute.
- write.amelia() can now create a stacked/long imputed
datatset (also updated to AmeliaView)
- Fixed a bug in moPrep (Thanks to Jeff Arnold for the patch)
- missmap() has an arugment to not re-order the variables.
== 1.5-4 ==
- Fixed a bug with error messages.
== 1.5-3 ==
- Fixed a bug with completely missing rows in the tscsPlot().
== 1.5-2 (26 Apr 2011) ==
- Fixed a bug in the handling of priors.
== 1.5-1 (23 Nov 2010) ==
- Fixed a bug in the new GUI where it didn't respect the "intercs" option.
== 1.5-0 (23 Nov 2010) ==
- Major changes to the AmeliaView GUI.
== 1.2-18 (4 Nov 2010) ==
- Fixed a bug when all variables are set to nominal or ordinal.
== 1.2-17 (10 May 2010) ==
- Fixed a bug with the 'ask' argument when using "plot" on an
== 1.2-16 (20 Mar 2010) ==
- Fixed a bug when priors specified.
- When priors are used, Amelia now tries to use starting values
with the prior-filled data.
== 1.2-15 (20 Feb 2010) ==
- Fixed a bug when only 1 variable is not an ID variable or a
- Fixed a bug with the naming of columns in the imputation
== 1.2-14 (16 Nov 2009) ==
- Fixed a bug that "ords" variables would return multiple copies
of the same level.
== 1.2-13 (09 Aug 2009) ==
- Fixed a small bug in the error checking routines that handled
== 1.2-12 (11 Jul 2009) ==
- Fixed a bug in AmeliaView that caused it to crash.
== 1.2-11 (10 Jul 2009) ==
- Minor bugfixes in removing test code from AmeliaView() and
handling of the priors.
== 1.2-10 (07 Jul 2009) ==
- Fixed a bug in the error checking routine that occurred when
users put all of their variables into one of (idvars, noms,
ords, ts, cs).
== 1.2-9 (02 Jul 2009) ==
- Fixed typos in the manual with regard to ridge priors and
clarified the advice about them.
=== 1.2-8 (01 Jul 2009) ==
* Major update to the Amelia manual (now compiled as a vignette
* Changed a typo that stated values were the "percent missing"
when they should have been "fraction missing." This is fixed.
=== 1.2-7 (29 Jul 2009) ==
* In the amelia output, mu and covMatrices now have relevant
dimension names to be able to tell which column which.
* Fixed a bug in the handling of priors that may have affected
answers, but not significantly.
* The missmap() function can now accept any matrix or data.frame,
not just Amelia output. This allows for drawing a missingness
map before running amelia().
== 1.2-0 (09 Apr 2009) ==
* Amelia output is now an instance of the S3 class 'amelia'.
* Imputations are now stored in a list of length 'm' (the number
of imputations) in output$imputations, which is of the class
'mi', making it simple to pass to Zelig.
* Amelia output contains a matrix of means (one column for each
imputation) and an array of covariance matrices. These are the
posterior modes found by the EM algorithm in each imputation.