An easy way to import census, survey and geographic data provided by 'IPUMS' into R plus tools to help use the associated metadata to make analysis easier. 'IPUMS' data describing 1.4 billion individuals drawn from over 750 censuses and surveys is available free of charge from our website < https://ipums.org>.
The ipumsr package helps import IPUMS extracts from the IPUMS website into R. IPUMS provides census and survey data from around the world integrated across time and space. IPUMS integration and documentation makes it easy to study change, conduct comparative research, merge information across data types, and analyze individuals within family and community context. Data and services are available free of charge.
The ipumsr package can be installed by running the following command:
Or, you can install the development version using the following commands:
if (!require(devtools)) install.packages("devtools")devtools::install_github("mnpopcenter/ipumsr")
The vignettes are a great place to learn more about ipumsr and IPUMS data:
For a general introduction see the ipums vignette.
For a more detailed look at some of the features, see these vignettes:
Or to see examples of how to work through data from particular projects, see these vignettes:
You can access them with the
vignette() command (eg
If you are installing from github and want the vignettes, you’ll need to run the following commands first:
devtools::install_github("mnpopcenter/ipumsr/ipumsexamples")devtools::install_github("mnpopcenter/ipumsr", build_vignettes = TRUE)
We greatly appreciate bug reports, suggestions or pull requests. They can be submitted via github, or by email to [email protected]
Lots of improvements for users who wish to use "big data" sized IPUMS extracts. See
the vignette using command
vignette("ipums-bigdata", package = "ipusmr") for
the full details.
There are now chunked versions of the microdata reading functions
which let you perform functions on subsets of the data as you read
it in (
There is a new function
ipums_collect() which combined
set_ipums_attributes() to add value and variable labels to data collected from
When reading gzipped files, ipumsr no longer has to store the full text in memory.
Added pillar printing for labelled classes in tibbles. This means that the label will print the labels alongside the values when printed in a tibble (in a subtle grey color when the terminal supports it). To turn this feature off, use command `options("ipumsr.show_pillar_labels" = FALSE).
The approach to reading hierarchical data files is much faster.
read_ipums_sp() are now in the same order as
read_ipums_sp() gain 2 new arguments
allows you to pick a subset of variables, and
add_layer_var which lets
you add a variable indicating which layer it came from.
You can now use your inside voice for variable names with the new argument
read_ipums_micro() family of functions
so that the variable names are lower case.
ipumsr is compatible with versions of haven newer than 2.0 (while maintaining compatibility with earlier versions). (#31)
IPUMS Terra is now officially supported! Read raster, area or microdata extracts
Add support for keyvar in DDI, which will (eventually) help link data across record types in hierarchical extracts. To be effective, this requires more support on the ipums.org website, which is hopefully coming soon (#25 - thanks @mpadge!)
Improved main vignette instructions for Safari users (#27)
Fix for selecting columns from csv extracts (#26 - thanks forum user JCambon_OIS!)
Fixes for the
ipums_list_*() family of functions.
Fixed a bug in ipums_shape_*_join functions when using integer ID columns. (#16)
Allow for unzipped folders because Safari on macOS unzips folders by default (#17)
lbl_relabel behavior is improved so that labels aren't assigned sequentially (#21)