Tools for Obtaining and Cleaning Medicare Public Use Files

Publicly available data from Medicare frequently requires extensive initial effort to extract desired variables and merge them; this package formalizes the techniques I've found work best. More information on the Medicare program, as well as guidance for the publicly available data this package targets, can be found on CMS's website covering publicly available data. See <>.

The medicare package is a collection of functions and methods I've used to manipulate Medicare data and get it ready for analysis. This includes things like efficiently subsetting messy Cost Report data to pull desired variables, renaming variables in data that doesn't come with headers, and finding more useful names for Provider of Service files from the early 2000's that name variables sequentially from "PROV0001".

Publicly available Medicare data often requires extensive preparation and cleaning before any analysis can take place. Files are often raw dumps of database tables, which the researcher is expected to subset and merge to make a workable dataset. This package contains methods to extract data from such datasets (e.g. Cost Reports), provide useful names for variables (Cost Reports and Provider of Services File), and even parse data dictionary / layout files to extract variable names for older datasets, where names in the raw data are essentially Var1, Var2, Var3... (Provider of Services File).

Installation and Documentation

medicare is under active development and available on CRAN. You can install the latest release version of the package by using


You can install the development version of the medicare package using devtools:


Please let me know about any problems by opening an issue.

For detailed examples on how to use some of the functionality, check out the Vignettes, which show examples similar to what I've done in my own work.



Medicare 0.2.1

  • Updates:
    • price_deflate() updated with newer government published figures.
    • price_deflate() restricted to years 2007-2014.

Medicare 0.2.0

  • New features:
    • price_deflate() function: deflates spending from specific Medicare sectors (i.e. inpatient, physician, hospice) within the range 2002:2014.
  • Bug fixes:
    • Fixed row / column subsetting in cr_extract(), which wasn't working correctly for tibbles.
    • Fixed typos in vignettes.

Medicare 0.1.0

  • Initial release: functions for manipulating cost reports and provider of services data.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


0.2.1 by Robert Gambrel, 5 years ago

Report a bug at

Browse source code at

Authors: Robert Gambrel [aut, cre]

Documentation:   PDF Manual  

MIT + file LICENSE license

Suggests knitr, rmarkdown, dplyr, ggplot2, maps, magrittr, testthat

See at CRAN