Import Professional Baseball Data from 'Retrosheet'

A collection of tools to import and structure the (currently) single-season event, game-log, roster, and schedule data available from <>. In particular, the event (a.k.a. play-by-play) files can be especially difficult to parse. This package does the parsing on those files, returning the requested data in the most practical R structure to use for sabermetric or other analyses.

Import Retrosheet data as a structured R object

retrosheet is an R package that downloads and parses the single-season event, gamelog, roster, and schedule files from into structured R objects for further analysis.

Currently, the main functions are

  • getRetrosheet() - This workhorse function returns the full seasonal data associated with the user-entered arguments
  • getPartialGamelog() - An alternative to returning the full gamelog files. This function allows the user to choose the columns and date. Column names are made available by the global object gamelogFields

Also included are convenience functions

  • getFileNames() - for obtaining a list of all zip files currently available for use by this package
  • getTeamIDs() - for providing the team ID value to be used in the team argument of getRetrosheet()
  • getParkIDs() - for ballpark ID and name information

retrosheet version 1.0.1 is now available on CRAN, and can be installed with


This development version can be installed with



Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


1.1.4 by Colin Douglas, a month ago

Browse source code at

Authors: Colin Douglas [aut, cre, cph] , Richard Scriven [aut, cph]

Documentation:   PDF Manual  

GPL (>= 2) license

Imports xml2, stringi, httr, stringr, rvest

Suggests testthat, rmarkdown

See at CRAN