Interact with Google Sheets from R.
Access and manage Google spreadsheets from R with
googlesheets is inspired by gspread, a Google Spreadsheets Python API
The exuberant prose in this README is inspired by Tabletop.js: If you've ever wanted to get data in or out of a Google Spreadsheet from R without jumping through a thousand hoops, welcome home!
The released version is available on CRAN
Or you can get the development version from GitHub:
googlesheets is designed for use with the
%>% pipe operator and, to a lesser extent, the data-wrangling mentality of
dplyr. This README uses both, but the examples in the help files emphasize usage with plain vanilla R, if that's how you roll.
dplyr internally but does not require the user to do so. You can make the
%>% pipe operator available in your own work by loading
To play nicely with tab completion, we use consistent prefixes:
gs_= all functions in the package.
gs_ws_= all functions that operate on worksheets or tabs within a spreadsheet.
gd_= something to do with Google Drive, usually has a
gs_synonym, might one day migrate to a Drive client.
Here's how to get a copy of a Gapminder-based Sheet we publish for practicing and follow along. You'll be sent to the browser to authenticate yourself with Google at this point.
gs_gap() %>%gs_copy(to = "Gapminder")## or, if you don't use pipesgs_copy(gs_gap(), to = "Gapminder")
Register a Sheet (in this case, by title):
gap <- gs_title("Gapminder")#> Sheet successfully identified: "Gapminder"
Here's a registered
gap#> Spreadsheet title: Gapminder#> Spreadsheet author: gspreadr#> Date of googlesheets registration: 2017-05-06 19:37:34 GMT#> Date of last spreadsheet update: 2015-03-23 20:34:08 GMT#> visibility: private#> permissions: rw#> version: new#>#> Contains 5 worksheets:#> (Title): (Nominal worksheet extent as rows x columns)#> Africa: 625 x 6#> Americas: 301 x 6#> Asia: 397 x 6#> Europe: 361 x 6#> Oceania: 25 x 6#>#> Key: 1HT5B8SgkKqHdqHJmn5xiuaC04Ngb7dG9Tv94004vezA#> Browser URL:
Visit a registered
googlesheet in the browser:
gap %>% gs_browse()gap %>% gs_browse(ws = "Europe")
Read all the data in a worksheet:
africa <- gs_read(gap)#> Accessing worksheet titled 'Africa'.#> Parsed with column specification:#> cols(#> country = col_character(),#> continent = col_character(),#> year = col_integer(),#> lifeExp = col_double(),#> pop = col_integer(),#> gdpPercap = col_double()#> )glimpse(africa)#> Observations: 624#> Variables: 6#> $ country <chr> "Algeria", "Algeria", "Algeria", "Algeria", "Algeria...#> $ continent <chr> "Africa", "Africa", "Africa", "Africa", "Africa", "A...#> $ year <int> 1952, 1957, 1962, 1967, 1972, 1977, 1982, 1987, 1992...#> $ lifeExp <dbl> 43.077, 45.685, 48.303, 51.407, 54.518, 58.014, 61.3...#> $ pop <int> 9279525, 10270856, 11000948, 12760499, 14760787, 171...#> $ gdpPercap <dbl> 2449.008, 3013.976, 2550.817, 3246.992, 4182.664, 49...africa#> # A tibble: 624 × 6#> country continent year lifeExp pop gdpPercap#> <chr> <chr> <int> <dbl> <int> <dbl>#> 1 Algeria Africa 1952 43.077 9279525 2449.008#> 2 Algeria Africa 1957 45.685 10270856 3013.976#> 3 Algeria Africa 1962 48.303 11000948 2550.817#> 4 Algeria Africa 1967 51.407 12760499 3246.992#> 5 Algeria Africa 1972 54.518 14760787 4182.664#> 6 Algeria Africa 1977 58.014 17152804 4910.417#> 7 Algeria Africa 1982 61.368 20033753 5745.160#> 8 Algeria Africa 1987 65.799 23254956 5681.359#> 9 Algeria Africa 1992 67.744 26298373 5023.217#> 10 Algeria Africa 1997 69.152 29072015 4797.295#> # ... with 614 more rows
Some of the many ways to target specific cells:
gap %>% gs_read(ws = 2, range = "A1:D8")gap %>% gs_read(ws = "Europe", range = cell_rows(1:4))gap %>% gs_read(ws = "Africa", range = cell_cols(1:4))
readr-style control of data ingest -- highly artificial example!
gap %>%gs_read(ws = "Oceania", col_names = paste0("Z", 1:6),na = c("1962", "1977"), col_types = "cccccc", skip = 1, n_max = 7)#> Accessing worksheet titled 'Oceania'.#> # A tibble: 7 × 6#> Z1 Z2 Z3 Z4 Z5 Z6#> <chr> <chr> <chr> <chr> <chr> <chr>#> 1 Australia Oceania 1952 69.12 8691212 10039.6#> 2 Australia Oceania 1957 70.33 9712569 10949.65#> 3 Australia Oceania <NA> 70.93 10794968 12217.23#> 4 Australia Oceania 1967 71.1 11872264 14526.12#> 5 Australia Oceania 1972 71.93 13177000 16788.63#> 6 Australia Oceania <NA> 73.49 14074100 18334.2#> 7 Australia Oceania 1982 74.74 15184200 19477.01
Create a new Sheet from an R object:
iris_ss <- gs_new("iris", input = head(iris, 3), trim = TRUE)#> Warning: At least one sheet matching "iris" already exists, so you may#> need to identify by key, not title, in future.#> Sheet "iris" created in Google Drive.#> Range affected by the update: "R1C1:R4C5"#> Worksheet "Sheet1" successfully updated with 20 new value(s).#> Accessing worksheet titled 'Sheet1'.#> Sheet successfully identified: "iris"#> Accessing worksheet titled 'Sheet1'.#> Worksheet "Sheet1" dimensions changed to 4 x 5.#> Worksheet dimensions: 4 x 5.
Edit some arbitrary cells and append a row:
iris_ss <- iris_ss %>%gs_edit_cells(input = c("what", "is", "a", "sepal", "anyway?"),anchor = "A2", byrow = TRUE)#> Range affected by the update: "R2C1:R2C5"#> Worksheet "Sheet1" successfully updated with 5 new value(s).iris_ss <- iris_ss %>%gs_add_row(input = c("sepals", "support", "the", "petals", "!!"))#> Row successfully appended.
Look at what we have wrought:
iris_ss %>%gs_read()#> Accessing worksheet titled 'Sheet1'.#> Parsed with column specification:#> cols(#> Sepal.Length = col_character(),#> Sepal.Width = col_character(),#> Petal.Length = col_character(),#> Petal.Width = col_character(),#> Species = col_character()#> )#> # A tibble: 4 × 5#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species#> <chr> <chr> <chr> <chr> <chr>#> 1 what is a sepal anyway?#> 2 4.9 3 1.4 0.2 setosa#> 3 4.7 3.2 1.3 0.2 setosa#> 4 sepals support the petals !!
Download this precious thing as csv (other formats are possible):
iris_ss %>%gs_download(to = "iris-ish-stuff.csv", overwrite = TRUE)#> Sheet successfully downloaded:#> /Users/jenny/rrr/googlesheets/iris-ish-stuff.csv
Download this precious thing as an Excel workbook (other formats are possible):
iris_ss %>%gs_download(to = "iris-ish-stuff.xlsx", overwrite = TRUE)#> Sheet successfully downloaded:#> /Users/jenny/rrr/googlesheets/iris-ish-stuff.xlsx
Upload a Excel workbook into a new Sheet:
gap_xlsx <- gs_upload(system.file("mini-gap", "mini-gap.xlsx",package = "googlesheets"))#> File uploaded to Google Drive:#> /Users/jenny/resources/R/library/googlesheets/mini-gap/mini-gap.xlsx#> As the Google Sheet named:#> mini-gap
Clean up our mess locally and on Google Drive:
gs_vecdel(c("iris", "Gapminder"))file.remove(c("iris-ish-stuff.csv", "iris-ish-stuff.xlsx"))
Remember, the vignette shows a lot more usage.
|gs_title()||Register a Sheet by title|
|gs_key()||Register a Sheet by key|
|gs_url()||Register a Sheet by URL|
|gs_browse()||Visit a registered
|gs_read()||Read data and let
|gs_read_csv()||Read explicitly via the fast exportcsv link|
|gs_read_listfeed()||Read explicitly via the list feed|
|gs_read_cellfeed()||Read explicitly via the cell feed|
|gs_reshape_cellfeed()||Reshape cell feed data into a 2D thing|
|gs_simplify_cellfeed()||Simplify cell feed data into a 1D thing|
|gs_edit_cells()||Edit specific cells|
|gs_add_row()||Append a row to pre-existing data table|
|gs_new()||Create a new Sheet and optionally populate|
|gs_copy()||Copy a Sheet into a new Sheet|
|gs_rename()||Rename an existing Sheet|
|gs_ws_ls()||List the worksheets in a Sheet|
|gs_ws_new()||Create a new worksheet and optionally populate|
|gs_ws_rename()||Rename a worksheet|
|gs_ws_delete()||Delete a worksheet|
|gs_delete()||Delete a Sheet|
|gs_grepdel()||Delete Sheets with matching titles|
|gs_vecdel()||Delete the named Sheets|
|gs_upload()||Upload local file into a new Sheet|
|gs_download()||Download a Sheet into a local file|
|gs_auth()||Authorize the package|
|gs_deauth()||De-authorize the package|
|gs_user()||Get info about current user and auth status|
|gs_webapp_auth_url()||Facilitates auth by user of a Shiny app|
|gs_webapp_get_token()||Facilitates auth by user of a Shiny app|
|gs_gap()||Registers a public Gapminder-based Sheet (for practicing)|
|gs_gap_key()||Key of the Gapminder practice Sheet|
|gs_gap_url()||Browser URL for the Gapminder practice Sheet|
googlesheets as a read/write CMS that you (or your less R-obsessed friends) can edit through Google Docs, as well via R. It's like Christmas up in here.
Use a Google Form to conduct a survey, which populates a Google Sheet.
googleformrpackage provides an R API for Google Forms, allowing useRs to POST data securely to Google Forms without authentication. On CRAN and GitHub (README has lots of info and links to blog posts).
Gather data while you're in the field in a Google Sheet, maybe with an iPhone or an Android device. Take advantage of data validation to limit the crazy on the way in. You do not have to be online to edit a Google Sheet! Work offline via the Chrome browser, the Sheets app for Android, or the Sheets app for iOS.
There are various ways to harvest web data directly into a Google Sheet. For example:
IMPORTXML(), IMPORTHTML(), IMPORTFEED(): Google Sheets offer functions to populate Sheets based on web data.
googlesheets to get all that data into R.
Use it in a Shiny app! Several example apps come with the package.
What other ideas do you have?
The motivation for this release is for compatibility with the about-to-be released purrr 0.2.2.1.
purrr::dmap()and friends, which have been removed from purrr.
gs_upload()now has an
overwriteargument. (#285 @omgjens)
XMLis no longer a dependency.
Internal Server Error (HTTP 500). On or around 2016-03-11, there was a huge increase in the frequency of this error on Google Drive API calls.
GETcalls in the package are automatically retried up to 5 times, with exponential backoff, for statuses 500 and higher.
gd_refer to Google Drive and might eventually migrate into a separate Google Drive package. Generally there is a synonym with the
gd_token()is a new function to expose information about the current Google token. Some of this was migrated out of
gd_token(). New information includes scopes and cache path.
gd_user()now returns an S3 object of class
drive_user, but it's really just a list with a nice print method. It exposes information about the current Google user. New information includes user's Drive
NEWS.mdfile to track changes to the package.
httr v1.1.0: to become compatible with this version, we now require it.
plyris no longer required (#204)
purrris a new dependency
gs_browse()is a new function to visit a Google Sheet in the browser.
gs_rename()is a new function to rename an existing Sheet (#145).
gs_add_row()now works for two-dimensional
input, by calling itself once per row of
gs_read_listfeed()now supports parameters to manipulate data in the API call itself:
reverseinverts row order,
orderbyselects a column to sort on,
sqaccepts a structured query to filter rows. (#17)
gs_read_listfeed()doesn't return API-mangled column names anymore. They should now be the same as those from the other read functions and what you see in the browser.
readr-style data ingest: We explicitly try to match the interface of
readr::read_csv(). The read functions
gs_read_listfeed()and the reshaper
gs_reshape_cellfeed()should all return the same data frame when operating on the same worksheet. And this should match what
readr::read_csv()would return on a
.csvfile exported from that worksheet. The type conversion arguments for
gs_simplify_cellfeed()have also changed accordingly.
headerargument is no longer accepted. Use
gs_read*or reshape/simplify functions. Specify
trim_ws, etc. here.
readrexception #1: variables that consist entirely of missing values will be
NAof the logical type, not
googlesheetswill never return a data frame with
NAas a variable name. Instead, it will create a dummy variable name, like
readrexception #3: All read/reshape functions accept
check.names, in the spirit of
utils::read.table(), which defaults to
TRUE, variable names will be run through
make.names(..., unique = TRUE). (#208)
gs_read_cellfeed()now returns all possible definitions of cell contents:
value: The variable previously known as
cell_text. What you see in the browser and what Sheets API returns by default.
input_value: What you would have typed into the cell. Will give unevaluated formulas. (#18, #19, #152)
numeric_value: An actual number, if such exists, unmangled by rounding or other numeric formatting. (#152, #178)
literal = FALSEavailable in reading/reshaping functions that call the cell feed. Tries to be clever about using different definitions of cell contents.
gs_deauth()is a newly exported function that allows you to suspend the current token and, optionally, disable the
.httr-oauthtoken cache file by renaming it to