A method to download Department of Education College Scorecard data using the public API < https://collegescorecard.ed.gov/data/documentation/>. It is based on the 'dplyr' model of piped commands to select and filter data in a single chained function call. An API key from the U.S. Department of Education is required.
sc_get
(h/t @nguyentr17)utility_functions::dev_to_var
sc_year()
is now 'latest'
rather
than 2013. With continued data updates, this makes more sense than
keeping an old year. Existing scripts that relied on the default for
data from 2013 will need to be updated.year
column will be a character column with
latest
as the value when the most recent data are choosen. The
College Scorecard doesn't clearly note which data are the latest, so
I have left the string. When building a panel dataset across
multiple years, it will be best to use numeric year values for all
years so that the resulting tibbles can be bound together cleanly.sc_select()
: starts_with()
,
ends_with()
, contains()
, and matches()
should now be
available.sc_select_()
and sc_filter_()
, which allow users to select
and filter variables using strings stored in environment variablesc_zip()
to take zip codes that start with zero (h/t
@nateaff), either with string value or by returning leading zeros to
numeric values that R dropshttr
to make call rather than jsonlite
directly) in order to improve parsing on bad linesdebug
option to sc_get()
so that the API URL string could be returned when debugging callsysdata.rda
in ./data-raw/make_dict_hash.R
sc_get()
to use floor()
instead of ceiling()
so that
it doesn't make unnecessary API request/pull (h/t @jjchern)sc_filter()
to use subset object vectorssc_filter()
to use vectors stored in objectssc_filter()
to use %in%
operatorsc_dict()
to search all columns by defaultsc_dict()
bug that wouldn't allow for search by developer friendly names