Provides functions to access historical and real-time national 'hydrometric' data from Water Survey of Canada data sources (< http://dd.weather.gc.ca/hydrometric/csv/> and < http://collaboration.cmc.ec.gc.ca/cmc/hydrometrics/www/>) and then applies tidy data principles.
This package is maintained by the Knowledge Management Branch of the British Columbia Ministry of Environment and Climate Change Strategy.
tidyhydat
do?hy_*
) that access hydrometric data from the
HYDAT database, a national archive of Canadian hydrometric data and
return tidy data.realtime_*
) that access Environment and
Climate Change Canada’s real-time hydrometric data source.search_*
) that can search through the
approximately 7000 stations in the database and aid in generating
station vectorshy_daily_flows()
function queries the database, tidies the
data and returns a tibble of daily
flows.You can install tidyhydat
from CRAN:
install.packages("tidyhydat")
To install the development version of the tidyhydat
package, you need
to install the remotes
package then the tidyhydat
package
if(!requireNamespace("devtools")) install.packages("devtools")devtools::install_github("ropensci/tidyhydat")
A more thorough vignette can be found on the tidyhydat
CRAN
page.
When you install tidyhydat
, several other packages will be installed
as well. One of those packages, dplyr
, is useful for data
manipulations and is used regularly here. To use dplyr
, it is required
to be loaded by itself. A helpful dplyr
tutorial can be found
here.
library(tidyhydat)library(dplyr)
To use many of the functions in the tidyhydat
package you will need to
download a version of the HYDAT database, Environment and Climate Change
Canada’s database of historical hydrometric data then tell R where to
find the database. Conveniently tidyhydat
does all this for you via:
download_hydat()
This downloads (with your permission) the most recent version of HYDAT
and then saves it in a location on your computer where tidyhydat
’s
function will look for it. Do be patient though as this takes a long
time! To see where HYDAT was saved you can run hy_dir()
. Now that you
have HYDAT downloaded and ready to go, you are all set to begin looking
at Canadian hydrometric data.
Most functions in tidyhydat
follow a common argument structure. We
will use the hy_daily_flows()
function for the following examples
though the same approach applies to most functions in the package (See
help(package = "tidyhydat")
for a list of exported objects). Much of
the functionality of tidyhydat
originates with the choice of
hydrometric stations that you are interested in. A user will often find
themselves creating vectors of station numbers. There are several ways
to do this.
The simplest case is if you would like to extract only station. You can
supply this directly to the station_number
argument:
hy_daily_flows(station_number = "08LA001")#> No start and end dates specified. All dates available will be returned.#> All station successfully retrieved#> # A tibble: 29,159 x 5#> STATION_NUMBER Date Parameter Value Symbol#> <chr> <date> <chr> <dbl> <chr>#> 1 08LA001 1914-01-01 Flow 144 <NA>#> 2 08LA001 1914-01-02 Flow 144 <NA>#> 3 08LA001 1914-01-03 Flow 144 <NA>#> 4 08LA001 1914-01-04 Flow 140 <NA>#> 5 08LA001 1914-01-05 Flow 140 <NA>#> 6 08LA001 1914-01-06 Flow 136 <NA>#> 7 08LA001 1914-01-07 Flow 136 <NA>#> 8 08LA001 1914-01-08 Flow 140 <NA>#> 9 08LA001 1914-01-09 Flow 140 <NA>#> 10 08LA001 1914-01-10 Flow 140 <NA>#> # ... with 29,149 more rows
Another method is to use hy_stations()
to generate your vector which
is then given the station_number
argument. For example, we could take
a subset for only those active stations within Prince Edward Island
(Province code: PE
) and then create vector which is passed to the
multi-parameter function hy_daily()
. This function queries the flow,
level, sediment load and suspended sediment concentration tables and
combines them (if present) into one dataframe:
PEI_stns <- hy_stations() %>%filter(HYD_STATUS == "ACTIVE") %>%filter(PROV_TERR_STATE_LOC == "PE") %>%pull_station_number()#> All station successfully retrievedPEI_stns#> [1] "01CA003" "01CB002" "01CB004" "01CC002" "01CC005" "01CC010" "01CD005"hy_daily(station_number = PEI_stns)#> # A tibble: 123,225 x 5#> STATION_NUMBER Date Parameter Value Symbol#> <chr> <date> <chr> <dbl> <chr>#> 1 01CA003 1961-08-01 Flow NA <NA>#> 2 01CA003 1961-08-02 Flow NA <NA>#> 3 01CA003 1961-08-03 Flow NA <NA>#> 4 01CA003 1961-08-04 Flow NA <NA>#> 5 01CA003 1961-08-05 Flow NA <NA>#> 6 01CA003 1961-08-06 Flow NA <NA>#> 7 01CA003 1961-08-07 Flow NA <NA>#> 8 01CA003 1961-08-08 Flow NA <NA>#> 9 01CA003 1961-08-09 Flow NA <NA>#> 10 01CA003 1961-08-10 Flow NA <NA>#> # ... with 123,215 more rows
We can also merge our station choice and data extraction into one unified pipe which accomplishes a single goal. For example, if for some reason we wanted all the stations in Canada that had the name “Canada” in them we could unify those selection and data extraction processes into a single pipe:
search_stn_name("canada") %>%pull_station_number() %>%hy_daily_flows()#> No start and end dates specified. All dates available will be returned.#> The following station(s) were not retrieved: 07DB006#> Check station number typos or if it is a valid station in the network#> # A tibble: 77,044 x 5#> STATION_NUMBER Date Parameter Value Symbol#> <chr> <date> <chr> <dbl> <chr>#> 1 01AK001 1918-08-01 Flow NA <NA>#> 2 01AK001 1918-08-02 Flow NA <NA>#> 3 01AK001 1918-08-03 Flow NA <NA>#> 4 01AK001 1918-08-04 Flow NA <NA>#> 5 01AK001 1918-08-05 Flow NA <NA>#> 6 01AK001 1918-08-06 Flow NA <NA>#> 7 01AK001 1918-08-07 Flow 1.78 <NA>#> 8 01AK001 1918-08-08 Flow 1.78 <NA>#> 9 01AK001 1918-08-09 Flow 1.5 <NA>#> 10 01AK001 1918-08-10 Flow 1.78 <NA>#> # ... with 77,034 more rows
These example illustrate a few ways that an vector can be generated and
supplied to functions within tidyhydat
.
To download real-time data using the datamart we can use approximately
the same conventions discussed above. Using realtime_dd()
we can
easily select specific stations by supplying a station of interest:
realtime_dd(station_number = "08LG006")
Another option is to provide simply the province as an argument and download all stations from that province:
realtime_dd(prov_terr_state_loc = "PE")
A simple plotting tool is also provided to quickly visualize realtime data:
realtime_plot("08LG006")
To report bugs/issues/feature requests, please file an issue.
These are very welcome!
If you would like to contribute to the package, please see our CONTRIBUTING guidelines.
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.
Get citation information for tidyhydat
in R by
running:
citation("tidyhydat")
Copyright 2017 Province of British Columbia
Licensed under the Apache License, Version 2.0 (the “License”); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
realtime_add_local_datetime()
adds a local datetime column to realtime_dd()
tibble (#64)pull_station_number()
wraps pull(STATION_NUMBER)
for conveniencestart_date
and end_date
actually work with said argument (#98)hy_annual_instant_peaks()
now parses the date correctly into UTC and includes a datetime and time zone column. (#64)hy_stn_data_range()
now returns actual NA
's rather than string NA's (#97)download_hydat()
now returns an informative error if the download fails due to proxy-related connection issues (@rywhale, #101).realtime_dd
by elimating loop (#91)hy_monthly_flows
and hy_monthly_levels
date issue (#24)tidyhydat:::station_choice
and added more unit testingstation_number = "ALL"
.%>%
)hy_src()
for advanced functionality (PR#77).hy_src()
(PR#77)download_hydat()
now uses httr::GET()
download_hydat
choice wasn't respected.onAttach()
now checks 115 days after last HYDAT release to prevent slow package load times if HYDAT is longer than 3 months between RELEASES.hy_plot()
realtime_plot()
that prevented a lake level station from being calledhy_daily()
that threw an error when only a level station was calledhy_daily()
and realtime_plot()
HYD_STATUS
and REAL_TIME
columns to allstations
.hy_daily()
function which combines all daily data into one dataframe.realtime_daily_mean
function that quickly converts higher resolution data into daily means.download_hydat()
that create a path that wasn't OS-independent.download_hydat()
where by sometimes R had trouble overwriting an existing version of existing database. Now the old database is simply deleted before the new one is downloaded.hy_annual_instant_peaks()
now returns a date object with HOUR, MINUTE and TIME_ZONE returned as separed columns. (#10)hy_data_types
. (#60)station_number
to first argument to facilitate piped analysis (#54)search_stn_name
and search_stn_number
now query both realtime and historical data sources and have tests for a more complete list (#56)ws_token
can successfully be called by ws_token()
..onAttach()
checks if HYDAT is downloaded on package load.rappdirs
to imports and using to generate download path for download_hydat()
(#44)rappdirs
so that all the hy_* functions access hydat from rappdirs::user_data_dir()
via hy_dir()
(#44)FULL MONTH
evaluate to a logic (#51)download_realtime_ws()
with some documentation on actual limits. (3234c22).onload
(#47)SED_MONTHLY_LOADS
(#51)output_symbol
has been added as an argument so code can be produced if desired (#33)download_realtime_ws
(#27)STN_*
functionsSTN_DATUM_RELATED
STN_DATA_RANGE
bug (#26)styler
package to format code to tidyverse style guidePROV_TERR_STATE_LOC
to allstations
search_number
functionMONTHLY
functionson.exit()
to internal code; a better way to disconnect*Renamed real-time function as download_realtime and download_realtime2 *Added more units tests *Wrote vignette for package utilization *Brought all data closer to a "tidy" state
*Added ability for STATIONS to retrieve ALL stations in the HYDAT database *Added ability for STATIONS to retrieve ALL stations in the HYDAT database *Standardize documentation; remove hydat_path default *Better error handling for download_realtime *Update documentation *Adding param_id data, data-raw and documentation *Dates filter to ANNUAL_STATISTICS and DLY_FLOWS; func and docs *DLY_LEveLS function and docs *download_ws and get_ws_token function and docs *UPDATE README
*Added ability for STATIONS to retrieve ALL stations in the HYDAT database *Added ability for STATIONS to retrieve ALL stations in the HYDAT database *Standardize documentation; remove hydat_path default *Better error handling for download_realtime *Update documentation *Adding param_id data, data-raw and documentation *Dates filter to ANNUAL_STATISTICS and DLY_FLOWS; func and docs *DLY_LEveLS function and docs *download_ws and get_ws_token function and docs *UPDATE README
*fixed db connection problem; more clear documentation *better error handling; more complete realtime documentation *harmonized README with standardized arguments
*Added example analysis to README *Added devex badge; license to all header; import whole readr package *Able to take other protidyhydat inces than BC now *Update documentation; README
*Initial package commit *Add license and include bcgotidyhydat files in RBuildIgnore *Two base working function; package level R file and associated documentation *Only importing functions used in the function *Update README with example *Added download_ functions *Added ANNUAL_STATISTICS query/table and docs *Updated docs and made DLY_FLOWS more rigorous