Tools to Conduct Meteorological Normalisation on Air Quality Data

An integrated set of tools to allow data users to conduct meteorological normalisation on air quality data. This meteorological normalisation technique uses predictive random forest models to remove variation of pollutant concentrations so trends and interventions can be explored in a robust way. For examples, see Grange et al. (2018) and Grange and Carslaw (2019) .


Build Status CRAN status CRAN log

Introduction

rmweather is an R package to conduct meteorological/weather normalisation on air quality so trends and interventions can be investigated in a robust way. For those who are aware of my previous research, rmweather is the "Mk.II" package of normalweatherr. rmweather does less than normalweatherr, but it is much faster and easier to use.

Installation

rmweather is aviable from CRAN and can be installed in the normal way:

# Install rmweather from CRAN
install.packages("rmweather")

Development version

To install the development version of rmweather, the devtools package will need to be installed first. Then:

# Load helper package
library(devtools)

# Install rmweather
install_github("skgrange/rmweather")

Example usage

rmweather contains example data from London which can be used to show the meteorological normalisation procedure. The example data are daily means of NO2 and NOx observations at London Marylebone Road. The accompanying surface meteorological data are from London Heathrow, a major airport located 23 km west of Central London.

Most of rmweather's functions begin with rmw_ so are easy to track and find help for. In this example, we have used dplyr and the pipe (%>% and pronounced as "then") for clarity. The example takes about a couple of minutes on my (laptop) system and the model has an R2 value of 77 %.

# Load packages
library(dplyr)
library(rmweather)
library(ranger)

# Have a look at rmweather's example data, from london
head(data_london)

# Prepare data for modelling
# Only use data with valid wind speeds, no2 will become the dependent variable
data_london_prepared <- data_london %>% 
  filter(!is.na(ws)) %>% 
  rename(value = no2) %>% 
  rmw_prepare_data(na.rm = TRUE)

# Grow/train a random forest model and then create a meteorological normalised trend 
list_normalised <- rmw_do_all(
  data_london_prepared,
  variables = c(
    "date_unix", "day_julian", "weekday", "air_temp", "rh", "wd", "ws",
    "atmospheric_pressure"
  ),
  n_trees = 300,
  n_samples = 300,
  verbose = TRUE
)

# What units are in the list? 
names(list_normalised)

# Check model object's performance
rmw_model_statistics(list_normalised$model)

# Plot variable importances
list_normalised$model %>% 
  rmw_model_importance() %>% 
  rmw_plot_importance()

# Check if model has suffered from overfitting
rmw_predict_the_test_set(
  model = list_normalised$model,
  df = list_normalised$observations
) %>% 
  rmw_plot_test_prediction()

# How long did the process take? 
list_normalised$elapsed_times

# Plot normalised trend
rmw_plot_normalised(list_normalised$normalised)

# Investigate partial dependencies, if variable is NA, predict all
data_pd <- rmw_partial_dependencies(
  model = list_normalised$model, 
  df = list_normalised$observations,
  variable = NA
)

# Plot partial dependencies
data_pd %>% 
  filter(variable != "date_unix") %>% 
  rmw_plot_partial_dependencies()

The meteorologically normalised trend produced is below.

Examples and citations

For usage examples see:

Grange, S. K., Carslaw, D. C., Lewis, A. C., Boleti, E., and Hueglin, C. (2018). Random forest meteorological normalisation models for Swiss PM10 trend analysis. Atmospheric Chemistry and Physics 18.9, pp. 6223--6239.

Grange, S. K. and Carslaw, D. C. (2018). Using meteorological normalisation to detect interventions in air quality time series. Science of The Total Environment 653, pp. 578--588.

See also

News

rmweather 0.1.3

  • Example data are now tibbles, not data frames

  • Add citation file with two publications

  • Enhance description file to contain new publication

rmweather 0.1.2

  • Add tolerance to an R^2 unit test for some flavours of Linux used on the CRAN networks

  • The enhancement of a number of functions to allow for the estimation of uncertainty/errors of predictions

  • Convenient plotting functions now have colour arguments

  • Normalised predictions can now be returned without being aggregated

  • Add na.rm argument to data preparing function to avoid imputing of the dependent variable

  • Add replace argument to data preparing function so generated variables replace existing variables of the same name if they exist

  • Add variables_sample argument to rmw_do_all to allow for a user to choose which variables to be sampled for the normalisation step

rmweather 0.1.1

  • Resubmisison after failure to pass CRAN's manual checks

    • Expanded the package's description and added a reference which uses the method to conduct an example analysis (https://doi.org/10.5194/acp-18-6223-2018)

    • Added two new data objects

    • Replaced \dontrun{} with \donttest{} for examples

rmweather 0.1.0

  • First CRAN submission

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("rmweather")

0.1.3 by Stuart K. Grange, a year ago


https://github.com/skgrange/rmweather


Report a bug at https://github.com/skgrange/rmweather/issues


Browse source code at https://github.com/cran/rmweather


Authors: Stuart K. Grange [cre, aut]


Documentation:   PDF Manual  


GPL-3 | file LICENSE license


Imports dplyr, ggplot2, lubridate, magrittr, pdp, purrr, ranger, stringr, strucchange, tibble, viridis

Suggests testthat, openair


See at CRAN