Check the Properties of Common R Objects

Expressive, assertive, pipe-friendly functions to check the properties of common R objects. In the case of failure the functions issue informative error messages.


lifecycle Travis-CI BuildStatus AppVeyor buildstatus CoverageStatus License:MIT CRAN_Status_Badge CRANDownloads JOSS

checkr is a light-weight R package of expressive, assertive, pipe-friendly functions to check the properties of common R objects.

In the case of failure the functions, which are designed to be used in scripts and packages, issue informative error messages.

For an overview of the functions see the checkr-naming vignette and for a comparison with similar packages see the assertive-programming vignette.

Demonstration

The following code demonstrates the check_data() function

library(checkr)
 
# the starwars data frame in the dplyr package fails many of these checks
check_data(dplyr::starwars, values = list(
  height = c(66L, 264L),
  name = "",
  mass = c(20,1358, NA),
  hair_color = c("blond", "brown", "black", NA),
  gender = c("male", "female", "hermaphrodite", "none", NA)), 
    order = TRUE, nrow = c(81, 84), key = "hair_color", error = FALSE)
#> Warning: dplyr::starwars column names must include 'height', 'name',
#> 'mass', 'hair_color' and 'gender' in that order
#> Warning: column height of dplyr::starwars must not include missing values
#> Warning: the values in column mass of dplyr::starwars must lie between 20
#> and 1358
#> Warning: column hair_color of dplyr::starwars can only include values
#> 'black', 'blond' or 'brown'
#> Warning: dplyr::starwars must not have more than 84 rows
#> Warning: column 'hair_color' in dplyr::starwars must be a unique key

Syntax

checkr uses objects to check the values of other objects using an elegant and expressive syntax.

Class

To check the class simply pass an object of the desired class.

y <- c(2,1,0,1,NA)
check_vector(y, values = numeric(0))
check_vector(y, values = integer(0))
#> Error: y must be class integer

Missing Values

To check that a vector does not include missing values pass a single non-missing value (of the correct class).

check_vector(y, 1)
#> Error: y must not include missing values

To allow it to include missing values include a missing value.

check_vector(y, c(1, NA))

And to check that it only includes missing values only pass a missing value (of the correct class)

check_vector(y, NA_real_)
#> Error: y must only include missing values

Range

To check the range of a vector pass two non-missing values (as well as the missing value if required).

check_vector(y, c(0, 2, NA))
check_vector(y, c(-1, -10, NA))
#> Error: the values in y must lie between -10 and -1

Specific Values

To check the vector only includes specific values pass three or more non-missing values or set only = TRUE.

check_vector(y, c(0, 1, 2, NA))
check_vector(y, c(1, 1, 2, NA))
#> Error: y can only include values 1 or 2
check_vector(y, c(1, 2, NA), only = TRUE)
#> Error: y can only include values 1 or 2

Naming Objects

By default, the name of an object is determined from the function call.

check_vector(list(x = 1))
#> Error: list(x = 1) must be an atomic vector

This simplifies things but results in less informative error messages when used in a pipe.

library(magrittr)
y %>% check_list()
#> Error: . must be a list

The argument x_name can be used to override the name.

y %>% check_list(x_name = "y")
#> Error: y must be a list

Scalars

The four wrapper functions check_lgl(), check_int(), check_dbl() and check_str() check whether an object is an attribute-less non-missing scalar logical (flag), integer, double (number) or character (string). They are really useful for checking the types of arguments in functions

fun <- function(x) { check_lgl(x)}
fun(x = NA)
#> Error: x must not include missing values
fun(x = TRUE)
fun(x = 1)
#> Error: x must be class logical

Additional scalar wrappers are check_date() and check_dttm() for scalar Date and POSIXct objects. Alternatively you can roll your own using the more general check_scalar() function.

Installation

To install the latest official release from CRAN

install.packages("checkr")

To install the latest development version from GitHub

install.packages("devtools")
devtools::install_github("poissonconsulting/err")
devtools::install_github("poissonconsulting/checkr")

To install the latest development version from the Poisson drat repository

install.packages("drat")
drat::addRepo("poissonconsulting")
install.packages("checkr")

Citation


To cite checkr in publications use:

  Joe Thorley (2018). checkr: An R package for Assertive
  Programming. Journal of Open Source Software, 3(23), 624. URL
  https://doi.org/10.21105/joss.00624

A BibTeX entry for LaTeX users is

  @Article{,
    title = {checkr: {An} {R} package for {Assertive} {Programming}},
    author = {Joe Thorley},
    journal = {Journal of Open Source Software},
    year = {2018},
    volume = {3},
    number = {23},
    pages = {624},
    url = {http://joss.theoj.org/papers/10.21105/joss.00624},
  }

Contribution

Please report any issues.

Pull requests are always welcome.

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

Inspiration

datacheckr.

News

checkr 0.4.0

Major Changes

  • added err as dependency for message generation
  • check_data argument values now NULL by default (as opposed to missing)
  • lengths including nrows and ncols can now be checked by a vector of possible values
  • coerce = TRUE now also strips attributes for flag, int, dbl, string and logical, integer, double, character.

Exported

  • exported chk_deparse() to deparse dealing with NAs for packages which extend
  • exported chk_fail() to have conditional error or warning messages for packages which extend.
  • exported chk_max_int(), chk_min_int(), chk_min_dbl(), chk_max_dbl() and chk_tiny_dbl() to get integer and numeric ranges for system.

New Functions

  • added check_intersection() to check the intersection between two atomic vectors
  • added check_integer(), check_numeric(), check_double(), check_logical() and check_character()
  • added check_int() and check_dbl() both of which do coercion
  • added check_prob() to check a probability
  • added check_pos_dbl(), check_neg_dbl() and check_noneg_dbl()
  • added check_pos_int(), check_neg_int() and check_noneg_int()
  • added check_attributes() to check an objects attributes and check_no_attributes()
  • added check_lgl(), check_chr(), check_day(), check_dttm()
  • added check_grepl()

New Arguments

  • added attributes argument to check_vector() and check_scalar() which now only accept a flag for named
  • added complete = TRUE argument to check_names()
  • added exclusive = FALSE and order = FALSE to check_list()

Deprecated

  • deprecated unique = FALSE, length = NA and named = NA from check_list() as checked through values argument or with specific functions
  • deprecated check_regex() and check_pattern() (and added check_grepl()) and deprecated regex argument for pattern argument
  • deprecated check_flag_na()

checkr 0.3.0

  • redefined check_scalar (following previous deprecation)
  • added only = FALSE argument to check_vector() to check whether only the actual values are permitted.
  • added check_rbind() to check two data frames can be smoothly rbinded

checkr 0.2.0

  • deprecated check_tz() for check_tzone()
  • added check_unused() to check ... is unused within a function
  • added check_homogenous() to check object's elements are the same class
  • added check_flag_na() to check is scalar logical

checkr 0.1.0

  • added check_nchar() function
  • check_vector() and check_list() now allow named argument to be a regular expression or count range
  • added nchar = c(0L, .Machine$max.integer) and regex = ".*" arguments to check_named()
  • added check_regex() function
  • added all_y = TRUE argument to check_join() to check all rows in y in join
  • changed check_join() error message to
    ...join in x and y must include all the rows in x as opposed to ...join in x and y violates referential integrity
  • added check_number() to check that object is a scalar real
  • added assertive-programming vignette
  • vector length are now checked before values
  • lengths can now be specified using TRUE, FALSE or NA (# 2)

checkr 0.0.2

  • added check_inherits() and check_classes() functions
  • check_named() now only checks unique when unique = TRUE
  • check_names() (and check_colnames()) can now check names are unique and also accept names = character(0) (and colnames = character(0))

checkr 0.0.1

  • first official release

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("checkr")

0.4.0 by Joe Thorley, 5 months ago


https://github.com/poissonconsulting/checkr


Report a bug at https://github.com/poissonconsulting/checkr/issues


Browse source code at https://github.com/cran/checkr


Authors: Joe Thorley [aut, cre]


Documentation:   PDF Manual  


MIT + file LICENSE license


Imports err

Suggests assertthat, checkmate, covr, datasets, magrittr, dplyr, testthat, knitr, rmarkdown


Imported by flobr, mcmcr, rpdo, rtide, ssdtools, tinter, ypr.


See at CRAN