Check the Properties of Common R Objects

Expressive, assertive, pipe-friendly functions to check the properties of common R objects. In the case of failure the functions issue informative error messages.


lifecycle Travis-CI BuildStatus AppVeyor buildstatus CoverageStatus License:MIT CRAN_Status_Badge CRAN Downloads JOSS

checkr is a light-weight R package of expressive, assertive, pipe-friendly functions to check the properties of common R objects.

In the case of failure the functions, which are designed to be used in scripts and packages, issue informative error messages.

For an overview of the functions see the checkr-naming vignette and for a comparison with similar packages see the assertive-programming vignette.

Demonstration

The following code demonstrates the check_data() function

library(checkr)
 
# the starwars data frame in the dplyr package fails many of these checks
check_data(dplyr::starwars, values = list(
  height = c(66L, 264L),
  name = "",
  mass = c(20,1358, NA),
  hair_color = c("blond", "brown", "black", NA),
  gender = c("male", "female", "hermaphrodite", "none", NA)), 
    order = TRUE, nrow = c(81, 84), key = "hair_color", error = FALSE)
#> Error in cc_and(colnames): could not find function "cc_and"

Syntax

checkr uses objects to check the values of other objects using an elegant and expressive syntax.

Class

To check the class simply pass an object of the desired class.

y <- c(2,1,0,1,NA)
check_vector(y, values = numeric(0))
check_vector(y, values = integer(0))
#> Error: y must be class integer

Missing Values

To check that a vector does not include missing values pass a single non-missing value (of the correct class).

check_vector(y, 1)
#> Error: y must not include missing values

To allow it to include missing values include a missing value.

check_vector(y, c(1, NA))

And to check that it only includes missing values only pass a missing value (of the correct class)

check_vector(y, NA_real_)
#> Error: y must only include missing values

Range

To check the range of a vector pass two non-missing values (as well as the missing value if required).

check_vector(y, c(0, 2, NA))
check_vector(y, c(-1, -10, NA))
#> Error in cc_and(values[1:2]): could not find function "cc_and"

Specific Values

To check the vector only includes specific values pass three or more non-missing values or set only = TRUE.

check_vector(y, c(0, 1, 2, NA))
check_vector(y, c(1, 1, 2, NA))
#> Error in cc_or(values): could not find function "cc_or"
check_vector(y, c(1, 2, NA), only = TRUE)
#> Error in cc_or(values): could not find function "cc_or"

Naming Objects

By default, the name of an object is determined from the function call.

check_vector(list(x = 1))
#> Error: list(x = 1) must be an atomic vector

This simplifies things but results in less informative error messages when used in a pipe.

library(magrittr)
y %>% check_list()
#> Error: . must be a list

The argument x_name can be used to override the name.

y %>% check_list(x_name = "y")
#> Error: y must be a list

Scalars

The four wrapper functions check_lgl(), check_int(), check_dbl() and check_str() check whether an object is an attribute-less non-missing scalar logical (flag), integer, double (number) or character (string). They are really useful for checking the types of arguments in functions

fun <- function(x) { check_lgl(x)}
fun(x = NA)
#> Error: x must not include missing values
fun(x = TRUE)
fun(x = 1)
#> Error: x must be class logical

Additional scalar wrappers are check_date() and check_dttm() for scalar Date and POSIXct objects. Alternatively you can roll your own using the more general check_scalar() function.

Installation

To install the latest official release from CRAN

install.packages("checkr")

To install the latest development version from GitHub

install.packages("devtools")
devtools::install_github("poissonconsulting/err")
devtools::install_github("poissonconsulting/checkr")

To install the latest development version from the Poisson drat repository

install.packages("drat")
drat::addRepo("poissonconsulting")
install.packages("checkr")

Citation


To cite checkr in publications use:

  Joe Thorley (2018). checkr: An R package for Assertive
  Programming. Journal of Open Source Software, 3(23), 624. URL
  https://doi.org/10.21105/joss.00624

A BibTeX entry for LaTeX users is

  @Article{,
    title = {checkr: {An} {R} package for {Assertive} {Programming}},
    author = {Joe Thorley},
    journal = {Journal of Open Source Software},
    year = {2018},
    volume = {3},
    number = {23},
    pages = {624},
    url = {http://joss.theoj.org/papers/10.21105/joss.00624},
  }

Contribution

Please report any issues.

Pull requests are always welcome.

Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

Inspiration

datacheckr

News

checkr 0.5.0

  • fixed check_key with no columns
  • added units testing
  • added names = TRUE and class = TRUE arguments to check_attributes(), check_no_attributes() and check_vector()
  • added na_distinct = FALSE argument to check_key
  • replace internal deparse_x_name() with exported chk_deparse()
  • added check_name() to check if elements of character vector are each a syntactically valid name
  • check_named() now only gives 1 warning if error = FALSE and not named
  • removed check_tz()

checkr 0.4.0

Major Changes

  • added err as dependency for message generation
  • check_data argument values now NULL by default (as opposed to missing)
  • lengths including nrows and ncols can now be checked by a vector of possible values
  • coerce = TRUE now also strips attributes for flag, int, dbl, string and logical, integer, double, character.

Exported

  • exported chk_deparse() to deparse dealing with NAs for packages which extend
  • exported chk_fail() to have conditional error or warning messages for packages which extend.
  • exported chk_max_int(), chk_min_int(), chk_min_dbl(), chk_max_dbl() and chk_tiny_dbl() to get integer and numeric ranges for system.

New Functions

  • added check_intersection() to check the intersection between two atomic vectors
  • added check_integer(), check_numeric(), check_double(), check_logical() and check_character()
  • added check_int() and check_dbl() both of which do coercion
  • added check_prob() to check a probability
  • added check_pos_dbl(), check_neg_dbl() and check_noneg_dbl()
  • added check_pos_int(), check_neg_int() and check_noneg_int()
  • added check_attributes() to check an objects attributes and check_no_attributes()
  • added check_lgl(), check_chr(), check_day(), check_dttm()
  • added check_grepl()

New Arguments

  • added attributes argument to check_vector() and check_scalar() which now only accept a flag for named
  • added complete = TRUE argument to check_names()
  • added exclusive = FALSE and order = FALSE to check_list()

Deprecated

  • deprecated unique = FALSE, length = NA and named = NA from check_list() as checked through values argument or with specific functions
  • deprecated check_regex() and check_pattern() (and added check_grepl()) and deprecated regex argument for pattern argument
  • deprecated check_flag_na()

checkr 0.3.0

  • redefined check_scalar (following previous deprecation)
  • added only = FALSE argument to check_vector() to check whether only the actual values are permitted.
  • added check_rbind() to check two data frames can be smoothly rbinded

checkr 0.2.0

  • deprecated check_tz() for check_tzone()
  • added check_unused() to check ... is unused within a function
  • added check_homogenous() to check object's elements are the same class
  • added check_flag_na() to check is scalar logical

checkr 0.1.0

  • added check_nchar() function
  • check_vector() and check_list() now allow named argument to be a regular expression or count range
  • added nchar = c(0L, .Machine$max.integer) and regex = ".*" arguments to check_named()
  • added check_regex() function
  • added all_y = TRUE argument to check_join() to check all rows in y in join
  • changed check_join() error message to
    ...join in x and y must include all the rows in x as opposed to ...join in x and y violates referential integrity
  • added check_number() to check that object is a scalar real
  • added assertive-programming vignette
  • vector length are now checked before values
  • lengths can now be specified using TRUE, FALSE or NA (# 2)

checkr 0.0.2

  • added check_inherits() and check_classes() functions
  • check_named() now only checks unique when unique = TRUE
  • check_names() (and check_colnames()) can now check names are unique and also accept names = character(0) (and colnames = character(0))

checkr 0.0.1

  • first official release

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("checkr")

0.5.0 by Joe Thorley, a month ago


https://github.com/poissonconsulting/checkr


Report a bug at https://github.com/poissonconsulting/checkr/issues


Browse source code at https://github.com/cran/checkr


Authors: Joe Thorley [aut, cre]


Documentation:   PDF Manual  


MIT + file LICENSE license


Imports err

Suggests assertthat, checkmate, covr, datasets, magrittr, hms, dplyr, units, testthat, knitr, rmarkdown


Imported by flobr, mcmcr, rpdo, rtide, ssdtools, tinter, ypr.


See at CRAN