Basic Pattern Analysis

Run basic pattern analyses on character sets, digits, or combined input containing both characters and numeric digits. Useful for data cleaning and for identifying columns containing multiple or nonstandard formats.

Basic pattern analysis, as implemented in the R package bpa, is a data pre-processing tool and is designed to help reduce the time spent doing various pre-processing tasks. It takes inspiration from some of the functionality of SAS/DataFlux Data Management Studio. More specifically, the functions in bpa help standardize the data so that multiple formatting issues, typos, and other unexpected outcomes can more easily be identified in unfamiliar and/or large amounts of data. For more information and example usage, see the introductory vignette included with the package.

The current stable release of the bpa package is available from CRAN and can be installed using install.packages:

# Install current stable release from CRAN

The development version is hosted on GitHub at and can be installed using install_github from the devtools package:

# Assuming devtools is already installed

NEWS for bpa package

  • Initial release.
  • Added NEWS file.
  • Added introductory vignette.
  • Fixed trim_ws to remove dependency on R version 3.2.0.

