Detect Data Containing Personally Identifiable Information

Allows users to quickly and easily detect data containing Personally Identifiable Information (PII) through convenience functions.


detector

Build Status Build status codecov.io CRAN_Status_Badge Downloads from the RStudio CRAN mirror Project Status: Active - The project has reached a stable, usable state and is being actively developed.

detector makes detecting data containing Personally Identifiable Information (PII) quick, easy, and scalable. It provides high-level functions that can take vectors and data.frames and return important summary statistics in a convenient data.frame. Once complete, detector will be able to detect the following types of PII:

  • Full name
  • Home address
  • E-mail address
  • National identification number
  • Passport number
  • Social Security number
  • IP address
  • Vehicle registration plate number
  • Driver's license number
  • Credit card number
  • Date of birth
  • Birthplace
  • Telephone number
  • Latitude and longtiude

State of the Union

  • E-mail address
  • Telephone number
  • National identification number

Needs more work...

  • Credit card number

Haven't even started :(

  • Full name
  • Date of birth
  • Home address
  • IP address
  • Vehicle registration plate number
  • Driver's license number
  • Birthplace
  • Latitude and longtiude

Installation

You can install:

  • the latest released version from CRAN with

    install.packages("detector")
  • the latest development version from github with

    if (packageVersion("devtools") < 1.6) {
      install.packages("devtools")
    }
    devtools::install_github("paulhendricks/detector")

If you encounter a clear bug, please file a minimal reproducible example on github.

API

Generate data containing fake PII

library(dplyr, warn.conflicts = FALSE)
library(generator)
n <- 6
ashley_madison <- 
  data.frame(name = r_full_names(n), 
             email = r_email_addresses(n), 
             phone_number = r_phone_numbers(n, use_hyphens = TRUE, 
                                            use_spaces = TRUE), 
             stringsAsFactors = FALSE)
ashley_madison %>% 
  knitr::kable(format = "markdown")
name email phone_number
Leonardo Rodriguez [email protected] 254- 851- 6814
Dee Rice [email protected] 597- 978- 5193
Conception Marquardt [email protected] 184- 962- 8153
Collette Nitzsche [email protected] 475- 723- 2947
Norman Pfannerstill [email protected] 153- 674- 4219
Katelin Gislason [email protected] 831- 847- 1568

Detect data containing PII

library(detector)
ashley_madison %>% 
  detect %>% 
  knitr::kable(format = "markdown")
column_name has_email_addresses has_phone_numbers has_national_identification_numbers
name FALSE FALSE FALSE
email TRUE FALSE FALSE
phone_number FALSE TRUE FALSE

News

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("detector")

0.1.0 by Paul Hendricks, 4 years ago


https://github.com/paulhendricks/detector


Report a bug at https://github.com/paulhendricks/detector/issues


Browse source code at https://github.com/cran/detector


Authors: Paul Hendricks [aut, cre]


Documentation:   PDF Manual  


MIT + file LICENSE license


Imports stringr

Suggests testthat, generator


See at CRAN