Allows users to quickly and easily detect data containing Personally Identifiable Information (PII) through convenience functions.
detector
makes detecting data containing Personally Identifiable Information (PII) quick, easy, and scalable. It provides high-level functions that can take vectors and data.frames and return important summary statistics in a convenient data.frame. Once complete, detector
will be able to detect the following types of PII:
You can install:
the latest released version from CRAN with
install.packages("detector")
the latest development version from github with
if (packageVersion("devtools") < 1.6) {install.packages("devtools")}devtools::install_github("paulhendricks/detector")
If you encounter a clear bug, please file a minimal reproducible example on github.
library(dplyr, warn.conflicts = FALSE)library(generator)n <- 6ashley_madison <-data.frame(name = r_full_names(n),email = r_email_addresses(n),phone_number = r_phone_numbers(n, use_hyphens = TRUE,use_spaces = TRUE),stringsAsFactors = FALSE)ashley_madison %>%knitr::kable(format = "markdown")
name | phone_number | |
---|---|---|
Leonardo Rodriguez | [email protected] | 254- 851- 6814 |
Dee Rice | [email protected] | 597- 978- 5193 |
Conception Marquardt | [email protected] | 184- 962- 8153 |
Collette Nitzsche | [email protected] | 475- 723- 2947 |
Norman Pfannerstill | [email protected] | 153- 674- 4219 |
Katelin Gislason | [email protected] | 831- 847- 1568 |
library(detector)ashley_madison %>%detect %>%knitr::kable(format = "markdown")
column_name | has_email_addresses | has_phone_numbers | has_national_identification_numbers |
---|---|---|---|
name | FALSE | FALSE | FALSE |
TRUE | FALSE | FALSE | |
phone_number | FALSE | TRUE | FALSE |