Skip to contents

Uses a broad, configurable regex library to match likely PII columns. You can extend it with extra_patterns (they get ORed in) or replace everything with a single override_regex.

Usage

detect_sensitive_columns(x_names, extra_patterns = NULL, override_regex = NULL)

Arguments

x_names

Character vector of column names to check.

extra_patterns

Character vector of additional regexes to OR in. Examples: c("MRN", "NHS", "Aadhaar", "passport")

override_regex

Optional single regex string that fully replaces the defaults (case-insensitive). When supplied, extra_patterns is ignored.

Value

Character vector of names from x_names that matched.

Examples

detect_sensitive_columns(c("id","email","home_phone","zip","notes"))
#> [1] "id"         "email"      "home_phone" "zip"       
detect_sensitive_columns(names(mtcars), extra_patterns = c("^vin$", "passport"))
#> character(0)