Generate fake data with privacy controls
Source:R/generate_fake_with_privacy.R
generate_fake_with_privacy.Rd
Generates a synthetic copy of data
, then optionally detects/handles
sensitive columns by name. Detection uses the ORIGINAL column names and
maps to output via attr(fake, "name_map")
if present.
Arguments
- data
A data.frame (or coercible) to mirror.
- n
Rows to generate (default same as input if NULL).
- level
One of "low","medium","high".
- seed
Optional RNG seed.
- sensitive
Character vector of original column names to treat as sensitive.
- sensitive_detect
Logical; auto-detect common sensitive columns by name.
- sensitive_strategy
One of "fake" or "drop".
- normalize
Logical; lightly normalize inputs.
- sensitive_patterns
Optional named list of patterns to treat as sensitive (e.g., list(id = "...", email = "...", phone = "...")). Overrides defaults.
- sensitive_regex
Optional fully-combined regex (single string) to detect sensitive columns by name. If supplied, it is used instead of defaults.