Anonymization
Before we will go into detail, let’s provide an overview of anonymization process.
To initiate, setup and build a project (i.e. group of data you would like to anonymize) follow these steps. See evl anon command for details about ‘evl anon’ commands.
- Create new project
evl anon project new <project_dir>
See Project for details about projects.
- Add a source, i.e. folder with files to be anonymized or database with tables to be anonymized:
evl anon source new <source_name> \ --guess-from-csv <path_to_folder_with_such_CSVs>
See Source Settings for details about settings for a source.
- Edit such a config (csv) file according to your preferences. (Excel file checks the validity immediately and provides drop down options.)
- Check the config file for mistakes
evl anon check <config_file>
- Generate anonymization jobs and workflow
evl anon build <config_file>
See Build and Run for details about jobs and workflow generation and see Config File for details about a config file.
Then to anonymize (regularly), run anonymization jobs:
evl run/anon/<table_1>.evl evl run/anon/<file_1>.evl ...
Each job represents one file or table to be anonymized. See Build and Run for details.
Note: Be careful running anonymization jobs several times, as data are by default overwritten in the target, unless
export EVL_ANON_APPEND=1
is specified in settings configs/anon/*.sh file or project.sh.
See Environment variables for details about all possible configuration EVL_ANON_*
variables.
Having many files or tables to anonymize in one batch, you don’t need to run anonymization jobs one after another, but you can run all jobs by running generated workflow:
evl run workflow/anon/<source_name>.ewf
See Salt for dealing with a salt.