EVL – ETL Tool


Products, services and company names referenced in this document may be either trademarks or registered trademarks of their respective owners.

Copyright © 2017–2022 EVL Tool, s.r.o.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts.

Table of Contents

Anonymization Functions

For all anonymization functions there are again the same rules as for string functions, i.e.:

  • when the argument is ‘nullptr’, it returns again ‘nullptr’;
  • when the (first) argument is ‘pointer’, it returns again ‘pointer’.

anonymize

anonymize(str, keep_chars, keep_char_class = false)
anonymize(str, min_length, max_length)

(since EVL 2.1)

First argument ‘str’ is mandatory and is of data type ‘string’ or ‘ustring’. The function then returns such data type as well.

Parameter ‘keep_chars’ is a string of characters which should be kept as is, i.e. such characters are not anonymized. Mostly it makes sense to use a space here, but for example to anonymize an email you can specify "@.". For ‘ustring’ input it must be ‘ustring’ as well, so for an email example u"@."

When parameter ‘keep_chars_class’ is ‘true’, then capital letters will be again capitals, lowercase letters stay lowercased and numbers will be numbers again.

Arguments ‘min_length, max_length’ says how long the result could be. When no ‘min_length, max_length’ parameters are used, then it returns a string or ustring of the same length as input.

Mapping examples:

out->anonymized_name = anonymize(in->name);
 // "Mircea Eliade" -> "icDoudVhaXYll" (same length)

out->anonymized_name = anonymize(in->name, " ");
 // "Mircea Eliade" -> "kJsqzt ZhGFts" (keep space)

out->anonymized_name = anonymize(in->name, " Maeiou");
 // "Mircea Eliade" -> "Misqea Jhiade" (keep also letters M,a,e,i,o,u)

out->anonymized_name = anonymize(in->name, " ", true);
 // "5 Mircea Eliade" -> "9 Piosdf Kiudpp" (keep space and char class)

out->anonymized_name = anonymize(in->name, 2, 10);
 // "Mircea Eliade" -> "jTro"       (length between 2 and 10)
 // "Franz Kafka"   -> "ksgTzDhoQf" (length between 2 and 10)

out->anonymized_name = anonymize(in->name, 0, length(in->name));
 // "Mircea Eliade" -> "lkdUuZytSd"
 // "Franz Kafka"   -> ""  // might be a NULL if 'name' is nullable 
anonymize(ustr, locale, keep_chars, keep_char_class = false)
anonymize(ustr, locale, min_length, max_length)

(since EVL 2.5)

First argument ‘ustr’ is mandatory and is of data type ‘ustring’. The function returns such data type as well.

Arguments ‘keep_chars’, ‘keep_chars_class’ and ‘min_length, max_length’ are the same as for previous variant of the function. Just ‘keep_chars’ must be of ustring data type here.

Parameter ‘locale’ is an instance of class ulocale defined in mapping, so for example the following mapping will produce anonymized (ustring) output consists of Spanish letters.

static ulocale my_locale("es_ES");
out->text_field =
    anonymize(u"Some text in Spanish.", my_locale, 1, 10);

Mapping examples with name and anonymized_name as ustring data type:

out->anonymized_name = anonymize(in->name);
 // "Leoš Janáček" -> "fQlKUHlduGus" (same length)

out->anonymized_name = anonymize(in->name, u" ");
 // "Leoš Janáček" -> "hGrT iUjSFeQ" (keep space)

out->anonymized_name = anonymize(in->name, u" š");
 // "Leoš Janáček" -> "jTDš oIZqqWv" (keep also letter š)

out->anonymized_name = anonymize(in->name, u" aeiou", true);
 // "8 Leoš Janáček" -> "3 Peoi Kařawec" (keep vowels and char class)

out->anonymized_name = anonymize(in->name, 2, 10);
 // "Bedřich Smetana" -> "SwpAq" (length between 2 and 10)
 // "Antonín Dvořák"  -> "Qs"    (length between 2 and 10)

out->anonymized_name = anonymize(in->name, 0, length(in->name));
 // "Bedřich Smetana" -> "HsgIusTFErq"
 // "Antonín Dvořák"  -> "" // might be a NULL if 'name' is nullable 
anonymize(number, min, max)

(since EVL 2.1)

To be used for ‘number’ of all integral data types, for decimals and for floats. The function returns such data type then. Example (for :

anonymize((int)100, -5, 10);
 // return integer between 95 and 110 (incl.)
anonymize(  100.00, -5, 10);
 // return float   between 95 and 110 (incl.)

anonymize_uniq

anonymize_uniq()

(since EVL 2.1)

Example:

out->anonymized_username = anonymize_uniq(in->id);

anonymize_iban

anonymize_iban()

(since EVL 2.4)

Example:

string iban  = "NL91 ABNA 0417 1643 00"
string iban2 = "NL91ABNA0417164300"

anonymize_iban(iban)
              // return .... .... .... ....
anonymize_iban(iban2)
              // return ..................
anonymize_iban(iban, iban_anon::keep_country)
              // return NL.. .... .... .... ..
anonymize_iban(iban, iban_anon::keep_country_and_bank)
              // return NL.. ABNA .... .... ..
anonymize_iban(iban, iban_anon::whole, iban_form::grouped)
              // return .... .... .... .... ..
anonymize_iban(iban, iban_anon::whole, iban_form::compact)
              // return ..................
anonymize_iban(iban, iban_anon::keep_country, iban_form::compact)
              // return NL................