String Functions
All string manipulation functions can be used in two ways:
- with pointers (preferred)
- without pointers (i.e. as referenced values, “with star”)
Option with pointers is preferred as it can handle NULL values (‘nullptr’ in fact). So these two examples:
out->field = str_function(in->field); *out->field = str_function(*in->field);
are basically the same, but the second one will fail in case ‘in->field’ will be NULL (i.e. ‘nullptr’).
There are these two rules in all string manipulation functions described in this section:
- When the first argument is a pointer, the function returns also a pointer.
- When the first argument is ‘nullptr’, the function returns ‘nullptr’ as well.
Checksum Functions
Standard checksum functions – ‘md5sum()’, ‘sha224sum()’, ‘sha256sum()’, ‘sha384sum()’, ‘sha512sum()’ – can be used in mapping this way for example:
*out->anonymized_username = sha256sum(*in->username);
When the argument is ‘nullptr’, it returns ‘nullptr’. But in such case you need to use pointer manipulation, so the example would (preferably) look like:
out->anonymized_username = sha256sum(in->username);
Functions headers:
std::string md5sum (const char* const str); std::string md5sum (const std::string& str); std::string* md5sum (const std::string* const str); std::string sha224sum(const char* const str); std::string sha224sum(const std::string& str); std::string* sha224sum(const std::string* const str); std::string sha256sum(const char* const str); std::string sha256sum(const std::string& str); std::string* sha256sum(const std::string* const str); std::string sha384sum(const char* const str); std::string sha384sum(const std::string& str); std::string* sha384sum(const std::string* const str); std::string sha512sum(const char* const str); std::string sha512sum(const std::string& str); std::string* sha512sum(const std::string* const str);
length
Returns the length of given string.
For ‘nullptr’ it returns again ‘nullptr’.
Example:
length((string)"Some text") // return 9 length(nullptr) // return nullptr
In mapping it might look like this (without pointers):
out->str_len = length(in->first_name);
split
Example:
split("Some text, another text.", ' ') // returns vector ["Some", "text,", ‘another’, "text."]
When the first argument is ‘nullptr’, it returns ‘nullptr’.
In mapping it might look like this (without pointers):
static std::vector<std::string> name_vec; name_vec = split(*in->full_name", ' '); *out->first_name = name_vec[0]; *out->last_name = name_vec[1];
or (preferably) using pointers:
static std::vector<std::string*>* name_vec; name_vec = split(in->full_name", ' '); out->first_name = name_vec[0]; out->last_name = name_vec[1];
Function headers:
std::vector<std::string> split(const std::string& str, \ const char delimiter); std::vector<std::string*>* split(const std::string* const str, \ const char delimiter);
starts_with, ends_with
True if a string starts or ends with the given substring.
When the first argument is ‘nullptr’, it returns False.
Example:
starts_with("Some text", "Some") // return True starts_with("Some text", ‘x’) // return False starts_with(nullptr, ‘x’) // return False ends_with("Some text", ‘ext’) // return True ends_with("Some text", ‘x’) // return False
In mapping it might look like this:
*out->test_field = starts_with(in->test_field ? "OK" : "NOK" ;
Function headers:
bool starts_with(const std::string& str, const char* const prefix); bool starts_with(const std::string* const str, const char* const prefix); bool starts_with(const std::string& str, const std::string& prefix); bool starts_with(const std::string* const str, const std::string& prefix);
bool ends_with(const std::string& str, const char* const suffix); bool ends_with(const std::string* const str, const char* const suffix); bool ends_with(const std::string& str, const std::string& suffix); bool ends_with(const std::string* const str, const std::string& suffix);
str_compress, str_uncompress
Compress/uncompress the given string. Examples which return pointers:
str_compress(in->string_field_to_compress) // snappy by default str_compress(in->string_field_to_compress, compression::gzip) str_compress(in->snappy_field) // snappy by default str_compress(in->gzipped_field, compression::gzip)
Examples which return string values:
str_compress(*in->string_field_to_compress) // snappy by default str_compress(*in->string_field_to_compress, compression::gzip) str_compress(*in->snappy_field) // snappy by default str_compress(*in->gzipped_field, compression::gzip)
When the first argument is ‘nullptr’, it returns ‘nullptr’.
In mapping it might look like this:
out->gzipped_field = str_compress(in->string_field);
Function headers:
std::string str_compress(const std::string& str, \ const compression method = compression::snappy); std::string* str_compress(const std::string* const str, \ const compression method = compression::snappy);
std::string str_uncompress(const std::string& str, \ const compression method = compression::snappy); std::string* str_uncompress(const std::string* const str, \ const compression method = compression::snappy);
str_count
It counts the number of occurrences of given string or character. Example:
str_count("Some text, another text.", ' ') // returns 3 str_count("Some text, another text.", "text") // returns 2
When the first argument is ‘nullptr’, it returns ‘nullptr’.
In mapping it might look like this (using pointers):
out->jan_cnt = str_count(in->first_name", "Jan");
or without pointers:
*out->jan_cnt = str_count(*in->first_name", "Jan");
Function headers:
std::size_t str_count(const std::string& str, const char ch); std::size_t* str_count(const std::string* const str, const char ch); std::size_t str_count(const std::string& str, const char* const substr); std::size_t* str_count(const std::string* const str, \ const char* const substr); std::size_t str_count(const std::string& str, const std::string& substr); std::size_t* str_count(const std::string* const str, \ const std::string& substr);
str_index, str_rindex
str_index(str,substr)
it returns the index (counted from 0) of the first occurrence of the given substring,
str_rindex(str,substr)
it returns the index (counted from 0) of the last occurrence of the given substring.
When no match, then ‘-1’ is returned.
When the string is ‘nullptr’, it returns ‘nullptr’.
Examples:
str_index("Some text text", "text") // return 5 str_index("Some text text", "xyz") // return -1 str_index(nullptr, 'x') // return nullptr str_rindex("Some text text", "text") // return 10
Function headers:
std::int64_t str_index(const std::string& str, const char* const substr); std::int64_t* str_index(const std::string* const str, \ const char* const substr); std::int64_t str_index(const std::string& str, const std::string& substr); std::int64_t* str_index(const std::string* const str, \ const std::string& substr);
std::int64_t str_rindex(const std::string& str, const char* const substr); std::int64_t* str_rindex(const std::string* const str, \ const char* const substr); std::int64_t str_rindex(const std::string& str, const std::string& substr); std::int64_t* str_rindex(const std::string* const str, \ const std::string& substr);
str_mask_left, str_mask_right
Functions return string with visible characters replaced by given character from given direction, but keep the specified number of character unchanged.
Example:
str_mask_left("abcd text efgh", 6) // returns "abcd tex* ****" str_mask_right("1234567890", 3, '-') // returns "---4567890"
Without the second argument, asterisk ‘*’ is assumed.
When the first argument is ‘nullptr’, these functions return ‘nullptr’.
Function headers:
std::string str_mask_left(const std::string& str, \ const std::size_t keep, const char ch = '*'); std::string* str_mask_left(const std::string* const str, \ const std::size_t keep, const char ch = '*'); std::string str_mask_right(const std::string& str, \ const std::size_t keep, const char ch = '*'); std::string* str_mask_right(const std::string* const str, \ const std::size_t keep, const char ch = '*');
str_pad_left, str_pad_right
Add from left/right the specified character (space by default), up to the given length. It counts Bytes, not characters, so be careful with multibyte encodings.
Example:
str_pad_left("123",7,'0') // returns "0000123" str_pad_right("text",7) // returns "text " str_pad_right("text",2) // returns "text" str_pad_left("Groß",6,'*') // returns "*Groß" as "ß" has 2 Bytes
When the first argument is ‘nullptr’, these functions return ‘nullptr’.
Function headers:
std::string str_pad_left(const std::string& str, \ const std::size_t length, const char ch = ' '); std::string* str_pad_left(const std::string* const str, \ const std::size_t length, const char ch = ' '); std::string str_pad_right(const std::string& str, \ const std::size_t length, const char ch = ' '); std::string* str_pad_right(const std::string* const str, \ const std::size_t length, const char ch = ' ');
str_replace
Examples:
str_replace("Some text", ' ', '-') // returns "Some-text" str_replace("Some text", "Some", "Any") // returns "Any text" str_replace("Some text", ' ', "SPACE") // returns "SomeSPACEtext"
When the first argument is ‘nullptr’, it returns ‘nullptr’.
In mapping it might look like this:
out->name = str_replace(in->name", ' ', '-');
Function headers:
std::string str_replace(const std::string& str, \ const char old_ch, const char new_ch); std::string* str_replace(const std::string* const str, \ const char old_ch, const char new_ch); std::string str_replace(const std::string& str, \ const char* const old_substr, const char* const new_substr); std::string* str_replace(const std::string* const str, \ const char* const old_substr, const char* const new_substr); std::string str_replace(const std::string& str, \ const std::string& old_substr, const std::string& new_substr); std::string* str_replace(const std::string* const str, \ const std::string& old_substr, const std::string& new_substr);
str_to_hex, hex_to_str
Convert ordinary string to its hexadecimal representation and vice versa.
When the first argument is ‘nullptr’, it returns also ‘nullptr’.
Examples:
str_to_hex("Some text") // return "536f6d652074657874" hex_to_str("536f6d652074657874") // return "Some text"
Function headers:
std::string str_to_hex(const std::string& str); std::string* str_to_hex(const std::string* const str); std::string hex_to_str(const std::string& str); std::string* hex_to_str(const std::string* const str);
substr
Return a substring starting after given position with the specified length.
Example:
substr("123456789",0,2) // returns "12" substr("123456789",6) // returns "789"
Without the third argument, it returns the rest of the string.
When the first argument is ‘nullptr’, function returns ‘nullptr’.
Function headers:
std::string substr(const std::string& str, const std::size_t pos = 0, const std::int64_t count = std::numeric_limits<std::int64_t>::max()); std::string* substr(const std::string* const str, const std::size_t pos = 0, const std::int64_t count = std::numeric_limits<std::int64_t>::max());
trim, trim_left, trim_right
Example:
trim(" text ") // returns "text" trim_left(" text ") // returns "text " trim_right("--text---", '-') // returns "--text"
Trim character ‘char’ from both sides, from left, from right, respectively. Without the second argument, space is assumed.
When the first argument is ‘nullptr’, these functions return ‘nullptr’.
Function headers:
std::string trim(const std::string& str, const char ch = ' '); std::string* trim(const std::string* const str, const char ch = ' '); std::string trim_left(const std::string& str, const char ch = ' '); std::string* trim_left(const std::string* const str, const char ch = ' '); std::string trim_right(const std::string& str, const char ch = ' '); std::string* trim_right(const std::string* const str, const char ch = ' ');
uppercase, lowercase
Examples:
uppercase("AbCd") // returns "ABCD" lowercase("AbCd") // returns "abcd"
When the argument is ‘nullptr’, these functions return ‘nullptr’.
Without specifying the second parameter it acts only on ‘A-Z’ and ‘a-z’.
When there is a need to acts also on national letters (with diacritics for example), there can be the second parameter specified with the locale:
static std::locale de_locale("de_DE.utf8"); *out->field_upcase = uppercase(*in->field, de_locale);
It is possible to specify the locale in the function as string, but using the static specification of locale is recommended due to performance.
Function headers:
std::string uppercase(const std::string& str); std::string* uppercase(const std::string* const str); std::string uppercase(const std::string& str, const std::locale& locale); std::string* uppercase(const std::string* const str, const std::locale& locale); std::string lowercase(const std::string& str); std::string* lowercase(const std::string* const str); std::string lowercase(const std::string& str, const std::locale& locale); std::string* lowercase(const std::string* const str, const std::locale& locale);