EVL – ETL Tool


Products, services and company names referenced in this document may be either trademarks or registered trademarks of their respective owners.

Copyright © 2017–2022 EVL Tool, s.r.o.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts.

Table of Contents

String Functions

All string manipulation functions can be used in two ways:

  • with pointers (preferred)
  • without pointers (i.e. as referenced values, “with star”)

Option with pointers is preferred as it can handle NULL values (‘nullptr’ in fact). So these two examples:

out->field  = str_function(in->field);
*out->field = str_function(*in->field);

are basically the same, but the second one will fail in case ‘in->field’ will be NULL (i.e. ‘nullptr’).


There are these two rules in all string manipulation functions described in this section:

  • When the first argument is a pointer, the function returns also a pointer.
  • When the first argument is ‘nullptr’, the function returns ‘nullptr’ as well.

length

(since EVL 2.0)

Returns the length of given string.

For ‘nullptr’ it returns again ‘nullptr’.

Example:

length((string)"Some text")     // return 9
length(nullptr)                 // return nullptr

In mapping it might look like this (without pointers):

out->str_len = length(in->first_name);

split

(since EVL 1.3)

Example:

split("Some text, another text.", ' ')
     // returns vector ["Some", "text,", ‘another’, "text."]

When the first argument is ‘nullptr’, it returns ‘nullptr’.

In mapping it might look like this (without pointers):

static std::vector<std::string> name_vec;

name_vec = split(*in->full_name", ' ');
*out->first_name = name_vec[0];
*out->last_name  = name_vec[1];

or (preferably) using pointers:

static std::vector<std::string*>* name_vec;

name_vec = split(in->full_name", ' ');
out->first_name = name_vec[0];
out->last_name  = name_vec[1];

Function headers:

std::vector<std::string>   split(const std::string& str, \
                                 const char delimiter);
std::vector<std::string*>* split(const std::string* const str, \
                                 const char delimiter);

starts_with, ends_with

(since EVL 2.0)

True if a string starts or ends with the given substring.

When the first argument is ‘nullptr’, it returns False.

Example:

starts_with("Some text", "Some")   // return True
starts_with("Some text", "x")      // return False
starts_with(nullptr, "x")          // return False
ends_with("Some text", "ext")      // return True
ends_with("Some text", "x")        // return False

In mapping it might look like this:

*out->test_field = starts_with(in->test_field ? "OK" : "NOK" ;

Function headers:

bool starts_with(const std::string& str, const char* const prefix);
bool starts_with(const std::string* const str, const char* const prefix);
bool starts_with(const std::string& str, const std::string& prefix);
bool starts_with(const std::string* const str, const std::string& prefix);
bool ends_with(const std::string& str, const char* const suffix);
bool ends_with(const std::string* const str, const char* const suffix);
bool ends_with(const std::string& str, const std::string& suffix);
bool ends_with(const std::string* const str, const std::string& suffix);

str_compress, str_uncompress

(since EVL 2.0)

Compress/uncompress the given string. Examples which return pointers:

str_compress(in->string_field_to_compress)       // snappy by default
str_compress(in->string_field_to_compress, compression::gzip)
str_compress(in->snappy_field)                   // snappy by default
str_compress(in->gzipped_field, compression::gzip)

Examples which return string values:

str_compress(*in->string_field_to_compress)      // snappy by default
str_compress(*in->string_field_to_compress, compression::gzip)
str_compress(*in->snappy_field)                  // snappy by default
str_compress(*in->gzipped_field, compression::gzip)

When the first argument is ‘nullptr’, it returns ‘nullptr’.

In mapping it might look like this:

out->gzipped_field = str_compress(in->string_field);

Function headers:

std::string str_compress(const std::string& str, \
             const compression method = compression::snappy);
std::string* str_compress(const std::string* const str, \
             const compression method = compression::snappy);
std::string str_uncompress(const std::string& str, \
             const compression method = compression::snappy);
std::string* str_uncompress(const std::string* const str, \
             const compression method = compression::snappy);

str_count

(since EVL 1.3)

It counts the number of occurrences of given string or character. Example:

str_count("Some text, another text.", ' ')     // returns 3
str_count("Some text, another text.", "text")  // returns 2

When the first argument is ‘nullptr’, it returns ‘nullptr’.

In mapping it might look like this (using pointers):

out->jan_cnt  = str_count(in->first_name", "Jan");

or without pointers:

*out->jan_cnt = str_count(*in->first_name", "Jan");

Function headers:

std::size_t  str_count(const std::string& str, const char ch);
std::size_t* str_count(const std::string* const str, const char ch);
std::size_t  str_count(const std::string& str, const char* const substr);
std::size_t* str_count(const std::string* const str, \
                       const char* const substr);
std::size_t  str_count(const std::string& str, const std::string& substr);
std::size_t* str_count(const std::string* const str, \
                       const std::string& substr);

str_index, str_rindex

(since EVL 2.0)

str_index(str,substr)

it returns the index (counted from 0) of the first occurrence of the given substring,

str_rindex(str,substr)

it returns the index (counted from 0) of the last occurrence of the given substring.

When no match, then ‘-1’ is returned.

When the string is ‘nullptr’, it returns ‘nullptr’.

Examples:

str_index("Some text text", "text")   // return 5
str_index("Some text text", "xyz")    // return -1
str_index(nullptr, 'x')               // return nullptr
str_rindex("Some text text", "text")  // return 10

Function headers:

std::int64_t  str_index(const std::string& str, const char* const substr);
std::int64_t* str_index(const std::string* const str, \
                        const char* const substr);
std::int64_t  str_index(const std::string& str, const std::string& substr);
std::int64_t* str_index(const std::string* const str, \
                        const std::string& substr);
std::int64_t  str_rindex(const std::string& str, const char* const substr);
std::int64_t* str_rindex(const std::string* const str, \
                         const char* const substr);
std::int64_t  str_rindex(const std::string& str, const std::string& substr);
std::int64_t* str_rindex(const std::string* const str, \
                         const std::string& substr);

str_join

(since EVL 2.4)

str_join(vector_of_strings,delimiter)

it returns the string of concatenated vector members, delimited by a specified delimiter.

When the vector is ‘nullptr’, it returns ‘nullptr’.

Examples of a mapping:

static std::vector<std::string> x{"Here", "is", "a", "hardcoded", "vector."};

*out->x_spaced = str_join(x,' ')   // return "Here is a hardcoded vector."
*out->x_dashed = str_join(x,'-')   // return "Here-is-a-hardcoded-vector."
*out->x_longer = str_join(x,"---") // return "Here---is---a---hardcoded---vector."

Function headers:

std::string  str_join(const std::vector<std::string>& strings, \
                      const char delimiter);
std::string* str_join(const std::vector<std::string*>* strings, \
                      const char delimiter);
std::string  str_join(const std::vector<std::string>& strings, \
                      const std::string_view delimiter);
std::string* str_join(const std::vector<std::string*>* strings, \
                      const std::string_view delimiter);

str_mask_left, str_mask_right

(since EVL 2.1)

Functions return string with visible characters replaced by given character from given direction, but keep the specified number of character unchanged.

Example:

str_mask_left("abcd  text efgh", 6)   // returns "abcd  tex* ****"
str_mask_right("1234567890", 3, '-')  // returns "---4567890"

Without the second argument, asterisk ‘*’ is assumed.

When the first argument is ‘nullptr’, these functions return ‘nullptr’.

Function headers:

std::string  str_mask_left(const std::string& str, \
                 const std::size_t keep, const char ch = '*');
std::string* str_mask_left(const std::string* const str, \
                 const std::size_t keep, const char ch = '*');
std::string  str_mask_right(const std::string& str, \
                 const std::size_t keep, const char ch = '*');
std::string* str_mask_right(const std::string* const str, \
                 const std::size_t keep, const char ch = '*');

str_pad_left, str_pad_right

(since EVL 2.1)

Add from left/right the specified character (space by default), up to the given length. It counts Bytes, not characters, so be careful with multibyte encodings.

Example:

str_pad_left("123",7,'0')     // returns "0000123"
str_pad_right("text",7)       // returns "text   "
str_pad_right("text",2)       // returns "text"
str_pad_left("Groß",6,'*')    // returns "*Groß" as "ß" has 2 Bytes

When the first argument is ‘nullptr’, these functions return ‘nullptr’.

Function headers:

std::string  str_pad_left(const std::string& str, \
                 const std::size_t length, const char ch = ' ');
std::string* str_pad_left(const std::string* const str, \
                 const std::size_t length, const char ch = ' ');
std::string  str_pad_right(const std::string& str, \
                 const std::size_t length, const char ch = ' ');
std::string* str_pad_right(const std::string* const str, \
                 const std::size_t length, const char ch = ' ');

str_replace

(since EVL 1.3)

Examples:

str_replace("Some text", ' ', '-')        // returns "Some-text"
str_replace("Some text", "Some", "Any")   // returns "Any text"
str_replace("Some text", ' ', "SPACE")    // returns "SomeSPACEtext"

When the first argument is ‘nullptr’, it returns ‘nullptr’.

In mapping it might look like this:

out->name = str_replace(in->name", ' ', '-');

Function headers:

std::string  str_replace(const std::string& str, \
                 const char old_ch, const char new_ch);
std::string* str_replace(const std::string* const str, \
                 const char old_ch, const char new_ch);
std::string  str_replace(const std::string& str, \
                 const char* const old_substr, const char* const new_substr);
std::string* str_replace(const std::string* const str, \
                 const char* const old_substr, const char* const new_substr);
std::string  str_replace(const std::string& str, \
                 const std::string& old_substr, const std::string& new_substr);
std::string* str_replace(const std::string* const str, \
                 const std::string& old_substr, const std::string& new_substr);

str_to_base64, base64_to_str

(since EVL 2.6)

Encode/decode string to/from Base64 form.

When the first argument is ‘nullptr’, it returns also ‘nullptr’.

Examples:

str_to_base64("Some\r\nbíňářý text.")   // return "U29tZQ0KYsOtxYjDocWZw70gdGV4dC4="
base64_to_str("U29tZQ0KYsOtxYjDocWZw70gdGV4dC4=")   // return "Some\r\nbíňářý text."

Function headers:

std::string  str_to_base64(const std::string& str);
std::string* str_to_base64(const std::string* const str);
std::string  base64_to_str(const std::string& str);
std::string* base64_to_str(const std::string* const str);

str_to_hex, hex_to_str

(since EVL 2.0)

Convert string or ustring to its hexadecimal representation and vice versa. (Ustring support has been added in EVL v2.6.)

When the first argument is ‘nullptr’, it returns also ‘nullptr’.

Examples:

str_to_hex("Some text")            // return "536f6d652074657874"
hex_to_str("536f6d652074657874")   // return "Some text"

Function headers:

std::string  str_to_hex(const std::string& str);
std::string* str_to_hex(const std::string* const str);
ustring      str_to_hex(const __detail::u16str& str);
ustring*     str_to_hex(const ustring* const str);
std::string  hex_to_str(const std::string& str);
std::string* hex_to_str(const std::string* const str);
ustring      hex_to_str(const __detail::u16str& str);
ustring*     hex_to_str(const ustring* const str);

substr

(since EVL 2.0)

Return a substring starting after given position with the specified length.

Example:

substr("123456789",0,2)      // returns "12"
substr("123456789",6)        // returns "789"

Without the third argument, it returns the rest of the string.

When the first argument is ‘nullptr’, function returns ‘nullptr’.

Function headers:

std::string  substr(const std::string& str, const std::size_t pos = 0,
       const std::int64_t count = std::numeric_limits<std::int64_t>::max());
std::string* substr(const std::string* const str, const std::size_t pos = 0,
       const std::int64_t count = std::numeric_limits<std::int64_t>::max());

trim, trim_left, trim_right

(since EVL 1.0)

Example:

trim("  text ")              // returns "text"
trim_left("  text ")         // returns "text "
trim_right("--text---", '-') // returns "--text"

Trim character ‘char’ from both sides, from left, from right, respectively. Without the second argument, space is assumed.

When the first argument is ‘nullptr’, these functions return ‘nullptr’.

Function headers:

std::string  trim(const std::string& str, const char ch = ' ');
std::string* trim(const std::string* const str, const char ch = ' ');
  
std::string  trim_left(const std::string& str, const char ch = ' ');
std::string* trim_left(const std::string* const str, const char ch = ' ');

std::string  trim_right(const std::string& str, const char ch = ' ');
std::string* trim_right(const std::string* const str, const char ch = ' ');

uppercase, lowercase

(since EVL 1.0)

Examples:

uppercase("AbCd")   // returns "ABCD"
lowercase("AbCd")   // returns "abcd"

When the argument is ‘nullptr’, these functions return ‘nullptr’.

Without specifying the second parameter it acts only on ‘A-Z’ and ‘a-z’.

When there is a need to acts also on national letters (with diacritics for example), there can be the second parameter specified with the locale:

static std::locale de_locale("de_DE.utf8");
*out->field_upcase = uppercase(*in->field, de_locale);

It is possible to specify the locale in the function as string, but using the static specification of locale is recommended due to performance.

Function headers:

std::string  uppercase(const std::string& str);
std::string* uppercase(const std::string* const str);
std::string  uppercase(const std::string& str, const std::locale& locale);
std::string* uppercase(const std::string* const str, const std::locale& locale);

std::string  lowercase(const std::string& str);
std::string* lowercase(const std::string* const str);
std::string  lowercase(const std::string& str, const std::locale& locale);
std::string* lowercase(const std::string* const str, const std::locale& locale);