EVL – ETL Tool


Products, services and company names referenced in this document may be either trademarks or registered trademarks of their respective owners.

Copyright © 2017–2022 EVL Tool, s.r.o.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts.

Table of Contents

String

Standard C++ library ‘std::basic_string’ is used for strings. For details see

http://en.cppreference.com/w/cpp/string/basic_string

string

size: up to 264 Bytes (i.e. limited only by memory)


An EVD file Example:

field_name1  string(10)
field_name2  string(10) sep=""
field_name3  string     sep=";" null="NULL"
field_name4  string             null=""              quote="\""
field_name5  string             null=["","N/A","NA"]
last_field   string

where

field_name1

cannot be NULL and has fixed length 10 bytes, followed by the value of $EVL_DEFAULT_FIELD_SEPARATOR environment variable.

field_name2

cannot be NULL and has fixed length 10 bytes, with no separator.

field_name3

is nullable and string ‘NULL’ is interpreted as NULL value. End of the field is represented by character ‘;’.

field_name4

is nullable and empty string is interpreted as NULL value. Field is quoted by ‘"’, but for an empty string, quotes are not needed. The end of the field is represented by $EVL_DEFAULT_FIELD_SEPARATOR.

field_name5

is nullable and empty string, ‘N/A’ and ‘NA’ are interpreted as NULL value when reading, but when writing into text file, NULL is represented by the first one, i.e. an empty string. The end of the field is represented by $EVL_DEFAULT_FIELD_SEPARATOR.

last_field

cannot be NULL and the end of the field is represented by $EVL_DEFAULT_RECORD_SEPARATOR.

Example of four records which can be parsed by above EVD file definition.

          |          NULL;"second string field"|NA|last field
0123456789|0123456789first string field;""|N/A|last field
----------|----------;"  ;  second field  |  "|third string field|last field
abcdefghij|abcdefghij       ;||last field

Neither EVL_DEFAULT_FIELD_SEPARATOR nor EVL_DEFAULT_RECORD_SEPARATOR is set, so default values are used, i.e. ‘|’ and ‘\n’.

Manipulation

Standard methods from the library ‘basic_string’, where ‘a’, ‘b’ are strings, ‘c’ is a char, ‘i’ is an (unsigned) int):

a.empty()

checks whether the string ‘a’ is empty,

a.size(), a.length()

returns the number of characters,

a.clear()

clears the contents of string ‘a’,

a.insert()

inserts characters,

a.erase(position,size)

removes from string ‘a’ characters from after ‘position’ of the size ‘size’,

a.push_back(c)

appends a character ‘c’ to the end,

a.pop_back()

removes the last character,

a.append(b), +=

appends characters to the end,

operator +

concatenates two strings or a string and a char,

a.replace(position,size,b)

replaces in string ‘a’ from after ‘position’ of size ‘size’ by string ‘b’,

a.substr(position,size)

returns a substring of ‘a’ from after ‘position’ and of length ‘size’,

a.copy(b)

copies characters,

a.resize(i,c)

changes the number of characters stored, if ‘i’ is shorter than current length then it simply cuts, if ‘i’ is longer, then add character ‘c’ to fill the length ‘i’,

a.swap(b)

swaps the contents of ‘a’ and ‘b’.

Comparison

Operators ==, != <, >, <=, >=

lexicographically compares two strings,

compare()

compares two strings.

Numeric conversions

stoi()

converts a string to an integer,

stol()

converts a string to a long,

stoul()

converts a string to an unsigned long,

stof()

converts a string to a float,

stod()

converts a string to a double,

to_string()

converts an integral or floating point value to string.

EVL specific string functions

The advantage of using EVL specific function is that they handle NULLs, i.e. when the string is NULL, then also the output is NULL. Using native C++ functions need to handle NULLs conditionally.

The list of such function:

hex_to_str(str), str_to_hex(str)

to convert ordinary string to its hexadecimal representation and vice versa,

length(str)

returns the length of given string,

md5sum(str)

returns MD5 checksum,

sha256sum(str), ...

SHA checksum functions,

split(str,char)

to split a string into a vector,

starts_with(str,substr), ends_with(str,substr)

to check if a string starts or ends with a given character or string ‘substr’,

str_compress(str,method), str_uncompress(str,method)

to un/compress a string by given method: snappy or gzip,

str_count(str,substr)

returns the number of ‘substr’ occurrences,

str_index(str,substr), str_rindex(str,substr)

returns the position of ‘substr’ from left or right,

str_mask_left(str,len,char), str_mask_right(str,len,char)

to replace by ‘char’ the specified number of characters from left/right,

str_pad_left(str,len,char), str_pad_right(str,len,char)

to add from left/right the specified character, up to the given length,

str_replace(str,strA,strB)

to replace a string or character ‘strA’ by ‘strB’,

substr(str,pos,len)

it returns a substring starting after position ‘pos’ of the specified length ‘len’.

trim(str), trim_left(str), trim_right(str)

to trim a string by specified character,

uppercase(str), lowercase(str)

to change to uppercase or lowercase string, ...

where ‘str’ is the string or a pointer to the string.

See String Functions for details.