String
Standard C++ library ‘std::basic_string’ is used for strings. For details see
http://en.cppreference.com/w/cpp/string/basic_string
string
size: up to 264 Bytes (i.e. limited only by memory)
An EVD file Example:
field_name1 string(10) field_name2 string(10) sep="" field_name3 string sep=";" null="NULL" field_name4 string null="" quote="\"" field_name5 string null=["","N/A","NA"] last_field string
where
- ‘field_name1’
cannot be NULL and has fixed length 10 bytes, followed by the value of
$EVL_DEFAULT_FIELD_SEPARATOR
environment variable.- ‘field_name2’
cannot be NULL and has fixed length 10 bytes, with no separator.
- ‘field_name3’
is nullable and string ‘NULL’ is interpreted as NULL value. End of the field is represented by character ‘;’.
- ‘field_name4’
is nullable and empty string is interpreted as NULL value. Field is quoted by ‘"’, but for an empty string, quotes are not needed. The end of the field is represented by
$EVL_DEFAULT_FIELD_SEPARATOR
.- ‘field_name5’
is nullable and empty string, ‘N/A’ and ‘NA’ are interpreted as NULL value when reading, but when writing into text file, NULL is represented by the first one, i.e. an empty string. The end of the field is represented by
$EVL_DEFAULT_FIELD_SEPARATOR
.- ‘last_field’
cannot be NULL and the end of the field is represented by
$EVL_DEFAULT_RECORD_SEPARATOR
.
Example of four records which can be parsed by above EVD file definition.
| NULL;"second string field"|NA|last field 0123456789|0123456789first string field;""|N/A|last field ----------|----------;" ; second field | "|third string field|last field abcdefghij|abcdefghij ;||last field
Neither EVL_DEFAULT_FIELD_SEPARATOR
nor EVL_DEFAULT_RECORD_SEPARATOR
is set, so default values are used, i.e. ‘|’ and ‘\n’.
Manipulation
Standard methods from the library ‘basic_string’, where ‘a’, ‘b’ are strings, ‘c’ is a char, ‘i’ is an (unsigned) int):
a.empty()
checks whether the string ‘a’ is empty,
a.size(), a.length()
returns the number of characters,
a.clear()
clears the contents of string ‘a’,
a.insert()
inserts characters,
a.erase(position,size)
removes from string ‘a’ characters from after ‘position’ of the size ‘size’,
a.push_back(c)
appends a character ‘c’ to the end,
a.pop_back()
removes the last character,
a.append(b), +=
appends characters to the end,
operator +
concatenates two strings or a string and a char,
a.replace(position,size,b)
replaces in string ‘a’ from after ‘position’ of size ‘size’ by string ‘b’,
a.substr(position,size)
returns a substring of ‘a’ from after ‘position’ and of length ‘size’,
a.copy(b)
copies characters,
a.resize(i,c)
changes the number of characters stored, if ‘i’ is shorter than current length then it simply cuts, if ‘i’ is longer, then add character ‘c’ to fill the length ‘i’,
a.swap(b)
swaps the contents of ‘a’ and ‘b’.
Search
find()
¶find characters in the string,
rfind()
¶find the last occurrence of a substring,
find_first_of()
¶find first occurrence of characters,
find_first_not_of()
¶find first absence of characters,
find_last_of()
¶find last occurrence of characters,
find_last_not_of()
¶find last absence of characters.
Comparison
Operators ==, != <, >, <=, >=
lexicographically compares two strings,
compare()
compares two strings.
Numeric conversions
EVL specific string functions
The advantage of using EVL specific function is that they handle NULLs, i.e. when the string is NULL, then also the output is NULL. Using native C++ functions need to handle NULLs conditionally.
The list of such function:
hex_to_str(str), str_to_hex(str)
to convert ordinary string to its hexadecimal representation and vice versa,
length(str)
returns the length of given string,
md5sum(str)
returns MD5 checksum,
sha256sum(str), ...
SHA checksum functions,
split(str,char)
to split a string into a vector,
starts_with(str,substr), ends_with(str,substr)
to check if a string starts or ends with a given character or string ‘substr’,
str_compress(str,method), str_uncompress(str,method)
to un/compress a string by given method: snappy or gzip,
str_count(str,substr)
returns the number of ‘substr’ occurrences,
str_index(str,substr), str_rindex(str,substr)
returns the position of ‘substr’ from left or right,
str_mask_left(str,len,char), str_mask_right(str,len,char)
to replace by ‘char’ the specified number of characters from left/right,
str_pad_left(str,len,char), str_pad_right(str,len,char)
to add from left/right the specified character, up to the given length,
str_replace(str,strA,strB)
to replace a string or character ‘strA’ by ‘strB’,
substr(str,pos,len)
it returns a substring starting after position ‘pos’ of the specified length ‘len’.
trim(str), trim_left(str), trim_right(str)
to trim a string by specified character,
uppercase(str), lowercase(str)
to change to uppercase or lowercase string, ...
where ‘str’ is the string or a pointer to the string.
See String Functions for details.