EVL – ETL Tool


Products, services and company names referenced in this document may be either trademarks or registered trademarks of their respective owners.

Copyright © 2017–2022 EVL Tool, s.r.o.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts.

Table of Contents

EVD Options

Structure of the data is described in an EVD file – an EVL data types definition file – with file extension .evd.

Separator Definition

Field separator is defined by ‘sep="X"’, where ‘X’ can be an empty string or an ascii character below 128 specified as normal string or special character ‘\n’, ‘\r’, ‘\t’, ‘\v’, ‘\b’, ‘\f’, ‘\a’, ‘\"’, ‘\\’, or in hexa ‘\x??’ (0-7E) (as it is always a single character, ‘\x?’ is also possible).

Default separators can be defined:

EVL_DEFAULT_FIELD_SEPARATOR

defines default field separator, when not set, EVL_DEFAULT_FIELD_SEPARATOR='|' is used,

EVL_DEFAULT_RECORD_SEPARATOR

defines default record separator, i.e. the last field separator, when not set, EVL_DEFAULT_RECORD_SEPARATOR='\n' is used.

When these variables are set, then no ‘sep=’ options are needed in the above EVD example and these defaults are used instead.

Note: It is recommended to use these variables only for project-wide settings in project.sh. Try to avoid to set them in jobs. Better use ‘sep=’ option in evd file.

In case we want to have an empty separator, for example after fixed length field, we can use ‘sep=""’.

Null Option

A null string by ‘null="X"’ or list of strings ‘null=["X","Y",...]’ can be specified. Then such string(s) will be read as ‘null’ values when ‘--text-input’ is used by the component.

When writing the ‘null’ value by the output component with ‘--text-output’ option, such string will be used instead.

When the list of null values is specified, then the first one will be used to write.

To type a special character, like newline or ‘TAB’, standard hexadecimal notation can be used: ‘\x??’, or also special notation for often used special characters: ‘\n’, ‘\r’, ‘\t’, ‘\v’, ‘\b’, ‘\f’, ‘\a’, ‘\"’, ‘\\’. So then to interpret a tabulator as NULL value use ‘null="\t"’.

Quote Option

When reading csv files, fields might be quoted by some character, usually by double quotes: ‘"’.

Proper parsing of such field is done by specifying attributes ‘quote=’ or ‘optional_quote=’.

Specified string might be any ascii character below 128 specified as normal string or special character ‘\n’, ‘\r’, ‘\t’, ‘\v’, ‘\b’, ‘\f’, ‘\a’, ‘\"’, ‘\\’, or in hexa ‘\x??’ (0-7E) (as it is always a single character, ‘\x?’ is also possible).

quote="<quote_char>"

Use this attribute when the field is always quoted.

optional_quote="<quote_char>"

Using this attribute, the field doesn’t need to be quoted.

Encoding and Locale

(since EVL 2.5)

enc="<encoding>"

To specify an encoding of given field, string functions then behaves according to that.

locale="<locale>"

To specify a locale of given field, components (like sort) then behaves according to that.

Examples

czech_string_in_utf8  string  enc="utf8" locale="cs_CZ"
en_string_in_utf8     string  enc="utf8" locale="en_GB"

When there is no encoding or locale specified in an EVD, then following environment variables can be used:

EVL_DEFAULT_STRING_ENC=""

defines default encoding, when not set, empty encoding is used,

EVL_DEFAULT_STRING_LOCALE="C"

defines default locale, when not set, generic ‘C’ locale is used.

Max string length

(since EVL 2.5)

Attributes which are used to specify maximal length of given string field. So far used only in case of load/unload tables.

max_bytes="<number>"

To specify maximum Bytes of given string field. Is populated when generated based on table definition, e.g. ‘VARCHAR(100 BYTES)’.

max_chars="<number>"

To specify maximum characters of given string field. Is populated when generated based on table definition, e.g. ‘VARCHAR(100 CHARS)’.

Examples:

string_20_bytes  string  enc="utf8" max_bytes="20"  // VARCHAR(20 BYTES)
string_20_chars  string  enc="utf8" max_chars="20"  // VARCHAR(20 CHARS)

Both attributes are currently used in ‘Writeora’ component to know the maximal length of a string field.

QVD options

(since EVL 2.4)

qvd:format="<format_string>"

To specify a format string for ‘timestamp’, ‘datetime’, and ‘date’ data types when read/write Qlik’s QVD files. Example:

request_dt  timestamp   qvd:format="%d/%m/%Y %H:%M:%S"
some_date   date        qvd:format="%d.%m.%Y"
qvd:interval
qvd:time

To be used as an attribute for ‘timestamp’ and ‘datetime’ data types to get an interval or time data type in Qlik’s QVD files. Example:

request_time1  timestamp  qvd:time
request_time2  timestamp  qvd:interval

Compared to Qlik’s time data type, interval can be larger than 24 hours. For example input timestamp ‘1970-01-02 03:05:30’ would be ‘03:05:30’ as time, but ‘27:05:30’ as interval.

Default Values

It is important to keep in mind, that when no output record is specified in the EVM mapping (see EVM Mappings), then default value is taken, i.e. not ‘nullptr’ is taken!

For string it is empty string, for integers, floats and decimal it is ‘0’ and for date, datetime and timestamp it is ‘1970-01-01’ and for time it is ‘00:00:00’.