EVL – ETL Tool


Products, services and company names referenced in this document may be either trademarks or registered trademarks of their respective owners.

Copyright © 2017–2022 EVL Tool, s.r.o.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts.

Table of Contents

EVD and Data Types

EVD’ stands for ‘EVL Data Definition’ and it is the way how to specify structure of data sets in EVL. It can be used either inline, as a component option, or in an *.evd file.

EVL uses mostly standard C++ data types, so most of them are well known.

EVD Structure

Firstly an example of evd file, which defines data types for some csv file:

ID           int                sep=";"
Name         string             sep=";"  null=""
"Birth Date" date("%-d.%-m.%Y") sep=";"  null="1.1.1970"
Amount       decimal(12,3)      sep=";"    thousands_sep=","
"Created At" datetime           sep="\n" null="0000-00-00 00:00:00"

In general each nonempty line of EVD file looks like this:

<indent> Field_Name <blank> Data_Type <blank> EVD_Options

where

<indent>

might be empty, 2 spaces, 4 spaces, 6 spaces, etc., to define a substructure of compound data types, see Compound Types for details,

Field_Name

is a sequence of any printable ASCII characters below 128. When a space is used, then whole field name must be quoted by double quotes. Special characters (also only ASCII ones under 128) must be escaped, e.g. ‘\n’, ‘\r’, ‘\t’, ‘\v’, ‘\b’, ‘\f’, ‘\a’, ‘\"’, ‘\\’, or in hexa ‘\x??’. Characters other than letters, numbers and underscore are replaced by underscore in mappings. All these field names are valid:

                            // Name in mapping:
recommended_field_name      // recommended_field_name
"Field with a Space"        // Field_with_a_Space
'field-with-a-hyphen'       // _field_with_a_hyphen_
"$field_with_dollar"        // _field_with_dollar
'single_quoted'field'       // _single_quoted_field_
"with\nnewline"             // with_newline
Data_Type

is one of:

EVD_Options

is <blank> separated list of options, see EVD Options

<blank>

is one or more spaces and/or tabs.


Comments

Standard C-style comments can be used in evd file, for example:

street_id    int
street_name  string
street_code  string  null=""  // but NOT NULL in DB
/* COMBAK: street_code will be replaced by street_num later this year
street_num   long
*/

Inline EVD

For the most of the EVL Components an inline EVD can be specified as an option. In such case comments are not allowed and the format is simply the same as for EVD in a file, just instead of newlines, commas are used to separate each field definition.

The same structure, as in above EVD Example, but as a component option (a comma separated list of fields with data types and options):

--data-definition='id int sep=";",
    name string sep=";" null="",
    birth_date date sep=";" null="1970-01-01",
    amount decimal(12,3) sep=";" thousands_sep=",",
    created_at datetime sep="\n" null="0000-00-00 00:00:00"'

Variables in EVD

There could be shell variables to be resolved in the evd file. For example:

$COMMON_EVD
specific_field1  string  null=""
specific_field2  date    null=""

will resolve the $COMMON_EVD environment variable and use such EVD. So having defined

export COMMON_EVD="$(cat common.evd)"

where common.evd might look like this:

id          int
valid_from  datetime
valid_to    datetime

will resolve finally used EVD would be:

id          int
valid_from  datetime
valid_to    datetime
specific_field1  string  null=""
specific_field2  date    null=""

In some situations, like having a dollar character as part of field name, this functionality may be switched off by setting

export EVL_ENVSUBST_EVD=0

in your job or in the shell.