EVL

Table of Contents


Products, services and company names referenced in this document may be either trademarks or registered trademarks of their respective owners.

Copyright © 2017–2020 EVL Tool, s.r.o.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts.

EVD and Data Types

EVD stands for EVL Data definition and it is the way how to specify structure of data sets in EVL. It can be used either inline, as a component option, or in the *.evd file.

Base Data Types

EVL uses mostly standard C++ data types.

Data typeBytesminmaxlibrary
char1−128127native C++
uchar10255native C++
short2−32,76832,767native C++
ushort2065,535native C++
int4−2,147,483,6482,147,483,647native C++
uint404,294,967,295native C++
long8−263 (ca. −9×1018)263−1 (ca. 9×1018)native C++
ulong80264−1 (ca. 18×1018)native C++
decimal(m,n)16EVL
float4±3.4×10±38(7 digits)native C++
double8±1.7×10±308(15 digits)native C++
stringstd::basic_string
date81970-01-01 ± ca.6×1011 yearsstd::time_t
timestamp81970-01-01 ± ca.6×1011 yearsstd::time_t

EVD contains the definition of fields (columns) of the data. (Sometimes called DDL or schema.)

Example of such *.evd file, which defines data types for some semicolon delimited csv file:

int_field        int            sep=";"  null="\N"
long_field       long           sep=";"  null=[0,"---"]
float_field      float          sep=";"  null="0000"
double_field     double         sep=";"  null="NULL"
date_field1      date           sep=";"  null="1973-01-01"
date_field2      date("%Y%m%d") sep=";"
timestamp_field1 timestamp("%Y%m%d%H%M%S")
timestamp_field2 timestamp      sep=";"  null="0000-00-00 00:00:00"
decimal_field    decimal(8,3)   sep=";"  decimal_sep="." thousands_sep=","
string_field     string         sep="\n" 

All possible primitive data types:

  • String types: ‘char’, ‘string’.
  • Integral types: ‘char’, ‘uchar’, ‘short’, ‘ushort’, ‘int’, ‘uint’, ‘long’, ‘ulong’.
  • Decimal type: ‘decimal’.
  • Float types: ‘float’, ‘double’.
  • Date/time types: ‘date’, ‘timestamp’.

Default Values

It is important to keep in mind, that when no output record is specified in the EVM mapping (see EVM Mappings), then default value is taken, i.e. not ‘nullptr’ is taken!

For string it is empty string, for integers, floats and decimal it is ‘0’ and for date and timestamp it is ‘1970-01-01’.

Separator Definition

Field separator is defined by ‘sep="X"’, where ‘X’ can be an empty string or an ascii character below 128 specified as normal string or special character ‘\n’, ‘\r’, ‘\t’, ‘\v’, ‘\b’, ‘\f’, ‘\a’, ‘\"’, or in hexa ‘\x??’ (0-7E) (as it is always a single character, ‘\x?’ is also possible).

Default separators can be defined:

EVL_DEFAULT_FIELD_SEPARATOR

defines default field separator, and

EVL_DEFAULT_RECORD_SEPARATOR

defines default record separator, i.e. the last field separator.

These characters are set by default:

export EVL_DEFAULT_FIELD_SEPARATOR="|"
export EVL_DEFAULT_RECORD_SEPARATOR=$'\n'

but can be changed in project.sh for example.

When these variables are set, then no ‘sep=’ options are needed in the above EVD example and these defaults are used instead.

Note: It is recommended to use these variables only for project-wide settings in project.sh. Try to avoid to set them in jobs.

In case we want to have an empty separator, for example after fixed length field, we can use ‘sep=""’.

Null Option

A null string by ‘null="X"’ or list of strings ‘null=["X","Y",...]’ can be specified. Then such string(s) will be read as ‘null’ values when ‘--text-input’ is used by the component.

When writing the ‘null’ value by the output component with ‘--text-output’ option, such string will be used instead.

When the list of null values is specified, then the first one will be used to write.

Quote Option

When reading csv files, fields might be quoted by some character, usually by ‘"’.

Proper parsing of such field is done by specifying attributes ‘quote=’ or ‘optional_quote=’.

Specified string might be any ascii character below 128 specified as normal string or special character ‘\n’, ‘\r’, ‘\t’, ‘\v’, ‘\b’, ‘\f’, ‘\a’, ‘\"’, or in hexa ‘\x??’ (0-7E) (as it is always a single character, ‘\x?’ is also possible).

quote=

Use this attribute when a field is always quoted.

optional_quote=

Using this attribute, a field doesn’t need to be quoted.

Compound Data Types

vector

Members can be one of the primitive data types.

struct

Members can be of any primitive data type or vector or again stucture.

Example of evd file, which defines ‘vector’ and ‘struct’ data types:

int_field         int          sep="|"
struct_field      struct       sep="|"
  double_field    double       sep=";"
  date_field1     date         sep=";"  null="1973-01-01"
vector_field      vector       sep="\n"
  timestamp_field timestamp    sep=","  null="0000-00-00 00:00:00"

Then in mapping you can manipulate the whole structure or vector with ‘in->struct_field’ or ‘in->vector_field’. Particular element of ‘struct’ you can then reach by ‘in->struct_field->double_field’ for example.

Elements of ‘vector’ or ‘struct’ are distinguished by indentation in yaml style.

struct’ and ‘vector’ are especially useful for reading and writing JSON and XML files.