EVL – ETL Tool


Products, services and company names referenced in this document may be either trademarks or registered trademarks of their respective owners.

Copyright © 2017–2022 EVL Tool, s.r.o.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts.

Table of Contents

Write

(since EVL 1.0)

Write <f_in> into a file or table specified by

[uri://][[user@@]host[:port]]/path/basename[.format][.compression]

Besides below mentioned options, which changes file suffix behaviour, one can use generic ‘--cmd=<cmd>’ option, which calls something like ‘| <cmd> > <path>’ at the end. <cmd> can be also a pipeline. See examples below for inspiration.

Write

is to be used in EVS job structure definition file. <f_out> is either output file or flow name.

evl write

is intended for standalone usage, i.e. to be invoked from command line and and write to standard output.

EVD is EVL data definition file, for details see evl-evd(5).

URI Scheme for file

Based on the URI Scheme ‘uri://’, component calls appropriate utilities to write the file to the destination.

no scheme, ‘file://’,

suppose local filesystem

gdrive://

calls ‘gdrive’ utility

gs://

calls ‘gsutil’ utility

hdfs://

calls ‘hadoop fs’ utility

s3://

calls ‘aws s3’ utility

sftp://

calls ‘ssh’ utility

smb://

calls ‘smbclient’ utility

URI Scheme for table

Based on the URI Scheme ‘uri://’, component switch to appropriate EVL component.

postgres://

calls Readpg component to write to PostgreSQL table

Compression:

Compression file suffix behaviour (applied by following the order):

*.bz2’, ‘*.BZ2

calls ‘bzip2 -c

*.gz’, ‘*.GZ

calls ‘gzip -c

*.zip’, ‘*.ZIP

calls ‘zip

File Type:

Write component behaves according to the <file> suffix.

Specific file formats suffix behaviour:

*.avro’, ‘*.AVRO

calls ‘evl writeavro

*.csv’, ‘*.CSV’, ‘*.txt’, ‘*.TXT

write file with ‘--text-output’ option, other than standard Unix end-of-line character (‘\n’) can be specified by option ‘--dos-eol’ or ‘--mac-eol

*.json’, ‘*.JSON

calls ‘evl writejson

*.parquet’, ‘*.parq’, ‘*.PARQUET’, ‘*.PARQ

calls ‘evl writeparquet

*.qvd’, ‘*.QVD

calls ‘evl writeqvd

*.qvx’, ‘*.QVX

calls ‘evl writeqvx

*.xlsx’, ‘*.XLSX

calls ‘evl writexlsx

*.xml’, ‘*.XML

calls ‘evl writexml

Synopsis

Write
  <f_in>
  ( [uri://][[user@]host[:port]]/path/basename[.format][.compression] |
    [uri://][[user@]host[:port]]/database/table )
  (<evd>|-d <inline_evd>) [--append]
  [--footer-file=<f_in>] [--header-file=<f_in> | -h|--header] 
  [ --avro |
    --json [--omit-null-fields] [--array-output] |
    --parquet |
    --qvd | --qvx
    --xlsx
    --xml [--document-tag=<tag>] [--record-tag=<tag>]
          [--vector-element-tag=<tag>] |
    -y|--text-output [--dos-eol] [--mac-eol]
  ]
  [--gz] [--cmd=<cmd>] [--ignore-suffix]
  [-x|--text-input] [--validate]

evl write
  ( [uri://][[user@]host[:port]]/path/basename[.format][.compression] |
    [uri://][[user@]host[:port]]/database/table )
  (<evd>|-d <inline_evd>) [--append]
  [--footer-file=<file>] [--header-file=<file> | -h|--header] 
  [ --avro |
    --json [--omit-null-fields] [--array-output] |
    --parquet |
    --qvd | --qvx
    --xlsx
    --xml [--document-tag=<tag>] [--record-tag=<tag>]
          [--vector-element-tag=<tag>] |
    -y|--text-output [--dos-eol] [--mac-eol]
  ]
  [--gz] [--cmd=<cmd>] [--ignore-suffix]
  [-x|--text-input] [--validate]
  [-v|--verbose]

evl write
  ( --help | --usage | --version )

Options

Common options:

-d, --data-definition=<inline_evd>

either this option or the file <evd> must be presented

--footer-file=<file>

add <file> after last written record

-h, --header

add header line with field names. Applicable only for text files (e.g. CSV) and XLSX file.

--header-file=<file>

add <file> before the first record

--validate

without this option, no fields are checked against data types. With this option, all output fields are checked

-x, --text-input

suppose the input as text, not binary

--dos-eol

suppose the output is text with CRLF as end of line

--mac-eol

suppose the output is text with CR as end of line

-y, --text-output

write the output as text, not binary

Standard options:

--help

print this help and exit

--usage

print short usage information and exit

-v, --verbose

print to stderr info/debug messages of the component

--version

print version and exit

Options changing file suffix behaviour:

--avro

whatever file’s suffix, write the file in Avro file format

--cmd=<cmd>

bash command <cmd> is used to write into <file>. In such case recognizing file’s suffix is switched off. See examples below for inspiration.

--csv

whatever file’s suffix, write the file in as CSV using delimiters based on EVD (same as –text-output option)

--gz

whatever file’s suffix, use ‘gzip’ to compress the file

--ignore-suffix

ignore file’s suffix, act only based on options

--json

whatever file’s suffix, write the file as JSON

--parquet

whatever file’s suffix, write the file in Parquet columnar file format

--qvd

whatever file’s suffix, write the file as Qlik’s QVD file

--qvx

whatever file’s suffix, write the file as Qlik’s QVX file

--xml

whatever file’s suffix, write the file as XML

--xlsx

whatever file’s suffix, write the file as MS Excel sheet

XML specific options:

--document-tag=<tag>

for other than XML file is this option ignored. Check ‘man evl writexml’ for details.

--record-tag=<tag>

for other than XML file is this option ignored. Check ‘man evl writexml’ for details.

--vector-element-tag=<tag>

for other than XML file is this option ignored. Check ‘man evl writexml’ for details.

JSON specific options:

--array-output

using this flag the json output would be an array or records, i.e. ‘[{...},{...},...,{...}]

--omit-null-fields

for other than JSON file is this option ignored. Check ‘man evl writejson’ for details.

Examples

When password is needed in following examples, they are taken from $HOME/.evlpass file.

  1. Write local CSV file in EVL graph (an EVS file):
    TARGET_FILE="/home/myself/file.csv"
    ...
    Map   FLOW1 FLOW2  evd/f1.evd evd/f2.evd evm/f.evm
    Write FLOW2 $TARGET_FILE evd/f2.evd
    
  2. Write JSON file to AWS S3 bucket:
    TARGET_FILE="s3://mybucket/file.json"
    ...
    Map   FLOW1 FLOW2  evd/f1.evd evd/f2.evd evm/f.evm
    Write FLOW2 $TARGET_FILE evd/f2.evd
    
  3. Write Parquet file to Hadoop file system:
    TARGET_FILE="hdfs:///some/path/file.parquet"
    ...
    Map   FLOW1 FLOW2  evd/f1.evd evd/f2.evd evm/f.evm
    Write FLOW2 $TARGET_FILE evd/f2.evd
    
  4. Load gzipped CSV file to Google Storage:
    TARGET_FILE="gs://some_bucket/some/path/file.csv.gz"
    ...
    Map   FLOW1 FLOW2  evd/f1.evd evd/f2.evd evm/f.evm
    Write FLOW2 $TARGET_FILE evd/f2.evd
    
  5. Load data to Postgres table:
    TARGET_FILE="postgres://tech_user@10.11.12.13:5432/my_database/my_table"
    ...
    Map   FLOW1 FLOW2  evd/f1.evd evd/f2.evd evm/f.evm
    Write FLOW2 $TARGET_FILE evd/f2.evd
    
  6. Example of standalone usage: Write gzipped CSV file with header and validated data types over SFTP to some server:
    evl write -d 'id int sep=";", value string sep="\n"' --header -xy --validate \
          sftp://my_user@10.11.12.13:22/some/path/example.csv.gz < example.csv