EVL – ETL Tool


Products, services and company names referenced in this document may be either trademarks or registered trademarks of their respective owners.

Copyright © 2017–2022 EVL Tool, s.r.o.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts.

Table of Contents

Gather

(since EVL 1.2)

Gather several input flows or files into one output flow or file in round-robin fashion.

Gather

is to be used in EVS job structure definition file. <f_in> and <f_out> are either input and output file or flow name.

evl gather

is intended for standalone usage, i.e. to be invoked from command line. When <file> is ’-’, then read from stdin.

EVD is EVL data definition file, for details see evl-evd(5).

Synopsis

Gather
  <f_in>... <f_out> (<evd>|-d <inline_evd>)
  [--validate] [-x|--text-input] [-y|--text-output]

evl gather
  [<file>...]  (<evd>|-d <inline_evd>)
  [--validate] [-x|--text-input] [-y|--text-output]
  [-v|--verbose]

evl gather
  ( --help | --usage | --version )

Options

-d, --data-definition=<inline_evd>

either this option or the file <evd> must be presented. Example: -d ’id int, user_id string(6) enc=iso-8859-1’

--validate

without this option, no fields are checked against data types. With this option, all output fields are checked

-x, --text-input

suppose the input as text, not binary

-y, --text-output

write the output as text, not binary

Standard options:

--help

print this help and exit

--usage

print short usage information and exit

-v, --verbose

print to stderr info/debug messages of the component

--version

print version and exit

Examples

  1. Following command:
    evl gather file.a file.b file.c file.evd -xy
    

print to stdout first record of ‘file.a’ then first record of ‘file.b’ then first record of ‘file.c’, then second records and so on

  1. To gather partitioned flow in the EVL job:
    Read      s3://my_bucket/cust.csv CUSTOMERS $EVD_CUST
    Partition CUSTOMERS CUST_P  $EVD_CUST --round-robin
    Map       CUST_P    PROC_M  $EVD_CUST $EVD_PROC $EVM_PROC
    Gather    PROC_M    PROC_G  $EVD_PROC
    Write     PROC_G    sftp:///some/path/proc.csv.gz $EVD_PROC