EVL – ETL Tool


Products, services and company names referenced in this document may be either trademarks or registered trademarks of their respective owners.

Copyright © 2017–2022 EVL Tool, s.r.o.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts.

Table of Contents

Departition

(since EVL 1.2)

Gather or merge partitions into one output flow or file. When ‘-k <key>’ is specified, then sorted input of each partition is supposed and output will be again sorted (i.e. merged). With no ‘-k <key>’, it gather input partitions in round-robin fashion. Applying to only one partition simply write input to output. EVD is EVL data definition file, for details see evl-evd(5).

Synopsis

Departition
  <f_in>... <f_out> (<evd>|-d <inline_evd>)
  (--key=<key> | --round-robin)
  [--validate] [-x|--text-input] [-y|--text-output]

evl departition
  <file_in> <file_out> (<evd>|-d <inline_evd>)
  (--key=<key> | --round-robin)
  [-v|--validate] [-x|--text-input] [-y|--text-output]
  [-v|--verbose]

evl departition
  ( --help | --usage | --version )

Options

-d, --data-definition=<inline_evd>

either this option or the file <evd> must be presented. Example: -d ’id int, user_id string(6) enc=iso-8859-1’

-k, --key=<key>

merge partitioned flows/files according to the key, so the output is sorted by this key

-r, --round-robin

gather in round-robin fashion

--validate

without this option, no fields are checked against data types. With this option, all output fields are checked

-x, --text-input

suppose the input as text, not binary

-y, --text-output

write the output as text, not binary

Standard options:

--help

print this help and exit

--usage

print short usage information and exit

-v, --verbose

print to stderr info/debug messages of the component

--version

print version and exit

Examples

  1. To departition partitioned flow in the EVL job:
    Read  gs://my_bucket/cust.csv CUST $EVD_CUST
    Partition   CUST      CUST_P  $EVD_CUST --round-robin
    Map         CUST_P    PROC_M  $EVD_CUST $EVD_PROC $EVM_PROC
    Departition PROC_M    PROC_G  $EVD_PROC --round-robin
    Write       PROC_G    gdrive://proc.xlsx $EVD_PROC