Partition
(since EVL 1.2)
Read input flow or file and according to ‘--key’ or ‘--round-robin’ logic send to several number of output flows or files. The number of partitions depends on the ‘EVL_PARTITIONS’ environment variable and also on the EVL version/edition.
- Partition
-
is to be used in EVS job structure definition file.
<f_in>
and<f_out>
are either input and output file or flow name. - evl partition
-
is intended for standalone usage, i.e. to be invoked from command line.
EVD is EVL data definition file, for details see evl-evd(5).
Synopsis
Partition <f_in> <f_out> (<evd>|-d <inline_evd>) (--key=<key> | --round-robin) [--validate] [-x|--text-input] [-y|--text-output] evl partition <file_in> <file_out> (<evd>|-d <inline_evd>) (--key=<key> | --round-robin) [--validate] [-x|--text-input] [-y|--text-output] [-v|--verbose] evl partition ( --help | --usage | --version | --max-partitions )
Options
- -d, --data-definition=<inline_evd>
-
either this option or the file <evd_out> must be presented
- -k, --key=<key>
-
key according to which to distribute data
- -m, --max-partitions
-
return the number of maximal possible partitions
- -r, --round-robin
-
split by round-robin, i.e. simply one record after another to one output flow/file after another
- --validate
-
without this option, no fields are checked against data types. With this option, all output fields are checked
- -x, --text-input
-
suppose the input as text, not binary
- -y, --text-output
-
write the output as text, not binary
Standard options:
- --help
-
print this help and exit
- --usage
-
print short usage information and exit
- -v, --verbose
-
print to stderr info/debug messages of the component
- --version
-
print version and exit
Examples
- To partition flow in the EVL job:
Read s3://my_bucket/cust.csv CUST $EVD_CUST Partition CUST CUST_P $EVD_CUST --round-robin Map CUST_P PROC_M $EVD_CUST $EVD_PROC $EVM_PROC Departition PROC_M PROC_G $EVD_PROC --round-robin Write PROC_G sftp:///some/path/proc.csv.gz $EVD_PROC