EVL

Table of Contents


Products, services and company names referenced in this document may be either trademarks or registered trademarks of their respective owners.

Copyright © 2017–2020 EVL Tool, s.r.o.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts.

Sort

(since EVL 1.0)

Command takes records from stdin or <f_in>, sort them via <key> and write them to stdout or <f_out>. With the -u option it deduplicates the data. At the moment it uses only traditional sort order (i.e. like LC_ALL=C), not national.

Sort

is to be used in EVS job structure definition file. <f_in> and <f_out> are either input and output file or flow name.

evl sort

is intended for standalone usage, i.e. to be invoked from command line and reading records from standard input and writing to standard output.

EVD and EVS are EVL definition files, for details see evl-evd(5) and evl-evs(5).

Synopsis

syntax/Sort
Sort
  <f_in> <f_out> (<evd>|-d <inline_evd) -k <key>
  [-u <unique-key> [-t|--keep-first] [--reject=<file>]]
  [-c|--check-sort] [-f|--file-storage] [-i|--ignore-case]
  [--validate] [-x|--text-input] [-y|--text-output]

evl sort
  (<evd>|-d <inline_evd) -k <key>
  [-u <unique-key> [-t|--keep-first] [--reject=<file>]]
  [-c|--check-sort] [-f|--file-storage] [-i|--ignore-case]
  [--validate] [-x|--text-input] [-y|--text-output]
  [-v|--verbose]

evl sort
  ( --help | --usage | --version )

Options

-c, --check-sort

only check if the input is sorted and fail if not

-d, --data-definition=<inline_evd>

either this option or the file <evd> must be presented. Example: -d ’id int, user_id string(6) enc=iso-8859-1’

-f, --file-storage

store temporary files on disk instead of using memory

-i, --ignore-case

ignore case sensitivity for key fields

-k, --key=<key>

sort via a key, where <key> is comma separated list of fields with type (default type is ASC). Example: -k ’id,user_id DESC,modify_dt ASC’

-r, --reject=<reject_file>

being used with option -u it catch duplicated records into <reject_file>

-t, --keep-first

when deduplicate by –unique-key, keep the first record from the group

-u, --unique-key=<unique_key>

deduplicate the output via <unique_key>; take only the last value unless –keep-first is specified. Duplicated records are catched by -r option. Example: -u ’id,user_id’

--validate

without this option, no fields are checked against data types. With this option, all output fields are checked

-x, --text-input

suppose the input as text, not binary

-y, --text-output

write the output as text, not binary

Standard options:

--help

print this help and exit

--usage

print short usage information and exit

-v, --verbose

print to stderr info/debug messages of the component

--version

print version and exit

Examples

Sort via the whole record (i.e. according to all fields) the text input and write into text output file:

evl sort example.evd -k '' -xy <in.txt >out.txt

Deduplicate the binary input (for example from another EVL component) by keeping the first record in each group with the same id (with the lowest updated date) and write the result into output.csv and duplicates into duplicates.csv:

cat input.bin | \
evl sort -ty -k'd,updated' -u'id' \
  -d'id int sep=",", updated date sep="\n"' -r duplicates.csv >output.csv

Check sort (being case insensitive) of input text file input.txt and write into file output.bin in binary (i.e. not as text):

evl sort -cix -k'name' -d'name string sep="|", personal_id int sep="\n"' \
  <input.txt >output.bin