Write
(since EVL 1.0)
Write <f_in>
into <target>
which is a file or table specified in general by
[scheme://][[user@@]host[:port]]/path/basename[.format][.compression] [scheme://][[user@@]host[:port]/]database?(table=<table>|query=<query>)
Besides below mentioned options, which changes file suffix behaviour, one can use generic
‘--cmd=<cmd>’ option, which calls something like ‘| <cmd> > <path>’ at the end.
<cmd>
can be also a pipeline. See examples below for inspiration.
- Write
-
is to be used in EVS job structure definition file.
<f_out>
is either output file or flow name. - evl write
-
is intended for standalone usage, i.e. to be invoked from command line and and write to standard output.
EVD is EVL data definition file, for details see evl-evd(5).
URI Scheme:
Based on the URI Scheme ‘scheme://’, component calls appropriate utilities to write the file to the destination.
- no scheme, ‘file://’,
-
suppose local filesystem
- ‘gdrive://’
-
calls ‘gdrive’ utility
- ‘gs://’
-
calls ‘gsutil’ utility
- ‘hdfs://’
-
calls ‘hadoop fs’ utility
- ‘s3://’
-
calls ‘aws s3’ utility
- ‘sftp://’
-
calls ‘ssh’ utility
- ‘smb://’
-
calls ‘smbclient’ utility
Based on the URI Scheme ‘scheme://’, component switch to appropriate EVL component.
- ‘mysql://’
-
calls Writemysql component to write to MySQL/MariaDB table
- ‘oracle://’
-
calls Writeora component to write to Oracle table
- ‘postgres://’
-
calls Writepg component to write to PostgreSQL table
- ‘sqlite://’
-
calls Writesqlite component to write to SQLite table
- ‘teradata://’
-
calls Writetd component to write to Teradata table
Compression:
Compression file suffix behaviour (applied by following the order):
- ‘*.bz2’, ‘*.BZ2’
-
calls ‘bzip2 -c’
- ‘*.gz’, ‘*.GZ’
-
calls ‘gzip -c’
- ‘*.zip’, ‘*.ZIP’
-
calls ‘zip’
File Type:
Write component behaves according to the <file>
suffix.
Specific file formats suffix behaviour:
- ‘*.avro’, ‘*.AVRO’
-
calls ‘evl writeavro’
- ‘*.csv’, ‘*.CSV’, ‘*.txt’, ‘*.TXT’
-
write file with ‘--text-output’ option, other than standard Unix end-of-line character (‘\n’) can be specified by option ‘--dos-eol’ or ‘--mac-eol’
- ‘*.json’, ‘*.JSON’
-
calls ‘evl writejson’
- ‘*.parquet’, ‘*.parq’, ‘*.PARQUET’, ‘*.PARQ’
-
calls ‘evl writeparquet’
- ‘*.qvd’, ‘*.QVD’
-
calls ‘evl writeqvd’
- ‘*.qvx’, ‘*.QVX’
-
calls ‘evl writeqvx’
- ‘*.xlsx’, ‘*.XLSX’
-
calls ‘evl writexlsx’
- ‘*.xml’, ‘*.XML’
-
calls ‘evl writexml’
Synopsis
Write <f_in> <target> (<evd>|-d <inline_evd>) [--append] [--footer-file=<f_in>] [--header-file=<f_in> | -h|--header] [ --avro | --json [--omit-null-fields] [--array-output] | --parquet | --qvd | --qvx --xlsx --xml [--document-tag=<tag>] [--record-tag=<tag>] [--vector-element-tag=<tag>] | -y|--text-output [--dos-eol] [--mac-eol] ] [--gz] [--cmd=<cmd>] [--ignore-suffix] [-x|--text-input] [--validate] evl write <target> (<evd>|-d <inline_evd>) [--append] [--footer-file=<file>] [--header-file=<file> | -h|--header] [ --avro | --json [--omit-null-fields] [--array-output] | --parquet | --qvd | --qvx --xlsx --xml [--document-tag=<tag>] [--record-tag=<tag>] [--vector-element-tag=<tag>] | -y|--text-output [--dos-eol] [--mac-eol] ] [--gz] [--cmd=<cmd>] [--ignore-suffix] [-x|--text-input] [--validate] [-v|--verbose] evl write ( --help | --usage | --version )
Options
Common options:
- -d, --data-definition=<inline_evd>
-
either this option or the file
<evd>
must be presented - --footer-file=<file>
-
add
<file>
after last written record - -h, --header
-
add header line with field names. Applicable only for text files (e.g. CSV) and XLSX file.
- --header-file=<file>
-
add
<file>
before the first record - --validate
-
without this option, no fields are checked against data types. With this option, all output fields are checked
- -x, --text-input
-
suppose the input as text, not binary
- --dos-eol
-
suppose the output is text with CRLF as end of line
- --mac-eol
-
suppose the output is text with CR as end of line
- -y, --text-output
-
write the output as text, not binary
Standard options:
- --help
-
print this help and exit
- --usage
-
print short usage information and exit
- -v, --verbose
-
print to stderr info/debug messages of the component
- --version
-
print version and exit
Options changing file suffix behaviour:
- --avro
-
whatever file’s suffix, write the file in Avro file format
- --cmd=<cmd>
-
bash command
<cmd>
is used to write into<file>
. In such case recognizing file’s suffix is switched off. See examples below for inspiration. - --csv
-
whatever file’s suffix, write the file in as CSV using delimiters based on EVD (same as –text-output option)
- --gz
-
whatever file’s suffix, use ‘gzip’ to compress the file
- --ignore-suffix
-
ignore file’s suffix, act only based on options
- --json
-
whatever file’s suffix, write the file as JSON
- --parquet
-
whatever file’s suffix, write the file in Parquet columnar file format
- --qvd
-
whatever file’s suffix, write the file as Qlik’s QVD file
- --qvx
-
whatever file’s suffix, write the file as Qlik’s QVX file
- --xml
-
whatever file’s suffix, write the file as XML
- --xlsx
-
whatever file’s suffix, write the file as MS Excel sheet
XML specific options:
- --document-tag=<tag>
-
for other than XML file is this option ignored. Check ‘man evl writexml’ for details.
- --record-tag=<tag>
-
for other than XML file is this option ignored. Check ‘man evl writexml’ for details.
- --vector-element-tag=<tag>
-
for other than XML file is this option ignored. Check ‘man evl writexml’ for details.
JSON specific options:
- --array-output
-
using this flag the json output would be an array or records, i.e. ‘[{...},{...},...,{...}]’
- --omit-null-fields
-
for other than JSON file is this option ignored. Check ‘man evl writejson’ for details.
Examples
When password is needed in following examples, they are taken from $HOME/.evlpass file.
- Write local CSV file in EVL graph (an EVS file):
TARGET_FILE="/home/myself/file.csv" ... Map FLOW1 FLOW2 evd/f1.evd evd/f2.evd evm/f.evm Write FLOW2 $TARGET_FILE evd/f2.evd
- Write JSON file to AWS S3 bucket:
TARGET_FILE="s3://mybucket/file.json" ... Map FLOW1 FLOW2 evd/f1.evd evd/f2.evd evm/f.evm Write FLOW2 $TARGET_FILE evd/f2.evd
- Write Parquet file to Hadoop file system:
TARGET_FILE="hdfs:///some/path/file.parquet" ... Map FLOW1 FLOW2 evd/f1.evd evd/f2.evd evm/f.evm Write FLOW2 $TARGET_FILE evd/f2.evd
- Load gzipped CSV file to Google Storage:
TARGET_FILE="gs://some_bucket/some/path/file.csv.gz" ... Map FLOW1 FLOW2 evd/f1.evd evd/f2.evd evm/f.evm Write FLOW2 $TARGET_FILE evd/f2.evd
- Load data to Postgres table:
TARGET_FILE="postgres://tech_user@10.11.12.13:5432/my_database/my_table" ... Map FLOW1 FLOW2 evd/f1.evd evd/f2.evd evm/f.evm Write FLOW2 $TARGET_FILE evd/f2.evd
- Example of standalone usage: Write gzipped CSV file with header and validated data types over SFTP
to some server:
evl write -d 'id int sep=";", value string sep="\n"' --header -xy --validate \ sftp://my_user@10.11.12.13:22/some/path/example.csv.gz < example.csv