Write
(since EVL 1.0)
Write <f_in>
into <target>
which is a file or table specified in general by
[scheme:][//[user@@]host[:port]]/path/basename[.format][.compression] [scheme:][//[user@@]host[:port]/]database?(table=[schema.]<table>|query=<query>)
Besides below mentioned options, which changes file suffix behaviour, one can use generic
‘--cmd=<cmd>’ option, which calls something like ‘| <cmd> > <path>’ at the end.
<cmd>
can be also a pipeline. See examples below for inspiration.
- Write
-
is to be used in EVS job structure definition file.
<f_out>
is either output file or flow name. - evl write
-
is intended for standalone usage, i.e. to be invoked from command line and and write to standard output.
EVD is EVL data definition file, for details see evl-evd(5).
URI Scheme:
Based on the URI Scheme ‘scheme:’, component calls appropriate utilities to write the file to the destination.
- no scheme, ‘file:’,
-
suppose local filesystem
- ‘gdrive:’
-
calls ‘gdrive’ utility
- ‘gs:’
-
calls ‘gsutil’ utility
- ‘hdfs:’
-
calls ‘hadoop fs’ utility
- ‘s3:’
-
calls ‘aws s3’ utility
- ‘sftp:’
-
calls ‘ssh’ utility
- ‘smb:’
-
calls ‘smbclient’ utility
Based on the URI Scheme ‘scheme:’, component switch to appropriate EVL component.
- ‘mysql:’
-
calls Writemysql component to write to MySQL/MariaDB table
- ‘oracle:’
-
calls Writeora component to write to Oracle table
- ‘postgres:’
-
calls Writepg component to write to PostgreSQL table
- ‘sqlite:’
-
calls Writesqlite component to write to SQLite table
- ‘teradata:’
-
calls Writetd component to write to Teradata table
Compression:
Compression file suffix behaviour (applied by following the order):
- ‘*.bz2’, ‘*.BZ2’
-
calls ‘bzip2 -c’
- ‘*.gz’, ‘*.GZ’
-
calls ‘gzip -c’
- ‘*.zip’, ‘*.ZIP’
-
calls ‘zip’
File Type:
Write component behaves according to the <file>
suffix.
Specific file formats suffix behaviour:
- ‘*.avro’, ‘*.AVRO’
-
calls ‘evl writeavro’
- ‘*.csv’, ‘*.CSV’, ‘*.txt’, ‘*.TXT’
-
write file with ‘--text-output’ option, other than standard Unix end-of-line character (‘\n’) can be specified by option ‘--dos-eol’ or ‘--mac-eol’
- ‘*.json’, ‘*.JSON’
-
calls ‘evl writejson’
- ‘*.parquet’, ‘*.parq’, ‘*.PARQUET’, ‘*.PARQ’
-
calls ‘evl writeparquet’
- ‘*.qvd’, ‘*.QVD’
-
calls ‘evl writeqvd’
- ‘*.qvx’, ‘*.QVX’
-
calls ‘evl writeqvx’
- ‘*.xlsx’, ‘*.XLSX’
-
calls ‘evl writexlsx’
- ‘*.xml’, ‘*.XML’
-
calls ‘evl writexml’
Synopsis
Write <f_in> <target> (<evd>|-d <inline_evd>) [-a|--append] [--footer-file=<f_in>] [--header-file=<f_in> | -h|--header] [ --avro | --json [--omit-null-fields] [--array-output] | --parquet | --qvd | --qvx --xlsx --xml [--document-tag=<tag>] [--record-tag=<tag>] [--vector-element-tag=<tag>] | -y|--text-output [--dos-eol] [--mac-eol] ] [--gz] [--cmd=<cmd>] [--ignore-suffix] [-x|--text-input] [--validate] evl write <target> (<evd>|-d <inline_evd>) [-a|--append] [--footer-file=<file>] [--header-file=<file> | -h|--header] [ --avro | --json [--omit-null-fields] [--array-output] | --parquet | --qvd | --qvx --xlsx --xml [--document-tag=<tag>] [--record-tag=<tag>] [--vector-element-tag=<tag>] | -y|--text-output [--dos-eol] [--mac-eol] ] [--gz] [--cmd=<cmd>] [--ignore-suffix] [-x|--text-input] [--validate] [-v|--verbose] evl write ( --help | --usage | --version )
Options
Common options:
- -a, --append
-
do not overwrite the target file or table, but only append. When used with file formats Avro, Parquet, QVD, QVX or XLSX, warning is displayed and target file is overwritten. For these formats append doesn’t make sense or is not possible. So better use this option with care. Rather concatenate the increment with previous version of the file/table and then move over.
- -d, --data-definition=<inline_evd>
-
either this option or the file
<evd>
must be presented - --footer-file=<file>
-
add
<file>
after last written record When used with file formats Parquet, QVD, QVX or XLSX, warning is displayed and no<file>
is appended. (It doesn’t make sense for these formats as they are binary.) - -h, --header
-
add header line with field names. Applicable only for text files (e.g. CSV) and XLSX file. When used with file formats Avro, JSON, Parquet, QVD, QVX or XML, warning is displayed and no header is written. It doesn’t make sense for these formats.
- --header-file=<file>
-
add
<file>
before the first record When used with file formats Parquet, QVD, QVX or XLSX, warning is displayed and no<file>
is prepended. (It doesn’t make sense for these formats as they are binary.) - --validate
-
without this option, no fields are checked against data types. With this option, all output fields are checked
- -x, --text-input
-
suppose the input as text, not binary
- --dos-eol
-
suppose the output is text with CRLF as end of line
- --mac-eol
-
suppose the output is text with CR as end of line
- -y, --text-output
-
write the output as text, not binary
Standard options:
- --help
-
print this help and exit
- --usage
-
print short usage information and exit
- -v, --verbose
-
print to stderr info/debug messages of the component
- --version
-
print version and exit
Options changing file suffix behaviour:
- --avro
-
whatever file’s suffix, write the file in Avro file format
- --cmd=<cmd>
-
bash command
<cmd>
is used to write into<file>
. In such case recognizing file’s suffix is switched off. See examples below for inspiration. - --csv
-
whatever file’s suffix, write the file in as CSV using delimiters based on EVD (same as –text-output option)
- --gz
-
whatever file’s suffix, use ‘gzip’ to compress the file
- --ignore-suffix
-
ignore file’s suffix, act only based on options
- --json
-
whatever file’s suffix, write the file as JSON
- --parquet
-
whatever file’s suffix, write the file in Parquet columnar file format
- --qvd
-
whatever file’s suffix, write the file as Qlik’s QVD file
- --qvx
-
whatever file’s suffix, write the file as Qlik’s QVX file
- --xml
-
whatever file’s suffix, write the file as XML
- --xlsx
-
whatever file’s suffix, write the file as MS Excel sheet
XML specific options:
- --document-tag=<tag>
-
for other than XML file is this option ignored. Check ‘man evl writexml’ for details.
- --record-tag=<tag>
-
for other than XML file is this option ignored. Check ‘man evl writexml’ for details.
- --vector-element-tag=<tag>
-
for other than XML file is this option ignored. Check ‘man evl writexml’ for details.
JSON specific options:
- --array-output
-
using this flag the json output would be an array or records, i.e. ‘[{...},{...},...,{...}]’
- --omit-null-fields
-
for other than JSON file is this option ignored. Check ‘man evl writejson’ for details.
Examples
When password is needed in following examples, they are taken from $HOME/.evlpass file.
- Write local CSV file in EVL graph (an EVS file):
TARGET_FILE="/home/myself/file.csv" ... Map FLOW1 FLOW2 evd/f1.evd evd/f2.evd evm/f.evm Write FLOW2 $TARGET_FILE evd/f2.evd
- Write JSON file to AWS S3 bucket:
TARGET_FILE="s3://mybucket/file.json" ... Map FLOW1 FLOW2 evd/f1.evd evd/f2.evd evm/f.evm Write FLOW2 $TARGET_FILE evd/f2.evd
- Write Parquet file to Hadoop file system:
TARGET_FILE="hdfs:///some/path/file.parquet" ... Map FLOW1 FLOW2 evd/f1.evd evd/f2.evd evm/f.evm Write FLOW2 $TARGET_FILE evd/f2.evd
- Load gzipped CSV file to Google Storage:
TARGET_FILE="gs://some_bucket/some/path/file.csv.gz" ... Map FLOW1 FLOW2 evd/f1.evd evd/f2.evd evm/f.evm Write FLOW2 $TARGET_FILE evd/f2.evd
- Load data to Postgres table:
TARGET_FILE="postgres://tech_user@10.11.12.13:5432/my_database/my_table" ... Map FLOW1 FLOW2 evd/f1.evd evd/f2.evd evm/f.evm Write FLOW2 $TARGET_FILE evd/f2.evd
- Example of standalone usage: Write gzipped CSV file with header and validated data types over SFTP
to some server:
evl write -d 'id int sep=";", value string sep="\n"' --header -xy --validate \ sftp://my_user@10.11.12.13:22/some/path/example.csv.gz < example.csv