Writeparquet
(since EVL 2.0)
Write stdin or <f_in>
into <parquet>
directory as files of the size approximately of
the size <file_size>
MB. Compression can be turned on by –compression option.
- Writeparquet
-
is to be used in EVS job structure definition file.
<f_in>
is either input file or flow name. - evl writeparquet
-
is intended for standalone usage, i.e. to be invoked from command line and reading records from standard input.
EVD and EVS are EVL definition files, for details see evl-evd(5) and evl-evs(5).
Synopsis
Writeparquet <f_in> <parquet> (<evd>|-d <inline_evd>) [-x|--text-input] [--compression=(gzip|snappy|lz4|brotli|zstd)] [--size=<file_size>] [--impala] evl writeparquet <parquet> (<evd>|-d <inline_evd>) [-x|--text-input] [--compression=(gzip|snappy|lz4|brotli|zstd)] [--size=<file_size>] [--impala] [-v|--verbose] evl writeparquet ( --help | --usage | --version )
Options
- -d, --data-definition=<inline_evd>
-
either this option or the file <evd> must be presented. Example: -d ’id int, name string, started timestamp’
- --compression=<compression>
-
compression to be used, possible values are gzip, snappy, lz4, brotli, zstd, none. By default ’none’ is used, so no compression is applied
- --size=<file_size>
-
specify the number of MB, this size will be used for resulting files, default is 256 MB
- --impala
-
produce a parquet file(s) to be used then by Apache Impala, i.e. store TIMESTAMP as INT96
- -x, --text-input
-
suppose the input as text, not binary
Standard options:
- --help
-
print this help and exit
- --usage
-
print short usage information and exit
- -v, --verbose
-
print to stderr info/debug messages of the component
- --version
-
print version and exit