EVL – ETL Tool

Products, services and company names referenced in this document may be either trademarks or registered trademarks of their respective owners.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts.

Introduction
Release Notes
- Version 1.0
- Version 1.1
- Version 1.2
- Version 1.3
- Version 2.0
- Version 2.1
- Version 2.2
- Version 2.3
- Version 2.4
- Version 2.5
- Version 2.6
- Version 2.7
Installation and Settings
- Linux – RPM
- Linux – DEB
- Other Unix systems
- Settings
  - Compiler
  - Project
- Text Editor
  - Vim
EVL Overview
- EVL Jobs
- EVL Workflows
- Scheduling
Main EVL Command
- Usage
- Examples
- Options
- Environment
- evl project
- evl run
- evl workflow
EVD and Data Types
- EVD Structure
- EVD Options
- Default Values
- Compound Types
- String
- Integral Types
- Decimal
  - Declaration in mapping
  - Manipulation, comparison
- Float and Double
- Date and Time
Components Common
- Common Options
Basic Components
- Assign
- Cat
- Cmd
- Component
- Cut
- Departition
- Echo
- Filter
- Gather
- Generate
- Head
- Lookup
- Merge
- Partition
- Sort
- Sortgroup
- Tail
- Tee
- Trash
- Uniq
- Validate
- Watcher
Mapping Components
- Aggreg
- Join
- Map
Read Components
- Read
- Readevd
- Readjson
- Readkafka
- Readmysql
- Readora
- Readparquet
- Readpg
- Readqvd
- Readtd
- Readxls
- Readxlsx
- Readxml
Run SQL Components
- Runmysql
- Runora
- Runpg
Write Components
- Write
- Writeevd
- Writejson
- Writekafka
- Writeora
- Writeparquet
- Writepg
- Writeqvd
- Writeqvx
- Writetd
- Writexlsx
- Writexml
- Writemysql
Commands
- Cancel
- Cp
- Chmod
- Crontab
- End
- Fr
- Log
- Ls
- Mail
- Manager
- Mkdir
- Mv
- Rm
- Set
- Skip
- Sleep
- Spark
- Status
- Test
- Wait
EVM Mappings
- Output Functions
- String Functions
- Checksum Functions
- IP Addresses Functions
  - IPv4 Functions
  - IPv6 Functions
- Randomization Functions
- Anonymization Functions
Joins and Lookups
- Lookup tables
  - Declaration and load
  - Methods
Utils
- csv2evd
- csv2qvd
- evd2sql
- guess-timestamp-format
- json2evd
- pg2evd
- qvd2csv
- qvd2evd
- evl_increment_run_id
- qvd-header
EVM Functions Index
EVD Data Types Index
Variables Index
General Index

Spark

(since EVL 2.0)

In case jar file is specified (i.e. file with mask *.jar), it invokes:

$EVL_SPARK_SUBMIT <spark_submit_options> <jar_file> --name <name>

where EVL_SPARK_SUBMIT is ‘spark-submit’ by default.

When other than jar file is used, then it firstly build the code by ‘$EVL_SPARK_BUILD’, which is by default ‘sbt’, and then run such jar file in above manner.

Synopsis

Spark
  ( <jar_file> | <scala_source> ) [--name <name>]

evl spark
  ( <jar_file> | <scala_source> ) [--name <name>]
  [--verbose]

evl spark
  ( --help | --usage | --version )

Options

Standard options:

--help: print this help and exit
--usage: print short usage information and exit
-v, --verbose: print to stderr info/debug messages of the component
--version: print version and exit

Examples

Run already built scala code in YARN:

export EVL_SPARK_SUBMIT="--master yarn --executor-memory 2G
                         --conf spark.executor.memoryOverhead=4G"
Spark agregate_something.jar --name aggregate_something

Run scala code in YARN: