EVL – ETL Tool


Products, services and company names referenced in this document may be either trademarks or registered trademarks of their respective owners.

Copyright © 2017–2022 EVL Tool, s.r.o.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts.

Table of Contents

csv2evd

(since EVL 2.2)

Read <file.csv> or standard input, guess data types, field separator, if strings are quoted, and write EVD to standard output or to <file.evd>.

It uses header line for field names, spaces are replaced by underscores.

Separator is trying to be guessed in this order: ‘,’ (comma), ‘;’ (semi-colon), ‘|’ (pipe), ‘\t’ (tab), ‘:’ (colon), ‘’ (space).

Quotation character is guessed in this order: double quotes, single quotes.

EVD is EVL data definition file, for details see man 5 evd.

Synopsis

csv2evd
  [<file.csv>] [-o|--output=<file.evd>]
  [--inline]
  [-d|--date=<format>]
  [-h|--header=<field_name>,...]
  [-n|--no-header]
  [-l|--null=<string>]
  [-q|--quote=<char>]
  [-s|--separator=<char>]
  [-t|--datetime=<format>]
  [-v|--verbose]

csv2evd
  ( --help | --usage | --version )

Options

-d, --date=<format>

by default it tries only ‘%Y-%m-%d’, then ‘%Y%m%d’, then ‘%d.%m.%Y

-h, --header=<field_name>,...

use comma separated list of field names instead of header line, for example when there is no header in csv file (option ‘-n’ must be used) or when other field names would be used

--inline

output EVD in the inline format (for example to use EVD by other component with ‘-d’ option)

-n, --no-header

with this option it suppose there is no header. Fields will be named ‘field_001’, ‘field_002’, etc.

-l, --null=<string>

to specify what string is used for NULL values in CSV, empty string is allowed

-o, --output=<file.evd>

write output into file <file.evd> instead of standard output

-q, --quote=<char>

do not guess if fields are quoted, but suppose <char> as quotation character

-s, --separator=<char>

do not guess the separator, but use <char> instead

-t, --datetime=<format>

by default it tries only ‘%Y-%m-%d %H:%M:%S’, then ‘%Y%m%d%H%M%S

-v, --verbose

print to STDERR info/debug messages

--help

print this help and exit

--usage

print short usage information and exit

--version

print version and exit

Examples

  1. Having table.csv:
    id;started;value
    1;2019-06-06;some string
    

    This command:

    csv2evd table.csv
    

    will try to guess data types, field separator and if strings are quoted or not, and use header line for field names, to produce EVD to standard output:

    id       int              null="" sep=";"
    started  date("%Y-%m-%d") null="" sep=";"
    value    string           null="" sep="\n"
    
  2. Just an alternative invocation forwording output EVD to a file:
    csv2evd < table.csv > table.evd
    
  3. To skip header and use different field names:
    csv2evd --header="first_field,other_field,last_one" \
      table.csv > table.evd
    
  4. Case when there is no header in CSV file, but use specified field names:
    csv2evd --no-header --header="first_field,other_field,last_one" \
      table.csv > table.evd
    
  5. No header in CSV and use generated field names ‘field_001’, ‘field_002’, etc.:
    csv2evd --no-header table.csv > table.evd
    
  6. Consider specific date format, here day of year (‘001..366’), and ‘|’ as a field separator:
    csv2evd --date="%j" -s '|' table.csv > table.evd