EVL – QVD Utils


Products, services and company names referenced in this document may be either trademarks or registered trademarks of their respective owners.

Copyright © 2017–2023 EVL Tool, s.r.o.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts.

csv2qvd

(since EVL 0.1)

Read <file.csv>, guess data types or use <evd> file or <inline_evd> and write QVD file to <file.qvd> or standard output. For guessing data types (EVD) it uses utility ‘csv2evd’.

EVD is EVL data definition file, for details see man 5 evd.

Synopsis

csv2qvd
  <file.csv>
  [-o|--output=<file.qvd>]
  [-d|--date=<format>]
  [-h|--header=<field_name>,...]
  [-n|--no-header]
  [-l|--null=<string>]
  [-q|--quote=<char> | --optional-quote=<char>]
  [-s|--separator=<char>]
  [-t|--datetime=<format>]
  [--timestamp=<format>]
  [-a|--dos-eol | -b|--mac-eol]
  [-v|--verbose]

csv2qvd
  <file.csv> (<evd>|-d <inline_evd>)
  [-o|--output=<file.qvd>]
  [-a|--dos-eol | -b|--mac-eol]
  [-v|--verbose]

csv2qvd
  ( --help | --usage | --version )

Options

Standard options:

-d, --data-definition=<inline_evd>

either this option or the file <evd> must be presented to use already defined EVD

-a, --dos-eol

suppose DOS end-of-line, i.e. replace CR+LF (‘\r\n’) by LF (‘\n’) on input

-b, --mac-eol

suppose Mac end-of-line, i.e. replace CR (‘\r’) by LF (‘\n’) on input

-o, --output=<file.qvd>

write output into <file.qvd> instead of standard output

-v, --verbose

print to STDERR info/debug messages

--help

print this help and exit

--usage

print short usage information and exit

--version

print version and exit

EVD options:

--date=<format>

by default it tries only ‘%Y-%m-%d’, then ‘%d.%m.%Y

-h, --header=<field_name>,...

use comma separated list of field names instead of header line, for example when there is no header in csv file (option ‘-n’ must be used) or when other field names should be used

-n, --no-header

with this option it suppose there is no header. Fields will be named ‘field_001’, ‘field_002’, etc.

-l, --null=<string>

to specify what string is used for NULL values in CSV, empty string is allowed

--optional-quote=<char>

suppose optional quote character <char>, must be used together with ‘--separator

-q, --quote=<char>

do not guess if fields are quoted, but suppose <char> as quotation character

-s, --separator=<char>

do not guess the separator, but use <char> instead

-t, --datetime=<format>

by default it tries only ‘%Y-%m-%d %H:%M:%S

--timestamp=<format>

by default it tries only ‘%Y-%m-%d %H:%M:%S.%E*f

Examples

  1. Having ‘some.csv’:
    id;started;value
    1;2019-06-06;some string
    

    The command:

    csv2qvd --null="NULL" some.csv > some.qvd
    

    will produce some.qvd file with these field:

    id       int               null="NULL"  sep=";"
    started  date("%Y-%m-%d")  null="NULL"  sep=";"
    value    string            null="NULL"  sep="\n"
    
  2. Following invocation will guess data types, field separator and if strings are quoted or not, and use header line for field names:
    csv2qvd table.csv > table.qvd
    

    With the ‘--verbose’ option it will write to standard error the whole EVD file which was used:

    csv2qvd --verbose table.csv > table.qvd
    
  3. To skip header and use different field names:
    csv2qvd --header="first_field,other_field,last_one"
      table.csv > table.qvd
    
  4. Case when there is no header in CSV file, but use specified field names:
    csv2qvd --no-header --header="first_field,other_field,last_one" \
      table.csv > table.qvd
    
  5. No header in CSV and use generated field names ‘field_001’, ‘field_002’, etc.:
    csv2qvd --no-header table.csv > table.qvd
    
  6. Consider specific date format, here day of year (‘001..366’), and ‘|’ as a field separator:
    csv2qvd --date="%j" -s '|' table.csv > table.qvd
    
  7. To use own (specific or already generated) EVD file (i.e. data types definition):
    csv2qvd table.csv table.evd > table.qvd
    

Another example

Let’s have following sample_1.csv file with header and with semicolon as a field separator:

ID;Name;Code;Price;Created at
1;item_11;11;15,85;2016-05-04
2;item_12;12;21,25;2019-11-11
3;item_13;13;12,99;2019-05-05

Then by running:

csv2qvd -v < sample_1.csv > sample_1.qvd

will guess data types and use field names from the header and produce sample_1.qvd file.

And it will also write to standard error EVL data types which were used:

Used EVD:
---------
ID         int           null="" sep=";"
Name       string        null="" sep=";"
Code       int           null="" sep=";"
Price      decimal(8,2)  null="" sep=";"
Created_at date          null="" sep="\n"
---------

More complex example

If guessed data types are not correct for some reason, for example the date format would be different or field which looks like an integer might be next time a string, then it is better to provide own EVD file with EVL data definition description.

Suppose following sample_2.csv file with header and with semicolon as a field separator:

ID;Name;Code;Price;Created at
1;item_11;11;15,85;05/04/2016
2;item_12;12;21,25;11/11/2019
3;item_13;13;12,99;05/05/2019

Let’s start by guessing EVD by csv2evd utility:

csv2evd sample_2.csv > sample_2.evd

It will result in sample_2.evd:

ID         int          null="" sep=";"
Name       string       null="" sep=";"
Code       int          null="" sep=";"
Price      decimal(8,2) null="" sep=";"
Created_at string       null="" sep="\n"

We can either use option --date="%m/%d/%Y" or simply edit sample_2.evd manually. Both will result in:

ID         int              null="" sep=";"
Name       string           null="" sep=";"
Code       int              null="" sep=";"
Price      decimal(8,2)     null="" sep=";"
Created_at date("%m/%d/%Y") null="" sep="\n"