csv2qvd
(since EVL 0.1)
Read <file.csv>
, guess data types or use <evd>
file or <inline_evd>
and write
QVD file to <file.qvd>
or standard output. For guessing data types (EVD) it uses utility
‘csv2evd’.
EVD is EVL data definition file, for details see man 5 evd.
Synopsis
csv2qvd <file.csv> [-o|--output=<file.qvd>] [-d|--date=<format>] [-h|--header=<field_name>,...] [-n|--no-header] [-l|--null=<string>] [-q|--quote=<char> | --optional-quote=<char>] [-s|--separator=<char>] [-t|--datetime=<format>] [--timestamp=<format>] [-a|--dos-eol | -b|--mac-eol] [-v|--verbose] csv2qvd <file.csv> (<evd>|-d <inline_evd>) [-o|--output=<file.qvd>] [-a|--dos-eol | -b|--mac-eol] [-v|--verbose] csv2qvd ( --help | --usage | --version )
Options
Standard options:
- -d, --data-definition=<inline_evd>
-
either this option or the file
<evd>
must be presented to use already defined EVD - -a, --dos-eol
-
suppose DOS end-of-line, i.e. replace CR+LF (‘\r\n’) by LF (‘\n’) on input
- -b, --mac-eol
-
suppose Mac end-of-line, i.e. replace CR (‘\r’) by LF (‘\n’) on input
- -o, --output=<file.qvd>
-
write output into
<file.qvd>
instead of standard output - -v, --verbose
-
print to STDERR info/debug messages
- --help
-
print this help and exit
- --usage
-
print short usage information and exit
- --version
-
print version and exit
EVD options:
- --date=<format>
-
by default it tries only ‘%Y-%m-%d’, then ‘%d.%m.%Y’
- -h, --header=<field_name>,...
-
use comma separated list of field names instead of header line, for example when there is no header in csv file (option ‘-n’ must be used) or when other field names should be used
- -n, --no-header
-
with this option it suppose there is no header. Fields will be named ‘field_001’, ‘field_002’, etc.
- -l, --null=<string>
-
to specify what string is used for NULL values in CSV, empty string is allowed
- --optional-quote=<char>
-
suppose optional quote character
<char>
, must be used together with ‘--separator’ - -q, --quote=<char>
-
do not guess if fields are quoted, but suppose
<char>
as quotation character - -s, --separator=<char>
-
do not guess the separator, but use
<char>
instead - -t, --datetime=<format>
-
by default it tries only ‘%Y-%m-%d %H:%M:%S’
- --timestamp=<format>
-
by default it tries only ‘%Y-%m-%d %H:%M:%S.%E*f’
Examples
- Having ‘some.csv’:
id;started;value 1;2019-06-06;some string
The command:
csv2qvd --null="NULL" some.csv > some.qvd
will produce some.qvd file with these field:
id int null="NULL" sep=";" started date("%Y-%m-%d") null="NULL" sep=";" value string null="NULL" sep="\n"
- Following invocation will guess data types, field separator and if strings are quoted or not, and
use header line for field names:
csv2qvd table.csv > table.qvd
With the ‘--verbose’ option it will write to standard error the whole EVD file which was used:
csv2qvd --verbose table.csv > table.qvd
- To skip header and use different field names:
csv2qvd --header="first_field,other_field,last_one" table.csv > table.qvd
- Case when there is no header in CSV file, but use specified field names:
csv2qvd --no-header --header="first_field,other_field,last_one" \ table.csv > table.qvd
- No header in CSV and use generated field names ‘field_001’, ‘field_002’, etc.:
csv2qvd --no-header table.csv > table.qvd
- Consider specific date format, here day of year (‘001..366’), and ‘|’ as a field
separator:
csv2qvd --date="%j" -s '|' table.csv > table.qvd
- To use own (specific or already generated) EVD file (i.e. data types definition):
csv2qvd table.csv table.evd > table.qvd
Another example
Let’s have following sample_1.csv file with header and with semicolon as a field separator:
ID;Name;Code;Price;Created at 1;item_11;11;15,85;2016-05-04 2;item_12;12;21,25;2019-11-11 3;item_13;13;12,99;2019-05-05
Then by running:
csv2qvd -v < sample_1.csv > sample_1.qvd
will guess data types and use field names from the header and produce sample_1.qvd file.
And it will also write to standard error EVL data types which were used:
Used EVD: --------- ID int null="" sep=";" Name string null="" sep=";" Code int null="" sep=";" Price decimal(8,2) null="" sep=";" Created_at date null="" sep="\n" ---------
More complex example
If guessed data types are not correct for some reason, for example the date format would be different or field which looks like an integer might be next time a string, then it is better to provide own EVD file with EVL data definition description.
Suppose following sample_2.csv file with header and with semicolon as a field separator:
ID;Name;Code;Price;Created at 1;item_11;11;15,85;05/04/2016 2;item_12;12;21,25;11/11/2019 3;item_13;13;12,99;05/05/2019
Let’s start by guessing EVD by csv2evd
utility:
csv2evd sample_2.csv > sample_2.evd
It will result in sample_2.evd:
ID int null="" sep=";" Name string null="" sep=";" Code int null="" sep=";" Price decimal(8,2) null="" sep=";" Created_at string null="" sep="\n"
We can either use option --date="%m/%d/%Y" or simply edit sample_2.evd manually. Both will result in:
ID int null="" sep=";" Name string null="" sep=";" Code int null="" sep=";" Price decimal(8,2) null="" sep=";" Created_at date("%m/%d/%Y") null="" sep="\n"