EVL – ETL Tool

Table of Contents


Products, services and company names referenced in this document may be either trademarks or registered trademarks of their respective owners.

Copyright © 2017–2021 EVL Tool, s.r.o.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts.

Generate

(since EVL 1.3)

According to data definition (evd file) generates records to stdout or output flow or file. EVD is EVL data definition file, for details see evl-evd(5).

When no <config_file> is specified:

Number data types

values from the whole range of given data type are randomly generated

Date, timestamp

values between 1970-01-01 and 2199-12-31 are randomly generated

String

random characters [a-zA-Z0-9] of the length between 0 and 10 are generated

Vector

random number of elements between 0 and 10 are generated

When <config_file> in JSON format is specified:

Number data types

range, values, probability of nulls

Date, timestamp

range, values, probability of nulls

String

range, values, min-length, max-length, probability of nulls

Vector

range, values, min-elements, max-elements, probability of nulls

When both, probability of nulls and values with null is specified, then only probability is taken. When range(s) and values overlaps, then it has no effect on the probability, all values has the same probability of being generated. See examples of JSON below for details.

Synopsis

syntax/Generate
Generate
  <f_out> (<evd>|-d <inline_evd>) [<config_file>]
  [-n|--records <num>] [-y|--text-output]

evl generate
  (<evd>|-d <inline_evd>) [<config_file>]
  [-n|--records <num>] [-y|--text-output]
  [-v|--verbose]

evl generate
  ( --help | --usage | --version )

Options

-d, --data-definition=<inline_evd>

either this option or the file <evd> must be presented. Example: -d ’id int, user_id string(6) enc=iso-8859-1’

-n, --records=<num>

generate <num> number of records instead of the default one

-y, --text-output

write the output as text, not binary

Standard options:

--help

print this help and exit

--usage

print short usage information and exit

-v, --verbose

print to stderr info/debug messages of the component

--version

print version and exit

Examples

  1. Print to stdout one random uchar:
    evl generate -d 'value uchar' -y
    
  2. Example of config JSON file:
    {
      "int_field": {
        "values": [100, 200, 500],
        "range": { "min": 0, "max": 10 },
        "range": { "min": 50, "max": 60 },
        "null": 0.1
      },
      "float_field": {
        "range": { "min": 0, "max": 100 }
      },
      "date_field": {
        "values": [ null, "2018-03-07", "2018-03-08" ]
      },
      "struct_field.string_field1": {
        "min-length": 10,
        "max-length": 20
      },
      "struct_field.string_field2": {
        "values": ["abc", "def", "ghi", "jkl"]
      },
      "struct_field.decimal_field": {
        "range": { "min": "0.00", "max": "100.00" }
      },
      "vector_field": {
        "min-elements": 2,
        "max-elements": 5
      },
      "vector_field[]": {
        "range": { "min": "2018-03-07 05:00:00", "max": "2018-03-07 14:00:00" }
      }
    }
    

    where corresponding evd is:

    int_field        int           sep="|"  null=""
    float_field      float         sep="|"
    date_field       date          sep="|"  null=""
    struct_field     struct        sep="|"
      string_field1  string        sep=";"
      string_field2  string        sep=";"
      decimal_field  decimal(5.2)  sep=";"
    vector_field     vector        sep="\n"
      timestamp                    sep=","
    

    For the ‘int_field’ it will generate randomly values 0,1,...,10,50,...,60,100,200,500, but in 10% cases there will be ‘NULL’ values generated.