Untitled :: Provoly documentation

How to import data using project APIs

Requirement

Use following endpoints to initialize the data model and the dataset into which the data must be imported :

Import data

Use the import endpoint by specifying the dataset id.

To see the process import, use this endpoint with the dataset version id returned by the import request.

During indexation items are split in chunks, it’s necessary to reduce the chunk size if data has a lot of properties, or if their values are large, so as not to saturate memory.

On the other hand, it is possible to speed up the import by increasing the size of the chunk if there is a lot of "small" data.

To do this, go to the desired environment and edit the value of import-chunk-size in the data-virt configmap, then restart the pod.

CSV import

The first line of the csv must contain the data model attributes. All attributes must be specified, even if there is no value. The separator used must be a semicolon (;). Encoding is utf-8.

The csv import will be much less effective if it contains more than 150 columns.

Let car be a data model with the following attributes:

model : text field
color : text field
nbDoors : numeric field
creationDate : instant field
position: point field

The corresponding csv could be :

model;color;nbDoors;creationDate;position
nissan;red;3;1977-04-22T00:00:00Z;"{""type"": ""Point"",""coordinates"": [102.0, 0.5]}"
peugeot;blue;5;1968-09-06T00:00:;"{""type"": ""Point"",""coordinates"": [208.0, 0.10]}"

model;nbDoors;color;creationDate;position
nissan;3;red;1977-04-22T00:00:00Z;"{""type"": ""Point"",""coordinates"": [102.0, 0.5]}"
peugeot;;blue;;;

Geo attributes

Geo data can be added to the csv as follows :

Example 1. Geo data

"{""type"": ""Point"", ""coordinates"": [102.0, 0.5]}"

Multivalued attributes

The separator used for multivalued attributes is a pipe (|).

Example 2. String attribute with 2 values valueA and valueB

valueA|valueB

Example 3. Geo attribute with 2 points [102.0, 0.5] and [92.0, 2.5]

"{""type"": ""Point"", ""coordinates"": [102.0, 0.5]}|{""type"": ""Point"", ""coordinates"": [92.0, 2.5]}"

Geo import

Geodetic datum is a reference frame used to measure locations on Earth.

To import geodetic data, it’s necessary to fill in a zipped folder containing at least the .shp, .dbf and .shx files. The archive may also contain the following files :

.cpg
.prj

The content-type must be application/shp.

The attribute carrying the geometry must be named the_geom in the data model and there can only be one. All attributes must be filled in, even if values are empty. The geometry must correspond exactly to the attribute type : it’s not possible to have lines and polygon in a dataset. However, it is possible to have lines and multilines. To do this, when importing data you must send params normalizeGeo to true..

When you create a field, it’s possible to precise the Coordinate Reference System (CRS) of your choice, otherwise it will be WGS84 by default. All data will be added with the Coordinate Reference System (CRS) precise in the field.

If the precision of the geometry is greater than 15, it will be truncated at 15.