O2A GeoCSV (`.sdi.csv`)

Basics

This specification extends the commonly known GeoCSV specification with support for basic relational structure. It also adds requirements to enable data to be used as exchange format in automated O2A SDI dataflows (see SOPs).

Its main purpose are single-source WMS/WFS layers where data is coming in a harmonised table structure and vocabulary.

File Types

This specification comprises two main file types: Layer Definition Files and Data Files.

The Layer Definition Files define the target structure of the database table that is or will be fueling a WMS/WFS layer. Data Files contain the actual data of that table.

To facilitate maintenance and extension of your data two special data file types can be used: Join Files and Chunk Files. Both are optional concepts. Join Files are enable you to reduce data redundancy. Chunk Files help you organise your data.

File Type	File Name (Pattern)	Description
Layer Definition	`_layer_def.csv`	CSV, defines column names.
Layer Definition	`_layer_def.csvt`	CSVT, defines column types.
Data	`<basename>.sdi.csv`	CSV, Contains data.
Data (Join)	`<basename>.sdi.join.csv`	CSV, Contains data to be joined on one (basic) or multiple (chunked) data files sharing the same `<basename>`.
Data (Chunk)	`<basename>@<chunk>.sdi.csv`	CSV, Contains data. Allows to re-use one join file for multiple data files sharing the same `<basename>`.

File Types Overview

Technicalities

For all of these file types – with _layer_def.csvt being the exception – the CSV and GeoCSV specifications (giswiki.ch) apply. However, a common set of additional or more specific requirements is outlined below. Make sure to read the according subsections, including the Examples section.

delimiter: tab
INFO
Why tab separation? Compared to commas or semicolons, tabs rarely occur within strings. This minimises the demand for quoting values or strings.
Why not .sdi.tsv? The CSVT extension only works for files named *.csv.
decimal separator: dot (.)
encoding: UTF-8
file name restrictions:
- <basename> and <chunk>: alphanumeric, no special characters except underscore (_), dash (-), dot (.), hash symbol (#)
column name restrictions:
- adherence to column name conventions for basic metadata (see Metadata Vocabulary)
- allowed characters: alphanumeric (a-zA-Z0-9), underscore (_), dot (.)
  - recommendation: lower case alphanumeric and underscores only
  - needs to be compliant with Postgres identifier naming constraints

Metadata Vocabulary

Within O2A and especially the O2A SDI ecosystem specific vocabulary is used for specific metadata. This is a list of compulsory and encouraged vocabulary to use for column headers if columns with corresponding meaning are used.

WARNING

Compulsory vocabulary does not mean that the column itself is compulsory. It means that if a column with identical meaning is present, it should be named like this.

INFO

Some vocabulary is encouraged to foster consistency among O2A-hosted OWS, but may be deviated from (e.g. basis instead of platform).

Space/Time

term	relevance	comment/meaning
`date_time_start`	compulsory	Date and time of data point in ISO 8601 format notation or start of time range. ⚠️ Note: Time zone specification (e.g. `+00` for UTC) is mandatory but could be different from UTC.
`date_time_end`	compulsory	End of time range in ISO 8601.
`elevation`	compulsory	Elevation in meter. A negative value means below sea level, while positive value means above sea level. See PANGAEA Geocode definition. ⚠️ Note: This is not the height/depth of the measurement/observation (unless it's taken on earth's surface) but the topographical elevation at the lon/lat position.
`z_value`	compulsory	Vertical position of the measurement (third spatial dimension), in meter.
`z_type`	compulsory	PANGAEA Geocode, describing the type of `z_value`.
`geometry`	compulsory	Geometry in WKT notation without third/vertical spatial dimension. The reference system needs to be EPSG:4326 and the unit decimal degrees. Longitude first, latitude second. The geometry type can be chosen freely. However, a simple POINT is usually the best choice.

Vocabulary for space/time metadata

Acquisition

term	relevance	comment/meaning
`expedition`	encouraged	see https://wiki.pangaea.de/wiki/Campaign
`event`	encouraged	see https://wiki.pangaea.de/wiki/Event
`platform`	encouraged	see https://wiki.pangaea.de/wiki/Basis
`method`	encouraged	Method used for measuring (see https://wiki.pangaea.de/wiki/Method#2._Methods)
`device`	encouraged	Instrument used for measuring (see https://wiki.pangaea.de/wiki/Method#1._Instruments)
`registry_uri`	compulsory	O2A Registry handle (not URN!)

Vocabulary for data acquisition metadata

References

term	relevance	comment/meaning
`citation`	compulsory	citation string
`license`	compulsory	license name (not license text)
`data_url`	compulsory	URL, pointing to original data source
`metadata_url`	compulsory	URL, pointing to (standard-compliant) metadata
`sop_url`	compulsory	URL, pointing to published Standard Operating Procedures
`doi_url`	compulsory	URL, pointing to according DOI

Vocabulary for referencing metadata

Layer Definition Files

To define the desired (layer) table structure, including column types, two specific files need to be provided: _layer_def.csv and _layer_def.csvt. They define the structure of the database table.

A bunch of examples for layer definition files can be found in the examples section! For even more information, see GDAL's CSV driver Which is being used to process O2A GeoCSV files.

Column Names

The column names of the table are defined by the _layer_def.csv file. It is an empty CSV file, consisting of exactly one line with exactly all column headers the target layer should contain, separated by tabs. Consider the section on metadata vocabulary within the O2A SDI when naming columns.

Three columns are mandatory-ish:

Column Name	CSVT Type	Column Mandatoriness	Value Mandatoriness
`geometry`	`WKT`	yes	yes
`date_time_start`	`DateTime`	yes, if applicable	yes
`z_type`	`String`	yes, if `z_value` column present	yes, if `z_value` value is given

Mandatoriness of columns

Column Types

The column types of the table are defined by the _layer_def.csvt file. It is a CSVT file, consisting of exactly one line with column types for exactly all columns specified in the _layer_def.csv file, separated by commas (,). Supported column types are documented in the following table.

CSVT Column Type	Meaning	(Numerical) Range
`Integer(Boolean)`	boolean	Values need to be `0` (meaning `false`) or `1` (meaning `true`).
`Integer`	natural number	`-2147483648` to `+2147483647` (`integer` in Postgres)
`Integer64`		`-9223372036854775808` to `+9223372036854775807` (`bigint` in Postgres)
`Real`	decimal numeral	6 decimal digits precision (`float` in Postgres)
`Real(Float32)`		15 decimal digits precision (`double precision` in Postgres)
`String`	string/text	Could contain lists/dictionaries formatted as strings (see example)
`DateTime`	date with timestamp
`WKT`	geometry

CSVT column/data types

Basic Data Files

Basic O2A GeoCSV data files are GeoCSV files, adhering to the O2A-specific technicalities. They look much like the _layer_def.csv file, but with actual data in it. There can be an arbitrary amount of data files contributing to the table defined by the layer definition files.

Not every data file needs to contain all columns (except for those mandatory) of the target layer. Columns defined in the layer definition but left out in data files will hold null values in the according rows.

See examples for inspiration, especially the minimal examples #1 and #2.

Columns

Use these column names/headers in your data files

column header	column data type	column+value mandatory?	description	example values
`csv_join`	none		Used to hold join keys for the use of join files. ⚠️ Note: Must not appear in layer definition files!

Optional Data Files

Join Files

Basic relations between files are supported. For each data file (<basename>.sdi.csv) a second CSV (<basename>.sdi.join.csv) file can be joined using the column csv_join to reduce redundancy. Both files will be joined during database ingest using the following statement. Join files can hold arbitrary data and metadata columns (but not the geometry column). For details, see the according example.

sql

select *
from <datafile>
left join <joinfile>
  on <datafile>.csv_join = <joinfile>.csv_join;

Pseudo SQL code, demonstrating how basic data file and join file are combined during database ingest

Both data file and join file need to share the same <basename> but the join file has to have the file extension .sdi.join.csv

data file: <basename>.sdi.csv
join file: <basename>.sdi.join.csv

Chunk Files

When multiple data files should share one join file, they can be chunked. The data files can have different table structures as long as they adhere to the specifications of basic data files. Join files cannot get chunked.

data files: <basename>@<chunk>.sdi.csv
join file: <basename>.sdi.join.csv (see Join Files)

For details, see the pure chunking example and the complex example.

Examples

In this section, a bunch of examples will showcase how this specification works.

Minimal #1

This example consists of one set of layer definition files and two data files. Both data files contain data for every column.

_layer_def.csv_layer_def.csvt

tsv

date_time_start	geometry	temperature_degc

csv

DateTime,WKT,Real

data-0.sdi.csvdata-1.sdi.csv

tsv

date_time_start	geometry	temperature_degc
2023-06-28T00:00:00+00	POINT(8.5 53.5)	28.76
2023-06-28T01:00:00+00	POINT(8.5 53.5)	27.65

tsv

date_time_start	geometry	temperature_degc
2023-06-28T02:00:00+00	POINT(8.5 53.5)	26.54
2023-06-28T03:00:00+00	POINT(8.5 53.5)	25.43

Using the input files above results in the following table.

`date_time_start`	`geometry`	`temperature_degc`
`2023-06-28T00:00:00+00`	`POINT(8.5 53.5)`	`28.76`
`2023-06-28T01:00:00+00`	`POINT(8.5 53.5)`	`27.65`
`2023-06-28T02:00:00+00`	`POINT(8.5 53.5)`	`26.54`
`2023-06-28T03:00:00+00`	`POINT(8.5 53.5)`	`25.43`

Minimal example #1: Resulting table

Minimal #2

This example consists of one set of layer definition files and two data files. The data files contain data in different columns. They also have different column order.

_layer_def.csv_layer_def.csvt

tsv

date_time_start	geometry	temperature_degc	salinity_psu	platform

csv

DateTime,WKT,Real,Real,String

data-0.sdi.csvdata-1.sdi.csv

tsv

date_time_start	geometry	platform	temperature_degc
2023-06-28T00:00:00+00	POINT(8.5 53.5)		28.76
2023-06-28T01:00:00+00	POINT(8.5 53.5)	Black Pearl	27.65

tsv

date_time_start	geometry	salinity_psu	platform
2023-06-28T02:00:00+00	POINT(8.5 53.5)	26.54	Flying Dutchman
2023-06-28T03:00:00+00	POINT(8.5 53.5)	25.43

Using the input files above results in the following table. It can be observed, that the final column order is determined by the layer definition files and varying column order in data does not matter. Also, it gets clear how missing values get treated.

`date_time_start`	`geometry`	`temperature_degc`	`salinity_psu`	`platform`
`2023-06-28T00:00:00+00`	`POINT(8.5 53.5)`	`28.76`	null	null
`2023-06-28T01:00:00+00`	`POINT(8.5 53.5)`	`27.65`	null	`Black Pearl`
`2023-06-28T02:00:00+00`	`POINT(8.5 53.5)`	null	`26.54`	`Flying Dutchman`
`2023-06-28T03:00:00+00`	`POINT(8.5 53.5)`	null	`25.43`	null

Minimal example #2: Resulting table

Joining

This example consists of one set of layer definition files, one data file and one join file.

The data file contains data on people, the join file contains data on institutions. Both files share the column csv_join which will be used to append institution data to the people table.

layer_def.csv_layer_def.csvt

tsv

date_time_start	geometry	name	institute	department	division	group	phone

csv

DateTime,WKT,String,String,String,String,String,String

people.sdi.csvpeople.sdi.join.csv

tsv

date_time_start	geometry	name	phone	csv_join
2015-01-01T01:00:00+00	POINT(8.5 53.5)	Andreas	-1744	awi-se
2019-08-01T01:00:00+00	POINT(8.5 53.5)	Robin		awi-rz-se
2019-11-01T00:00:00+00	POINT(8.5 53.5)	Kono	-2362	awi-se
2020-01-01T00:00:00+00	POINT(8.5 53.5)	Christopher		awi-se
2020-01-01T00:00:00+00	POINT(8.5 53.5)	Max	-2561	awi-dls
2021-01-01T00:00:00+00	POINT(10.12 54.2)	Felix		geomar
2000-01-01T00:00:00+00	POINT(8.5 53.5)	Antje		awi-dir

tsv

csv_join	institute	department	division	group
awi-se	AWI	Computing & Data Centre	DATA	Software Engineering
awi-dls	AWI	Computing & Data Centre	DATA	Data Logistics Support
geomar	GEOMAR
awi-dir	AWI	Board of Directors

Using the input files above results in the following table.

`date_time_start`	`geometry`	`name`	`institute`	`department`	`division`	`group`	`phone`
`2015-01-01T01:00:00+00`	`POINT(8.5 53.5)`	`Andreas`	`AWI`	`CDC`	`DATA`	`SE`	`-1744`
`2019-08-01T01:00:00+00`	`POINT(8.5 53.5)`	`Robin`	`AWI`	`CDC`	`DATA`	`SE`	null
`2019-11-01T00:00:00+00`	`POINT(8.5 53.5)`	`Kono`	`AWI`	`CDC`	`DATA`	`SE`	`-2362`
`2020-01-01T00:00:00+00`	`POINT(8.5 53.5)`	`Christopher`	`AWI`	`CDC`	`DATA`	`SE`	null
`2020-01-01T00:00:00+00`	`POINT(8.5 53.5)`	`Max`	`AWI`	`CDC`	`DATA`	`DLS`	`-2561`
`2021-01-01T00:00:00+00`	`POINT(10.12 54.2)`	`Felix`	`GEOMAR`	null	null	null	null
`2020-01-01T00:00:00+00`	`POINT(8.5 53.5)`	`Antje`	`AWI`	`BoD`	null	null	null

Join example: Resulting table

Chunking

This example consists of one set of layer definition files and two data files. It is much like the minimal example #1](#minimal-1) but both files share the <basename> and their filename differs only in the @<chunk> part.

Data for this example is taken from wetterkontor.de.

_layer_def.csv_layer_def.csvt

tsv

date_time_start	date_time_end	geometry	precipitation_mm

csv

DateTime,DateTime,WKT,Real

bremerhaven@2022.sdi.csvbremerhaven@2023.sdi.csv

tsv

date_time_start	date_time_end	geometry	precipitation_mm
2022-01-01T00:00:00+00	2022-02-01T00:00:00+00	POINT(8.57 53.54)	54.9
2022-02-01T00:00:00+00	2022-03-01T00:00:00+00	POINT(8.57 53.54)	129.5
2022-03-01T00:00:00+00	2022-04-01T00:00:00+00	POINT(8.57 53.54)	24.2
2022-04-01T00:00:00+00	2022-05-01T00:00:00+00	POINT(8.57 53.54)	57.7
2022-05-01T00:00:00+00	2022-06-01T00:00:00+00	POINT(8.57 53.54)	76.1
2022-06-01T00:00:00+00	2022-07-01T00:00:00+00	POINT(8.57 53.54)	78.4
2022-07-01T00:00:00+00	2022-08-01T00:00:00+00	POINT(8.57 53.54)	60.1
2022-08-01T00:00:00+00	2022-09-01T00:00:00+00	POINT(8.57 53.54)	21.0
2022-09-01T00:00:00+00	2022-10-01T00:00:00+00	POINT(8.57 53.54)	170.9
2022-10-01T00:00:00+00	2022-11-01T00:00:00+00	POINT(8.57 53.54)	24.6
2022-11-01T00:00:00+00	2022-12-01T00:00:00+00	POINT(8.57 53.54)	47.8
2022-12-01T00:00:00+00	2023-01-01T00:00:00+00	POINT(8.57 53.54)	67.1

tsv

date_time_start	date_time_end	geometry	precipitation_mm
2023-01-01T00:00:00+00	2023-02-01T00:00:00+00	POINT(8.57 53.54)	93.0
2023-02-01T00:00:00+00	2023-03-01T00:00:00+00	POINT(8.57 53.54)	41.9
2023-03-01T00:00:00+00	2023-04-01T00:00:00+00	POINT(8.57 53.54)	96.1
2023-04-01T00:00:00+00	2023-05-01T00:00:00+00	POINT(8.57 53.54)	71.0
2023-05-01T00:00:00+00	2023-06-01T00:00:00+00	POINT(8.57 53.54)	14.8
2023-06-01T00:00:00+00	2023-07-01T00:00:00+00	POINT(8.57 53.54)	48.4

Using the input files above results in the following table.

`date_time_start`	`date_time_end`	`geometry`	`precipitation_mm`
`2022-01-01T00:00:00+00`	`2022-02-01T00:00:00+00`	`POINT(8.57 53.54)`	`54.9`
`2022-02-01T00:00:00+00`	`2022-03-01T00:00:00+00`	`POINT(8.57 53.54)`	`129.5`
`2022-03-01T00:00:00+00`	`2022-04-01T00:00:00+00`	`POINT(8.57 53.54)`	`24.2`
`2022-04-01T00:00:00+00`	`2022-05-01T00:00:00+00`	`POINT(8.57 53.54)`	`57.7`
`2022-05-01T00:00:00+00`	`2022-06-01T00:00:00+00`	`POINT(8.57 53.54)`	`76.1`
`2022-06-01T00:00:00+00`	`2022-07-01T00:00:00+00`	`POINT(8.57 53.54)`	`78.4`
`2022-07-01T00:00:00+00`	`2022-08-01T00:00:00+00`	`POINT(8.57 53.54)`	`60.1`
`2022-08-01T00:00:00+00`	`2022-09-01T00:00:00+00`	`POINT(8.57 53.54)`	`21.0`
`2022-09-01T00:00:00+00`	`2022-10-01T00:00:00+00`	`POINT(8.57 53.54)`	`170.9`
`2022-10-01T00:00:00+00`	`2022-11-01T00:00:00+00`	`POINT(8.57 53.54)`	`24.6`
`2022-11-01T00:00:00+00`	`2022-12-01T00:00:00+00`	`POINT(8.57 53.54)`	`47.8`
`2022-12-01T00:00:00+00`	`2023-01-01T00:00:00+00`	`POINT(8.57 53.54)`	`67.1`
`2023-01-01T00:00:00+00`	`2023-02-01T00:00:00+00`	`POINT(8.57 53.54)`	`93.0`
`2023-02-01T00:00:00+00`	`2023-03-01T00:00:00+00`	`POINT(8.57 53.54)`	`41.9`
`2023-03-01T00:00:00+00`	`2023-04-01T00:00:00+00`	`POINT(8.57 53.54)`	`96.1`
`2023-04-01T00:00:00+00`	`2023-05-01T00:00:00+00`	`POINT(8.57 53.54)`	`71.0`
`2023-05-01T00:00:00+00`	`2023-06-01T00:00:00+00`	`POINT(8.57 53.54)`	`14.8`
`2023-06-01T00:00:00+00`	`2023-07-01T00:00:00+00`	`POINT(8.57 53.54)`	`48.4`

Chunking example: Resulting table

The same result could have been achieved without chunking but simply using different <basename> values. However, this serves to purely demonstrate the chunking principle. To actually see what this would be useful for, checkout the complex example which combines joining and chunking.

Complex

This example consists of one set of layer definition files, one unchunked data file with a join file and one set of two chunked data files with another join file.

_layer_def.csv_layer_def.csvt

tsv

date_time_start	date_time_end	geometry	state	city	area_km²	population	unemployment_%

csv

DateTime,DateTime,WKT,String,String,Real,Real,Real

bremerhaven@2000-2004.sdi.csvbremerhaven@2005-2009.sdi.csvnrw.sdi.csv

tsv

date_time_start	date_time_end	geometry	population	csv_join
2000-01-01T00:00:00+00	2001-01-01T00:00:00+00	POINT(8.54 53.54)	120822	bhv
2001-01-01T00:00:00+00	2002-01-01T00:00:00+00	POINT(8.54 53.54)	118701	bhv
2002-01-01T00:00:00+00	2003-01-01T00:00:00+00	POINT(8.54 53.54)	119111	bhv
2003-01-01T00:00:00+00	2004-01-01T00:00:00+00	POINT(8.54 53.54)	118276	bhv
2004-01-01T00:00:00+00	2005-01-01T00:00:00+00	POINT(8.54 53.54)	117281	bhv

tsv

date_time_start	date_time_end	geometry	unemployment_%	csv_join
2005-01-01T00:00:00+00	2006-01-01T00:00:00+00	POINT(8.54 53.54)	23.7	bhv
2006-01-01T00:00:00+00	2007-01-01T00:00:00+00	POINT(8.54 53.54)	20.7	bhv
2007-01-01T00:00:00+00	2008-01-01T00:00:00+00	POINT(8.54 53.54)	18.5	bhv
2008-01-01T00:00:00+00	2009-01-01T00:00:00+00	POINT(8.54 53.54)	16.7	bhv
2009-01-01T00:00:00+00	2010-01-01T00:00:00+00	POINT(8.54 53.54)	15.4	bhv

tsv

date_time_start	date_time_end	geometry	population	unemployment_%	csv_join	comment
2000-01-01T00:00:00+00	2001-01-01T00:00:00+00	POINT(7.10 51.51)	278695		ge	unemployment data missing
2001-01-01T00:00:00+00	2002-01-01T00:00:00+00	POINT(7.10 51.51)	275835	15.3	ge
2002-01-01T00:00:00+00	2003-01-01T00:00:00+00	POINT(7.10 51.51)	274926	16.0	ge
2003-01-01T00:00:00+00	2004-01-01T00:00:00+00	POINT(7.10 51.51)	273782	17.0	ge
2004-01-01T00:00:00+00	2005-01-01T00:00:00+00	POINT(7.10 51.51)	270109	18.0	ge
2005-01-01T00:00:00+00	2006-01-01T00:00:00+00	POINT(7.10 51.51)	268102	23.4	ge
2006-01-01T00:00:00+00	2007-01-01T00:00:00+00	POINT(7.10 51.51)	266772	20.1	ge
2007-01-01T00:00:00+00	2008-01-01T00:00:00+00	POINT(7.10 51.51)	264765	16.7	ge
2008-01-01T00:00:00+00	2009-01-01T00:00:00+00	POINT(7.10 51.51)	262063	15.2	ge
2009-01-01T00:00:00+00	2010-01-01T00:00:00+00	POINT(7.10 51.51)	259744	15.1	ge
2000-01-01T00:00:00+00	2001-01-01T00:00:00+00	POINT(7.63 51.96)	265609		ms	unemployment data missing
2001-01-01T00:00:00+00	2002-01-01T00:00:00+00	POINT(7.63 51.96)	267197	6.7	ms
2002-01-01T00:00:00+00	2003-01-01T00:00:00+00	POINT(7.63 51.96)	268945	7.3	ms
2003-01-01T00:00:00+00	2004-01-01T00:00:00+00	POINT(7.63 51.96)	269579	7.8	ms
2004-01-01T00:00:00+00	2005-01-01T00:00:00+00	POINT(7.63 51.96)	270038	8.3	ms
2005-01-01T00:00:00+00	2006-01-01T00:00:00+00	POINT(7.63 51.96)	270868	9.1	ms
2006-01-01T00:00:00+00	2007-01-01T00:00:00+00	POINT(7.63 51.96)	272106	8.4	ms
2007-01-01T00:00:00+00	2008-01-01T00:00:00+00	POINT(7.63 51.96)	272951	7.1	ms
2008-01-01T00:00:00+00	2009-01-01T00:00:00+00	POINT(7.63 51.96)	273875	6.4	ms
2009-01-01T00:00:00+00	2010-01-01T00:00:00+00	POINT(7.63 51.96)	275543	6.4	ms

bremerhaven.sdi.join.csvnrw.sdi.join.csv

tsv

csv_join	city	state	area_km²
bhv	Bremerhaven	Bremen	93.8

tsv

csv_join	city	state	area_km²
ge	Gelsenkirchen	NRW	302.9
ms	Münster	NRW	104.8

Using the input files above results in the following table.

`date_time_start`	`date_time_end`	`geometry`	`state`	`city`	`area_km²`	`population`	`unemployment_%`
`2000-01-01T00:00:00+00`	`2001-01-01T00:00:00+00`	`POINT(8.54 53.54)`	`Bremen`	`Bremerhaven`	`93.8`	`120822`	null
`2001-01-01T00:00:00+00`	`2002-01-01T00:00:00+00`	`POINT(8.54 53.54)`	`Bremen`	`Bremerhaven`	`93.8`	`118701`	null
`2002-01-01T00:00:00+00`	`2003-01-01T00:00:00+00`	`POINT(8.54 53.54)`	`Bremen`	`Bremerhaven`	`93.8`	`119111`	null
`2003-01-01T00:00:00+00`	`2004-01-01T00:00:00+00`	`POINT(8.54 53.54)`	`Bremen`	`Bremerhaven`	`93.8`	`118276`	null
`2004-01-01T00:00:00+00`	`2005-01-01T00:00:00+00`	`POINT(8.54 53.54)`	`Bremen`	`Bremerhaven`	`93.8`	`117281`	null
`2005-01-01T00:00:00+00`	`2006-01-01T00:00:00+00`	`POINT(8.54 53.54)`	`Bremen`	`Bremerhaven`	`93.8`	null	`23.7`
`2006-01-01T00:00:00+00`	`2007-01-01T00:00:00+00`	`POINT(8.54 53.54)`	`Bremen`	`Bremerhaven`	`93.8`	null	`20.7`
`2007-01-01T00:00:00+00`	`2008-01-01T00:00:00+00`	`POINT(8.54 53.54)`	`Bremen`	`Bremerhaven`	`93.8`	null	`18.5`
`2008-01-01T00:00:00+00`	`2009-01-01T00:00:00+00`	`POINT(8.54 53.54)`	`Bremen`	`Bremerhaven`	`93.8`	null	`16.7`
`2009-01-01T00:00:00+00`	`2010-01-01T00:00:00+00`	`POINT(8.54 53.54)`	`Bremen`	`Bremerhaven`	`93.8`	null	`15.4`
`2000-01-01T00:00:00+00`	`2001-01-01T00:00:00+00`	`POINT(7.10 51.51)`	`NRW`	`Gelsenkirchen`	`302.9`	`278695`	null
`2001-01-01T00:00:00+00`	`2002-01-01T00:00:00+00`	`POINT(7.10 51.51)`	`NRW`	`Gelsenkirchen`	`302.9`	`275835`	`15.3`
`2002-01-01T00:00:00+00`	`2003-01-01T00:00:00+00`	`POINT(7.10 51.51)`	`NRW`	`Gelsenkirchen`	`302.9`	`274926`	`16.0`
`2003-01-01T00:00:00+00`	`2004-01-01T00:00:00+00`	`POINT(7.10 51.51)`	`NRW`	`Gelsenkirchen`	`302.9`	`273782`	`17.0`
`2004-01-01T00:00:00+00`	`2005-01-01T00:00:00+00`	`POINT(7.10 51.51)`	`NRW`	`Gelsenkirchen`	`302.9`	`270109`	`18.0`
`2005-01-01T00:00:00+00`	`2006-01-01T00:00:00+00`	`POINT(7.10 51.51)`	`NRW`	`Gelsenkirchen`	`302.9`	`268102`	`23.4`
`2006-01-01T00:00:00+00`	`2007-01-01T00:00:00+00`	`POINT(7.10 51.51)`	`NRW`	`Gelsenkirchen`	`302.9`	`266772`	`20.1`
`2007-01-01T00:00:00+00`	`2008-01-01T00:00:00+00`	`POINT(7.10 51.51)`	`NRW`	`Gelsenkirchen`	`302.9`	`264765`	`16.7`
`2008-01-01T00:00:00+00`	`2009-01-01T00:00:00+00`	`POINT(7.10 51.51)`	`NRW`	`Gelsenkirchen`	`302.9`	`262063`	`15.2`
`2009-01-01T00:00:00+00`	`2010-01-01T00:00:00+00`	`POINT(7.10 51.51)`	`NRW`	`Gelsenkirchen`	`302.9`	`259744`	`15.1`
`2000-01-01T00:00:00+00`	`2001-01-01T00:00:00+00`	`POINT(7.63 51.96)`	`NRW`	`Münster`	`104.8`	`265609`	null
`2001-01-01T00:00:00+00`	`2002-01-01T00:00:00+00`	`POINT(7.63 51.96)`	`NRW`	`Münster`	`104.8`	`267197`	`6.7`
`2002-01-01T00:00:00+00`	`2003-01-01T00:00:00+00`	`POINT(7.63 51.96)`	`NRW`	`Münster`	`104.8`	`268945`	`7.3`
`2003-01-01T00:00:00+00`	`2004-01-01T00:00:00+00`	`POINT(7.63 51.96)`	`NRW`	`Münster`	`104.8`	`269579`	`7.8`
`2004-01-01T00:00:00+00`	`2005-01-01T00:00:00+00`	`POINT(7.63 51.96)`	`NRW`	`Münster`	`104.8`	`270038`	`8.3`
`2005-01-01T00:00:00+00`	`2006-01-01T00:00:00+00`	`POINT(7.63 51.96)`	`NRW`	`Münster`	`104.8`	`270868`	`9.1`
`2006-01-01T00:00:00+00`	`2007-01-01T00:00:00+00`	`POINT(7.63 51.96)`	`NRW`	`Münster`	`104.8`	`272106`	`8.4`
`2007-01-01T00:00:00+00`	`2008-01-01T00:00:00+00`	`POINT(7.63 51.96)`	`NRW`	`Münster`	`104.8`	`272951`	`7.1`
`2008-01-01T00:00:00+00`	`2009-01-01T00:00:00+00`	`POINT(7.63 51.96)`	`NRW`	`Münster`	`104.8`	`273875`	`6.4`
`2009-01-01T00:00:00+00`	`2010-01-01T00:00:00+00`	`POINT(7.63 51.96)`	`NRW`	`Münster`	`104.8`	`275543`	`6.4`

Complex example: Resulting table

Lists and Dictionaries

_layer_def.csv_layer_def.csvt

tsv

date_time_start	geometry	people_dict	people_list

csv

DateTime,WKT,String,String

data-0.sdi.csv

tsv

date_time_start	geometry	people_dict	people_list
2023-06-28T00:00:00+00	POINT(8.5 53.5)	{"Alice": {"age": 18, "favorite_colour": "blue"}, "Bob": {"age": 19, "favorite_colour": "pink"}}
2023-06-28T01:00:00+00	POINT(8.5 53.5)		["Alice", "Bob", "Charlie"]

Using the input files above results in the following table.

`date_time_start`	`geometry`	`people_dict`	`people_list`
`2023-06-28T00:00:00+00`	`POINT(8.5 53.5)`	`{"Alice": {"age": 18, "favorite_colour": "blue"}, "Bob": {"age": 19, "favorite_colour": "pink"}}`	null
`2023-06-28T01:00:00+00`	`POINT(8.5 53.5)`	null	`["Alice", "Bob", "Charlie"]`

List/dict example: Resulting table

Unit Representation

There is no one way to represent units with O2A GeoCSV files. All of the following are valid. However, be constistent!

column naming: e.g. temperature_degc
additional unit column: e.g. temperature and temperature_unit
string values
- e.g. "28.76 °C" instead of "28.76"
- will prohibit range filtering
add unit information to layer abstract (see Data Product Configuration specification)

Example Use Cases

Guides

Specs

O2A GeoCSV (`.sdi.csv`)

Basics

File Types

Technicalities

Metadata Vocabulary

Space/Time

Acquisition

References

Layer Definition Files

Column Names

Column Types

Basic Data Files

Columns

Optional Data Files

Join Files

Chunk Files

Examples

Minimal #1

Minimal #2

Joining

Chunking

Complex

Lists and Dictionaries

Unit Representation

O2A GeoCSV (.sdi.csv) ​

Basics ​

File Types ​

Technicalities ​

Metadata Vocabulary ​

Space/Time ​

Acquisition ​

References ​

Layer Definition Files ​

Column Names ​

Column Types ​

Basic Data Files ​

Columns ​

Optional Data Files ​

Join Files ​

Chunk Files ​

Examples ​

Minimal #1 ​

Minimal #2 ​

Joining ​

Chunking ​

Complex ​

Lists and Dictionaries ​

Unit Representation ​

O2A GeoCSV (`.sdi.csv`)

Basics

File Types

Technicalities

Metadata Vocabulary

Space/Time

Acquisition

References

Layer Definition Files

Column Names

Column Types

Basic Data Files

Columns

Optional Data Files

Join Files

Chunk Files

Examples

Minimal #1

Minimal #2

Joining

Chunking

Complex

Lists and Dictionaries

Unit Representation