Back to resources
Metadata specification
Background
In the OpenEPI project, it is important to establish a clear set of rules for datasets, known as a metadata specification. This is more than just a formality – it's a way to guide our partners, make data easy to find, ensure consistent quality, and create combined datasets with shared information.
Firstly, having a metadata specification acts as a guide for our partners. It tells them exactly what their datasets need to have to be a part of OpenEPI. This helps everyone understand what's required and keeps things consistent across all the different datasets.
Secondly, it helps make data more discoverable in OpenEPI. By using a standard set of details about each dataset, we make it easier for users to find and understand the information they're looking for. This makes the whole OpenEPI system more user-friendly.
Another important point is maintaining a consistent level of quality across all datasets. The metadata specification sets the standards for data quality, ensuring that every dataset meets the same requirements. This consistency is essential for building trust in the accuracy and reliability of the information within OpenEPI.
Lastly, having a metadata specification is crucial when we want to combine or compare datasets. It ensures that different datasets can be brought together seamlessly because they all follow the same set of rules. This harmonization is key for conducting thorough research and analysis within the OpenEPI project.
Specification
This specification details the metadata required for datasets that aim to be part of the OpenEPI. It does not specify anything about the actual data contained in the dataset. The specification is inspired by and adapted from https://github.com/awslabs/open-data-registry. It also follows the FAIR-principles. The metadata should either be specified with JSON or with YAML, and be encoded in UTF-8.
Relevance to the INSPIRE Directive
In developing our metadata specification, we balanced the requirements of the INSPIRE Directive with a streamlined approach for broader usability and simplicity. While simplifying the format, we retained some key field names from INSPIRE, adapting others for clarity and accessibility. The use of JSON and YAML formats further enhances this accessibility and simplicity. Our approach, though simplified, preserves the core values of INSPIRE, ensuring our metadata remains relevant and valuable.
Fields
Top level
Field | Required | Type | Description |
---|---|---|---|
Identifier | Y | UUID | Universal Unique Identifier for this dataset |
Title | Y | String | Title of the dataset |
Abstract | Y | String | Short description of data contained in the dataset |
Documentation | URL | Link to where documentation about the dataset can be found. | |
Data producer | String | Name of the person(s) that created this dataset | |
Distribution agency | Y | String | The organization or entity that published this dataset |
Contact | Y | String | Contact information to responsible publisher |
Release date | Y | Date | The date on which the dataset was made publicly available or released for use. |
Version | Number | The version of this dataset. E.g 2.0 | |
Maintenance and update frequency | Y | String | Description of how often this dataset is being updated (Daily, Weekly, Monthly, Annually, As needed, Irregular, Never) |
Temporal extent | Object | See Temporal extent | |
Spatial extent | Object | See Spatial extent | |
License | Y | Object | See License |
Keywords | List of strings | Keywords/tags for the dataset that can be used in filtering | |
Resources | Y | List of objects | See Resources |
Examples | List. of objects | See Examples |
Temporal Extent
Field | Required | Type | Description |
---|---|---|---|
Start date | Y | Date | The start date (inclusive) the data covers |
End date | Y | Date | The end date (inclusive) the data covers |
Spatial extent
Field | Required | Type | Description |
---|---|---|---|
Type | Y | String | Either “GLOBAL” or “REGION” |
Region | String | Required if Type is “REGION”. String representing the region the data covers | |
Details | String | Additional information about the region. Can for example be a comma separated list of countries or regions within a country | |
Coordinates | List of coordinates | The following order of coordinates: min_lon, min_lat, max_lon, max_lat | |
Spatial resolution | String | Description of the resolution of data. E.g. 10m x 10m or 5° x 5° | |
Coordinate reference system | String | Definition of the geographical coordinate reference system |
License
Field | Required | Type | Description |
---|---|---|---|
Name | Y | String | The name of the license |
URL | URL | URL to the license text | |
Description | String | Short description about the license |
Resources
Field | Required | Type | Description |
---|---|---|---|
Type | Y | String | Type of resource. Eg. API, File, DB |
Description | String | Description of this resource | |
Data Specification | String | Specification of the standard used in the data contents | |
URL | Y | URL | URL for direct access to this resource |
Examples
Field | Required | Type | Description |
---|---|---|---|
Type | Y | String | Type of resource. Eg. Application, Publication, Tutorial |
Description | String | Description of this resource | |
URL | Y | URL | URL to the example use of the dataset |
Data format
We support metadata in both JSON and YAML format.
Field names undergo a conversion for standardization and compatibility. This conversion involves two steps:
- Translation to Lowercase: All characters in the field name are converted to lowercase. This uniformity ensures consistency across different systems and platforms.
- Replacing Spaces with Underscores: Spaces within the field names are replaced with underscores ( _ ).
For example, a field name like "Spatial Resolution" in a standard text format would be converted to "spatial_resolution" in JSON and YAML. This approach maintains readability while adhering to the naming conventions commonly used in programming and data formats.
Examples
JSON
{
"identifier": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
"title": "Norway Weather Data 2020-2023",
"abstract": "Comprehensive weather data collected across Norway, including temperature, precipitation, and wind speed measurements.",
"documentation": "https://met.no/en/services/norway-weather-data-documentation",
"data_producer": "Norwegian Meteorological Institute",
"distribution_agency": "Norwegian Meteorological Institute",
"contact": "[email protected]",
"release_date": "2024-02-15",
"version": 1.0,
"maintenance_and_update_frequency": "Annually",
"temporal_extent": {
"start_date": "2020-01-01",
"end_date": "2023-12-31"
},
"spatial_extent": {
"type": "REGION",
"region": "Norway",
"details": "Oslo, Bergen, Trondheim, Stavanger, Tromsø",
"coordinates": ["57.99", "71.38", "4.88", "31.10"],
"spatial_resolution": "5km x 5km",
"coordinate_reference_system": "ETRS89 / UTM zone 32N"
},
"license": {
"name": "Open Government Licence - Norway",
"url": "https://data.norge.no/nlod/en/1.0",
"description": "License allowing free use and distribution of the data, as long as the Norwegian Meteorological Institute is credited."
},
"keywords": ["Weather", "Climate", "Norway", "Meteorology"],
"resources": [
{
"type": "API",
"description": "Real-time weather data API",
"url": "https://api.met.no/weatherapi/locationforecast/2.0/"
},
{
"type": "File",
"description": "Historical weather data files",
"url": "https://met.no/en/free-meteorological-data/Historical-data"
}
],
"examples": [
{
"type": "Application",
"description": "Weather forecasting app using this dataset",
"url": "https://example.com/weatherapp"
},
{
"type": "Research",
"description": "Study on climate trends in Norway based on this dataset",
"url": "https://example.com/climatetrendsresearch"
}
]
}
YAML
identifier: f47ac10b-58cc-4372-a567-0e02b2c3d479
title: Norway Weather Data 2020-2023
abstract: Comprehensive weather data collected across Norway, including temperature, precipitation, and wind speed measurements.
documentation: https://met.no/en/services/norway-weather-data-documentation
data_producer: Norwegian Meteorological Institute
distribution_agency: Norwegian Meteorological Institute
contact: [email protected]
release_date: 2024-02-15
version: 1.0
maintenance_and_update_frequency: Annually
temporal_extent:
start_date: 2020-01-01
end_date: 2023-12-31
spatial_extent:
type: REGION
region: Norway
details: Oslo, Bergen, Trondheim, Stavanger, Tromsø
coordinates: [57.99, 71.38, 4.88, 31.10]
spatial_resolution: 5km x 5km
coordinate_reference_system: ETRS89 / UTM zone 32N
license:
name: Open Government Licence - Norway
url: https://data.norge.no/nlod/en/1.0
description: License allowing free use and distribution of the data, as long as the Norwegian Meteorological Institute is credited.
keywords: [Weather, Climate, Norway, Meteorology]
resources:
- type: API
description: Real-time weather data API
url: https://api.met.no/weatherapi/locationforecast/2.0/
- type: File
description: Historical weather data files
url: https://met.no/en/free-meteorological-data/Historical-data
examples:
- type: Application
description: Weather forecasting app using this dataset
url: https://example.com/weatherapp
- type: Research
description: Study on climate trends in Norway based on this dataset
url: https://example.com/climatetrendsresearch
JSON
{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"identifier": {
"type": "string",
"format": "uuid"
},
"title": {
"type": "string"
},
"abstract": {
"type": "string"
},
"documentation": {
"type": "string",
"format": "uri"
},
"data_producer": {
"type": "string"
},
"distribution_agency": {
"type": "string"
},
"contact": {
"type": "string"
},
"release_date": {
"type": "string",
"format": "date"
},
"version": {
"type": "number"
},
"maintenance_and_update_frequency": {
"type": "string"
},
"temporal_extent": {
"type": "object",
"properties": {
"start_date": {
"type": "string",
"format": "date"
},
"end_date": {
"type": "string",
"format": "date"
}
},
"required": ["start_date", "end_date"]
},
"spatial_extent": {
"type": "object",
"properties": {
"type": {
"type": "string"
},
"region": {
"type": "string"
},
"details": {
"type": "string"
},
"coordinates": {
"type": "array",
"items": {
"type": "number"
}
},
"spatial_resolution": {
"type": "string"
},
"coordinate_reference_system": {
"type": "string"
}
},
"required": ["type"]
},
"license": {
"type": "object",
"properties": {
"name": {
"type": "string"
},
"url": {
"type": "string",
"format": "uri"
},
"description": {
"type": "string"
}
},
"required": ["name"]
},
"keywords": {
"type": "array",
"items": {
"type": "string"
}
},
"resources": {
"type": "array",
"items": {
"type": "object",
"properties": {
"type": {
"type": "string"
},
"description": {
"type": "string"
},
"url": {
"type": "string",
"format": "uri"
}
},
"required": ["type", "url"]
}
},
"examples": {
"type": "array",
"items": {
"type": "object",
"properties": {
"type": {
"type": "string"
},
"description": {
"type": "string"
},
"url": {
"type": "string",
"format": "uri"
}
},
"required": ["type", "url"]
}
}
},
"required": [
"identifier",
"title",
"abstract",
"distribution_agency",
"contact",
"release_date",
"maintenance_and_update_frequency",
"temporal_extent",
"spatial_extent",
"license",
"resources"
]
}