Back to resources

Metadata specification

Background

In the OpenEPI project, it is important to establish a clear set of rules for datasets, known as a metadata specification. This is more than just a formality – it's a way to guide our partners, make data easy to find, ensure consistent quality, and create combined datasets with shared information.

Firstly, having a metadata specification acts as a guide for our partners. It tells them exactly what their datasets need to have to be a part of OpenEPI. This helps everyone understand what's required and keeps things consistent across all the different datasets.

Secondly, it helps make data more discoverable in OpenEPI. By using a standard set of details about each dataset, we make it easier for users to find and understand the information they're looking for. This makes the whole OpenEPI system more user-friendly.

Another important point is maintaining a consistent level of quality across all datasets. The metadata specification sets the standards for data quality, ensuring that every dataset meets the same requirements. This consistency is essential for building trust in the accuracy and reliability of the information within OpenEPI.

Lastly, having a metadata specification is crucial when we want to combine or compare datasets. It ensures that different datasets can be brought together seamlessly because they all follow the same set of rules. This harmonization is key for conducting thorough research and analysis within the OpenEPI project.

Specification

This specification details the metadata required for datasets that aim to be part of the OpenEPI. It does not specify anything about the actual data contained in the dataset. The specification is inspired by and adapted from https://github.com/awslabs/open-data-registry. It also follows the FAIR-principles. The metadata should either be specified with JSON or with YAML, and be encoded in UTF-8.

Relevance to the INSPIRE Directive

In developing our metadata specification, we balanced the requirements of the INSPIRE Directive with a streamlined approach for broader usability and simplicity. While simplifying the format, we retained some key field names from INSPIRE, adapting others for clarity and accessibility. The use of JSON and YAML formats further enhances this accessibility and simplicity. Our approach, though simplified, preserves the core values of INSPIRE, ensuring our metadata remains relevant and valuable.


Fields

Top level

FieldRequiredTypeDescription
IdentifierYUUIDUniversal Unique Identifier for this dataset
TitleYStringTitle of the dataset
AbstractYStringShort description of data contained in the dataset
DocumentationURLLink to where documentation about the dataset can be found.
Data producerStringName of the person(s) that created this dataset
Distribution agencyYStringThe organization or entity that published this dataset
ContactYStringContact information to responsible publisher
Release dateYDateThe date on which the dataset was made publicly available or released for use.
VersionNumberThe version of this dataset. E.g 2.0
Maintenance and update frequencyYStringDescription of how often this dataset is being updated (Daily, Weekly, Monthly, Annually, As needed, Irregular, Never)
Temporal extentObjectSee Temporal extent
Spatial extentObjectSee Spatial extent
LicenseYObjectSee License
KeywordsList of stringsKeywords/tags for the dataset that can be used in filtering
ResourcesYList of objectsSee Resources
ExamplesList. of objectsSee Examples

Temporal Extent

FieldRequiredTypeDescription
Start dateYDateThe start date (inclusive) the data covers
End dateYDateThe end date (inclusive) the data covers

Spatial extent

FieldRequiredTypeDescription
TypeYStringEither “GLOBAL” or “REGION”
RegionStringRequired if Type is “REGION”. String representing the region the data covers
DetailsStringAdditional information about the region. Can for example be a comma separated list of countries or regions within a country
CoordinatesList of coordinates The following order of coordinates: min_lon, min_lat, max_lon, max_lat
Spatial resolutionStringDescription of the resolution of data. E.g. 10m x 10m or 5° x 5°
Coordinate reference systemStringDefinition of the geographical coordinate reference system

License

FieldRequiredTypeDescription
NameYStringThe name of the license
URLURLURL to the license text
DescriptionStringShort description about the license

Resources

FieldRequiredTypeDescription
TypeYStringType of resource. Eg. API, File, DB
DescriptionStringDescription of this resource
Data SpecificationStringSpecification of the standard used in the data contents
URLYURLURL for direct access to this resource

Examples

FieldRequiredTypeDescription
TypeYStringType of resource. Eg. Application, Publication, Tutorial
DescriptionStringDescription of this resource
URLYURLURL to the example use of the dataset

Data format

We support metadata in both JSON and YAML format.

Field names undergo a conversion for standardization and compatibility. This conversion involves two steps:

  1. Translation to Lowercase: All characters in the field name are converted to lowercase. This uniformity ensures consistency across different systems and platforms.
  2. Replacing Spaces with Underscores: Spaces within the field names are replaced with underscores ( _ ).

For example, a field name like "Spatial Resolution" in a standard text format would be converted to "spatial_resolution" in JSON and YAML. This approach maintains readability while adhering to the naming conventions commonly used in programming and data formats.


Examples

JSON

{
  "identifier": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
  "title": "Norway Weather Data 2020-2023",
  "abstract": "Comprehensive weather data collected across Norway, including temperature, precipitation, and wind speed measurements.",
  "documentation": "https://met.no/en/services/norway-weather-data-documentation",
  "data_producer": "Norwegian Meteorological Institute",
  "distribution_agency": "Norwegian Meteorological Institute",
  "contact": "[email protected]",
  "release_date": "2024-02-15",
  "version": 1.0,
  "maintenance_and_update_frequency": "Annually",
  "temporal_extent": {
    "start_date": "2020-01-01",
    "end_date": "2023-12-31"
  },
  "spatial_extent": {
    "type": "REGION",
    "region": "Norway",
    "details": "Oslo, Bergen, Trondheim, Stavanger, Tromsø",
    "coordinates": ["57.99", "71.38", "4.88", "31.10"],
    "spatial_resolution": "5km x 5km",
    "coordinate_reference_system": "ETRS89 / UTM zone 32N"
  },
  "license": {
    "name": "Open Government Licence - Norway",
    "url": "https://data.norge.no/nlod/en/1.0",
    "description": "License allowing free use and distribution of the data, as long as the Norwegian Meteorological Institute is credited."
  },
  "keywords": ["Weather", "Climate", "Norway", "Meteorology"],
  "resources": [
    {
      "type": "API",
      "description": "Real-time weather data API",
      "url": "https://api.met.no/weatherapi/locationforecast/2.0/"
    },
    {
      "type": "File",
      "description": "Historical weather data files",
      "url": "https://met.no/en/free-meteorological-data/Historical-data"
    }
  ],
  "examples": [
    {
      "type": "Application",
      "description": "Weather forecasting app using this dataset",
      "url": "https://example.com/weatherapp"
    },
    {
      "type": "Research",
      "description": "Study on climate trends in Norway based on this dataset",
      "url": "https://example.com/climatetrendsresearch"
    }
  ]
}

YAML

identifier: f47ac10b-58cc-4372-a567-0e02b2c3d479
title: Norway Weather Data 2020-2023
abstract: Comprehensive weather data collected across Norway, including temperature, precipitation, and wind speed measurements.
documentation: https://met.no/en/services/norway-weather-data-documentation
data_producer: Norwegian Meteorological Institute
distribution_agency: Norwegian Meteorological Institute
contact: [email protected]
release_date: 2024-02-15
version: 1.0
maintenance_and_update_frequency: Annually
temporal_extent:
  start_date: 2020-01-01
  end_date: 2023-12-31
spatial_extent:
  type: REGION
  region: Norway
  details: Oslo, Bergen, Trondheim, Stavanger, Tromsø
  coordinates: [57.99, 71.38, 4.88, 31.10]
  spatial_resolution: 5km x 5km
  coordinate_reference_system: ETRS89 / UTM zone 32N
license:
  name: Open Government Licence - Norway
  url: https://data.norge.no/nlod/en/1.0
  description: License allowing free use and distribution of the data, as long as the Norwegian Meteorological Institute is credited.
keywords: [Weather, Climate, Norway, Meteorology]
resources:
  - type: API
    description: Real-time weather data API
    url: https://api.met.no/weatherapi/locationforecast/2.0/
  - type: File
    description: Historical weather data files
    url: https://met.no/en/free-meteorological-data/Historical-data
examples:
  - type: Application
    description: Weather forecasting app using this dataset
    url: https://example.com/weatherapp
  - type: Research
    description: Study on climate trends in Norway based on this dataset
    url: https://example.com/climatetrendsresearch

JSON

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "identifier": {
      "type": "string",
      "format": "uuid"
    },
    "title": {
      "type": "string"
    },
    "abstract": {
      "type": "string"
    },
    "documentation": {
      "type": "string",
      "format": "uri"
    },
    "data_producer": {
      "type": "string"
    },
    "distribution_agency": {
      "type": "string"
    },
    "contact": {
      "type": "string"
    },
    "release_date": {
      "type": "string",
      "format": "date"
    },
    "version": {
      "type": "number"
    },
    "maintenance_and_update_frequency": {
      "type": "string"
    },
    "temporal_extent": {
      "type": "object",
      "properties": {
        "start_date": {
          "type": "string",
          "format": "date"
        },
        "end_date": {
          "type": "string",
          "format": "date"
        }
      },
      "required": ["start_date", "end_date"]
    },
    "spatial_extent": {
      "type": "object",
      "properties": {
        "type": {
          "type": "string"
        },
        "region": {
          "type": "string"
        },
        "details": {
          "type": "string"
        },
        "coordinates": {
          "type": "array",
          "items": {
            "type": "number"
          }
        },
        "spatial_resolution": {
          "type": "string"
        },
        "coordinate_reference_system": {
          "type": "string"
        }
      },
      "required": ["type"]
    },
    "license": {
      "type": "object",
      "properties": {
        "name": {
          "type": "string"
        },
        "url": {
          "type": "string",
          "format": "uri"
        },
        "description": {
          "type": "string"
        }
      },
      "required": ["name"]
    },
    "keywords": {
      "type": "array",
      "items": {
        "type": "string"
      }
    },
    "resources": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "type": {
            "type": "string"
          },
          "description": {
            "type": "string"
          },
          "url": {
            "type": "string",
            "format": "uri"
          }
        },
        "required": ["type", "url"]
      }
    },
    "examples": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "type": {
            "type": "string"
          },
          "description": {
            "type": "string"
          },
          "url": {
            "type": "string",
            "format": "uri"
          }
        },
        "required": ["type", "url"]
      }
    }
  },
  "required": [
    "identifier", 
    "title", 
    "abstract", 
    "distribution_agency", 
    "contact", 
    "release_date", 
    "maintenance_and_update_frequency", 
    "temporal_extent", 
    "spatial_extent", 
    "license", 
    "resources"
  ]
}