User Tools

Site Tools


wiki:s1.0

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
wiki:s1.0 [2025/08/28 10:31] – [1. Explanation of the Background and Benefits of the Recommendation] dkottmeierwiki:s1.0 [2025/09/01 11:47] (current) – [Motivation for this Recommendation:] dkottmeier
Line 1: Line 1:
-**Recommendation S1.0 **+**Recommendation **
  
-======Recommendation to enrich data with rich metadata======+======Recommendation to decompose metadata according to community-recognized frameworks======
  
 =====Description===== =====Description=====
  
-Status: Under development, Date: 2025/07/08 10:18, Version: 001+Status: Under development, Date: 2025/07/07 10:18, Version: 001
  
 =====Motivation for this Recommendation: ===== =====Motivation for this Recommendation: =====
-As data can only be embedded in semantic frameworks when it is described with rich metadata, the first step toward a standardized approach for implementing semantic resources is for data producers to enrich data with metadata - even if this may seem self-evidentOnly the standardized use of metadata enables the annotation with identifiable terms from recognized controlled vocabularies, which allows machines to interpret and connect data across disciplinary and institutional boundariesSince the needs for these metadata can vary greatly between research communitiesdata infrastructuresand use caseswe recommend using existing metadata schemas commonly used in each of these. In additionthere are generally applicable schemas that can be recommended.+Many metadata elementse.g., a measured quantities or methodsare complexmeaning they combine terms from different categories into a single compound concept. In other wordsthey consist of multiple metadata components drawn from different categories.
  
-=====Recommendation summary====+For example, the variable air temperature (°C) does not only specify the measured quantity (temperature) but also includes additional components: the measurement context (air) and the unit (°C).
  
-All data producers in Helmholtz Earth & Environment should enrich their datasets with richstandardized metadata at the time of dataset creation, submission to repositories or publicationRepositories, sensor registries and other data infrastructures should ensure that metadata is requested and archived in standardized and structured manner. This should be done following metadata categories specified in established general or discipline-specific metadata schemas or workflows (see also I2.0 Define exchange format).+Because most metadata elements leave room for interpretation regarding which information they should captureit is crucial to establish binding standardsSuch standards must clearly define which components belong to a given metadata element. 
 + 
 +The individual components of metadata element can either be stored in separate fields or, alternatively, combined into a single text string within one field, following a community-agreed syntax. 
 +=====Recommendation ==== 
 + 
 +[shortened from below] 
 + 
 +[Format: Wer! macht was! wo! wann! unter welchen Voraussetzungen!]
  
 =====Binding Convention: ===== =====Binding Convention: =====
Line 18: Line 25:
  
 ^                         ^ mandatory  ^ conditional           ^ optional ^ ^                         ^ mandatory  ^ conditional           ^ optional ^
-^ Helmholtz FAIR Principle|     x       |      |          |+^ Helmholtz FAIR Principle|            |      |          |
  
 =====Precondition for Implementation: ===== =====Precondition for Implementation: =====
Line 24: Line 31:
 =====Related Recommendations ===== =====Related Recommendations =====
  
-Parent: S0+Parent:
  
-Dependent: S3.0+Dependent:
  
-Other: related to I2.0+Other: none
  
 =====Contributors===== =====Contributors=====
  
 +Names of contributors to this recommendation
  
 =====Content===== =====Content=====
  
 ====1. Explanation of the Background and Benefits of the Recommendation ==== ====1. Explanation of the Background and Benefits of the Recommendation ====
-A metadata schema defines the structure, content, and semantics of metadata elements used to describe a dataset. It specifies what metadata should be captured, how it should be named, and in which format it should be stored. Schemas often include controlled vocabularies and formal structures, allowing metadata to be understood both by humans and machines. Widely used schemas include DataCite Metadata Schema for citation metadata, ISO 19115 for geospatial data (hat Platz für Details, viele Abhängigkeiten, in Ausformulierungen überlegen), DCAT for XXX (behörden, flach) and Dublin Core for general resource description (bibliothekarische Metadaten/Attribute?(manche dieser Attribute Pflichtfelder bei Datacite und ISO=, aus den 70er Jahren). 
  
-Across Helmholtz Earth and Environmental sciences, metadata standards are applied in diverse repositories and infrastructures, to name just a few of them: +__About__
-  * PANGAEA (AWI/MARUM) applies metadata workflows combining descriptive (e.g. dataset titles, abstracts, parameters, methods), structural (e.g. campaign and event hierarchies), and administrative (e.g. file formats, DOIs, licenses) elements. PANGAEA uses ISO 19115, DIF, Dublin Core, and its own metadata schema ensuring interoperability across geosciences and marine research (https://wiki.pangaea.de/wiki/Metadata). +
-  * GFZ Data Services (GFZ Potsdam) supports metadata based on ISO 19115, NASA GCMD DIF, and DataCite, with its own metadata entry system providing templates and controlled vocabularies for FAIR compliance. +
-  * Helmholtz Coastal Data Center (HCDC) relies on ISO 19115 and NetCDF CF Conventions, and in some cases OGC SensorML, to capture observational and sensor-based data. +
-  * DataCite Metadata Schema provides the global backbone for dataset citation and retrieval (DataCite Schema+
  
-Providing sufficient enrichment of data with metadata forms the basis for the implementation of standardized semantic concepts.+__History and structure__ 
 + 
 +__Current Use of ...__ 
 + 
 +__Motivation__
  
 ====2. Possible alternative solutions==== ====2. Possible alternative solutions====
Line 54: Line 61:
  
 ====4. The Recommendation==== ====4. The Recommendation====
-**Bibliographic Metadata**/(Administrative metadata?) 
  
-It is recommended that for the accurate and consistent identification of a resource for citation and retrieval purposes, each published dataset should be provided with the core metadata elements defined in the most up-to-date DataCite Metadata Schema (see https://schema.datacite.org/).+**Instruments/Devices** 
 +**Manufacturers’ names** should always be reported as they were //valid at the time of production//. In practice, this means using the name that appears on the instrument label or in the official manual.
  
-**Community and Repository Alignment**+  If an instrument is marketed under a brand name, the brand (e.g., Thermo Scientific) not the parent company name (Thermo Fisher Scientific) should be used. If no brand is indicated, the official company name should be given. 
 +  If the instrument was produced by a subsidiary company, use the subsidiary’s name at the time of production (e.g., Spectra GmbH), not the later acquirer (X Corp.). Subsequent changes, such as company sales, mergers, or renamings, should //not// be reflected in the metadata. 
 +  * In general, the most granular level available (e.g., the concrete brand or subsidiary rather than only the corporate group) should be recorded to ensure precision and avoid ambiguity. 
 + 
 +**Instrument model names** and numbers should be reproduced exactly as they are written on the instrument label or in the accompanying manual, including spaces, special characters, and capitalization. This ensures consistency and guarantees that identical instruments are always represented in the same way across datasets.
  
-When selecting metadata schemas, data producers should always consider the intended purpose of the metadata: 
  
-  - If datasets are to be published in repositories such as PANGAEA or GFZ Data Services, metadata must follow repository-specific workflows and schemas. 
-  - If datasets need to be interoperable within a scientific community (e.g., oceanography, climate science), community standards like CF Conventions, NetCDF Climate & Forecast metadata, or GCMD keywords should be adopted. 
-  - If datasets must be integrated into international portals and infrastructures, metadata must be aligned with globally recognized schemas such as ISO 19115 or DataCite. 
 ====5. Naming of communities that have already implemented the recommendation==== ====5. Naming of communities that have already implemented the recommendation====
  
Line 72: Line 79:
  
 ====7. Examples of Instances==== ====7. Examples of Instances====
 +Comment: HIER ERLÄUTERN, WIE in XML oder JSON dokumentiert werden; Beispiel.. V´Wie verpackt, um im Protokoll zu packen. unterschiedlich je nach Metadatenschemata; z.B. PANGAEA "kommaseparariert in einem Feld" vs SMS or Registry"
 ====8. Further Information==== ====8. Further Information====
  
wiki/s1.0.1756377108.txt.gz · Last modified: by dkottmeier