RDM and MD Landscape in Earth & Environment

HMC Home -> HMC Hub Earth & Evironment -> Catalogue of Resources

Go to a collection of other useful resources collected by the hub

Compilation of Recommendations

Details


Short Title

Identifying releases of data products

Source Documnent

Principles and best practices in data versioning for all datasets big and small

Source Document Link

https://doi.org/10.15497/RDA00042

Publishing Organisation

RDA Data Versioning WG

Date of Publication

2020-01-16

Topic

Quality control/ curation

Addressed Stakeholders

data stewards

Keywords

releases, versioning

Text

In some cases, the production of a dataset can be quite complex. The dataset may go through a number of revisions before it is considered to be “final”. The publication of such a “final” version of a dataset is called a “release”. The release of a new version of a dataset should be accompanied by a description of the nature and the significance of the change. The significance of this change will depend on the intended use of the data by its designated user community. For instance, the release of a new version could signify changes in the data format and its compatibility with existing data processing pipelines, or significant changes to the content of the dataset. Concepts such as Semantic Versioning (Preston-Werner, 2013) describe a commonly used practice to communicate the significance of aversion change in a dataset release and have been widely adopted in software development.