HMC Home -> HMC Hub Earth & Evironment -> Catalogue of Resources
Go to a collection of other useful resources collected by the hub
Compilation of Recommendations
Details
Short Title
Identifying releases of data products
Source Documnent
Principles and best practices in data versioning for all datasets big and small
Source Document Link
https://doi.org/10.15497/RDA00042
Publishing Organisation
RDA Data Versioning WG
Date of Publication
2020-01-16
Topic
Quality control/ curation
Addressed Stakeholders
data stewards
Keywords
releases, versioning
Text
In some cases, the production of a dataset can be quite complex. The dataset may go through a number of revisions before it is considered to be “final”. The publication of such a “final” version of a dataset is called a “release”. The release of a new version of a dataset should be accompanied by a description of the nature and the significance of the change. The significance of this change will depend on the intended use of the data by its designated user community. For instance, the release of a new version could signify changes in the data format and its compatibility with existing data processing pipelines, or significant changes to the content of the dataset. Concepts such as Semantic Versioning (Preston-Werner, 2013) describe a commonly used practice to communicate the significance of aversion change in a dataset release and have been widely adopted in software development.