Identifying releases of data products
- Short Title: Identifying releases of data products
- Source Documnent: Principles and best practices in data versioning for all datasets big and small
- Source Document Link: https://doi.org/10.15497/RDA00042
- Publishing Organisation: RDA Data Versioning WG
- Date of Publication: 2020-01-16
- Topic: Quality control/ curation
- Keywords: releases, versioning
- Addressed Stakeholders: data stewards
- Full Text: In some cases, the production of a dataset can be quite complex. The dataset may go through a number of revisions before it is considered to be “final”. The publication of such a “final” version of a dataset is called a “release”. The release of a new version of a dataset should be accompanied by a description of the nature and the significance of the change. The significance of this change will depend on the intended use of the data by its designated user community. For instance, the release of a new version could signify changes in the data format and its compatibility with existing data processing pipelines, or significant changes to the content of the dataset. Concepts such as Semantic Versioning (Preston-Werner, 2013) describe a commonly used practice to communicate the significance of aversion change in a dataset release and have been widely adopted in software development.