HMC Home -> HMC Hub Earth & Evironment -> Catalogue of Resources
Go to a collection of other useful resources collected by the hub
Compilation of Recommendations
Details
Short Title
Identification of data collections
Source Documnent
Principles and best practices in data versioning for all datasets big and small
Source Document Link
https://doi.org/10.15497/RDA00042
Publishing Organisation
RDA Data Versioning WG
Date of Publication
2020-01-16
Topic
Policy, Quality control/ curation
Addressed Stakeholders
data stewards, policy makers
Keywords
data collections, PID, persistent identifiers
Text
Datasets may be aggregated into collections or timeseries. These collections can be seen as “works of works” (Hourclé, 2009), similar to a journal series. Following this practice, the collection (work of works) should be identified and versioned, and so should be each of its constituent datasets (works) (Klump et al., 2016) Some data collections, such as time series data, are expected to change over time as new data are added. Here, the entire time series should be identified, as should be time-stamped revisions, if the series is updated frequently (Rauber, et al., 2016). As not all changes are due to the addition of data over time, but may also be the result of corrections, recalibrations, etc. it is also recommended to adopt a dataset release policy for time series data.