1. All datasets intended for citation must have a globally unique persistent identifier that can be expressed as an unambiguous URL.
- Short Title: 1. All datasets intended for citation must have a globally unique persistent identifier that can be expressed as an unambiguous URL.
- Source Documnent: A data citation roadmap for scholarly data repositories
- Source Document Link: https://doi.org/10.1038/s41597-019-0031-8
- Publishing Organisation: FORCE11.org and BioCADDIE Data Citation implementation Pilot Repositories Expert Group
- Date of Publication: 2021-04-10
- Topic: Discovery/ indexing/ search, Interlinking/ interoperability
- Keywords: PID, identifier, URL
- Addressed Stakeholders: data service providers
- Full Text: A data citation must include a persistent method for identification that is machine actionable, globally unique, and widely used by a community (JDDCP, principle #4). The use of the persistent identifier should follow community best practices. For implementation by data repositories, this means: Persistent method for identification. Unique identifiers, and metadata describing the data, and its disposi- tion, must persist–even beyond the lifespan of the data they describe (JDDCP, principle #6). As an extension to this principle, data repositories should make provisions to keep unique identifiers and metadata available beyond the lifespan of the data or repository, ideally in a well-recognized and accepted standard metadata format. Machine actionable. The persistent identifier must be understood, and be resolvable, as an HTTP URI in accordance with IETF RFC 3986, including support for content negotiation. Globally unique. The identifier must use a prefix (namespace) if the identifier character string is only unique within a particular database, e.g. an accession number; and the prefix must be registered with a robust, insti- tutionally stable global resolver such as the identifiers.org system at EMBL/EBI. Widely used by a community. The persistent identifier must be widely used in the community. For the life sciences this includes accession numbers, in combination with the database name for global uniqueness.