학술논문

Ten simple rules for making a vocabulary FAIR.
Document Type
Article
Source
PLoS Computational Biology. 6/16/2021, Vol. 17 Issue 6, p1-15. 15p. 3 Charts.
Subject
*KNOWLEDGE representation (Information theory)
*VOCABULARY
*DATA integration
*METADATA
*CONSUMPTION (Economics)
Language
ISSN
1553-734X
Abstract
We present ten simple rules that support converting a legacy vocabulary—a list of terms available in a print-based glossary or in a table not accessible using web standards—into a FAIR vocabulary. Various pathways may be followed to publish the FAIR vocabulary, but we emphasise particularly the goal of providing a globally unique resolvable identifier for each term or concept. A standard representation of the concept should be returned when the individual web identifier is resolved, using SKOS or OWL serialised in an RDF-based representation for machine-interchange and in a web-page for human consumption. Guidelines for vocabulary and term metadata are provided, as well as development and maintenance considerations. The rules are arranged as a stepwise recipe for creating a FAIR vocabulary based on the legacy vocabulary. By following these rules you can achieve the outcome of converting a legacy vocabulary into a standalone FAIR vocabulary, which can be used for unambiguous data annotation. In turn, this increases data interoperability and enables data integration. Author summary: We present ten simple rules that support converting a list of terms not currently accessible using web standards into a vocabulary conforming to the FAIR principles–Findable, Accessible, Interoperable and Reusable. In a FAIR vocabulary each term has its own persistent web-identifier, and its definition can be downloaded in both human- and standard machine-readable formats. The goal is to enable terminology to be unambiguously cited within technical datasets, in both the dataset description, or individual fields within the data, so that data can be discovered and integrated. The rules consider arrangements for governance of a terminology alongside the technical aspects related to conversion of (typically) print-based forms to standards-based knowledge representations. The rules are presented in the sequence in which they should be considered in a conversion process. [ABSTRACT FROM AUTHOR]