AIRR Ontologies and Vocabularies Team¶
Summary¶
The “Ontologies and Vocabularies Team” was formed as a joint interest group of the Common Repository (ComRepo) and the Minimal Standards (MiniStd) working groups of the AIRR Community. The long-term aim of the Team is to define standard vocabularies and ontologies to be used by AIRR-compliant databases.
Ontology Data Representation¶
The nodes in an ontology are typically either concepts (e.g., capital)
or instances thereof (e.g., Paris). These nodes have local IDs (often
numbers), which are unique within an ontology. They also typically have
labels, which is the human-readable name of the node. Ontology
entities in the AIRR Data Standard reflect this model, with each AIRR
field that is represented as an ontology recorded with a global
ontology ID (id) and the corresponding label (label).
Within the AIRR Standards, Compact URIs (CURIEs) are used to represent ontology IDs. CURIEs are a standardized way to abbreviate International Resource Identifiers (IRI, [RFC3987]), which includes URIs as a subset. They were originally conceived to simplify the handling of attributes, e.g. in XML or SPARQL, by making them more compact and readable. CURIEs are also used by IEDB databases to reduce redundancies (mainly in the leading part of IRIs).
For example, a typical CURIE would look like NCBITAXON:9258. In this
case, NCBITAXON is the prefix, a custom string that will be
replaced by a repository-defined IRI component (e.g.,
http://purl.obolibrary.org/obo/NCBITaxon_). Note that there is no
connection between NCBITAXON in the CURIE and NCBITaxon in the
IRI, the former one is just a placeholder.
The AIRR schema will provide a list of AIRR approved CURIE prefixes along with a list of at least one IRI prefix (i.e., replacement string) for each them. This list serves two purposes:
It provides a controlled namespace for CURIE prefixes used in the AIRR schema. For now, custom additions to or replacements of these prefixes in the schema are prohibited. This does not affect the ability of repositories to use such custom prefixes internally.
It simplifies resolution of CURIEs by non-repositories. The lists of IRI prefixes for each CURIE prefix should not be considered to be exhaustive. However, when using custom IRI prefixes, it must be ensured that they refer to the same ontology as the provider prefixes.
It should be explicitly noted that the IRI prefix list should not be interpreted as any kind of recommendation for certain providers. It is left up to users to decide how to resolve the resulting IRIs, e.g., via DNS/HTTP (if possible) or by using a provider of their choice.
Approved Ontologies¶
Cell ontology (CL)
used in:
Cell subset (
cell_subset, Tissue and Cell Processing)
CURIE summary
CURIE Prefix:
CLCURIE IRI Prefix:
http://purl.obolibrary.org/obo/CL_
example AIRR use
“cell_subset.id” : “CL:0000542”
“cell_subset.label” : “lymphocyte”
default root node
label:
lymphocytelocal id:
CL_0000542path: ``
license: CC BY
latest release (as of 2020-05-20): 2020-03-02
maintainer: Alexander Diehl, Buffalo, NY, US (addiehl@buffalo.edu)
Human disease ontology (DOID)
used in:
Diagnosis (
disease_diagnosis, Diagnosis)
CURIE summary
CURIE Prefix:
DOIDCURIE IRI Prefix:
http://purl.obolibrary.org/obo/DOID_
example AIRR use
“disease_diagnosis.id” : “DOID:9538”
“disease_diagnosis.label” : “multiple myeloma”
default root node
label:
diseaselocal ID:
DOID:4path:
disease
license: CC0
latest release (as of 2020-05-20): 2020-04-20
repo: https://github.com/DiseaseOntology/HumanDiseaseOntology
maintainer: Lynn Schriml, U Maryland, MD, US (lynn.schriml@gmail.com)
notes: Features ICD cross-reference
NCBI organismal taxonomy (NCBITAXON)
used in:
Species (
species, Subject)Cell species (
cell_species, Tissue and Cell Processing)
CURIE summary
CURIE Prefix:
NCBITAXONCURIE IRI Prefixes:
http://purl.obolibrary.org/obo/NCBITaxon_,http://purl.bioontology.org/ontology/NCBITAXON/
example AIRR use
“species.id” : “NCBITAXON:9606”
“species.label” : “Homo sapiens”
default root node
label:
Gnathostomatalocal ID:
7776path:
cellular organisms/Eukaryota/Opisthokonta/Metazoa/Eumetazoa/Bilateria/Deuterostomia/Chordata/Craniata/Vertebrata/Gnathostomata
license: UMLS
latest release (as of 2020-05-20): 2020-04-18
maintainer: NCBI (info@ncbi.nlm.nih.gov)
NCI thesaurus (NCIT)
used in:
Study type (
study_type, Study)
CURIE summary
CURIE Prefix:
NCITCURIE IRI Prefixes:
http://purl.obolibrary.org/obo/NCIT_,http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#
example AIRR use
“study_type.id” : “NCIT:C15197”
“study_type.label” : “Case-Control Study”
default root node
label:
Studylocal ID:
C63536path:
Activity/Clinical or Research Activity/ Research Activity/Study
license: Public domain, credit of NCI is requested
repo: https://github.com/NCI-Thesaurus/thesaurus-obo-edition
latest release (as of 2020-05-20): 2020-05-04
maintainer: NCI (ncicbiitappssupport@mail.nih.gov)
Units of measurement ontology (UO)
used in:
Age unit (
age_unit, Subject)
CURIE summary
CURIE Prefix:
UOCURIE IRI Prefix:
http://purl.obolibrary.org/obo/UO_
example AIRR use
“age_unit.id” : “UO:0000036”
“age_unit.label” : “year”
default root node
label:
time unitlocal ID:
UO_0000003path:
unit/time unit
license: CC BY (per Github repo)
repo: https://github.com/bio-ontology-research-group/unit-ontology
latest release (as of 2020-05-20): 2020-05-18
maintainer: unknown
Uber-anatomy ontology (Uberon)
used in:
Tissue (
tissue, Sample)
CURIE summary
CURIE Prefix:
UBERONCURIE IRI Prefix:
http://purl.obolibrary.org/obo/UBERON_
example AIRR use
“tissue.id” : “UBERON:0002371”
“tissue.label” : “bone marrow”
default root node
label:
multicellular anatomical structurelocal ID:
UBERON:0010000path:
/BFO_0000002/BFO_0000004/anatomical entity/material anatomical entity/anatomical structure/multicellular anatomical structure
license: CC BY
latest release (as of 2020-05-20): 2019-11-22
maintainer: Chris Mungall, LBL, CA, US (cjmungall@lbl.gov)
Sprint Reports¶
- RFC3987
Internationalized Resource Identifiers (IRIs). `DOI:10.17487/RFC3987`_