This document describes the AIRR Data Model. It begins with an
overview of the structure and semantics of the Repertoire schema,
including best practices for documenting data processing, principles for
linking related data, and definitions of key concepts such as
Repertoire and Rearrangement. This is followed by a specification
of the file format and a detailed description of individual Repertoire
fields.
Repertoire Schema#
A Repertoire is an abstract organizational unit of analysis that
is defined by the researcher and consists of study metadata, subject
metadata, sample metadata, cell processing metadata, nucleic acid
processing metadata, sequencing run metadata, a set of raw sequence
files, data processing metadata, and a set of Rearrangements. A
Repertoire gathers all of this information together into a
composite object, which can be easily accessed by computer programs
for data entry, analysis and visualization.
A Repertoire is specific to a single subject and, ideally, to a
specific sample, with any number of raw sequence files, and any number
of rearrangements. It can also consist of any number of data
processing metadata objects that describe the processing of raw
sequence files into Rearrangements.
Typically, a Repertoire corresponds to the biological concept of
the immune repertoire which the researcher experimentally measures
and computationally analyzes. However, researchers can have different
interpretations about what constitutes the biological immune repertoire;
therefore, the Repertoire schema attempts to be flexible and broadly
useful for all AIRR-seq studies.
Multiple Data Processing on a Repertoire#
Data processing can be a complicated multi-stage
process. Documenting the process in a formal way is challenging
because of the diversity of actions that may be performed. The MiAIRR
standard requires documentation of the process but in an informal way
with free text descriptions. A Repertoire might undergo multiple
different data processing for any number of reasons, e.g. to
compare the results from different toolchains, or to compare different
settings for the same toolchain.
It is expected that all of the Samples of a Repertoire will be
processed together within a DataProcessing. That is, a
DataProcessing that only uses some but not all samples in a
Repertoire could be confusing to users and appear as though data
is missing. Likewise, processing some samples within a Repertoire
with one DataProcessing and the remaining samples with a
different DataProcessing could also confuse users. Because
DataProcessing is unstructured information, it is not possible
to validate that all Samples in a Repertoire are being
processed together, so this expectation cannot be strictly
enforced.
Having multiple DataProcessing for a Repertoire will
create multiple sets of Rearrangements that are distinct and
separate from each other. Analysis tools need to be careful not to mix
these sets of Rearrangements from different DataProcessing
because it can generate incorrect results. The identifier
data_processing_id was added so Rearrangements can
identify their specific DataProcessing.
Linking Data#
Each Repertoire has a unique repertoire_id identifier. This
identifier should be globally unique so that repertoires from multiple
studies can be combined together without conflict. The
repertoire_id is used to link other AIRR data to a
Repertoire. Specifically, the Rearrangements Schema includes repertoire_id for referencing the
specific Repertoire for that Rearrangement.
If a Repertoire has multiple DataProcessing then
data_processing_id should be used to distinguish the
appropriate DataProcessing within the Repertoire. The
Rearrangements contains data_processing_id for this
purpose. The data_processing_id is only unique within a
Repertoire so repertoire_id should first be used to get the
appropriate Repertoire object and then data_processing_id
used to acquire the appropriate DataProcessing.
It is expected that typical Repertoires might only have a single
DataProcessing, in which case repertoire_id and
data_processing_id will be semantically equivalent and only the
former should be used.
If a Repertoire has multiple sample processing objects in the sample
array then sample_processing_id should be used to distinguish the
the appropriate sample processing object within the Repertoire. The
Rearrangement object can contain a sample_processing_id to uniquely
identify a sample processing object within a Repertoire. Like
data_processing_id, the sample_processing_id is only unique within
the Repertoire so repertoire_id should first be used to get the
appropriate Repertoire object and then sample_processing_id should
be used to determine the appropriate sample processing object that is associated
with the Rearrangement. If the Rearrangement object does not have a
sample_processing_id then it can be assumed that the rearrangement is
associated with all of the samples in the Repertoire (e.g. the rearrangement
is a collapsed rearrangement across multiple samples).
It is expected that Repertoires might often have a single
sample processing object, in which case repertoire_id and
sample_processing_id will be semantically equivalent and only the
former should be used.
Finally, if it is necessary to link a Rearrangement object with a unique
pairing of sample processing and DataProcessing, the repertoire_id of
the Rearrangement object should be used to identify the correct Repertoire
object and then the data_processing_id should be used to identify the correct
DataProcessing metadata and the sample_processing_id should be used to
identify the correct sample processing metadata within that Repertoire.
Duality between Repertoires and Rearrangements#
There is an important duality relationship between Repertoires and
Rearrangements, specifically with the experimental protocols
described in the Repertoire versus the annotations on
Rearrangements. A Repertoire defines an experimental design
for what a researcher intends to measure or observe, while the
Rearrangements are what was actually measured and
observed. Technically, the border between the two occurs at
sequencing, that is when the biological physical entity (prepared DNA)
is measured and recorded as information (nucleotide sequence).
This duality is important when considering how to answer certain
questions. For example, locus for Rearrangements may have the
value “IGH” which indicates that B cell heavy chain receptors were
measured, yet the Repertoire might have “T cell” in
cell_subset which indicates the researcher intended to measure T
cells. This conflict between the two indicates something is
wrong. Differences can occur in many ways, as with errors in the
experimental protocol, or data processing might have incorrectly
processed the raw sequencing data leading to invalid annotations.
RepertoireFilter Schema#
As a Repertoire corresponds to a discrete biological unit, it
will often be the case that an experiment or analysis will span
multiple Repertoires. Common examples include comparing
individuals with and without a particular diagnosis or tracking
repertoire evolution across a time series. Conversely, a
researcher may sometimes be interested in only a specific subset
of a Repertoire such as “productive rearrangements for IGHV4”.
All of these cases can be represented using an array of
RepertoireFilters and contained in a RepertoireGroup.
A RepertoireFilter incorporates its underlying Repertoires
by reference to their repertoire_ids and thus retains the
ability to access all of the associated MiAIRR metadata. The
RepertoireFilter also describes the selection criteria for
the included repertoires and how they have been filtered by
building a query equivalent to one that would be used in the
ADC API.
RepertoireGroups can be associated with the same study as
the underlying Repertoires or a new one, as appropriate.
File Format Specification#
Files are YAML/JSON with a structure defined below. Files should be
encoded as UTF-8. Identifiers are case-sensitive. Files should have the
extension .yaml, .yml, or .json.
File Structure#
The file as a whole is considered a dictionary (key/value pair) structure with the keys
InfoandRepertoire.The file can (optionally) contain an
Infoobject, at the beginning of the file, based upon theInfoschema in the OpenAPI V2 specification. If provided,versioninInfoshould reference the version of the AIRR schema for the file.The file should correspond to a list of
Repertoireobjects, usingRepertoireas the key to the list.Each
Repertoireobject should contain a top-level key/value pair forrepertoire_idthat uniquely identifies the repertoire.Some fields require the use of a particular ontology or controlled vocabulary.
The structure is the same regardless of whether the data is stored in a file or a data repository. For example, The ADC API will return a properly structured JSON object that can be saved to a file and used directly without modification.
Schema Field Definitions#
Repertoire Fields#
Name |
Type |
Attributes |
Definition |
|---|---|---|---|
|
string |
optional, identifier, nullable |
Identifier for the repertoire object. This identifier should be globally unique so that repertoires from multiple studies can be combined together without conflict. The repertoire_id is used to link other AIRR data to a Repertoire. Specifically, the Rearrangements Schema includes repertoire_id for referencing the specific Repertoire for that Rearrangement. |
|
string |
optional, nullable |
Short generic display name for the repertoire |
|
string |
optional, nullable |
Generic repertoire description |
|
string |
optional, nullable |
Repertoire type (source). Most often the type should be “observed,” meaning it corresponds to an actual physical sample that was sequenced. Other allowed values are “simulated,” i.e. the Rearrangements were generated in silico and there is no linked Subject/Sample metadata, and “inferred” for Rearrangements that are phylogenetically reconstructed from observed sequences. Inferred Repertoires should point to the same Subject/Sample metadata as the corresponding physical Repertoires. |
|
required |
Study object |
|
|
required |
Subject object |
|
|
array of SampleProcessing |
required |
List of Sample Processing objects |
|
array of DataProcessing |
required |
List of Data Processing objects |
Repertoire Filter Fields#
Download as TSV
Name |
Type |
Attributes |
Definition |
|---|---|---|---|
|
string |
optional |
Identifier to the repertoire |
|
string |
optional, nullable |
Description of this repertoire within the group |
|
optional, nullable |
Time point designation for this repertoire within the group |
|
|
object |
optional, nullable |
A JSON object describing how this Repertoire was filtered using the same structure as an ADC API query |
Study Fields#
Name |
Type |
Attributes |
Definition |
|---|---|---|---|
|
string |
required, identifier, nullable |
Unique ID assigned by study registry such as one of the International Nucleotide Sequence Database Collaboration (INSDC) repositories. |
|
string |
required, nullable |
Descriptive study title |
|
required, nullable |
Type of study design |
|
|
string |
optional, nullable |
Generic study description |
|
string |
required, nullable |
List of criteria for inclusion/exclusion for the study |
|
string |
required, nullable |
Funding agencies and grant numbers |
|
array of Contributor |
required |
List of individuals who contributed to the study. Note that these are not necessarily identical with the authors on an associated manuscript or other scholarly communication. Further note that typically at least the three CRediT contributor roles “supervision”, “investigation” and “data curation” should be assigned. The coresponding author should be listed last. |
|
string |
DEPRECATED |
Full contact information of the contact persons for this study This should include an e-mail address and a persistent identifier such as an ORCID ID. |
|
string |
DEPRECATED |
Full contact information of the data collector, i.e. the person who is legally responsible for data collection and release. This should include an e-mail address and a persistent identifier such as an ORCID ID. |
|
string |
DEPRECATED |
Department of data collector |
|
string |
DEPRECATED |
Institution and institutional address of data collector |
|
string |
DEPRECATED |
Full contact information of the data depositor, i.e., the person submitting the data to a repository. This should include an e-mail address and a persistent identifier such as an ORCID ID. This is supposed to be a short-lived and technical role until the submission is relased. |
|
array of string |
required, nullable |
Array of publications describing the rationale and/or outcome of the study as an array of CURIE objects such as a DOI or Pubmed ID. Where more than one publication is given, if there is a primary publication for the study it should come first. |
|
array of string |
required, nullable |
Keywords describing properties of one or more data sets in a study. “contains_schema” keywords indicate that the study contains data objects from the AIRR Schema of that type (Rearrangement, Clone, Cell, Receptor) while the other keywords indicate that the study design considers the type of data indicated (e.g. it is possible to have a study that “contains_paired_chain” but does not “contains_schema_cell”). |
|
string |
optional, nullable |
Date the study was first published in the AIRR Data Commons. |
|
string |
optional, nullable |
Date the study data was updated in the AIRR Data Commons. |
Subject Fields#
Name |
Type |
Attributes |
Definition |
|---|---|---|---|
|
string |
required, identifier, nullable |
Subject ID assigned by submitter, unique within study. If possible, a persistent subject ID linked to an INSDC or similar repository study should be used. |
|
boolean |
required |
TRUE for libraries in which the diversity has been synthetically generated (e.g. phage display) |
|
required |
Binomial designation of subject’s species |
|
|
DEPRECATED |
Binomial designation of subject’s species |
|
|
string |
required, nullable |
Biological sex of subject |
|
required, nullable |
Age of subject expressed as a time interval. If singular time point then min == max in the time interval. |
|
|
string |
required, nullable |
Event in the study schedule to which Age refers. For NCBI BioSample this MUST be sampling. For other implementations submitters need to be aware that there is currently no mechanism to encode to potential delta between Age event and Sample collection time, hence the chosen events should be in temporal proximity. |
|
number |
DEPRECATED |
|
|
number |
DEPRECATED |
|
|
DEPRECATED |
||
|
required, nullable |
Broad geographic origin of ancestry (continent) |
|
|
optional, nullable |
Self-reported location of birth of the subject, preferred granularity is country-level |
|
|
string |
required, nullable |
Ethnic group of subject (defined as cultural/language-based membership) |
|
string |
required, nullable |
Racial group of subject (as defined by NIH) |
|
string |
required, nullable |
Non-human designation of the strain or breed of animal used |
|
string |
required, nullable |
Subject ID to which Relation type refers |
|
string |
required, nullable |
Relation between subject and linked_subjects, can be genetic or environmental (e.g.exposure) |
|
array of Diagnosis |
optional |
Diagnosis information for subject |
|
optional, nullable |
Diagnosis Fields#
Name |
Type |
Attributes |
Definition |
|---|---|---|---|
|
string |
required, nullable |
Designation of study arm to which the subject is assigned to |
|
optional, nullable |
Time point for the diagnosis |
|
|
required, nullable |
Diagnosis of subject |
|
|
required, nullable |
Time duration between initial diagnosis and current intervention |
|
|
string |
required, nullable |
Stage of disease at current intervention |
|
string |
required, nullable |
List of all relevant previous therapies applied to subject for treatment of Diagnosis |
|
string |
required, nullable |
Antigen, vaccine or drug applied to subject at this intervention |
|
string |
required, nullable |
Description of intervention |
|
string |
required, nullable |
Medical history of subject that is relevant to assess the course of disease and/or treatment |
Sample Fields#
Name |
Type |
Attributes |
Definition |
|---|---|---|---|
|
string |
required, identifier, nullable |
Sample ID assigned by submitter, unique within study. If possible, a persistent sample ID linked to INSDC or similar repository study should be used. |
|
string |
required, nullable |
The way the sample was obtained, e.g. fine-needle aspirate, organ harvest, peripheral venous puncture |
|
required, nullable |
The actual tissue sampled, e.g. lymph node, liver, peripheral blood |
|
|
string |
required, nullable |
The anatomic location of the tissue, e.g. Inguinal, femur |
|
string |
required, nullable |
Histopathologic evaluation of the sample |
|
required, nullable |
Time point at which sample was taken, relative to label event |
|
|
DEPRECATED |
||
|
string |
DEPRECATED |
Event in the study schedule to which Sample collection time relates to |
|
optional, nullable |
Location where the sample was taken, preferred granularity is country-level |
|
|
string |
required, nullable |
Name and address of the entity providing the sample |
Sample Processing Fields#
Name |
Type |
Attributes |
Definition |
|---|---|---|---|
|
string |
optional, identifier, nullable |
Identifier for the sample processing object. This field should be unique within the repertoire. This field can be used to uniquely identify the combination of sample, cell processing, nucleic acid processing and sequencing run information for the repertoire. |
Tissue and Cell Processing Fields#
Name |
Type |
Attributes |
Definition |
|---|---|---|---|
|
string |
required, nullable |
Enzymatic digestion and/or physical methods used to isolate cells from sample |
|
required, nullable |
Commonly-used designation of isolated cell population |
|
|
string |
required, nullable |
List of cellular markers and their expression levels used to isolate the cell population. |
|
string |
optional, nullable |
Free text cell type annotation. Primarily used for annotating cell types that are not provided in the Cell Ontology. |
|
optional, nullable |
Binomial designation of the species from which the analyzed cells originate. Typically, this value should be identical to species, in which case it SHOULD NOT be set explicitly. However, there are valid experimental setups in which the two might differ, e.g., chimeric animal models. If set, this key will overwrite the species information for all lower layers of the schema. |
|
|
boolean |
required, nullable |
TRUE if single cells were isolated into separate compartments |
|
integer |
required, nullable |
Total number of cells that went into the experiment |
|
integer |
required, nullable |
Number of cells for each biological replicate |
|
boolean |
required, nullable |
TRUE if cells were cryo-preserved between isolation and further processing |
|
string |
required, nullable |
Relative amount of viable cells after preparation and (if applicable) thawing |
|
string |
required, nullable |
Description of the procedure used for marker-based isolation or enrich cells |
|
string |
required, nullable |
Description of the methods applied to the sample including cell preparation/ isolation/enrichment and nucleic acid extraction. This should closely mirror the Materials and methods section in the manuscript. |
Nucleic Acid Processing Fields#
Name |
Type |
Attributes |
Definition |
|---|---|---|---|
|
string |
required |
The class of nucleic acid that was used as primary starting material for the following procedures |
|
string |
required, nullable |
Description and results of the quality control performed on the template material |
|
PhysicalQuantity |
required, nullable |
Amount of template that went into the process |
|
DEPRECATED |
||
|
string |
required |
Generic type of library generation |
|
string |
required, nullable |
Description of processes applied to substrate to obtain a library that is ready for sequencing |
|
string |
required, nullable |
When using a library generation protocol from a commercial provider, provide the protocol version number |
|
array of PCRTarget |
optional |
If a PCR step was performed that specifically targets the IG/TR loci, the target and primer locations need to be provided here. This field holds an array of PCRTarget objects, so that multiplex PCR setups amplifying multiple loci at the same time can be annotated using one record per locus. PCR setups not targeting any specific locus must not annotate this field but select the appropriate library_generation_method instead. |
|
string |
required |
To be considered complete, the procedure used for library construction MUST generate sequences that 1) include the first V gene codon that encodes the mature polypeptide chain (i.e. after the leader sequence) and 2) include the last complete codon of the J gene (i.e. 1 bp 5’ of the J->C splice site) and 3) provide sequence information for all positions between 1) and 2). To be considered complete & untemplated, the sections of the sequences defined in points 1) to 3) of the previous sentence MUST be untemplated, i.e. MUST NOT overlap with the primers used in library preparation. mixed should only be used if the procedure used for library construction will likely produce multiple categories of sequences in the given experiment. It SHOULD NOT be used as a replacement of a NULL value. |
|
string |
required |
In case an experimental setup is used that physically links nucleic acids derived from distinct Rearrangements before library preparation, this field describes the mode of that linkage. All hetero_* terms indicate that in case of paired-read sequencing, the two reads should be expected to map to distinct IG/TR loci. *_head-head refers to techniques that link the 5’ ends of transcripts in a single-cell context. *_tail-head refers to techniques that link the 3’ end of one transcript to the 5’ end of another one in a single-cell context. This term does not provide any information whether a continuous reading-frame between the two is generated. *_prelinked refers to constructs in which the linkage was already present on the DNA level (e.g. scFv). |
PCR Target Locus Fields#
Name |
Type |
Attributes |
Definition |
|---|---|---|---|
|
string |
required, nullable |
Designation of the target locus. Note that this field uses a controlled vocubulary that is meant to provide a generic classification of the locus, not necessarily the correct designation according to a specific nomenclature. |
|
string |
required, nullable |
Position of the most distal nucleotide templated by the forward primer or primer mix |
|
string |
required, nullable |
Position of the most proximal nucleotide templated by the reverse primer or primer mix |
Sequencing Data Fields#
Name |
Type |
Attributes |
Definition |
|---|---|---|---|
|
string |
required, identifier, nullable |
Persistent identifier of raw data stored in an archive (e.g. INSDC run ID). Data archive should be identified in the CURIE prefix. |
|
string |
required, nullable |
File format for the raw reads or sequences |
|
string |
required, nullable |
File name for the raw reads or sequences. The first file in paired-read sequencing. |
|
string |
required, nullable |
Read direction for the raw reads or sequences. The first file in paired-read sequencing. |
|
integer |
required, nullable |
Read length in bases for the first file in paired-read sequencing |
|
string |
required, nullable |
File name for the second file in paired-read sequencing |
|
string |
required, nullable |
Read direction for the second file in paired-read sequencing |
|
integer |
required, nullable |
Read length in bases for the second file in paired-read sequencing |
|
string |
optional, nullable |
File name for the index file |
|
integer |
optional, nullable |
Read length in bases for the index file |
Sequencing Run Fields#
Name |
Type |
Attributes |
Definition |
|---|---|---|---|
|
string |
required, identifier, nullable |
ID of sequencing run assigned by the sequencing facility |
|
integer |
required, nullable |
Number of usable reads for analysis |
|
string |
required, nullable |
Designation of sequencing instrument used |
|
string |
required, nullable |
Name and address of sequencing facility |
|
string |
required, nullable |
Date of sequencing run |
|
string |
required, nullable |
Name, manufacturer, order and lot numbers of sequencing kit |
|
optional |
Set of sequencing files produced by the sequencing run |
Data Processing Fields#
Name |
Type |
Attributes |
Definition |
|---|---|---|---|
|
string |
optional, identifier, nullable |
Identifier for the data processing object. |
|
boolean |
optional, identifier |
If true, indicates this is the primary or default data processing for the repertoire and its rearrangements. If false, indicates this is a secondary or additional data processing. |
|
string |
required, nullable |
Version number and / or date, include company pipelines |
|
string |
required, nullable |
How paired end reads were assembled into a single receptor sequence |
|
string |
required, nullable |
How/if sequences were removed from (4) based on base quality scores |
|
string |
required, nullable |
How primers were identified in the sequences, were they removed/masked/etc? |
|
string |
required, nullable |
The method used for combining multiple sequences from (4) into a single sequence in (5) |
|
string |
required, nullable |
General description of how QC is performed |
|
array of string |
optional, nullable |
Array of file names for data produced by this data processing. |
|
string |
required, nullable |
Source of germline V(D)J genes with version number or date accessed. |
|
string |
optional, nullable |
Unique identifier of the germline set and version, in CURIE format |
|
string |
optional, nullable |
Identifier for machine-readable PROV model of analysis provenance |
Cell Processing Fields#
Name |
Type |
Attributes |
Definition |
|---|---|---|---|
|
string |
required, nullable |
Enzymatic digestion and/or physical methods used to isolate cells from sample |
|
required, nullable |
Commonly-used designation of isolated cell population |
|
|
string |
required, nullable |
List of cellular markers and their expression levels used to isolate the cell population. |
|
string |
optional, nullable |
Free text cell type annotation. Primarily used for annotating cell types that are not provided in the Cell Ontology. |
|
optional, nullable |
Binomial designation of the species from which the analyzed cells originate. Typically, this value should be identical to species, in which case it SHOULD NOT be set explicitly. However, there are valid experimental setups in which the two might differ, e.g., chimeric animal models. If set, this key will overwrite the species information for all lower layers of the schema. |
|
|
boolean |
required, nullable |
TRUE if single cells were isolated into separate compartments |
|
integer |
required, nullable |
Total number of cells that went into the experiment |
|
integer |
required, nullable |
Number of cells for each biological replicate |
|
boolean |
required, nullable |
TRUE if cells were cryo-preserved between isolation and further processing |
|
string |
required, nullable |
Relative amount of viable cells after preparation and (if applicable) thawing |
|
string |
required, nullable |
Description of the procedure used for marker-based isolation or enrich cells |
|
string |
required, nullable |
Description of the methods applied to the sample including cell preparation/ isolation/enrichment and nucleic acid extraction. This should closely mirror the Materials and methods section in the manuscript. |
Contributor Fields#
Name |
Type |
Attributes |
Definition |
|---|---|---|---|
|
string |
required, identifier, nullable |
Unique identifier of this contributor within the file |
|
string |
required |
Full name of contributor |
|
string |
optional, nullable |
ORCID identifier of the contributor. Note that if present, the label of the ORCID record should take precedence over the name reported in the name property. |
|
string |
optional, nullable |
ROR of the contributor’s primary affiliation. Note that ROR are only minted for institutions, not from individuals institutes, divisions or departments. |
|
string |
optional, nullable |
Additional information regarding the contributor’s primary affiliation. Can be used to specify individual institutes, divisions or departments. |
|
array of ContributorContribution |
optional, nullable |
List of all roles the contributor had in a project |
Subject Genotype Fields#
Name |
Type |
Attributes |
Definition |
|---|---|---|---|
|
optional, nullable |
Immune receptor genotype set for this subject. |
|
|
optional, nullable |
MHC genotype set for this subject. |
Genotype Fields#
Name |
Type |
Attributes |
Definition |
|---|---|---|---|
|
string |
required, identifier, nullable |
A unique identifier within the file for this Receptor Genotype, typically generated by the repository hosting the schema, for example from the underlying ID of the database record. |
|
string |
required |
Gene locus |
|
array of DocumentedAllele |
optional, nullable |
List of alleles documented in reference set(s) |
|
array of UndocumentedAllele |
optional, nullable |
List of alleles inferred to be present and not documented in an identified GermlineSet |
|
array of DeletedGene |
optional, nullable |
Array of genes identified as being deleted in this genotype |
|
string |
optional, nullable |
Information on how the genotype was acquired. Controlled vocabulary. |
Genotype Set Fields#
Name |
Type |
Attributes |
Definition |
|---|---|---|---|
|
string |
required, identifier, nullable |
A unique identifier for this Receptor Genotype Set, typically generated by the repository hosting the schema, for example from the underlying ID of the database record. |
|
array of Genotype |
optional, nullable |
List of Genotypes included in this Receptor Genotype Set. |
MHC Genotype Fields#
Name |
Type |
Attributes |
Definition |
|---|---|---|---|
|
string |
required, identifier, nullable |
A unique identifier for this MHCGenotype, assumed to be unique in the context of the study |
|
string |
required |
Class of MHC alleles described by the MHCGenotype |
|
array of MHCAllele |
required, nullable |
List of MHC alleles of the indicated mhc_class identified in an individual |
|
string |
optional, nullable |
Information on how the genotype was determined. The content of this field should come from a list of recommended terms provided in the AIRR Schema documentation. |
MHC Genotype Set Fields#
Name |
Type |
Attributes |
Definition |
|---|---|---|---|
|
string |
required, identifier, nullable |
A unique identifier for this MHCGenotypeSet |
|
array of MHCGenotype |
required, nullable |
List of MHCGenotypes included in this set |
Time Point Fields#
Name |
Type |
Attributes |
Definition |
|---|---|---|---|
|
string |
optional, nullable |
Informative label for the time point |
|
number |
optional |
Value of the time point |
|
optional |
Unit of the time point |
Time Interval Fields#
Name |
Type |
Attributes |
Definition |
|---|---|---|---|
|
number |
optional |
Lower/minimum value of the time interval |
|
number |
optional |
Upper/maximum value of the time interval |
|
optional |
Unit of the time interval |
Time Quantity Fields#
Name |
Type |
Attributes |
Definition |
|---|---|---|---|
|
number |
optional |
Time quantity |
|
optional |
Unit of time |