Alignment Schema (Experimental)

See the format overview for details on how to structure this data.

Note, this schema definition is still experimental and should not be considered final.


Download as TSV.

Name Type Priority Description
sequence_id string required Unique query sequence identifier within the file. Most often this will be the input sequence header or a substring thereof, but may also be a custom identifier defined by the tool in cases where query sequences have been combined in some fashion prior to alignment.
segment string required The gene segment for this alignment. One of V, D, J or C.
rev_comp boolean optional Alignment result is from the reverse complement of the query sequence.
call string required Allele assignment.
score number required Alignment score.
identity number optional Alignment fractional identity.
support number optional Alignment E-value, p-value, likelihood, probability or other similar measure of support for the segment assignment as defined by the alignment tool.
cigar string required Alignment CIGAR string.
sequence_start integer optional Start position of the segment in the query sequence (1-based closed interval).
sequence_end integer optional End position of the segment in the query sequence (1-based closed interval).
germline_start integer optional Alignment start position in the reference sequence (1-based closed interval).
germline_end integer optional Alignment end position in the reference sequence (1-based closed interval).
rank integer optional Alignment rank.
rearrangement_id string optional Identifier for the Rearrangement object. May be identical to sequence_id, but will usually be a univerally unique record locator for database applications.
rearrangement_set_id string optional Identifier for grouping Rearrangement objects.
germline_database string optional Source of germline V(D)J segments, with version number or date accessed. For example, ‘IMGT/GENE-DB 3.1.18 (15 March 2018)’.