Alignment Schema (Experimental)#
An Alignment is the output from a V(D)J assignment process for a
single V, D, J, or C gene for a sequence. It is not necessary
that the V(D)J assignment process performs a sequence alignment
algorithm, as the schema can support any algorithmic process. Multiple
Alignment records are supported and expected for a single sequence
with context-dependent fields (score
, identity
, support
,
rank
) for assessing the quality of assignments that can vary
considerably in definition based on the methodology used.
Note, this schema definition is still experimental and should not be considered final.
File Format Specification#
The format specification describes the file format and details on how to structure this data.
Fields#
Name |
Type |
Attributes |
Definition |
---|---|---|---|
|
string |
required, identifier, nullable |
Unique query sequence identifier within the file. Most often this will be the input sequence header or a substring thereof, but may also be a custom identifier defined by the tool in cases where query sequences have been combined in some fashion prior to alignment. |
|
string |
required, nullable |
The segment for this alignment. One of V, D, J or C. |
|
boolean |
optional, nullable |
Alignment result is from the reverse complement of the query sequence. |
|
string |
required, nullable |
Gene assignment with allele. |
|
number |
required, nullable |
Alignment score. |
|
number |
optional, nullable |
Alignment fractional identity. |
|
number |
optional, nullable |
Alignment E-value, p-value, likelihood, probability or other similar measure of support for the gene assignment as defined by the alignment tool. |
|
string |
required, nullable |
Alignment CIGAR string. |
|
integer |
optional, nullable |
Start position of the segment in the query sequence (1-based closed interval). |
|
integer |
optional, nullable |
End position of the segment in the query sequence (1-based closed interval). |
|
integer |
optional, nullable |
Alignment start position in the reference sequence (1-based closed interval). |
|
integer |
optional, nullable |
Alignment end position in the reference sequence (1-based closed interval). |
|
integer |
optional, nullable |
Alignment rank. |
|
string |
DEPRECATED |
Identifier for the Rearrangement object. May be identical to sequence_id, but will usually be a universally unique record locator for database applications. |
|
string |
optional, nullable |
Identifier to the data processing object in the repertoire metadata for this rearrangement. If this field is empty than the primary data processing object is assumed. |
|
string |
DEPRECATED |
Source of germline V(D)J genes with version number or date accessed. |