Alignment Schema (Experimental)#

An Alignment is the output from a V(D)J assignment process for a single V, D, J, or C gene for a sequence. It is not necessary that the V(D)J assignment process performs a sequence alignment algorithm, as the schema can support any algorithmic process. Multiple Alignment records are supported and expected for a single sequence with context-dependent fields (score, identity, support, rank) for assessing the quality of assignments that can vary considerably in definition based on the methodology used.

Note, this schema definition is still experimental and should not be considered final.

File Format Specification#

The format specification describes the file format and details on how to structure this data.

Fields#

Download as TSV

Name	Type	Attributes	Definition
`sequence_id`	string	required, identifier, nullable	Unique query sequence identifier within the file. Most often this will be the input sequence header or a substring thereof, but may also be a custom identifier defined by the tool in cases where query sequences have been combined in some fashion prior to alignment.
`segment`	string	required, nullable	The segment for this alignment. One of V, D, J or C.
`rev_comp`	boolean	optional, nullable	Alignment result is from the reverse complement of the query sequence.
`call`	string	required, nullable	Gene assignment with allele.
`score`	number	required, nullable	Alignment score.
`identity`	number	optional, nullable	Alignment fractional identity.
`support`	number	optional, nullable	Alignment E-value, p-value, likelihood, probability or other similar measure of support for the gene assignment as defined by the alignment tool.
`cigar`	string	required, nullable	Alignment CIGAR string.
`sequence_start`	integer	optional, nullable	Start position of the segment in the query sequence (1-based closed interval).
`sequence_end`	integer	optional, nullable	End position of the segment in the query sequence (1-based closed interval).
`germline_start`	integer	optional, nullable	Alignment start position in the reference sequence (1-based closed interval).
`germline_end`	integer	optional, nullable	Alignment end position in the reference sequence (1-based closed interval).
`rank`	integer	optional, nullable	Alignment rank.
`rearrangement_id`	string	DEPRECATED	Identifier for the Rearrangement object. May be identical to sequence_id, but will usually be a universally unique record locator for database applications.
`data_processing_id`	string	optional, nullable	Identifier to the data processing object in the repertoire metadata for this rearrangement. If this field is empty than the primary data processing object is assumed.
`germline_database`	string	DEPRECATED	Source of germline V(D)J genes with version number or date accessed.

Alignment Schema (Experimental)

Contents

Alignment Schema (Experimental)#

File Format Specification#

Fields#