Clone and Lineage Tree Schema (Experimental)#
A unique inferred clone object that has been constructed within a single data processing for a single repertoire and a subset of its sequences and/or rearrangements.
A clone object may have one or more inferred lineage trees. Each tree is represented by a Newick string for its edges and a dictionary of node objects.
File Format Specification#
The file format has not been specified yet.
Clone Fields#
Name |
Type |
Attributes |
Definition |
---|---|---|---|
|
string |
required, nullable |
Identifier for the clone. |
|
string |
optional, nullable |
Identifier to the associated repertoire in study metadata. |
|
string |
optional, nullable |
Identifier of the data processing object in the repertoire metadata for this clone. |
|
array of string |
optional, nullable |
List sequence_id strings that act as keys to the Rearrangement records for members of the clone. |
|
string |
optional, nullable |
V gene with allele of the inferred ancestral of the clone. For example, IGHV4-59*01. |
|
string |
optional, nullable |
D gene with allele of the inferred ancestor of the clone. For example, IGHD3-10*01. |
|
string |
optional, nullable |
J gene with allele of the inferred ancestor of the clone. For example, IGHJ4*02. |
|
string |
optional, nullable |
Nucleotide sequence for the junction region of the inferred ancestor of the clone, where the junction is defined as the CDR3 plus the two flanking conserved codons. |
|
string |
optional, nullable |
Amino acid translation of the junction. |
|
integer |
optional, nullable |
Number of nucleotides in the junction. |
|
integer |
optional, nullable |
Number of amino acids in junction_aa. |
|
string |
required, nullable |
Assembled, aligned, full-length inferred ancestor of the clone spanning the same region as the sequence_alignment field of nodes (typically the V(D)J region) and including the same set of corrections and spacers (if any). |
|
string |
optional, nullable |
Amino acid translation of germline_alignment. |
|
integer |
optional, nullable |
Start position in the V gene alignment in both the sequence_alignment and germline_alignment fields (1-based closed interval). |
|
integer |
optional, nullable |
End position in the V gene alignment in both the sequence_alignment and germline_alignment fields (1-based closed interval). |
|
integer |
optional, nullable |
Start position of the D gene alignment in both the sequence_alignment and germline_alignment fields (1-based closed interval). |
|
integer |
optional, nullable |
End position of the D gene alignment in both the sequence_alignment and germline_alignment fields (1-based closed interval). |
|
integer |
optional, nullable |
Start position of the J gene alignment in both the sequence_alignment and germline_alignment fields (1-based closed interval). |
|
integer |
optional, nullable |
End position of the J gene alignment in both the sequence_alignment and germline_alignment fields (1-based closed interval). |
|
integer |
optional, nullable |
Junction region start position in the alignment (1-based closed interval). |
|
integer |
optional, nullable |
Junction region end position in the alignment (1-based closed interval). |
|
integer |
optional, nullable |
Number of distinct UMIs observed across all sequences (Rearrangement records) in this clone. |
|
integer |
optional, nullable |
Absolute count of the size (number of members) of this clone in the repertoire. This could simply be the number of sequences (Rearrangement records) observed in this clone, the number of distinct cell barcodes (unique cell_id values), or a more sophisticated calculation appropriate to the experimental protocol. Absolute count is provided versus a frequency so that downstream analysis tools can perform their own normalization. |
|
string |
optional, nullable |
sequence_id of the seed sequence. Empty string (or null) if there is no seed sequence. |
Tree Fields#
Name |
Type |
Attributes |
Definition |
---|---|---|---|
|
string |
required, nullable |
Identifier for the tree. |
|
string |
required, nullable |
Identifier for the clone. |
|
string |
required, nullable |
Newick string of the tree edges. |
|
object |
optional, nullable |
Dictionary of nodes in the tree, keyed by sequence_id string |
Node Fields#
Name |
Type |
Attributes |
Definition |
---|---|---|---|
|
string |
required, nullable |
Identifier for this node that matches the identifier in the newick string and, where possible, the sequence_id in the source repertoire. |
|
string |
optional, nullable |
Nucleotide sequence of the node, aligned to the germline_alignment for this clone, including including any indel corrections or spacers. |
|
string |
optional, nullable |
Junction region nucleotide sequence for the node, where the junction is defined as the CDR3 plus the two flanking conserved codons. |
|
string |
optional, nullable |
Amino acid translation of the junction. |