API Reference

Rearrangement Interface

airr.read_rearrangement(filename, validate=False, debug=False)

Open an iterator to read an AIRR rearrangements file

Parameters
  • file (str) – path to the input file.

  • validate (bool) – whether to validate data as it is read, raising a ValidationError exception in the event of an error.

  • debug (bool) – debug flag. If True print debugging information to standard error.

Returns

iterable reader class.

Return type

airr.io.RearrangementReader

airr.create_rearrangement(filename, fields=None, debug=False)

Create an empty AIRR rearrangements file writer

Parameters
  • filename (str) – output file path.

  • fields (list) – additional non-required fields to add to the output.

  • debug (bool) – debug flag. If True print debugging information to standard error.

Returns

open writer class.

Return type

airr.io.RearrangementWriter

airr.derive_rearrangement(out_filename, in_filename, fields=None, debug=False)

Create an empty AIRR rearrangements file with fields derived from an existing file

Parameters
  • out_filename (str) – output file path.

  • in_filename (str) – existing file to derive fields from.

  • fields (list) – additional non-required fields to add to the output.

  • debug (bool) – debug flag. If True print debugging information to standard error.

Returns

open writer class.

Return type

airr.io.RearrangementWriter

airr.load_rearrangement(filename, validate=False, debug=False)

Load the contents of an AIRR rearrangements file into a data frame

Parameters
  • filename (str) – input file path.

  • validate (bool) – whether to validate data as it is read, raising a ValidationError exception in the event of an error.

  • debug (bool) – debug flag. If True print debugging information to standard error.

Returns

Rearrangement records as rows of a data frame.

Return type

pandas.DataFrame

airr.dump_rearrangement(dataframe, filename, debug=False)

Write the contents of a data frame to an AIRR rearrangements file

Parameters
  • dataframe (pandas.DataFrame) – data frame of rearrangement data.

  • filename (str) – output file path.

  • debug (bool) – debug flag. If True print debugging information to standard error.

Returns

True if the file is written without error.

Return type

bool

airr.merge_rearrangement(out_filename, in_filenames, drop=False, debug=False)

Merge one or more AIRR rearrangements files

Parameters
  • out_filename (str) – output file path.

  • in_filenames (list) – list of input files to merge.

  • drop (bool) – drop flag. If True then drop fields that do not exist in all input files, otherwise combine fields from all input files.

  • debug (bool) – debug flag. If True print debugging information to standard error.

Returns

True if files were successfully merged, otherwise False.

Return type

bool

airr.validate_rearrangement(filename, debug=False)

Validates an AIRR rearrangements file

Parameters
  • filename (str) – path of the file to validate.

  • debug (bool) – debug flag. If True print debugging information to standard error.

Returns

True if files passed validation, otherwise False.

Return type

bool

Repertoire Interface

airr.load_repertoire(filename, validate=False, debug=False)

Load an AIRR repertoire metadata file

Parameters
  • filename (str) – path to the input file.

  • validate (bool) – whether to validate data as it is read, raising a ValidationError exception in the event of an error.

  • debug (bool) – debug flag. If True print debugging information to standard error.

Returns

list of Repertoire dictionaries.

Return type

list

airr.write_repertoire(filename, repertoires, info=None, debug=False)

Write an AIRR repertoire metadata file

Parameters
  • file (str) – path to the output file.

  • repertoires (list) – array of repertoire objects.

  • info (object) – info object to write. Will write current AIRR Schema info if not specified.

  • debug (bool) – debug flag. If True print debugging information to standard error.

Returns

True if the file is written without error.

Return type

bool

airr.validate_repertoire(filename, debug=False)

Validates an AIRR repertoire metadata file

Parameters
  • filename (str) – path of the file to validate.

  • debug (bool) – debug flag. If True print debugging information to standard error.

Returns

True if files passed validation, otherwise False.

Return type

bool

airr.repertoire_template()

Return a blank repertoire object from the template. This object has the complete structure with all of the fields and all values set to None or empty string.

Returns

empty repertoire object.

Return type

object

Classes

class airr.io.RearrangementReader(handle, base=1, validate=False, debug=False)

Iterator for reading Rearrangement objects in TSV format

fields

field names in the input Rearrangement file.

Type

list

external_fields

list of fields in the input file that are not part of the Rearrangement definition.

Type

list

__init__(handle, base=1, validate=False, debug=False)

Initialization

Parameters
  • handle (file) – file handle of the open Rearrangement file.

  • base (int) – one of 0 or 1 specifying the coordinate schema in the input file. If 1, then the file is assumed to contain 1-based closed intervals that will be converted to python style 0-based half-open intervals for known fields. If 0, then values will be unchanged.

  • validate (bool) – perform validation. If True then basic validation will be performed will reading the data. A ValidationError exception will be raised if an error is found.

  • debug (bool) – debug state. If True prints debug information.

Returns

reader object.

Return type

airr.io.RearrangementReader

__iter__()

Iterator initializer

Returns

airr.io.RearrangementReader

__next__()

Next method

Returns

parsed Rearrangement data.

Return type

dict

close()

Closes the Rearrangement file

next()

Next method

class airr.io.RearrangementWriter(handle, fields=None, base=1, debug=False)

Writer class for Rearrangement objects in TSV format

fields

field names in the output Rearrangement file.

Type

list

external_fields

list of fields in the output file that are not part of the Rearrangement definition.

Type

list

__init__(handle, fields=None, base=1, debug=False)

Initialization

Parameters
  • handle (file) – file handle of the open Rearrangements file.

  • fields (list) – list of non-required fields to add. May include fields undefined by the schema.

  • base (int) – one of 0 or 1 specifying the coordinate schema in the output file. Data provided to the write is assumed to be in python style 0-based half-open intervals. If 1, then data will be converted to 1-based closed intervals for known fields before writing. If 0, then values will be unchanged.

  • debug (bool) – debug state. If True prints debug information.

Returns

writer object.

Return type

airr.io.RearrangementWriter

close()

Closes the Rearrangement file

write(row)

Write a row to the Rearrangement file

Parameters

row (dict) – row to write.

class airr.schema.Schema(definition)

AIRR schema definitions

properties

field definitions.

Type

collections.OrderedDict

info

schema info.

Type

collections.OrderedDict

required

list of mandatory fields.

Type

list

optional

list of non-required fields.

Type

list

false_values

accepted string values for False.

Type

list

true_values

accepted values for True.

Type

list

from_bool(value, validate=False)

Converts a boolean to a string

Parameters
  • value (bool) – logical value.

  • validate (bool) – when True raise a ValidationError for an invalid value. Otherwise, set invalid values to None.

Returns

conversion of True or False or ‘T’ or ‘F’.

Return type

str

Raises

airr.ValidationError – raised if value is invalid when validate is set True.

pandas_types()
Returns

mapping dictionary for pandas types

Return type

dict

spec(field)

Get the properties for a field

Parameters

name (str) – field name.

Returns

definition for the field.

Return type

collections.OrderedDict

to_bool(value, validate=False)

Convert a string to a boolean

Parameters
  • value (str) – logical value as a string.

  • validate (bool) – when True raise a ValidationError for an invalid value. Otherwise, set invalid values to None.

Returns

conversion of the string to True or False.

Return type

bool

Raises

airr.ValidationError – raised if value is invalid when validate is set True.

to_float(value, validate=False)

Converts a string to a float

Parameters
  • value (str) – float value as a string.

  • validate (bool) – when True raise a ValidationError for an invalid value. Otherwise, set invalid values to None.

Returns

conversion of the string to a float.

Return type

float

Raises

airr.ValidationError – raised if value is invalid when validate is set True.

to_int(value, validate=False)

Converts a string to an integer

Parameters
  • value (str) – integer value as a string.

  • validate (bool) – when True raise a ValidationError for an invalid value. Otherwise, set invalid values to None.

Returns

conversion of the string to an integer.

Return type

int

Raises

airr.ValidationError – raised if value is invalid when validate is set True.

type(field)

Get the type for a field

Parameters

name (str) – field name.

Returns

the type definition for the field

Return type

str

validate_header(header)

Validate header against the schema

Parameters

header (list) – list of header fields.

Returns

True if a ValidationError exception is not raised.

Return type

bool

Raises

airr.ValidationError – raised if header fails validation.

validate_object(obj, missing=True, nonairr=True, context=None)

Validate Repertoire object data against schema

Parameters
  • obj (dict) – dictionary containing a single repertoire object.

  • missing (bool) – provides warnings for missing optional fields.

  • (bool (nonairr) – provides warning for non-AIRR fields that cannot be validated.

  • context (string) – used by recursion to indicate place in object hierarchy

Returns

True if a ValidationError exception is not raised.

Return type

bool

Raises

airr.ValidationError – raised if object fails validation.

validate_row(row)

Validate Rearrangements row data against schema

Parameters

row (dict) – dictionary containing a single record.

Returns

True if a ValidationError exception is not raised.

Return type

bool

Raises

airr.ValidationError – raised if row fails validation.

Schema

airr.schema.AlignmentSchema Schema object for the Alignment definition

AIRR schema definitions

airr.schema.properties

field definitions.

Type

collections.OrderedDict

airr.schema.info

schema info.

Type

collections.OrderedDict

airr.schema.required

list of mandatory fields.

Type

list

airr.schema.optional

list of non-required fields.

Type

list

airr.schema.false_values

accepted string values for False.

Type

list

airr.schema.true_values

accepted values for True.

Type

list

airr.schema.RearrangementSchema Schema object for the Rearrangement definition

AIRR schema definitions

airr.schema.properties

field definitions.

Type

collections.OrderedDict

airr.schema.info

schema info.

Type

collections.OrderedDict

airr.schema.required

list of mandatory fields.

Type

list

airr.schema.optional

list of non-required fields.

Type

list

airr.schema.false_values

accepted string values for False.

Type

list

airr.schema.true_values

accepted values for True.

Type

list

airr.schema.RepertoireSchema Schema object for the Repertoire definition

AIRR schema definitions

airr.schema.properties

field definitions.

Type

collections.OrderedDict

airr.schema.info

schema info.

Type

collections.OrderedDict

airr.schema.required

list of mandatory fields.

Type

list

airr.schema.optional

list of non-required fields.

Type

list

airr.schema.false_values

accepted string values for False.

Type

list

airr.schema.true_values

accepted values for True.

Type

list