API Reference

Contents

API Reference#

Rearrangement Interface#

airr.read_rearrangement(filename, validate=False, debug=False)#

Open an iterator to read an AIRR rearrangements file

Parameters:
  • file (str) – path to the input file.

  • validate (bool) – whether to validate data as it is read, raising a ValidationError exception in the event of an error.

  • debug (bool) – debug flag. If True print debugging information to standard error.

Returns:

iterable reader class.

Return type:

airr.io.RearrangementReader

airr.create_rearrangement(filename, fields=None, debug=False)#

Create an empty AIRR rearrangements file writer

Parameters:
  • filename (str) – output file path.

  • fields (list) – additional non-required fields to add to the output.

  • debug (bool) – debug flag. If True print debugging information to standard error.

Returns:

open writer class.

Return type:

airr.io.RearrangementWriter

airr.derive_rearrangement(out_filename, in_filename, fields=None, debug=False)#

Create an empty AIRR rearrangements file with fields derived from an existing file

Parameters:
  • out_filename (str) – output file path.

  • in_filename (str) – existing file to derive fields from.

  • fields (list) – additional non-required fields to add to the output.

  • debug (bool) – debug flag. If True print debugging information to standard error.

Returns:

open writer class.

Return type:

airr.io.RearrangementWriter

airr.load_rearrangement(filename, validate=False, debug=False)#

Load the contents of an AIRR rearrangements file into a data frame

Parameters:
  • filename (str) – input file path.

  • validate (bool) – whether to validate data as it is read, raising a ValidationError exception in the event of an error.

  • debug (bool) – debug flag. If True print debugging information to standard error.

Returns:

Rearrangement records as rows of a data frame.

Return type:

pandas.DataFrame

airr.dump_rearrangement(dataframe, filename, debug=False)#

Write the contents of a data frame to an AIRR rearrangements file

Parameters:
  • dataframe (pandas.DataFrame) – data frame of rearrangement data.

  • filename (str) – output file path.

  • debug (bool) – debug flag. If True print debugging information to standard error.

Returns:

True if the file is written without error.

Return type:

bool

airr.merge_rearrangement(out_filename, in_filenames, drop=False, debug=False)#

Merge one or more AIRR rearrangements files

Parameters:
  • out_filename (str) – output file path.

  • in_filenames (list) – list of input files to merge.

  • drop (bool) – drop flag. If True then drop fields that do not exist in all input files, otherwise combine fields from all input files.

  • debug (bool) – debug flag. If True print debugging information to standard error.

Returns:

True if files were successfully merged, otherwise False.

Return type:

bool

airr.validate_rearrangement(filename, debug=False)#

Validates an AIRR rearrangements file

Parameters:
  • filename (str) – path of the file to validate.

  • debug (bool) – debug flag. If True print debugging information to standard error.

Returns:

True if files passed validation, otherwise False.

Return type:

bool

AIRR Data Model Interface#

airr.read_airr(filename, format=None, validate=False, model=True, debug=False, check_nullable=True)#

Load an AIRR Data file

Parameters:
  • filename (str) – path to the input file.

  • format (str) – input file format valid strings are “yaml” or “json”. If set to None, the file format will be automatically detected from the file extension.

  • validate (bool) – whether to validate data as it is read, raising a ValidationError exception in the event of a validation failure.

  • model (bool) – If True only validate objects defined in the AIRR DataFile schema. If False, attempt validation of all top-level objects. Ignored if validate=False.

  • debug (bool) – debug flag. If True print debugging information to standard error.

  • check_nullable (bool) – whether to check for nullable fields if validating the data.

Returns:

dictionary of AIRR Data objects.

Return type:

dict

airr.write_airr(filename, data, format=None, info=None, validate=False, model=True, debug=False, check_nullable=True)#

Write an AIRR Data file

Parameters:
  • filename (str) – path to the output file.

  • data (dict) – dictionary of AIRR Data Model objects.

  • format (str) – output file format valid strings are “yaml” or “json”. If set to None, the file format will be automatically detected from the file extension.

  • info (object) – info object to write. Will write current AIRR Schema info if not specified.

  • validate (bool) – whether to validate data before it is written, raising a ValidationError exception in the event of a validation failure.

  • model (bool) – If True only validate and write objects defined in the AIRR DataFile schema. If False, attempt validation and write of all top-level objects

  • debug (bool) – debug flag. If True print debugging information to standard error.

  • check_nullable (bool) – whether to check for nullable fields if validating the data.

Returns:

True if the file is written without error.

Return type:

bool

airr.validate_airr(data, model=True, debug=False, check_nullable=True)#

Validates an AIRR Data file

Parameters:
  • data (dict) – dictionary containing AIRR Data Model objects

  • model (bool) – If True only validate objects defined in the AIRR DataFile schema. If False, attempt validation of all top-level objects

  • debug (bool) – debug flag. If True print debugging information to standard error.

Returns:

True if files passed validation, otherwise False.

Return type:

bool

Classes#

class airr.io.RearrangementReader(handle, base=1, validate=False, debug=False)#

Iterator for reading Rearrangement objects in TSV format

fields#

field names in the input Rearrangement file.

Type:

list

external_fields#

list of fields in the input file that are not part of the Rearrangement definition.

Type:

list

__init__(handle, base=1, validate=False, debug=False)#

Initialization

Parameters:
  • handle (file) – file handle of the open Rearrangement file.

  • base (int) – one of 0 or 1 specifying the coordinate schema in the input file. If 1, then the file is assumed to contain 1-based closed intervals that will be converted to python style 0-based half-open intervals for known fields. If 0, then values will be unchanged.

  • validate (bool) – perform validation. If True then basic validation will be performed will reading the data. A ValidationError exception will be raised if an error is found.

  • debug (bool) – debug state. If True prints debug information.

Returns:

reader object.

Return type:

airr.io.RearrangementReader

__iter__()#

Iterator initializer

Returns:

airr.io.RearrangementReader

__next__()#

Next method

Returns:

parsed Rearrangement data.

Return type:

dict

close()#

Closes the Rearrangement file

next()#

Next method

class airr.io.RearrangementWriter(handle, fields=None, base=1, debug=False)#

Writer class for Rearrangement objects in TSV format

fields#

field names in the output Rearrangement file.

Type:

list

external_fields#

list of fields in the output file that are not part of the Rearrangement definition.

Type:

list

__init__(handle, fields=None, base=1, debug=False)#

Initialization

Parameters:
  • handle (file) – file handle of the open Rearrangements file.

  • fields (list) – list of non-required fields to add. May include fields undefined by the schema.

  • base (int) – one of 0 or 1 specifying the coordinate schema in the output file. Data provided to the write is assumed to be in python style 0-based half-open intervals. If 1, then data will be converted to 1-based closed intervals for known fields before writing. If 0, then values will be unchanged.

  • debug (bool) – debug state. If True prints debug information.

Returns:

writer object.

Return type:

airr.io.RearrangementWriter

close()#

Closes the Rearrangement file

write(row)#

Write a row to the Rearrangement file

Parameters:

row (dict) – row to write.

class airr.schema.Schema(definition)#

AIRR schema definitions

definition#

name of the schema definition.

info#

schema info.

Type:

collections.OrderedDict

properties#

field definitions.

Type:

collections.OrderedDict

required#

list of mandatory fields.

Type:

list

optional#

list of non-required fields.

Type:

list

false_values#

accepted string values for False.

Type:

list

true_values#

accepted values for True.

Type:

list

from_bool(value, validate=False)#

Converts a boolean to a string

Parameters:
  • value (bool) – logical value.

  • validate (bool) – when True raise a ValidationError for an invalid value. Otherwise, set invalid values to None.

Returns:

conversion of True or False or ‘T’ or ‘F’.

Return type:

str

Raises:

airr.ValidationError – raised if value is invalid when validate is set True.

pandas_types()#

Map of schema types to pandas types

Returns:

mapping dictionary for pandas types

Return type:

dict

spec(field)#

Get the properties for a field

Parameters:

name (str) – field name.

Returns:

definition for the field.

Return type:

collections.OrderedDict

template()#

Create an empty template object

Returns:

dictionary with all schema properties set as None or an empty list.

Return type:

collections.OrderedDict

to_bool(value, validate=False)#

Convert a string to a boolean

Parameters:
  • value (str) – logical value as a string.

  • validate (bool) – when True raise a ValidationError for an invalid value. Otherwise, set invalid values to None.

Returns:

conversion of the string to True or False.

Return type:

bool

Raises:

airr.ValidationError – raised if value is invalid when validate is set True.

to_float(value, validate=False)#

Converts a string to a float

Parameters:
  • value (str) – float value as a string.

  • validate (bool) – when True raise a ValidationError for an invalid value. Otherwise, set invalid values to None.

Returns:

conversion of the string to a float.

Return type:

float

Raises:

airr.ValidationError – raised if value is invalid when validate is set True.

to_int(value, validate=False)#

Converts a string to an integer

Parameters:
  • value (str) – integer value as a string.

  • validate (bool) – when True raise a ValidationError for an invalid value. Otherwise, set invalid values to None.

Returns:

conversion of the string to an integer.

Return type:

int

Raises:

airr.ValidationError – raised if value is invalid when validate is set True.

type(field)#

Get the type for a field

Parameters:

name (str) – field name.

Returns:

the type definition for the field

Return type:

str

validate_header(header)#

Validate header against the schema

Parameters:

header (list) – list of header fields.

Returns:

True if a ValidationError exception is not raised.

Return type:

bool

Raises:

airr.ValidationError – raised if header fails validation.

validate_object(obj, missing=True, nonairr=True, context=None, check_nullable=True)#

Validate Repertoire object data against schema

Parameters:
  • obj (dict) – dictionary containing a single repertoire object.

  • missing (bool) – provides warnings for missing optional fields.

  • (bool (nonairr) – provides warning for non-AIRR fields that cannot be validated.

  • context (string) – used by recursion to indicate place in object hierarchy

  • check_nullable (bool) – check if data complies with the required fields as determined by the nullable flag.

Returns:

True if a ValidationError exception is not raised.

Return type:

bool

Raises:

airr.ValidationError – raised if object fails validation.

validate_row(row)#

Validate Rearrangements row data against schema

Parameters:

row (dict) – dictionary containing a single record.

Returns:

True if a ValidationError exception is not raised.

Return type:

bool

Raises:

airr.ValidationError – raised if row fails validation.

Schema#

airr.schema.InfoSchema Schema object for the Info definition#

AIRR schema definitions

airr.schema.definition#

name of the schema definition.

airr.schema.info#

schema info.

Type:

collections.OrderedDict

airr.schema.properties#

field definitions.

Type:

collections.OrderedDict

airr.schema.required#

list of mandatory fields.

Type:

list

airr.schema.optional#

list of non-required fields.

Type:

list

airr.schema.false_values#

accepted string values for False.

Type:

list

airr.schema.true_values#

accepted values for True.

Type:

list

airr.schema.DataFileSchema Schema object for the DataFile definition#

AIRR schema definitions

airr.schema.definition#

name of the schema definition.

airr.schema.info#

schema info.

Type:

collections.OrderedDict

airr.schema.properties#

field definitions.

Type:

collections.OrderedDict

airr.schema.required#

list of mandatory fields.

Type:

list

airr.schema.optional#

list of non-required fields.

Type:

list

airr.schema.false_values#

accepted string values for False.

Type:

list

airr.schema.true_values#

accepted values for True.

Type:

list

airr.schema.AlignmentSchema Schema object for the Alignment definition#

AIRR schema definitions

airr.schema.definition#

name of the schema definition.

airr.schema.info#

schema info.

Type:

collections.OrderedDict

airr.schema.properties#

field definitions.

Type:

collections.OrderedDict

airr.schema.required#

list of mandatory fields.

Type:

list

airr.schema.optional#

list of non-required fields.

Type:

list

airr.schema.false_values#

accepted string values for False.

Type:

list

airr.schema.true_values#

accepted values for True.

Type:

list

airr.schema.RearrangementSchema Schema object for the Rearrangement definition#

AIRR schema definitions

airr.schema.definition#

name of the schema definition.

airr.schema.info#

schema info.

Type:

collections.OrderedDict

airr.schema.properties#

field definitions.

Type:

collections.OrderedDict

airr.schema.required#

list of mandatory fields.

Type:

list

airr.schema.optional#

list of non-required fields.

Type:

list

airr.schema.false_values#

accepted string values for False.

Type:

list

airr.schema.true_values#

accepted values for True.

Type:

list

airr.schema.RepertoireSchema Schema object for the Repertoire definition#

AIRR schema definitions

airr.schema.definition#

name of the schema definition.

airr.schema.info#

schema info.

Type:

collections.OrderedDict

airr.schema.properties#

field definitions.

Type:

collections.OrderedDict

airr.schema.required#

list of mandatory fields.

Type:

list

airr.schema.optional#

list of non-required fields.

Type:

list

airr.schema.false_values#

accepted string values for False.

Type:

list

airr.schema.true_values#

accepted values for True.

Type:

list

airr.schema.GermlineSetSchema Schema object for the Repertoire definition#

AIRR schema definitions

airr.schema.definition#

name of the schema definition.

airr.schema.info#

schema info.

Type:

collections.OrderedDict

airr.schema.properties#

field definitions.

Type:

collections.OrderedDict

airr.schema.required#

list of mandatory fields.

Type:

list

airr.schema.optional#

list of non-required fields.

Type:

list

airr.schema.false_values#

accepted string values for False.

Type:

list

airr.schema.true_values#

accepted values for True.

Type:

list

airr.schema.GenotypeSetSchema Schema object for the Repertoire definition#

AIRR schema definitions

airr.schema.definition#

name of the schema definition.

airr.schema.info#

schema info.

Type:

collections.OrderedDict

airr.schema.properties#

field definitions.

Type:

collections.OrderedDict

airr.schema.required#

list of mandatory fields.

Type:

list

airr.schema.optional#

list of non-required fields.

Type:

list

airr.schema.false_values#

accepted string values for False.

Type:

list

airr.schema.true_values#

accepted values for True.

Type:

list

Deprecated#

airr.load_repertoire(filename, validate=False, debug=False)#

Load an AIRR repertoire metadata file

Parameters:
  • filename (str) – path to the input file.

  • validate (bool) – whether to validate data as it is read, raising a ValidationError exception in the event of an error.

  • debug (bool) – debug flag. If True print debugging information to standard error.

Returns:

dictionary of AIRR Data objects.

Return type:

dict

Deprecated since version 1.4: Use read_airr() instead.

airr.write_repertoire(filename, repertoires, info=None, debug=False)#

Write an AIRR repertoire metadata file

Parameters:
  • file (str) – path to the output file.

  • repertoires (list) – array of repertoire objects.

  • info (object) – info object to write. Will write current AIRR Schema info if not specified.

  • debug (bool) – debug flag. If True print debugging information to standard error.

Returns:

True if the file is written without error.

Return type:

bool

Deprecated since version 1.4: Use write_airr() instead.

airr.validate_repertoire(filename, debug=False)#

Validates an AIRR repertoire metadata file

Parameters:
  • filename (str) – path of the file to validate.

  • debug (bool) – debug flag. If True print debugging information to standard error.

Returns:

True if files passed validation, otherwise False.

Return type:

bool

Deprecated since version 1.4: Use validate_airr() instead.

airr.repertoire_template()#

Return a blank repertoire object from the template. This object has the complete structure with all of the fields and all values set to None or empty string.

Returns:

empty repertoire object.

Return type:

object

Deprecated since version 1.4: Use schema.Schema.template() instead.