cassiopeia.pp.error_correct_intbcs_to_whitelist#

cassiopeia.pp.error_correct_intbcs_to_whitelist(input_df, whitelist, intbc_dist_thresh=1)[source]#

Corrects all intBCs to the provided whitelist.

This function can either take a list of whitelisted intBCs or a plaintext file containing these intBCs.

Parameters:
input_df DataFrame

Input DataFrame of alignments.

whitelist Union[str, List[str]]

May be either a single path to a plaintext file containing the barcode whitelist, one barcode per line, or a list of barcodes.

intbc_dist_thresh int (default: 1)

The threshold specifying the maximum Levenshtein distance between the read sequence and whitelist to be corrected.

Return type:

DataFrame

Returns:

A DataFrame of error corrected intBCs.