cassiopeia.pp.resolve_umi_sequence#

cassiopeia.pp.resolve_umi_sequence(molecule_table, output_directory, min_umi_per_cell=10, min_avg_reads_per_umi=2.0, plot=True)[source]#

Resolve a consensus sequence for each UMI.

This procedure will perform UMI and cellBC filtering on the basis of reads per UMI and UMIs per cell and then assign the most abundant sequence to each UMI if there is a set of conflicting sequences per UMI.

Parameters:

molecule_table DataFrame: molecule table to resolve
output_directory str: Directory to store results
min_umi_per_cell int (default: 10): The threshold specifying the minimum number of UMIs in a cell needed to be retained during filtering
min_avg_reads_per_umi float (default: 2.0): The threshold specifying the minimum coverage (i.e. average) reads per UMI in a cell needed for that cell to be retained during filtering

Return type:

DataFrame

Returns:

A molecule table with unique mappings between cellBC-UMI pairs.