cassiopeia.pp.call_lineage_groups#

cassiopeia.pp.call_lineage_groups(input_df, output_directory, min_umi_per_cell=10, min_avg_reads_per_umi=2.0, min_cluster_prop=0.005, min_intbc_thresh=0.05, inter_doublet_threshold=0.35, kinship_thresh=0.25, plot=False)[source]#

Assigns cells to their clonal populations.

Performs multiple rounds of filtering and assigning to lineage groups:

1. Iteratively generates putative lineage groups by forming intBC groups for each lineage group and then assigning cells based on how many intBCs they share with each intBC group (kinship).

2. Refines these putative groups by removing non-informative intBCs and reassigning cells through kinship.

3. Removes all inter-lineage doublets, defined as cells that have relatively equal kinship scores across multiple lineages and whose assignments are therefore ambigious.

4. Finally, performs one more round of filtering non-informative intBCs and cellBCs with low UMI counts before returning a final table of lineage assignments, allele information, and read and umi counts for each sample.

Parameters:
input_df DataFrame

The allele table of cellBC-UMI-allele groups to be annotated with lineage assignments

output_directory str

The folder to store the final table as well as plots

min_umi_per_cell int (default: 10)

The threshold specifying the minimum number of UMIs a cell needs in order to not be filtered during filtering

min_avg_reads_per_umi float (default: 2.0)

The threshold specifying the minimum coverage (i.e. average) reads per UMI in a cell needed in order for that cell not to be filtered during filtering

min_cluster_prop float (default: 0.005)

The minimum cluster size in the putative lineage assignment step, as a proportion of the number of cells

min_intbc_thresh float (default: 0.05)

The threshold specifying the minimum proportion of cells in a lineage group that need to have an intBC in order for it be retained during filtering. Also specifies the minimum proportion of cells that share an intBC with the most frequent intBC in forming putative lineage groups

inter_doublet_threshold float (default: 0.35)

The threshold specifying the minimum proportion of kinship a cell shares with its assigned lineage group out of all lineage groups for it to be retained during doublet filtering

kinship_thresh float (default: 0.25)

The threshold specifying the minimum proportion of intBCs shared between a cell and the intBC set of a lineage group needed to assign that cell to that lineage group in putative assignment

plot bool (default: False)

Indicates whether to generate plots

Return type:

DataFrame

Returns:

None, saves output allele table to file.