cassiopeia.pp.compute_empirical_indel_priors#
- cassiopeia.pp.compute_empirical_indel_priors(allele_table, grouping_variables=['intBC'], cut_sites=None)[source]#
Computes indel prior probabilities.
Generates indel prior probabilities from the input allele table. The general idea behind this procedure is to count the number of times an indel independently occur. By default, we treat each intBC as an independent, which is true if the input allele table is a clonal population. Here, the procedure will count the number of intBCs that contain a particular indel and divide by the number of intBCs in the allele table. However, a user can be more nuanced in their analysis and group intBC by other variables, such as lineage group (this is especially useful if intBCs might occur several clonal populations). Then, the procedure will count the number of times an indel occurs in a unique lineage-intBC combination.
- Parameters:
- allele_table
DataFrame
AlleleTable
- grouping_variables
List
[str
] (default:['intBC']
) Variables to stratify data by, to treat as independent groups in counting indel occurrences. These must be columns in the allele table
- cut_sites
Optional
[List
[str
]] (default:None
) Columns in the AlleleTable to treat as cut sites. If None, we assume that the cut-sites are denoted by columns of the form “r{int}” (e.g. “r1”)
- allele_table
- Return type:
- Returns:
A DataFrame mapping indel identities to the probability.