cassiopeia.data.sample_bootstrap_allele_tables#

cassiopeia.data.sample_bootstrap_allele_tables(allele_table, indel_priors=None, num_bootstraps=10, random_state=None, cut_sites=None)[source]#

Generates bootstrap character matrices from an allele table.

This function will take in an allele table, generated with the Cassiopeia preprocess pipeline and produce several bootstrap character matrices with respect to intBCs rather than individual cut-sites as in sample_bootstrap_character_matrices. This is useful because oftentimes there are dependencies between cut-sites on the same intBC TargetSite.

Parameters:
allele_table DataFrame

AlleleTable from the Cassiopeia preprocessing pipeline

indel_priors Optional[DataFrame] (default: None)

A dataframe mapping indel identities to prior probabilities

num_bootstraps int (default: 10)

number of bootstrap samples to create

random_state Optional[RandomState] (default: None)

A numpy random state for reproducibility.

cut_sites Optional[List[str]] (default: None)

Columns in the AlleleTable to treat as cut sites. If None, we assume that the cut-sites are denoted by columns of the form “r{int}” (e.g. “r1”)

Return type:

List[Tuple[DataFrame, Dict[int, Dict[int, float]], Dict[int, Dict[int, str]], List[str]]]

Returns:

A list of bootstrap samples in the form of tuples

(bootstrapped character matrix, prior dictionary, state to indel mapping, bootstrapped intBC set)