cassiopeia.data.sample_bootstrap_allele_tables#
- cassiopeia.data.sample_bootstrap_allele_tables(allele_table, indel_priors=None, num_bootstraps=10, random_state=None, cut_sites=None)[source]#
Generates bootstrap character matrices from an allele table.
This function will take in an allele table, generated with the Cassiopeia preprocess pipeline and produce several bootstrap character matrices with respect to intBCs rather than individual cut-sites as in sample_bootstrap_character_matrices. This is useful because oftentimes there are dependencies between cut-sites on the same intBC TargetSite.
- Parameters:
- allele_table
DataFrame
AlleleTable from the Cassiopeia preprocessing pipeline
- indel_priors
Optional
[DataFrame
] (default:None
) A dataframe mapping indel identities to prior probabilities
- num_bootstraps
int
(default:10
) number of bootstrap samples to create
- random_state
Optional
[RandomState
] (default:None
) A numpy random state for reproducibility.
- cut_sites
Optional
[List
[str
]] (default:None
) Columns in the AlleleTable to treat as cut sites. If None, we assume that the cut-sites are denoted by columns of the form “r{int}” (e.g. “r1”)
- allele_table
- Return type:
List
[Tuple
[DataFrame
,Dict
[int
,Dict
[int
,float
]],Dict
[int
,Dict
[int
,str
]],List
[str
]]]- Returns:
- A list of bootstrap samples in the form of tuples
(bootstrapped character matrix, prior dictionary, state to indel mapping, bootstrapped intBC set)