cassiopeia.tl.estimate_mutation_rate#

cassiopeia.tl.estimate_mutation_rate(tree, continuous=True, assume_root_implicit_branch=True, layer=None)[source]#

Estimates the mutation rate from a tree and character matrix.

Infers the mutation rate using the proportion of the cell/character entries in the leaves that have a non-uncut (non-0) state and the node depth/the total time of the tree. The mutation rate is either a continuous or per-generation rate according to which lineages accumulate mutations.

In estimating the mutation rate, we use the observed proportion of mutated entries in the character matrix as an estimate of the probability that a mutation occurs on a lineage. Using this probability, we can then infer the mutation rate.

This function attempts to consume the observed mutation proportion as mutation_proportion in tree.parameters. If this field is not populated, it is inferred using get_proportion_of_mutation.

In the case where the rate is per-generation (probability a mutation occurs on an edge), it is estimated using:

mutated proportion = 1 - (1 - mutation_rate) ^ (average depth of tree)

In the case when the rate is continuous, it is estimated using:

mutated proportion = ExponentialCDF(average time of tree, mutation rate)

Note that these naive estimates perform better when the tree is ultrametric in depth or time. The average depth/lineage time of the tree is used as a proxy for the depth/total time when the tree is not ultrametric.

In the inference, we need to consider whether to assume an implicit root, specified by assume_root_implicit_branch. In the case where the tree does not have a single leading edge from the root representing the progenitor cell before cell division begins, this additional edge is added to the total time in calculating the estimate if assume_root_implicit_branch is True.

Parameters:
tree CassiopeiaTree

The CassiopeiaTree specifying the tree and the character matrix

continuous bool (default: True)

Whether to calculate a continuous mutation rate, accounting for branch lengths. Otherwise, calculates a discrete mutation rate using the node depths

assume_root_implicit_branch bool (default: True)

Whether to assume that there is an implicit branch leading from the root, if it doesn’t exist

layer str | NoneOptional[str] (default: None)

Layer to use for character matrix. If this is None, then the current character_matrix variable will be used.

Return type:

float

Returns:

The estimated mutation rate

Raises:

ParameterEstimateError if the mutation_proportion parameter is not – between 0 and 1