cassiopeia.tl.IIDExponentialBayesian#

class cassiopeia.tl.IIDExponentialBayesian(mutation_rate, birth_rate, sampling_probability, discretization_level=600)[source]#

Inference under a subsampled Birth Process with IID memoryless mutations.

In more detail, his model assumes that CRISPR/Cas9 mutates each site independently and identically, with an exponential waiting time; and that the phylogeny follows a subsampled Birth Process. It further assumes that the tree has depth exactly 1. Conditional on the observed topology and on the character data, the posterior mean branch lengths are used as branch length estimates.

This estimator requires that the ancestral characters be provided (these can be imputed with CassiopeiaTree’s reconstruct_ancestral_characters method if they are not known, which is usually the case for real data).

This estimator also assumes that the tree is binary (except for the root, which should have degree 1).

Missing states are treated as missing at random by the model.

Parameters:
mutation_rate float

The CRISPR/Cas9 mutation rate.

birth_rate float

The phylogeny birth rate.

sampling_probability float

The probability that a leaf in the ground truth tree was sampled. Must be in (0, 1]

discretization_level int (default: 600)

How many timesteps are used to discretize time.

mutation_rate#

The CRISPR/Cas9 mutation rate.

birth_rate#

The phylogeny birth rate.

sampling_probability#

The probability that a leaf in the ground truth tree was sampled.

discretization_level#

How many timesteps are used to discretize time.

log_likelihood#

The log-likelihood of the training data under the model.

Methods

estimate_branch_lengths(tree)[source]#

Estimate branch lengths of the tree using the given model.

The tree must be binary except for the root, which should have degree 1.

This method raises a ValueError if the discretization_size is too small or the tree topology is not valid.

The computational complexity of this method is: O(discretization_level * tree.n_cell * tree.n_character)

Raises:
  • ValueError if discretization_size is too small or the tree topology

  • is not valid.

Return type:

None

property log_likelihood#

The log-likelihood of the training data under the estimated model.

log_joints(node)[source]#

log joint node time probabilities.

The log joint probability density of the observed tree topology, state vectors, and all possible times for the node. In other words: log P(node time = t, character states, tree topology) for t in [0, T] where T is the discretization_level.

Parameters:
node str

An internal node of the tree, for which to compute the posterior log joint.

Return type:

array

Returns:

log P(node time = t, character states, tree topology) for t in

[0, T], where T is the discretization_level.

posterior_time(node)[source]#

The posterior distribution of the time for a node.

The posterior time distribution of a node, numerically computed, i.e.: P(node time = t | character states, tree topology) for t in [0, T] where T is the discretization_level.

Parameters:
node str

An internal node of the CassiopeiaTree.

Return type:

array

Returns:

P(node time = t | character states, tree topology) for t in [0, T]

where T is the discretization_level.

Raises:

ValueError if the node is not an internal node.

property mutation_rate#

The CRISPR/Cas9 mutation rate.

property birth_rate#

The phylogeny birth rate.

property discretization_level#

How many timesteps are used to discretize time.

property sampling_probability#

The probability that a leaf in the ground truth tree was sampled.