cassiopeia.tl.IIDExponentialMLE#

class cassiopeia.tl.IIDExponentialMLE(minimum_branch_length=0.01, relative_mutation_rates=None, verbose=False, solver='SCS')[source]#

MLE under a model of IID memoryless CRISPR/Cas9 mutations.

In more detail, this model assumes that CRISPR/Cas9 mutates each site independently and identically, with an exponential waiting time. The tree is assumed to have depth exactly 1, and the user can provide a minimum branch length. The MLE under this set of assumptions can be solved with a special kind of convex optimization problem known as an exponential cone program, which can be readily solved with off-the-shelf (open source) solvers.

This estimator requires that the ancestral characters be provided (these can be imputed with CassiopeiaTree’s reconstruct_ancestral_characters method if they are not known, which is usually the case for real data).

The estimated mutation rate(s) will be stored as an attribute called mutation_rate. The log-likelihood will be stored in an attribute called log_likelihood.

Missing states are treated as missing at random by the model.

Parameters:
minimum_branch_length float (default: 0.01)

Estimated branch lengths will be constrained to have length at least this value. By default it is set to 0.01, since the MLE tends to collapse mutationless edges to length 0.

relative_mutation_rates Optional[List[float]] (default: None)

List of positive floats of length equal to the number of character sites. Number at each character site indicates the relative mutation rate at that site. Must be fully specified or None in which case all sites are assumed to evolve at the same rate. None is the default value for this argument.

solver str (default: 'SCS')

Convex optimization solver to use. Can be “SCS”, “ECOS”, or “MOSEK”. Note that “MOSEK” solver should be installed separately.

verbose bool (default: False)

Verbosity level.

mutation_rate#

The estimated CRISPR/Cas9 mutation rate, assuming that the tree has depth exactly 1.

log_likelihood#

The log-likelihood of the training data under the estimated model.

Methods

estimate_branch_lengths(tree)[source]#

MLE under a model of IID memoryless CRISPR/Cas9 mutations.

The only caveat is that this method raises an IIDExponentialMLEError if the underlying convex optimization solver fails, or a ValueError if the character matrix is degenerate (fully mutated, or fully unmutated).

Raises:
Return type:

None

property log_likelihood#

The log-likelihood of the training data under the estimated model.

property mutation_rate#

The estimated CRISPR/Cas9 mutation rate(s) under the given model. If relative_mutation_rates is specified, we return a list of rates (one per site). Otherwise all sites have the same rate and that rate is returned.