cassiopeia.solver.NeighborJoiningSolver

class cassiopeia.solver.NeighborJoiningSolver(dissimilarity_function=<function weighted_hamming_distance>, add_root=False, prior_transformation='negative_log')[source]

Neighbor-Joining class for Cassiopeia.

Implements the Neighbor-Joining algorithm described by Saitou and Nei (1987) as a derived class of DistanceSolver. This class inherits the generic solve method, but implements its own procedure for finding cherries by minimizing the Q-criterion between samples.

Parameters
dissimilarity_function : Callable[[array, array, int, Dict[int, Dict[int, float]]], float], NoneOptional[Callable[[array, array, int, Dict[int, Dict[int, float]]], float]] (default: <function weighted_hamming_distance at 0x7fa4d45d9598>)

A function by which to compute the dissimilarity map. Optional if a dissimilarity map is already provided.

add_root : boolbool (default: False)

Whether or not to add an implicit root the tree, i.e. a root with unmutated characters. If set to False, and no explicit root is provided in the CassiopeiaTree, then will return an unrooted, undirected tree

prior_transformation : strstr (default: 'negative_log')

Function to use when transforming priors into weights. Supports the following transformations:

”negative_log”: Transforms each probability by the negative

log (default)

”inverse”: Transforms each probability p by taking 1/p “square_root_inverse”: Transforms each probability by the

the square root of 1/p

dissimilarity_function

Function used to compute dissimilarity between samples.

add_root

Whether or not to add an implicit root the tree.

prior_transformation

Function to use when transforming priors into weights.

Methods

root_tree(tree, root_sample, remaining_samples)[source]

Roots a tree produced by Neighbor-Joining at the specified root.

Uses the specified root to root the tree passed in

Parameters
tree : GraphGraph

Networkx object representing the tree topology

root_sample : strstr

Sample to treat as the root

remaining_samples : List[str]List[str]

The last two unjoined nodes in the tree

Return type

DiGraphDiGraph

Returns

A rooted tree

find_cherry(dissimilarity_matrix)[source]

Finds a pair of samples to join into a cherry.

Proceeds by minimizing the Q-criterion as in Saitou and Nei (1987) to select a pair of samples to join.

Parameters
dissimilarity_matrix : arrayarray

A sample x sample dissimilarity matrix

Return type

Tuple[int, int]Tuple[int, int]

Returns

A tuple of intgers representing rows in the dissimilarity matrix

to join.

static compute_q(dissimilarity_map)[source]

Computes the Q-criterion for every pair of samples.

Computes the Q-criterion defined by Saitou and Nei (1987):

Q(i,j) = d(i, j) - 1/(n-2) (sum(d(i, :)) + sum(d(j,:)))

Parameters
dissimilarity_map : <class ‘int’><class ‘int’>

A sample x sample dissimilarity map

Return type

arrayarray

Returns

A matrix storing the Q-criterion for every pair of samples.

update_dissimilarity_map(dissimilarity_map, cherry, new_node)[source]

Update dissimilarity map after finding a cherry.

Updates the dissimilarity map after joining together two nodes (m1, m2) at a cherry m. For all nodes v, the new dissimilarity map d’ is:

d’(m, v) = 0.5 * (d(v, m1) + d(v, m2) - d(m1, m2))

Parameters
dissimilarity_map : DataFrameDataFrame

A dissimilarity map to update

cherry : Tuple[str, str]Tuple[str, str]

A tuple of indices in the dissimilarity map that are joining

new_node : strstr

New node name, to be added to the new dissimilarity map

Return type

DataFrameDataFrame

Returns

A new dissimilarity map, updated with the new node

setup_root_finder(cassiopeia_tree)[source]

Defines the implicit rooting strategy for the NeighborJoiningSolver.

By default, the NeighborJoining algorithm returns an unrooted tree. To root this tree, an implicit root of all zeros is added to the character matrix. Then, the dissimilarity map is recalculated using the updated character matrix. If the tree already has a computed dissimilarity map, only the new dissimilarities are calculated.

Parameters
cassiopeia_tree : CassiopeiaTreeCassiopeiaTree

Input CassiopeiaTree to solve

Return type

NoneNone