cassiopeia.solver.NeighborJoiningSolver#
- class cassiopeia.solver.NeighborJoiningSolver(dissimilarity_function=<function weighted_hamming_distance>, add_root=False, prior_transformation='negative_log', fast=False, implementation='ccphylo_dnj', threads=1)[source]#
Neighbor-Joining class for Cassiopeia.
Implements the Neighbor-Joining algorithm described by Saitou and Nei (1987) as a derived class of DistanceSolver. This class inherits the generic solve method, but implements its own procedure for finding cherries by minimizing the Q-criterion between samples. If fast is set to True, a fast NJ implementation of is used.
- Parameters:
- dissimilarity_function
Optional
[Callable
[[array
,array
,int
,Dict
[int
,Dict
[int
,float
]]],float
]] (default:<function weighted_hamming_distance at 0x766614485700>
) A function by which to compute the dissimilarity map. Optional if a dissimilarity map is already provided.
- add_root
bool
(default:False
) Whether or not to add an implicit root the tree, i.e. a root with unmutated characters. If set to False, and no explicit root is provided in the CassiopeiaTree, then will return an unrooted, undirected tree
- prior_transformation
str
(default:'negative_log'
) Function to use when transforming priors into weights. Supports the following transformations:
- ”negative_log”: Transforms each probability by the negative
log (default)
”inverse”: Transforms each probability p by taking 1/p “square_root_inverse”: Transforms each probability by the
the square root of 1/p
- fast
bool
(default:False
) Whether to use a fast implementation of Neighbor-Joining.
- implementation
str
(default:'ccphylo_dnj'
) Which fast implementation to use. Options are: “ccphylo_dnj”: CCPhylo implementation the Dynamic Neighbor-Joining
algorithm described by Clausen (2023). Solution in guaranteed to be exact.
- ”ccphylo_hnj”: CCPhylo implementation of the Heuristic
Neighbor-Joining algorithm described by Clausen (2023). Solution is not guaranteed to be exact.
”ccphylo_nj”: CCPhylo implementation of the Neighbor-Joining.
- threads
int
(default:1
) Number of threads to use for solver.
- dissimilarity_function
- dissimilarity_function#
Function used to compute dissimilarity between samples.
- add_root#
Whether or not to add an implicit root the tree.
- prior_transformation#
Function to use when transforming priors into weights.
- threads#
Number of threads to use for solver.
Methods
- root_tree(tree, root_sample, remaining_samples)[source]#
Roots a tree produced by Neighbor-Joining at the specified root.
Uses the specified root to root the tree passed in
- find_cherry(dissimilarity_matrix)[source]#
Finds a pair of samples to join into a cherry.
Proceeds by minimizing the Q-criterion as in Saitou and Nei (1987) to select a pair of samples to join.
- static compute_q(dissimilarity_map)[source]#
Computes the Q-criterion for every pair of samples.
Computes the Q-criterion defined by Saitou and Nei (1987):
Q(i,j) = d(i, j) - 1/(n-2) (sum(d(i, :)) + sum(d(j,:)))
- Parameters:
- dissimilarity_map <class ‘int’>
A sample x sample dissimilarity map
- Return type:
array
- Returns:
A matrix storing the Q-criterion for every pair of samples.
- update_dissimilarity_map(dissimilarity_map, cherry, new_node)[source]#
Update dissimilarity map after finding a cherry.
Updates the dissimilarity map after joining together two nodes (m1, m2) at a cherry m. For all nodes v, the new dissimilarity map d’ is:
d’(m, v) = 0.5 * (d(v, m1) + d(v, m2) - d(m1, m2))
- Parameters:
- Return type:
- Returns:
A new dissimilarity map, updated with the new node
- setup_root_finder(cassiopeia_tree)[source]#
Defines the implicit rooting strategy for the NeighborJoiningSolver.
By default, the NeighborJoining algorithm returns an unrooted tree. To root this tree, an implicit root of all zeros is added to the character matrix. Then, the dissimilarity map is recalculated using the updated character matrix. If the tree already has a computed dissimilarity map, only the new dissimilarities are calculated.
- Parameters:
- cassiopeia_tree
CassiopeiaTree
Input CassiopeiaTree to solve
- cassiopeia_tree
- Return type: