cassiopeia.solver.HybridSolver#
- class cassiopeia.solver.HybridSolver(top_solver, bottom_solver, lca_cutoff=None, cell_cutoff=None, threads=1, prior_transformation='negative_log')[source]#
The Hybrid Cassiopeia solver.
HybridSolver is an class representing the structure of Cassiopeia Hybrid inference algorithms. The solver procedure contains logic for building tree starting with a top-down greedy algorithm until a predetermined criteria is reached at which point a more complex algorithm is used to reconstruct each subproblem. The top-down algorithm _must_ be a subclass of a GreedySolver as it must have functions find_split and perform_split. The solver employed at the bottom of the tree can be any CassiopeiaSolver subclass and need only have a solve method.
- Parameters:
- top_solver
GreedySolver
An algorithm to be applied at the top of the tree. Must be a subclass of GreedySolver.
- bottom_solver
CassiopeiaSolver
An algorithm to be applied at the bottom of the tree. Must be a subclass of CassiopeiaSolver.
- lca_cutoff
float
|None
Optional
[float
] (default:None
) Distance to the latest-common-ancestor (LCA) of a subclade to be used as a cutoff for transitioning to the bottom solver.
- cell_cutoff
int
|None
Optional
[int
] (default:None
) Number of cells in a subclade to be used as a cutoff for transitioning to the bottom solver.
- threads
int
(default:1
) Number of threads to be used. This corresponds to the number of subproblems to be run concurrently with the bottom solver.
- prior_transformation
str
(default:'negative_log'
) Function to use when transforming priors into weights. Supports the following transformations:
- ”negative_log”: Transforms each probability by the negative
log (default)
”inverse”: Transforms each probability p by taking 1/p “square_root_inverse”: Transforms each probability by the
the square root of 1/p
- top_solver
Methods
- solve(cassiopeia_tree, layer=None, collapse_mutationless_edges=False, logfile='stdout.log')[source]#
The general hybrid solver routine.
The hybrid solver proceeds by clustering together cells using the algorithm stored in the top_solver until a criteria is reached. Once this criteria is reached, the bottom_solver is applied to each subproblem left over from the “greedy” clustering.
- Parameters:
- cassiopeia_tree
CassiopeiaTree
CassiopeiaTree that stores the character matrix and priors for reconstruction.
- layer
str
|None
Optional
[str
] (default:None
) Layer storing the character matrix for solving. If None, the default character matrix is used in the CassiopeiaTree.
- collapse_mutationless_edges
bool
(default:False
) Indicates if the final reconstructed tree should collapse mutationless edges based on internal states inferred by Camin-Sokal parsimony. In scoring accuracy, this removes artifacts caused by arbitrarily resolving polytomies.
- logfile
str
(default:'stdout.log'
) Location to log progress.
- cassiopeia_tree
- apply_top_solver(character_matrix, samples, tree, node_name_generator, weights=None, missing_state_indicator=-1, root=None)[source]#
Applies the top solver to samples.
A helper method for applying the top solver to the samples until a criteria is hit. Subproblems and the root ID are returned.
- Parameters:
- character_matrix
DataFrame
Character matrix
- samples
List
[str
] Samples in the subclade of interest.
- tree
DiGraph
In progress tree for the HybridSolver.
- node_name_generator
Generator
[str
,None
,None
] Generator for creating unique node names while applying the top-solver.
- weights {
int
: {int
:float
}} |None
Optional
[Dict
[int
,Dict
[int
,float
]]] (default:None
) Weights of character-state combinations, derived from priors if these are available.
- missing_state_indicator
int
(default:-1
) Indicator for missing data
- root
int
|None
Optional
[int
] (default:None
) Node ID of the root in the subtree containing the samples.
- character_matrix
- Return type:
- Returns:
- The ID of the node serving as the root of the tree containing the
samples, and a list of subproblems in the form [subtree-root, subtree-samples].
- apply_bottom_solver(cassiopeia_tree, root, samples=typing.List[str], logfile='stdout.log', layer=None)[source]#
Apply the bottom solver to subproblems.
A private method for solving subproblems identified by the top-down solver with the more precise bottom solver for this instantation of the HybridSolver. This function will create a unique log file, based on the root, set up a new instance of the bottom solver and solve the subproblem.
The function will return a tree for the subproblem and the identifier of the root of the tree.
- Parameters:
- cassiopeia_tree
CassiopeiaTree
CassiopeiaTree for the entire dataset. This will be subsetted with respect to the samples specified.
- root
int
Identifier of the root in the master tree
- samples default:
typing.List[str]
A list of samples for which to infer a tree.
- logfile
str
(default:'stdout.log'
) Base location for logging output. A specific logfile will be created from this base logfile name.
- layer
str
|None
Optional
[str
] (default:None
) Layer storing the character matrix for solving. If None, the default character matrix is used in the CassiopeiaTree.
- cassiopeia_tree
- Return type:
- Returns:
- A tree in the form of a Networkx graph and the original root
identifier