cassiopeia.solver.ILPSolver¶
- class cassiopeia.solver.ILPSolver(convergence_time_limit=12600, convergence_iteration_limit=0, maximum_potential_graph_layer_size=10000, maximum_potential_graph_lca_distance=None, weighted=False, seed=None, mip_gap=0.01, prior_transformation='negative_log')[source]¶
The Cassiopeia ILP-based maximum parsimony solver.
ILPSolver is a subclass of CassiopeiaSolver and implements the Cassiopeia-ILP algorithm described in Jones et al, Genome Biol 2020. The solver proceeds by constructing a tree over a network of possible evolutionary states known as the potential graph. The procedure for constructing this tree is done by solving for a Steiner Tree with an integer linear programming (ILP) optimization approach.
- Parameters
- convergence_time_limit :
intint(default:12600) Amount of time allotted to the ILP for
- Ignored if set to 0. : convergence.
- convergence_iteration_limit :
intint(default:0) Number of iterations allowed for ILP convergence. Ignored if set to 0.
- maximum_potential_graph_layer_size :
intint(default:10000) Maximum size allowed for an iteration of the potential graph inference procedure. If this is exceeded, we return the previous iteration’s graph or abort altogether.
- maximum_potential_graph_lca_distance :
int,NoneOptional[int] (default:None) Maximum height of LCA to add to the potential graph. If this parameter is not provided or the specified value is 0, the maximum distance between any pair of samples is used as the maximum lca height.
- weighted :
boolbool(default:False) Weight edges on the potential graph by the negative log likelihood of the mutations.
- seed :
int,NoneOptional[int] (default:None) Random seed to use during ILP optimization.
- mip_gap :
floatfloat(default:0.01) Objective gap for mixed integer linear programming problem.
- logfile
A file to log output to. This will contain information around the potential graph inference procedure as well as the Steiner Tree optimization.
- prior_transformation :
strstr(default:'negative_log') Function to use when transforming priors into weights. Supports the following transformations:
- ”negative_log”: Transforms each probability by the negative
log (default)
”inverse”: Transforms each probability p by taking 1/p “square_root_inverse”: Transforms each probability by the
the square root of 1/p
- convergence_time_limit :
Methods
- solve(cassiopeia_tree, layer=None, collapse_mutationless_edges=False, logfile='stdout.log')[source]¶
Infers a tree with Cassiopeia-ILP.
Solves a tree using the Cassiopeia-ILP algorithm and populates a tree in the provided CassiopeiaTree.
- Parameters
- cassiopeia_tree :
CassiopeiaTreeCassiopeiaTree Input CassiopeiaTree
- layer :
str,NoneOptional[str] (default:None) Layer storing the character matrix for solving. If None, the default character matrix is used in the CassiopeiaTree.
- collapse_mutationless_edges :
boolbool(default:False) Indicates if the final reconstructed tree should collapse mutationless edges based on internal states inferred by Camin-Sokal parsimony. In scoring accuracy, this removes artifacts caused by arbitrarily resolving polytomies.
- logfile :
strstr(default:'stdout.log') Location to log progress.
- cassiopeia_tree :
- infer_potential_graph(character_matrix, pid, lca_height, weights=None, missing_state_indicator=- 1)[source]¶
Infers a potential graph for the observed states.
Using the set of samples in the character matrix for this solver, this procedure creates a network which contains potential ancestors, or evolutionary intermediates.
This procedure invokes ilp_solver_utilities.infer_potential_graph_cython which returns the edges of the potential graph in character string format (e.g., “1|2|3|…”). The procedure here decodes these strings, creates a Networkx directed graph, and adds edges to the graph. These weights are added to the edges of the graph using priors, if they are specified in the CassiopeiaTree, or the number of mutations along an edge.
- Parameters
- character_matrix :
DataFrameDataFrame Character matrix
- root
Specified root node, represented as a list of character states
- pid :
intint Process ID for future reference
- lca_height :
intint Maximum lca height to consider for connecting nodes to an LCA
- weights : {
int: {int:str}},NoneOptional[Dict[int,Dict[int,str]]] (default:None) Weights for character-state pairs, derived from the priors if these are available.
- missing_state_indicator :
intint(default:-1) Indicator for missing data.
- character_matrix :
- Return type
DiGraphDiGraph- Returns
A potential graph represented by a directed graph.
- add_edge_weights(potential_graph, weights=None, missing_state_indicator=- 1)[source]¶
Annotates edges with the weight.
Given a graph where nodes are iterable entities containing character states, annotated edges with respect to the number of mutations. If a prior dictionary is passed into the constructor, the log likelihood of the mutations is added instead. These values are stored in the weight attribute of the networkx graph.
- Parameters
- Return type
DiGraphDiGraph- Returns
- The potential graph with edge weights added, stored in the weight
attribute.
- generate_steiner_model(potential_graph, root, targets)[source]¶
Generates a Gurobi instance for Steiner Tree inference.
Given a potential graph, a root to treat as the source, and a list of targets, create a Gurobi mixed integer linear programming instance for computing the Steiner Tree.
- Parameters
- potential_graph :
DiGraphDiGraph Potential graph representing the evolutionary space on which to solve for the Steiner Tree.
- root :
List[int]List[int] A node in the graph to treat as the source.
- targets :
List[List[int]]List[List[int]] A list of nodes in the tree that serve as targets for the Steiner Tree procedure.
- potential_graph :
- Returns
A Gurobipy Model instance and the edge variables involved.
- solve_steiner_instance(model, edge_variables, potential_graph, pid, logfile)[source]¶
Solves for a Steiner Tree from the Gurobi instance.
This function works with a model that has been specified via Gurobi, and will solve the model using the stopping criteria that the user has specified in this class instance.
- Parameters
- model
A Gurobi model corresponding to the Steiner Tree problem. This should be created with generate_steiner_model.
- edge_variables
Edge variables that were created during model generation. These are Gurobi variables that indicate whether two nodes are connected to one another in the Potential Graph; we use these variables to recreate a tree at the end from the Gurobi solution.
- potential_graph :
DiGraphDiGraph Potential Graph that was used as input to the Steiner Tree problem.
- pid :
intint Process ID
- logfile :
strstr Location to store standard out.
- Return type
- Returns
A list of solutions