cassiopeia.sp.impute_alleles_from_spatial_data#
- cassiopeia.sp.impute_alleles_from_spatial_data(character_matrix, adata=None, spatial_graph=None, neighborhood_size=None, neighborhood_radius=30.0, imputation_hops=2, imputation_concordance=0.8, num_imputation_iterations=1, max_neighbor_distance=inf, coordinates=None, connect_key='spatial_connectivities')[source]#
Imputes data based on spatial location.
This procedure iterates over spots in a given anndata and imputes missing data based on spatial neigborhoods.
The procedure is an interative algorithm where for each iteration a missing character is imputed based on the neighborhood of a node. Node neighborhoods are built off of a spatial graph that can passed in directly or inferred from a specified spatial anndata. In the simplest example, node neighobrhoods are just the immediate neighbors of a node, but users can also include neighbors-of-neighbors (and neighbors-of-neighbors-of-neighbors, so on) using the imputation_hops argument. An imputation will be accepted if the fraction of neighbors that agree with an observed non-zero allele exceeds the specified imputation_concordance threshold (by default 0.8). This procedure can be repeated for several rounds, controlled by the num_imputation_iterations argument and in this way approximates a message-passing process.
- Parameters:
- character_matrix
DataFrame A character matrix of spots, constructed using a function like convert_allele_table_to_character_matrix.
- adata
AnnData|None(default:None) Anndata of spatially-resolved data. Only the spatial coordinates need to be stored, and this is used to construct a graph.
- spatial_graph
Graph|None(default:None) Optionally, the user can provide a spatial connectivity graph instead of passing in an adata.
- neighborhood_size
int|None(default:None) If a connectivitity graph is being constructed, this is the number of nearest neighbors to connect to a node. If both neighborhood_size and neighborhood_radius are passed in, neighborhood_size is preferred.
- neighborhood_radius
float(default:30.0) Intead of passing in neighborhood_size, this is the radius of the connectivity graph.
- imputation_hops
int(default:2) Number of adjacent node’s adjacencies to query. For example, if this is 2, this means that imputation is done not just on nearest neighbors of a given node, but also each nearest neighbor’s nearest neighbors.
- imputation_concordance
float(default:0.8) Fraction of votes that must agree in order to accept an imputation.
- num_imputation_iterations
int(default:1) Number of iterations for imputation procedure.
- max_neighbor_distance
float(default:inf) Maximum distance to neighbor to be used for imputation.
- coordinates
DataFrame|None(default:None) If an AnnData is not specified, and you wish to set an upper limit on the distance for spatial imputation, these coordinates can be passed to the imputation procedure.
- connect_key
str(default:'spatial_connectivities') Key used to store spatial connectivities in adata.obsp. This will be passed into the key_added argument of sq.gr.spatial_neighbors and an etnry in adata.obsp will be added of the form {connect_key}_connectivities.
- character_matrix
- Return type:
- Returns:
An imputed character matrix.