cassiopeia.sp.impute_alleles_from_spatial_data#

cassiopeia.sp.impute_alleles_from_spatial_data(character_matrix, adata=None, spatial_graph=None, neighborhood_size=None, neighborhood_radius=30.0, imputation_hops=2, imputation_concordance=0.8, num_imputation_iterations=1, max_neighbor_distance=inf, coordinates=None, connect_key='spatial_connectivities')[source]#

Imputes data based on spatial location.

This procedure iterates over spots in a given anndata and imputes missing data based on spatial neigborhoods.

The procedure is an interative algorithm where for each iteration a missing character is imputed based on the neighborhood of a node. Node neighborhoods are built off of a spatial graph that can passed in directly or inferred from a specified spatial anndata. In the simplest example, node neighobrhoods are just the immediate neighbors of a node, but users can also include neighbors-of-neighbors (and neighbors-of-neighbors-of-neighbors, so on) using the imputation_hops argument. An imputation will be accepted if the fraction of neighbors that agree with an observed non-zero allele exceeds the specified imputation_concordance threshold (by default 0.8). This procedure can be repeated for several rounds, controlled by the num_imputation_iterations argument and in this way approximates a message-passing process.

Parameters:
character_matrix DataFrame

A character matrix of spots, constructed using a function like convert_allele_table_to_character_matrix.

adata AnnData | None (default: None)

Anndata of spatially-resolved data. Only the spatial coordinates need to be stored, and this is used to construct a graph.

spatial_graph Graph | None (default: None)

Optionally, the user can provide a spatial connectivity graph instead of passing in an adata.

neighborhood_size int | None (default: None)

If a connectivitity graph is being constructed, this is the number of nearest neighbors to connect to a node. If both neighborhood_size and neighborhood_radius are passed in, neighborhood_size is preferred.

neighborhood_radius float (default: 30.0)

Intead of passing in neighborhood_size, this is the radius of the connectivity graph.

imputation_hops int (default: 2)

Number of adjacent node’s adjacencies to query. For example, if this is 2, this means that imputation is done not just on nearest neighbors of a given node, but also each nearest neighbor’s nearest neighbors.

imputation_concordance float (default: 0.8)

Fraction of votes that must agree in order to accept an imputation.

num_imputation_iterations int (default: 1)

Number of iterations for imputation procedure.

max_neighbor_distance float (default: inf)

Maximum distance to neighbor to be used for imputation.

coordinates DataFrame | None (default: None)

If an AnnData is not specified, and you wish to set an upper limit on the distance for spatial imputation, these coordinates can be passed to the imputation procedure.

connect_key str (default: 'spatial_connectivities')

Key used to store spatial connectivities in adata.obsp. This will be passed into the key_added argument of sq.gr.spatial_neighbors and an etnry in adata.obsp will be added of the form {connect_key}_connectivities.

Return type:

DataFrame

Returns:

An imputed character matrix.