cassiopeia.tl.compute_morans_i#

cassiopeia.tl.compute_morans_i(tree, meta_columns=None, X=None, W=None, inverse_weight_fn=<function <lambda>>)[source]#

Computes Moran’s I statistic.

Using the cross-correlation between leaves as specified on the tree, compute the Moran’s I statistic for each of the data items specified. This will only work for numerical data, and will thrown an error otherwise.

Generally, this statistic takes in a weight matrix (which can be computed directly from a phylogenetic tree) and a set of numerical observations that are centered and standardized (i.e., mean 0 and population standard deviation of 1). Then, the Moran’s I statistic is:

I = X’ * Wn * X

where X’ denotes a tranpose, * denotes the matrix multiplier, and Wn is the normalized weight matrix such that sum([w_i,j for all i,j]) = 1.

Inspired from the tools and code used in Chaligne et al, Nature Genetics 2021.

The mathematical details of the statistic can be found in:

Wartenberg, “Multivariate Spatial Correlation: A Method for Exploratory Geographical Analysis”, Geographical Analysis (1985)

Parameters:
tree CassiopeiaTree

CassiopeiaTree

meta_columns Optional[List] (default: None)

Columns in the Cassiopeia Tree :attr:cell_meta object for which to compute autocorrelations

X Optional[DataFrame] (default: None)

Extra data matrix for computing autocorrelations.

W Optional[DataFrame] (default: None)

Phylogenetic weight matrix. If this is not specified, then the weight matrix will be computed within the function.

inverse_weight_fn Callable[[Union[int, float]], float] (default: <function <lambda> at 0x7ff862fa6c10>)

Inverse function to apply to the weights, if the weight matrix must be computed.

Return type:

Union[float, DataFrame]

Returns:

Moran’s I statistic