spatiomic.neighbor

Exposes neighborhood graph functions.

Classes

knn_graph

A class that exposes a static method for k-nearest neighbor graph construction.

snn_graph

A class that exposes a static method for shared nearest neighbor graph construction.

Package Contents

class spatiomic.neighbor.knn_graph

A class that exposes a static method for k-nearest neighbor graph construction.

classmethod create(data, batch=None, neighbor_count=20, distance_metric='euclidean', method='simple', accuracy='accurate', distance_max=None, job_count=-1, use_gpu=True)

Construct a k-nearest neighbor graph of the data.

Parameters:
  • data (Union[NDArray, Som]) – A channel-last array of the data points to be used for graph construction.

  • batch (Optional[Union[NDArray, List[int], List[str]]], optional) – The batch labels for the data points. Defaults to None.

  • neighbor_count (int, optional) – The neighbor count for the neighborhood graph. Defaults to 20.

  • distance_metric (Literal["euclidean", "manhattan", "correlation", "cosine"], optional) – The distance metric to be used for nearest neighbor calculation. Defaults to “euclidean”.

  • method (Literal["simple", "batch_balanced"], optional) – The method for nearest neighbor calculation. Defaults to “simple”.

  • accuracy (Literal["fast", "accurate"], optional) – The accuracy of the nearest neighbor calculation. Defaults to “accurate”.

  • distance_max (Optional[float], optional) – The maximum distance for nearest neighbor calculation. Currently only supported for the simple method. Defaults to None.

  • job_count (int, optional) – Parallelization core count when method is simple. Defaults to -1.

  • use_gpu (bool, optional) – Whether to use the GPU for nearest neighbor calculation if possible. Defaults to True.

Raises:
  • ValueError – Raised when the distance_max parameter is used with the batch_balanced method.

  • NotImplementedError – Raised when the specified neighborhood identification method has not been implemented.

Returns:

The neighborhood graph.

Return type:

Graph

static get_edges(neighbor_idx)

Create an edgelist from a neighbor index array.

Parameters:

neighbor_idx (NDArray) – The neighbor index array.

Returns:

The edgelist.

Return type:

List

class spatiomic.neighbor.snn_graph

A class that exposes a static method for shared nearest neighbor graph construction.

classmethod create(knn_graph, shared_count_threshold=1, fix_lonely_nodes=True, fix_lonely_nodes_method='distance', distance_data=None, distance_metric='euclidean', accuracy='accurate', job_count=-1, use_gpu=True)

Construct a shared nearest neighbor graph based on a k-nearest neighbor graph.

The shared nearest neighbor graph is an undirected graph where two nodes are connected if they have at least shared_count_threshold neighbors in common.

If fix_lonely_nodes is True, then the lonely nodes (nodes that don’t have any neighbors in the k-nearest neighbor graph) are connected to their nearest neighbor. The method for connecting lonely nodes can be specified with fix_lonely_nodes_method. If fix_lonely_nodes_method is random, then the lonely nodes are connected to a random neighbor. If fix_lonely_nodes_method is distance, then the lonely nodes are connected to their nearest neighbor based on the distance between them.

Warning

This method is not optimized for batch-balanced nearest neighbor calculation.

Parameters:
  • knn_graph (Graph) – A k-nearest neighbor graph.

  • shared_count_threshold (int, optional) – The minimum number of shared neighbors for two nodes to be connected. Defaults to 1.

  • fix_lonely_nodes (bool, optional) – Whether to connect lonely nodes to the nearest neighbor. Defaults to True.

  • fix_lonely_nodes_method (Literal["random", "distance"], optional) – The method to use for connecting lonely nodes. Defaults to “distance”.

  • distance_data (Union[NDArray, Som]) – The channel-last array of the data points used for knn graph construction. Used for distance calculation if fix_lonely_nodes_method is distance. Defaults to None.

  • distance_metric (Literal["euclidean", "manhattan", "correlation", "cosine"], optional) – The distance metric to be used for nearest neighbor calculation. Defaults to “euclidean”.

  • accuracy (Literal["fast", "accurate"], optional) – The accuracy of the nearest neighbor calculation. Used only when method is simple or batch_balanced. Defaults to “accurate”.

  • job_count (int, optional) – Parallelization core count when method is simple. Defaults to -1.

  • use_gpu (bool, optional) – Whether to use the GPU for nearest neighbor calculation if possible. Defaults to True.

Returns:

The shared nearest neighbor graph.

Return type:

Graph

Raises:
  • ValueError – Raised when the distance data is not provided when using the distance method.

  • NotImplementedError – Raised when the specified method for connecting lonely nodes is not supported.

static get_shared_neighbors(nodes, neighbors, shared_count_threshold=1)

Get the shared neighbors between nodes.

Parameters:
  • nodes (List[int]) – The nodes.

  • neighbors (List[set]) – The neighbors of each node.

  • shared_count_threshold (int, optional) – The minimum number of shared neighbors for two nodes to be linked. Defaults to 1.

Returns:

A list of edges.

Return type:

List[tuple]