spatiomic.tool¶

Expose helper functions in the tool submodule.

Functions¶

`count_clusters`(file_paths, cluster_count[, sort, ...])	Count the number of clusters in each image.
`get_stats`(data, group[, channel_names, comparison, ...])	Calculate the statistics for marker or cluster expression/abundance between groups.
`mean_cluster_intensity`(data, clusters[, channel_names])	Calculate the mean intensity for the channels of data for each label category.

Package Contents¶

spatiomic.tool.count_clusters(file_paths, cluster_count, sort=False, normalize=False)¶

Count the number of clusters in each image.

Parameters:

file_paths (List[str]) – The paths to the files to count the clusters in.
cluster_count (int) – The number of clusters to count.
sort (bool, optional) – Whether to sort the file paths. Defaults to False.
normalize (bool, optional) – Whether to return the row-normalized cluster count. Defaults to False.

Returns:

A DataFrame containing the cluster counts.

Return type:

pd.DataFrame

spatiomic.tool.get_stats(data, group, channel_names=None, comparison='all', is_log1p=False, test='t', dependent=False, equal_variance=None, correction='holm-sidak', correction_family_alpha=0.05, permutation_count=None, permutation_seed=0, test_kwargs=None)¶

Calculate the statistics for marker or cluster expression/abundance between groups.

Warning

A pseudo count of 1e-38 is added to the data to avoid log(0) errors when calculating log fold change and log10 p-values.

Usage:

data = [
    0,
    0,
    0,
    0,
    1,
    1,
    1,
    1,
    0,
    1,
    0,
    1,
]
group = [
    "healthy",
    "healthy",
    "healthy",
    "healthy",
    "disease",
    "disease",
    "disease",
    "disease",
    "treated",
    "treated",
    "treated",
    "treated",
]
# if only one comparison group is provided, all other groups are compared to it
comparison = "healthy"

df = get_stats(
    data,
    group,
    comparison,
    test="t",
    dependent=False,
    correction="bonferroni",
)

Parameters:

data (NDArray) – Image data to calculate the statistics for.
group (Union[NDArray, List[Union[str, int]]]) – The group labels for each pixel.
channel_names (Optional[List[str]], optional) – The names of the channels. Defaults to None.
comparison (Union[Literal["all", "each"], str, int, List[Union[str, int]]]) – The group labels to compare to. Defaults to “all”.
is_log1p (bool, optional) – Whether the data is log1p transformed. Defaults to False.
test (Literal["t", "wilcoxon", "mwu", "mannwhitneyu"], optional) – The statistical test to be used. When permutation_count is not None, this refers to the statistic to use for the permutation test. You can provide your own statistic that accepts the data to compare as the first two positional arguments and also accepts **kwargs. If the function has an axis and or a dependent parameter, the respective arguments will also be passed to the custom function. Defaults to “t”.
dependent (bool, optional) – Whether the data is dependent. Defaults to False.
equal_variance (bool, optional) – Whether to assume equal variance for the t-test. If False or None, the Welch’s t-test is used. Defaults to None.
correction (Union[Literal["holmsidak", "bonferroni", "fdr"], None], optional) – The correction to apply to the p-values to control the family-wise error rate. Defaults to “holmsidak”.
correction_family_alpha (float, optional) – The family-wise alpha value to use for the correction. Currently has no effect as only the corrected p-values are returned but may be used in the future. Defaults to 0.05.
permutation_count (int, optional) – The number of permutations to perform. If None or 0, no permutation test is performed. Defaults to None.
permutation_seed (int, optional) – The random seed to use for the permutations. Defaults to 0.
test_kwargs (dict, optional) – Additional keyword arguments to be passed to the statistical test function. Defaults to None

Returns:

A pandas DataFrame containing the statistics for each marker and comparison.

Return type:

pd.DataFrame

spatiomic.tool.mean_cluster_intensity(data, clusters, channel_names=None)¶

Calculate the mean intensity for the channels of data for each label category.

Parameters:

data (NDArray) – The image data, channel-last.
clusters (NDArray) – The clusters for the data points.
channel_names (Optional[Union[List[str], List[int]]], optional) – The column names for the channels in the mean label intensity DataFrame. Defaults to None.

Returns:

A dataframe of the channel-weise mean intensity per label with the clusters as the rows: and the channels as columns.

Return type:

pd.DataFrame