spatiomic.tool
==============

.. py:module:: spatiomic.tool

.. autoapi-nested-parse::

   Expose helper functions in the tool submodule.


Functions
---------

.. autoapisummary::

   spatiomic.tool.count_clusters
   spatiomic.tool.get_stats
   spatiomic.tool.mean_cluster_intensity


Package Contents
----------------

.. py:function:: count_clusters(file_paths, cluster_count, sort = False, normalize = False)

   Count the number of clusters in each image.

   :param file_paths: The paths to the files to count the clusters in.
   :type file_paths: List[str]
   :param cluster_count: The number of clusters to count.
   :type cluster_count: int
   :param sort: Whether to sort the file paths. Defaults to False.
   :type sort: bool, optional
   :param normalize: Whether to return the row-normalized cluster count. Defaults to False.
   :type normalize: bool, optional

   :returns: A DataFrame containing the cluster counts.
   :rtype: pd.DataFrame


.. py:function:: get_stats(data, group, channel_names = None, comparison = 'all', is_log1p = False, test = 't', dependent = False, equal_variance = None, correction = 'holm-sidak', correction_family_alpha = 0.05, permutation_count = None, permutation_seed = 0, test_kwargs = None)

   Calculate the statistics for marker or cluster expression/abundance between groups.

   .. warning:: A pseudo count of 1e-38 is added to the data to avoid log(0) errors when calculating log fold change
       and log10 p-values.

   Usage:

   .. code-block:: python

       data = [
           0,
           0,
           0,
           0,
           1,
           1,
           1,
           1,
           0,
           1,
           0,
           1,
       ]
       group = [
           "healthy",
           "healthy",
           "healthy",
           "healthy",
           "disease",
           "disease",
           "disease",
           "disease",
           "treated",
           "treated",
           "treated",
           "treated",
       ]
       # if only one comparison group is provided, all other groups are compared to it
       comparison = "healthy"

       df = get_stats(
           data,
           group,
           comparison,
           test="t",
           dependent=False,
           correction="bonferroni",
       )

   :param data: Image data to calculate the statistics for.
   :type data: NDArray
   :param group: The group labels for each pixel.
   :type group: Union[NDArray, List[Union[str, int]]]
   :param channel_names: The names of the channels. Defaults to None.
   :type channel_names: Optional[List[str]], optional
   :param comparison: The group labels to compare to.
                      Defaults to "all".
   :type comparison: Union[Literal["all", "each"], str, int, List[Union[str, int]]]
   :param is_log1p: Whether the data is log1p transformed. Defaults to False.
   :type is_log1p: bool, optional
   :param test: The statistical test to be used. When
                permutation_count is not None, this refers to the statistic to use for the permutation test. You can provide
                your own statistic that accepts the data to compare as the first two positional arguments and also accepts
                **kwargs. If the function has an `axis` and or a `dependent` parameter, the respective arguments will also
                be passed to the custom function.
                Defaults to "t".
   :type test: Literal["t", "wilcoxon", "mwu", "mannwhitneyu"], optional
   :param dependent: Whether the data is dependent. Defaults to False.
   :type dependent: bool, optional
   :param equal_variance: Whether to assume equal variance for the t-test. If False or None, the Welch's
                          t-test is used. Defaults to None.
   :type equal_variance: bool, optional
   :param correction: The correction to apply to the
                      p-values to control the family-wise error rate.
                      Defaults to "holmsidak".
   :type correction: Union[Literal["holmsidak", "bonferroni", "fdr"], None], optional
   :param correction_family_alpha: The family-wise alpha value to use for the correction. Currently
                                   has no effect as only the corrected p-values are returned but may be used in the future.
                                   Defaults to 0.05.
   :type correction_family_alpha: float, optional
   :param permutation_count: The number of permutations to perform. If None or 0, no permutation test is
                             performed. Defaults to None.
   :type permutation_count: int, optional
   :param permutation_seed: The random seed to use for the permutations. Defaults to 0.
   :type permutation_seed: int, optional
   :param test_kwargs: Additional keyword arguments to be passed to the statistical test function.
                       Defaults to None
   :type test_kwargs: dict, optional

   :returns: A pandas DataFrame containing the statistics for each marker and comparison.
   :rtype: pd.DataFrame


.. py:function:: mean_cluster_intensity(data, clusters, channel_names = None)

   Calculate the mean intensity for the channels of data for each label category.

   :param data: The image data, channel-last.
   :type data: NDArray
   :param clusters: The clusters for the data points.
   :type clusters: NDArray
   :param channel_names: The column names for the channels
                         in the mean label intensity DataFrame. Defaults to None.
   :type channel_names: Optional[Union[List[str], List[int]]], optional

   :returns:

             A dataframe of the channel-weise mean intensity per label with the clusters as the rows
                 and the channels as columns.
   :rtype: pd.DataFrame