scib.metrics.cluster_optimal_resolution

scib.metrics.cluster_optimal_resolution(adata, label_key, cluster_key, cluster_function=None, metric=None, resolutions=None, use_rep=None, force=False, verbose=True, return_all=False, metric_kwargs=None, **kwargs)

Optimised clustering

Leiden, louvain or any custom clustering algorithm with resolution optimised against a metric

Parameters:
  • adata – anndata object

  • label_key – name of column in adata.obs containing biological labels to be optimised against

  • cluster_key – name and prefix of columns to be added to adata.obs during clustering. Each resolution will be saved under “{cluster_key}_{resolution}”, while the optimal clustering will be under cluster_key. If force=True and one of the keys already exists, it will be overwritten.

  • cluster_function – a clustering function that takes an anndata.Anndata object. Default: Leiden clustering

  • metric – function that computes the cost to be optimised over. Must take as arguments (adata, label_key, cluster_key, **metric_kwargs) and returns a number for maximising Default is nmi()

  • resolutions – list of resolutions to be optimised over. If resolutions=None, by default 10 equally spaced resolutions ranging between 0 and 2 will be used (see get_resolutions())

  • use_rep – key of embedding to use only if adata.uns['neighbors'] is not defined, otherwise will be ignored

  • force – whether to overwrite the cluster assignments in the .obs[cluster_key]

  • verbose – whether to print out intermediate results

  • return_all – whether to results for all resolutions

  • metric_kwargs – arguments to be passed to metric

  • kwargs – arguments to pass to clustering function

Returns:

Only if return_all=True, return tuple of (res_max, score_max, score_all) res_max: resolution of maximum score; score_max: maximum score; score_all: pd.DataFrame containing all scores at resolutions. Can be used to plot the score profile.

If you specify an embedding that was not used for the kNN graph (i.e. adata.uns["neighbors"]["params"]["use_rep"] is not the same as use_rep), the neighbors will be recomputed in-place.