scib.metrics.cluster_optimal_resolution

scib.metrics.cluster_optimal_resolution(adata, label_key, cluster_key, cluster_function=None, metric=None, resolutions=None, use_rep=None, force=True, verbose=True, return_all=False, metric_kwargs=None, **kwargs)

Optimised clustering

Leiden, louvain or any custom clustering algorithm with resolution optimised against a metric

Parameters:
  • adata – anndata object

  • label_key – name of column in adata.obs containing biological labels to be optimised against

  • cluster_key – name of column to be added to adata.obs during clustering. Will be overwritten if exists and force=True

  • cluster_function – a clustering function that takes an anndata.Anndata object. Default: Leiden clustering

  • metric – function that computes the cost to be optimised over. Must take as arguments (adata, label_key, cluster_key, **metric_kwargs) and returns a number for maximising Default is nmi()

  • resolutions – list of resolutions to be optimised over. If resolutions=None, default resolutions of 10 values ranging between 0.1 and 2 will be used

  • use_rep – key of embedding to use only if adata.uns['neighbors'] is not defined, otherwise will be ignored

  • force – whether to overwrite the cluster assignments in the .obs[cluster_key]

  • verbose – whether to print out intermediate results

  • return_all – whether to results for all resolutions

  • metric_kwargs – arguments to be passed to metric

  • kwargs – arguments to pass to clustering function

Returns:

Only if return_all=True, return tuple of (res_max, score_max, score_all) res_max: resolution of maximum score; score_max: maximum score; score_all: pd.DataFrame containing all scores at resolutions. Can be used to plot the score profile.