scib.metrics.cluster_optimal_resolution

scib.metrics.cluster_optimal_resolution(adata, label_key, cluster_key, cluster_function=None, metric=None, resolutions=None, use_rep=None, force=True, verbose=True, return_all=False, metric_kwargs=None, **kwargs)

Optimised clustering

Leiden, louvain or any custom clustering algorithm with resolution optimised against a metric

Parameters:

adata – anndata object
label_key – name of column in adata.obs containing biological labels to be optimised against
cluster_key – name of column to be added to adata.obs during clustering. Will be overwritten if exists and force=True
cluster_function – a clustering function that takes an anndata.Anndata object. Default: Leiden clustering
metric – function that computes the cost to be optimised over. Must take as arguments (adata, label_key, cluster_key, **metric_kwargs) and returns a number for maximising Default is nmi()
resolutions – list of resolutions to be optimised over. If resolutions=None, default resolutions of 10 values ranging between 0.1 and 2 will be used
use_rep – key of embedding to use only if adata.uns['neighbors'] is not defined, otherwise will be ignored
force – whether to overwrite the cluster assignments in the .obs[cluster_key]
verbose – whether to print out intermediate results
return_all – whether to results for all resolutions
metric_kwargs – arguments to be passed to metric
kwargs – arguments to pass to clustering function

Returns:

Only if return_all=True, return tuple of (res_max, score_max, score_all) res_max: resolution of maximum score; score_max: maximum score; score_all: pd.DataFrame containing all scores at resolutions. Can be used to plot the score profile.