scib.metrics.isolated_labels_f1

scib.metrics.isolated_labels_f1(adata, label_key, batch_key, embed, cluster_key='iso_label', resolutions=None, iso_threshold=None, verbose=True, **kwargs)

Isolated label score F1

Score how well isolated labels are distinguished from other labels by data-driven clustering. The F1 score is used to evaluate clustering with respect to the ground truth labels.

Parameters:
  • adata – anndata object

  • label_key – column in adata.obs

  • batch_key – column in adata.obs

  • embed – key in adata.obsm used for as representation for kNN graph computation. If embed=None, use the existing kNN graph in adata.uns['neighbors'].

  • iso_threshold – max number of batches per label for label to be considered as isolated, if iso_threshold is integer. If iso_threshold=None, consider minimum number of batches that labels are present in

  • cluster_key – clustering key prefix to look or recompute for each resolution in resolutions. Is passed to cluster_optimal_resolution()

  • resolutions – list of resolutions to be passed to cluster_optimal_resolution()

  • verbose

Params **kwargs:

additional arguments to be passed to cluster_optimal_resolution()

Returns:

Mean of F1 scores over all isolated labels

This function performs clustering on a kNN graph and can be applied to all integration output types. For this metric the adata needs a kNN graph and can optionally make use of precomputed clustering (see example below). The precomputed clusters must be saved under adata.obs[cluster_key] as well as adata.obs[f"{cluster_key}_{resolution}"] for all resolutions.

See User Guide for more information on preprocessing.

Examples

# full feature output
scib.pp.reduce_data(
    adata, n_top_genes=2000, batch_key="batch", pca=True, neighbors=True
)
scib.me.isolated_labels_f1(adata, label_key="celltype")

# embedding output
sc.pp.neighbors(adata, use_rep="X_emb")
scib.me.isolated_labels_f1(adata, batch_key="batch", label_key="celltype")

# knn output
scib.me.isolated_labels_f1(adata, batch_key="batch", label_key="celltype")

# use precomputed clustering
scib.cl.cluster_optimal_resolution(adata, cluster_key="iso_label", label_key="celltype")
scib.me.isolated_labels_f1(adata, batch_key="batch", label_key="celltype")

# overwrite existing clustering
scib.me.isolated_labels_f1(adata, batch_key="batch", label_key="celltype", force=True)