scib.metrics.ari

scib.metrics.ari(adata, cluster_key, label_key, implementation=None)

Adjusted Rand Index

The adjusted rand index is a chance-adjusted rand index, which evaluates the pair-wise accuracy of clustering vs. ground truth label assignments. The score ranges between 0 and 1 with larger values indicating better conservation of the data-driven cell identity discovery after integration compared to annotated labels.

Parameters:

adata – anndata object with cluster assignments in adata.obs[cluster_key]
cluster_key – string of column in adata.obs containing cluster assignments
label_key – string of column in adata.obs containing labels
implementation – if set to ‘sklearn’, uses sklearn’s implementation, otherwise native implementation is taken

This function can be applied to all integration output types. The adata must contain cluster assignments that are based off the knn graph given or derived from the integration method output. For this metric you need to include all steps that are needed for clustering. See User Guide for more information on preprocessing.

Examples

# feature output
scib.pp.reduce_data(
    adata, n_top_genes=2000, batch_key="batch", pca=True, neighbors=True
)
scib.me.cluster_optimal_resolution(adata, cluster_key="cluster", label_key="celltype")
scib.me.ari(adata, cluster_key="cluster", label_key="celltype")

# embedding output
sc.pp.neighbors(adata, use_rep="X_emb")
scib.me.cluster_optimal_resolution(adata, cluster_key="cluster", label_key="celltype")
scib.me.ari(adata, cluster_key="cluster", label_key="celltype")

# knn output
scib.me.cluster_optimal_resolution(adata, cluster_key="cluster", label_key="celltype")
scib.me.ari(adata, cluster_key="cluster", label_key="celltype")