scib.metrics.metrics
- scib.metrics.metrics(adata, adata_int, batch_key, label_key, embed='X_pca', cluster_key='cluster', cluster_nmi=None, ari_=False, nmi_=False, nmi_method='arithmetic', nmi_dir=None, silhouette_=False, si_metric='euclidean', pcr_=False, cell_cycle_=False, organism='mouse', hvg_score_=False, isolated_labels_=False, isolated_labels_f1_=False, isolated_labels_asw_=False, n_isolated=None, graph_conn_=False, trajectory_=False, kBET_=False, lisi_graph_=False, ilisi_=False, clisi_=False, subsample=0.5, n_cores=1, type_=None, verbose=False)
Master metrics function
Wrapper for all metrics used in the study. Compute of all metrics given unintegrated and integrated anndata object
- Parameters:
adata – unintegrated, preprocessed anndata object
adata_int – integrated anndata object
batch_key – name of batch column in adata.obs and adata_int.obs
label_key – name of biological label (cell type) column in adata.obs and adata_int.obs
embed –
embedding representation of adata_int
Used for:
silhouette scores (label ASW, batch ASW),
PC regression,
cell cycle conservation,
isolated label scores, and
kBET
cluster_key – name of column to store cluster assignments. Will be overwritten if it exists
cluster_nmi – Where to save cluster resolutions and NMI for optimal clustering If None, these results will not be saved
ari_ – whether to compute ARI using
ari()
nmi_ – whether to compute NMI using
nmi()
nmi_method – which implementation of NMI to use
nmi_dir – directory of NMI code for some implementations of NMI
silhouette_ – whether to compute the average silhouette width scores for labels and batch using
silhouette()
andsilhouette_batch()
si_metric – which distance metric to use for silhouette scores
pcr_ – whether to compute principal component regression using
pc_comparison()
cell_cycle_ – whether to compute cell cycle score conservation using
cell_cycle()
organism – organism of the datasets, used for computing cell cycle scores on gene names
hvg_score_ – whether to compute highly variable gene conservation using
hvg_overlap()
isolated_labels_ – whether to compute both isolated label scores using
isolated_labels()
isolated_labels_f1_ – whether to compute isolated label score based on F1 score of clusters vs labels using
isolated_labels()
isolated_labels_asw_ – whether to compute isolated label score based on ASW (average silhouette width) using
isolated_labels()
n_isolated – maximum number of batches per label for label to be considered as isolated
graph_conn_ – whether to compute graph connectivity score using
graph_connectivity()
trajectory_ – whether to compute trajectory score using
trajectory_conservation()
kBET_ – whether to compute kBET score using
kBET()
lisi_graph_ – whether to compute both cLISI and iLISI using
lisi_graph()
clisi_ – whether to compute cLISI using
clisi_graph()
ilisi_ – whether to compute iLISI using
ilisi_graph()
subsample – subsample fraction for LISI scores
n_cores – number of cores to be used for LISI functions
type_ – one of ‘full’, ‘embed’ or ‘knn’ (used for kBET and LISI scores)