scib.metrics.kBET
- scib.metrics.kBET(adata, batch_key, label_key, type_, embed=None, scaled=True, return_df=False, verbose=False)
kBET score
Compute the average of k-nearest neighbour batch effect test (kBET) score per label. This is a wrapper function of the implementation by Büttner et al. 2019. kBET measures the bias of a batch variable in the kNN graph. Specifically, kBET is quantified as the average rejection rate of Chi-squared tests of local vs global batch label distributions. This means that smaller values indicate better batch mixing. By default the original kBET score is scaled between 0 and 1 so that larger scores are associated with better batch mixing.
- Parameters:
adata – anndata object to compute kBET on
batch_key – name of batch column in adata.obs
label_key – name of cell identity labels column in adata.obs
type_ – type of data integration, one of ‘knn’, ‘embed’ or ‘full’
embed – embedding key in
adata.obsm
for embedding and feature inputscaled – whether to scale between 0 and 1 with 0 meaning low batch mixing and 1 meaning optimal batch mixing if scaled=False, 0 means optimal batch mixing and 1 means low batch mixing
- Returns:
kBET score (average of kBET per label) based on observed rejection rate. If
return_df=True
, also return apd.DataFrame
with kBET observed rejection rate per cluster
This function can be applied to all integration output types and recomputes the kNN graph for feature and embedding output with specific parameters. Thus, no preprocessing is required, but the correct output type must be specified in
type_
.Examples
# full feature integration output or unintegrated data scib.me.kBET( adata, batch_key="batch", label_key="celltype", type_="full", embed="X_pca" ) # embedding output scib.me.kBET( adata, batch_key="batch", label_key="celltype", type_="embed", embed="X_emb" ) # kNN output scib.me.kBET(adata, batch_key="batch", label_key="celltype", type_="knn")