scib.metrics.pc_regression
- scib.metrics.pc_regression(data, covariate, pca_var=None, n_comps=50, svd_solver='arpack', linreg_method='sklearn', verbose=False, n_threads=1)
Principal component regression
Compute the overall variance contribution given a covariate according to the following formula:
\[Var(C|B) = \sum^G_{i=1} Var(C|PC_i) \cdot R^2(PC_i|B)\]for \(G\) principal components (\(PC_i\)), where \(Var(C|PC_i)\) is the variance of the data matrix \(C\) explained by the i-th principal component, and \(R^2(PC_i|B)\) is the \(R^2\) of the i-th principal component regressed against a covariate \(B\).
- Parameters:
data – Expression or PC matrix. Assumed to be PC, if pca_sd is given.
covariate – series or list of batch assignments
n_comps – number of PCA components for computing PCA, only when pca_sd is not given. If no pca_sd is not defined and n_comps=None, compute PCA and don’t reduce data
pca_var – Iterable of variances for
n_compscomponents. Ifpca_sdis notNone, it is assumed that the matrix contains PC, otherwise PCA is computed ondata.linreg_method –
Regression backend. One of
'sklearn','numpy', or'sequential'.'sequential'callslinreg_sklearn(), theoriginal implementation that fits one model per PC and is typically much slower.
'sklearn'callslinreg_multiple_sklearn(),a multi-output linear regression backend.
'numpy'callslinreg_multiple_np(), avectorized numpy backend with a categorical one-way ANOVA shortcut.
svd_solver
n_threads – Number of threads passed to the selected regression backend.
verbose
- Returns:
Variance contribution of regression