Skip to main content
Fig. 5 | Genome Biology

Fig. 5

From: Analytic Pearson residuals for normalization of single-cell RNA-seq UMI data

Fig. 5

Benchmarking the effect of normalization on cell type separation in reduced dimensionality. We used the Zhengmix8eq dataset with eight ground truth FACS-sorted cell types [42, 43] (3, 994 cells) and added ten pseudo-genes expressed in a random group of 50 cells from one type. All HVG selection methods were set up to select 2, 000 genes, and all normalization and dimensionality reduction methods reduced the data to 50 dimensions. For details see “Methods”. a t-SNE embedding after the seurat_v3 HVG selection as implemented in Scanpy, followed by depth normalization, median scaling, square-root transform, and PCA. Colors denote ground truth cell types, the artificially added type is shown in red. b t-SNE embedding after HVG selection by Pearson residuals (θ=100), followed by transformation to Pearson residuals (θ=100), and PCA. Black arrow points at the artificially added type. c Macro F1 score (harmonic mean between precision and recall, averaged across classes to counteract class imbalance) for kNN classification (k=15) of nine ground truth cell types for each of the 70 combinations of HVG selection and data transformation approaches

Back to article page