Skip to main content
Fig. 2 | Genome Biology

Fig. 2

From: DISCERN: deep single-cell expression reconstruction for improved cell clustering and cell subtype and state detection

Fig. 2

Expression reconstruction benchmark of DISCERN and five state-of-the-art batch correction and imputation algorithms. A Comparison of the performance using smartseq2 data. The smartseq2 data was split into a smartseq2-lq and a smartseq2-hq batch. The smartseq2-lq batch was modified such that the expression of all genes of a cell type determining pathway (top ranked by GSEA) was set to zero. The expression of the in silico altered pathway genes was then compared between reconstructed-hq data and the unaltered smartseq2-hq data. B DEG (t-statistics) and pathway enrichment (normalized enrichment scores) correlation of the reconstructed-hq to the expected values before removal restricted to genes which were removed in the smartseq2-lq batch. The smartseq2-lq data was the same as in A. C Mean expression correlation of reconstructed-hq with the expected expression in smartseq-hq data for different ratios of lq to hq data. The standard deviation indicates the deviation in correlation of the cell types. The datasets were created as described in A. D Alpha cells were removed from the smartseq-hq batch and left in the low quality batches. The number of overlapping cell types between the hq and lq data was then altered by removing cell types, which overlap between lq and hq data, from the lq data before preprocessing and expression reconstruction. The ratio of the intersection size is shown on the x-axis. The y-axis shows the correlation of the t-statistics of alpha cells from lq-batches vs other cells from the smartseq2 batch with ground truth from the uncorrected smartseq2 batch. We used Spearman rank correlation for the comparison, since no gene subset was used. E t-SNE visualization of the cell type removal experiment (as in D), such that there is no overlap between lq and hq. F Spearman correlation of the t-statistics of alpha cells as in D. The dataset was the same as in E. The dotted line indicates the correlation achieved without reconstruction

Back to article page