Skip to main content
Fig. 5 | Genome Biology

Fig. 5

From: Gene set enrichment analysis for genome-wide DNA methylation data

Fig. 5

Evaluation of the performance of GOregion on sorted blood cell data. A Bias plot showing that genes with more measured CpGs is more likely to have a differentially methylated region (DMR). This plot is produced from EPIC array sorted blood cell type data, comparing B cells to NK cells. B Numbers of DMRs identified by DMRcate, for each cell type comparison: CD4 T cells vs. CD8 T cells, monocytes vs. neutrophils and B cells vs. NK cells. The blue bar is the number of DMRs before filtering, the pink bar is the number of DMRs after filtering out DMRs with < 3 underlying CpGs and an absolute mean |Δβ| < 0.1. C Cumulative number of GO terms, as ranked by GOregion and a simple hypergeometric test (HGT), that are present in each truth set for the B cells vs. NK comparison. ISP Terms = immune-system process child terms truth set; RNAseq Terms = top 100 terms from RNAseq analysis of the same cell types. D Bubble plots of the top 10 GO terms as ranked by GOregion and a simple HGT for the B cells vs. NK comparison. The size of the bubble indicates the relative number of genes in the set. The color of the bubble indicates whether the term is present in either RNAseq (purple) or ISP (green) truth sets, both (red) or neither (blue). E Upset plot showing the characteristics of the CpGs selected as “significant” for the B cell vs. NK comparison by a probe-wise differential methylation analysis using a significance cut off (FDR < 0.05), the top 5000 CpGs as ranked by the probe-wise analysis (Top 5000) or the CpGs underlying the filtered DMRcate regions (DMRcate). The probe-wise analysis with FDR < 0.05 identified over 60,000 CpGs as “significant” and had the most unique CpGs. However, despite identifying fewer “significant” CpGs (~25,000), almost half of the CpGs identified by DMRcate are unique (~12,000). F Proportion of “significant” CpGs that are annotated to genes as identified by the three different strategies. G Upset plot showing the characteristics of the genes that “significant” CpGs are annotated to, as identified by the three different strategies, for the B cell vs. NK comparison. CpGs identified by the probe-wise analysis with FDR < 0.05 map to over 12,000 genes. Although the CpGs identified by DMRcate map to far fewer genes (~2900), a number of them are unique to this approach

Back to article page