Skip to main content
Fig. 2 | Genome Biology

Fig. 2

From: Intergenic disease-associated regions are abundant in novel transcripts

Fig. 2

Defining properties of novel transcripts. a Previous observation of portions of captured transcripts in public databases. Percent of captured transcripts overlapping previously annotated transcripts in GENCODE at the time of the experiment design (v.12), GENCODE v.19 and v.27, AceView, MiTranscriptome, and the EST database. Gray shades indicate length overlap between the novel transcript and the previously observed sequences. b Aggregated data for cap analysis gene expression (CAGE) clusters, centered on the 5’ end of captured transcripts. Counts are normalized by the number of transcripts. Positive control was defined as lncRNAs transcripts with the same median of expression distribution across tissues as captured transcripts, from Illumina Body Map data. X-axes indicate distance from the 5’ start of transcripts in base pairs. Y-axes represent counts of CAGE clusters, normalized by the number of transcripts (see “Methods”). c Fraction of promoters of captured transcripts, lncRNAs, pseudogenes, and protein-coding genes occupied by CAGE and epigenetic marks: CAGE (blue), H3K4me3 (red), H3K27ac (yellow), H3K4me1 (purple). Hollow circles represent randomized controls, whereby CAGE and epigenetic peaks were randomly distributed across the genome

Back to article page