Skip to main content
Fig. 4 | Genome Biology

Fig. 4

From: Statistical learning quantifies transposable element-mediated cis-regulation

Fig. 4

Cis-regulatory activities are more pronounced at epigenetically active TEs. A Overview of the procedure whereby TE subfamilies are split between so-called “functional” and “non-functional” fractions based on additional evidence, e.g., differential chromatin accessibility. The regulatory susceptibility scores tying TE subfamilies to protein-coding genes are distributed between the functional and non-functional fractions of each TE subfamily, leading to an experiment-specific column-wise expansion of N. Concretely, functional and non-functional fractions of TE subfamilies are treated as independent TE subfamilies in the subsequent cis-regulatory activity estimation process. B Estimated differences in cis-regulatory activity for the functional (in red) and non-functional fractions (in blue) of LTR5-Hs and SVA subfamilies under CRISPRi-mediated epigenetic repression in naïve hESCs [33]. The cis-regulatory activities for the unsplit subfamilies were estimated in a separated iteration of craTEs, using the standard distance-weighted N matrix (\(L=250\) kb), and are shown in black. The dotted line represents the significance threshold of BH-adjusted \(p.val = 0.05\). Note that even though only selected subfamilies are plotted for clarity, all TE subfamilies were included in the fitting process. C Estimated differences cis-regulatory activity for the functional and non-functional fractions of selected TE subfamilies according to definitions of the functional state that are either based on differential chromatin states (1st and 3rd panels from the left) or differential TF binding (2nd and 4th panels from the left) at integrants [33]. D Estimated differences in cis-regulatory activities for the functional (bound by both GATA6 and EOMES [70]) vs. non-functional fractions of selected TE subfamilies during hESC-derived endoderm differentiation, 48 h vs. 24 h, \(n=3\) [70] (left), functional (GATA6-bound [70]) vs. non-functional fractions of selected TE subfamilies upon GATA6 KO in iPSC-derived mesendoderm, \(n=2\) [69] (center) and GATA6 rescue in GATA6 KO iPSC-derived mesendoderm, \(n=2\) (right). E Estimated differences in cis-regulatory activity for the functional (SOX15-bound) vs. non-functional fractions of selected TE subfamilies between DP hESC-derived hPGCLCs and DN somatic cells, day 6, \(n=2\) [86] (top) and SOX15 KO in DP hESC-derived hPGCLCs (bottom). F Multiple sequence alignment (MSA) of all 152 LTR6B integrants considered by craTEs (central white rectangle). Gray patches within the central white rectangle indicate gaps. Sequences at loci found in gray rectangles flanking the MSA region are shown for convenience and were not aligned. The intensity of GATA6 (left, \(n=1\)) and EOMES (right, \(n=2\)) ChIP-seq signal is indicated at the corresponding genomic loci. The fraction of sequences adorned with ChIP-seq signal for each position is shown on top. Consensus sequences found underneath high-density ChIP-seq signal regions (\(>\frac{1}{3}\) of sequences overlapping ChIP-seq reads) with the highest density of GATA6 (underlined), resp. EOMES (entire consensus) signals are reported, with GATA consensus DNA-binding sites in bold

Back to article page