Skip to main content

Table 3 Frequency of occurrence of pairs of the ten most significant motifs in the same core promoter

From: Computational analysis of core promoters in the Drosophila genome

Motif

Percent of promoters with motif

Percent of promoters with each motif that also containing the indicated second motif

  

1

2 DRE

3 TATA

4 Inr

5

6

7

8

9 DPE

10

1

25.1

100.0

21.3

13.1

12.7

20.5

28.3

27.0

27.0

4.9

6.1

2

26.0

20.6

100.0

14.9

16.8

20.0

14.1

33.1

19.4

5.7

6.9

3

19.3

17.1

20.1

100.0

28.9

13.9

14.4

12.6

24.9

4.8

9.4

4

26.3

12.1

16.6

21.1

100.0

14.1

12.1

12.9

25.2

14.9

12.9

5

18.5

27.9

28.1

14.5

20.1

100.0

14.8

29.2

30.6

6.7

8.4

6

15.8

45.1

23.2

17.6

20.3

17.3

100.0

18.6

19.6

4.6

4.2

7

23.3

29.2

36.9

10.4

14.6

23.2

12.6

100.0

30.3

4.9

6.0

8

23.2

29.3

21.8

20.7

28.7

24.4

13.3

30.4

100.0

7.6

10.0

9

7.9

15.6

18.8

11.7

49.4

15.6

9.1

14.3

22.1

100.0

8.4

10

8.5

18.2

21.2

21.2

40.0

18.2

7.9

16.4

27.3

7.9

100.0

  1. The first column lists the motifs given in Table 2. The second column shows the frequency of promoters with a hit to the corresponding weight matrix model (p value 1.0e-3). Each of the other columns is labeled with a motif number and the intersection of a row and column shows the frequency with which the two motifs occur in the same core promoter. We did not normalize for the different sizes of the subsets, but entries in the same column can be compared. As we set all thresholds to deliver the same false-positive rate of one in 1,000 nucleotides, we would expect 8.5% of random sequences to contain a match to motifs 1, 5-8 and 10, as the length of the sequence searched allows for 85 different alignment positions of a 15 base motif. Because the sequence windows searched for the other motifs were smaller, the expected false-positive rate was reduced to 4.5% for the TATA box, 3.0% for the Inr, and 1.5% for DPE. Note that the percentage of promoters with TATA boxes or Inr motifs is much lower when estimated using the weight matrix models and Patser than when using matches to the more degenerate consensus strings.