Skip to main content

Table 7 Comparison of conservation scores between highest scoring k-mers and position weight matrices (PWM) for 20 known regulatory elements in S. cerevisiae, obtained when comparing S. cerevisiae and S. bayanus

From: Fast and systematic genome-wide discovery of conserved regulatory elements using a non-alignment based approach

Name

Sequence

Score

PWM consensus

Score

Bas1

AAGAGTCA

93.8*

[AG][AG]NANGAGTCA

80.9

Cbf1

CACGTGA

421.3*

[AG][AG]TCACGTG

406.5

Fkh1/2

TAAACAA

110.3

GTAAACAA[AT]

114.1*

Gcn4

TGACTCA

93.4

[AG][AG]TGA[CG]TCA

135.4*

Gcr1

TGGAAGC

82.7*

[AG]GCTTCCT CG]T

42.7

Hap4

CCAATCA

104.2*

G[AG][AG]CCAATCA

96.6

Ino4

CATGTGA

91.2*

CAT[CG]TGAAAA

61.1

Mbp1

ACGCGTC

204.1

ACGCGTNA[AG]N

210.2*

Msn2/4

AAAGGGG

140.1

A[AG]GGGG

169.7*

PAC

GCGATGAG

404.6

GCGATGAGNT

520.3*

Pdr3

CCGCGGA

76.9

[CG]NNTCCG[CT]GGAA

102.5*

Rap1

TGGGTGT

103.8

[AG]TGTN[CT]GG[AG]TG

253.2*

Reb1

CGGGTAA

Inf

[CG]CGGGTAA[CT]

Inf

Rpn4

TTTGCCACC

218.6

GGTGGCAAAA

259.4*

RRPE

AAAAATTT

509.9*

TGAAAAATTT

388.80

Ste12

TGAAACA

81.4

ANNNTGAAACA

100.0*

Sum1/Ndt80

TGACACA

135.4*

[AG][CT]G[AT]CA[CG][AT]AA[AT]

100.0

Swi4

CGCGAAA

224.1*

NNNNC[AG]CGAAAA

116.6

Ume6

TAGCCGCC

377.2

TCGGCGGC[AT]A

410.0*

Xbp1

CCTCGAG

86.7

GCCTCGA[AG]G[AC]G[AG]

141.7*

  1. *Indicates which regulatory element representation (k-mer or weight matrix) obtained the highest conservation score. Inf corresponds to very large conservation scores, obtained when taking the negative natural logarithm of near-zero hypergeometric p-values.