Skip to main content

Table 9 AUROC per bacterial species in the validation data set and mean AUROC ± standard deviation for each model

From: Promotech: a general tool for bacterial promoter recognition

Model

M. smegmatis

L. phytofermentans

B. amyloliquefaciens

R. capsulatus

Mean AUROC

RF-HOT

0.939

0.591

0.640

0.660

0.708 ± 0.157

RF-TETRA

0.814

0.608

0.837

0.674

0.733 ± 0.110

GRU-0

0.630

0.488

0.496

0.577

0.548 ± 0.068

GRU-1

0.601

0.487

0.502

0.566

0.539 ± 0.054

LSTM-3

0.622

0.489

0.481

0.553

0.536 ± 0.066

LSTM-4

0.592

0.498

0.506

0.546

0.536 ± 0.043

MULTiPly

0.684

0.470

0.700

0.593

0.612 ± 0.106

iPro70-FMWin

0.642

0.587

0.779

0.575

0.646 ± 0.093

bTSSFinder

(0.272, 0.265)

(0.944, 0.924)

(0, 0)

(0.250, 0.245)

NA

G4PromFinder

(0.938, 0.932)

(0.216, 0.269)

(0.339, 0.554)

(0.960, 0.953)

NA

BProm

(0.006, 0.002)

(0.560, 0.398)

(0.421, 0.181)

(0.011, 0.007)

NA

  1. AUROC is roughly the likelihood that a positive instance will get a higher probability of being a promoter sequence than a negative instance. These results were obtained in data sets (not seen during training) with a 1:1 ratio of positive to negative instances. The numbers in bold indicate the model with the highest AUROC. For BPROM, bTSSFinder, and G4PromFinder, the numbers between brackets indicate true-positive rate and false-positive rate obtained as these tools did not provide a probability associated to each instance in the data set