Skip to main content

Table 1 Statistics of the 25 selected tracks, arranged in the order of the UCSC genome browser

From: AceView: a comprehensive cDNA-supported gene and transcripts annotation

UCSC track

Model with introns

Model with introns and CDS

Single exon model (some clipped)

Unique introns in mRNA

All introns in mRNA

Input or method

HAVANA Gencode (Sanger, UK) known + putative

1,691

649

70

3,618

9,693

MEP,CA,H

EGASP model submissions

   AceView (NCBI, US)

1,630

1,460

24

3,530

9,597

ME,(H)

   UP Dogfish (Sanger, UK)

204

204

15

1,679

1,679

CA

   Exogean (ENS, France)

554

538

2

2,855

6,178

MEP,CA

   UP ExonHunter (U Waterloo, Canada)

807

807

220

3,237

3,237

MEP,CA

   Fgenesh (U London, UK)

462

458

97

2,610

3,241

P,CA

   UP GeneId (IMIM, Spain)

267

267

51

1,905

1,905

A

   UP GeneMark (Georgia IT, US)

551

551

81

2,185

2,185

A

   UP Jigsaw (TIGR, US)

259

259

67

2,168

2,168

MEP,CA

   PairagonAny (Wash U, US)

471

437

38

2,300

3,470

MEP?,CA

   UP SGP2 (IMIM, Spain)

552

552

159

2,645

2,645

P,CA

   P Twinscan-MARS (Wash U,US)

547

547

108

2,501

4,943

CA

   UP Augustus Any (U Göttingen, Germany)

312

316

87

2,291

2,291

MEP,CA

   UP GeneZilla (TIGR, US)

477

477

179

2,758

2,758

A

   UP Saga (UC Berkeley, US)

331

331

47

1,737

1,737

CA

UCSC gene tracks

   *Known Gene (UCSC)

501

477

53

2,264

4,427

MP

   *P CCDS

201

201

14

1,296

1,508

MP,H

   *RefSeq (NCBI, US)

342

325

41

2,082

2,922

M(E)P,H

   *MGC

323

310

19

1,400

2,101

M

   *Ensembl (EBI, UK)

427

418

58

2,429

3,548

MEP,CA

   *AceView (Aug 2005 NCBI)

1,792

1,627

902

3,812

9,792

ME, (H)

   *ECgene (Korea)

3,851

3,551

2,569

3,942

30,660

ME,C

   *U NscanEst (Wash U, US)

282

252

27

2,292

2,292

ME,CA

   *UP GenScan (MIT, US)

395

395

59

3,042

3,042

A

  1. The number of models, with or without introns (after clipping at region boundaries), the number of spliced coding models, and the number of unique and multiply used introns are given over the 31 ENCODE test regions. Coded information has been added in front of the track name: asterisks distinguish standard gene tracks, available genome-wide, from an ENCODE only track; a U track predicts a unique model per gene; P predicts protein coding regions only. According to their documentation, the programs use different input or methods: M, E, P stand for human mRNA, EST, protein sequences or alignments, respectively; C stands for for conservation, or use of cDNA or protein evidence from other species; A stands for ab initio prediction; H stands for Hand curation; and parenthesized letters stand for minimal use of the particular type. Notice the low proportion of Gencode mRNA models with an annotated CDS (in bold).