Skip to main content

Table 1 Summary of data types

From: Consistent probabilistic outputs for protein function prediction

Data type

Description

BP

CC

MF

Phenotype

    

   MGI

Mammalian phenotype ontology terms (33)

1,994

2,157

1,898

   OMIM

Diseases (2,488) associated with human homologs

998

1,166

978

Phylogenetic profile

    

   Inparanoid

Orthologs across 21 species

6,131

7,092

6,556

   Biomart

Orthologs across 18 species

6,269

7,242

6,695

Protein domain

    

   Interpro

Functional sites and domains

7,131

8,027

7,603

   PfamA

Protein domains

6,790

7,648

7,239

Protein-protein interaction

    

   PPI

Transferred via orthology from human (OPHID)

3,273

3,690

3,509

Gene expression data

    

   Su et al. [9]

Oligonucleotide arrays (55 tissues)

6,555

7,587

7,029

   Zhang et al. [7]

Affymetrix arrays (61 tissues)

5,097

5,716

5,447

   SAGE

Tag counts from SAGE library (99% cutoff)

6,323

7,231

6,753

Total

 

7,968

9,005

8,427

  1. The table lists the ten data types from [1], along with the number of proteins that are annotated with at least one term of each ontology and for which that data type is available. BP, biological process; CC, cellular component; MF, molecular function.