Skip to main content
Figure 1 | Genome Biology

Figure 1

From: Most partial domains in proteins are alignment and annotation artifacts

Figure 1

Distribution of partial domain lengths, Pfam27 and RPD2.(A) Cumulative fraction of domains versus fractional domain length. Cumulative fractions are shown for Pfam27 domains found in proteins marked as not fragments (24 million domains in total, of which 945,100 are <50% of model length, blue squares) and the RPD2 domains in Pfam27 (290,148 domains, 30,030 <50% of model length, red circles). Also shown are Pfam27 domains from families with more than 200 match states (6.9 million domains, 658,089 <50% partials, blue diamonds). (B) Cumulative number of sequences with increasing domain length. Cumulative fractions for Pfam27 sequences (16 million sequences, 820,000 with <50% partials, blue squares) and RPD2 sequences (274,000 total, 27,000 with a domain <50% of model length, red circles). Blue diamonds show sequences containing domains with model length >200 match states (6.3 million sequences, 557,941 <50% partials).

Back to article page