Skip to main content
Figure 3 | Genome Biology

Figure 3

From: Most partial domains in proteins are alignment and annotation artifacts

Figure 3

Longer alignments with extended partial domains. Two proteins annotated with partial [Pfam:PF01544] domains, [UniProt:Q9S9N4:112-411] and [UniProt:A5BS21], were compared to the RPD2 proteins using SSEARCH36 (BLOSUM62 scoring matrix, gap open/extend −11/ −1). (A), (B), and (C) show three representative alignments from the 156 RPD2 sequences sharing statistically significant similarity (E()<10−6) with [UniProt:Q9S9N4] (residues 112 to 411). Lines indicate the protein sequences; open trapezoidal boxes show the projection of the alignments onto the sequences. Shaded boxes map the [Pfam:PF01544] domains annotated on the proteins. The numbers in the shaded boxes report model start and end coordinates from Pfam27. (A) One ([UniProt:Q9S9N4:UniProt:F2EH86]) of the 67 longer alignments (>200 amino acids) between proteins with short domain annotations (<200 residues). Non-self-alignments in this category ranged from 26% to 99% identical with 10−10<E()<10−157. (B) Alignment of [UniProt:Q9S9N4] with [UniProt:A5BS21], a short alignment (150 residues) between two much longer proteins. (C) One alignment ([UniProt:Q9S9N4:UniProt:P87149]) representative of the 86 long alignments (>200 residues, E()<10−6) to proteins with >50% partial [Pfam:PF01544] domain annotations. (D) One ([UniProt:A5BS21:UniProt:Q9LJN2]) of the five non-self-alignments >200 amino acids between proteins with [Pfam:PF01544] domain annotations <200 amino acids (51% identity, E()<10−54). (E) A short alignment (156 residues, 37% identity, E()<10−15, [UniProt:A5BS21:UniProt:B4FQF3]) where one protein is annotated to contain ≥200 matches to [Pfam:PF01544].

Back to article page