Skip to main content

Table 1 Interkingdom domain fusions and their probable origins

From: Interkingdom gene fusions

IKF gene Best 'native' hit Best 'alien' hit Protein function Stand-alone Comment
(GI number and gene (E-value, amino acid (E-value, amino   paralog of the  
name) and origin residue range, acid residue range,   alien domain  
of domains species)/domain species)/domain    
  function function    
Archaea      
   Aeropyrum pernix      
5106104_ 2621953_Mth 2633525_Bs Hydroxymethyl- None Pyrococci encode proteins with
APE2400 5e-27; 4e-54; pyrimidine   the same domain organization
Archaeal-bacterial 282-445; 16-272; phosphate kinase   andclosest similarity to A. pernix;
  uncharacterized domain hydroxymethyl- involved in thiamine   M. jannaschii encodes a protein
  conserved among pyrimidine phosphate biosynthesis   with the same domain
  archaea (homolog kinase (additional function?)   organization but low similarity;
  of the amino-terminal     Mt encodes a HMP-kinase with
  domain of sialic acid     moderate similarity
  synthase)     
   Methanococcus jannaschii      
1591138_ 2128140_Mj; 7270033_At; Unknown; None The amino-terminal domain is
MJ0434 1e-19; 0.003; possible role   present in several stand-alone
Archaeal- 2-94; 120-222; in stress response   copies in M. jannaschii, but
bacterial-eukaryotic uncharacterized AIG2-like    otherwise, is seen mostly in
  domain stress-related    bacteria; the possibility of
   protein    acquisition of a bacterial gene
      by the Methanococcus lineage
      is conceivable
   Methanobacterium thermoautotrophicum      
2621249_ 5103547_Ap; 1651798_Ssp; Membrane-associated None In Ssp, the amino-terminal
MTH204 1e-34; 0.002; 5-formyl-   domain is fused to another
Archaeal- 137-326; 8-139; tetrahydrofolate   uncharacterized domain. An
eukaryotic/ 5-formyl- uncharacterized cyclo-ligase(?);   ortholog with conserved
bacterial tetrahydrofolate membrane-associated exact function   domain organization is seen
  cyclo-ligase domain unknown   in Mycobacterium, but many
      other bacteria encode stand-
      alone versions of this domain,
      which could be the actual sources
      of horizontal gene transfer
2621673_ 3256572_Ph; 2984130_Aa; GTPase, possible 2621855  
MTH594 3e-10; 6e-19; role in signal   
Archaeal-bacterial 5-137; 233-390; transduction   
  inactivated RecA GTPase    
  domain     
2622642_ 5105992_Ap; 2569943_Axy; Glucose-1-phosphate None  
MTH1523 3e-36; 2e-05; thymidylyl transferase/   
Archaeal-bacterial 5-226; 226-334; glucose-6-phosphate   
  glucose-1-phosphate mannose-6- isomerase   
  thymidylyl transferase phosphate isomerase    
   Bacteria      
Aquifex aeolicus      
2983622_ 2633696_Bs; 2650176_Af; Signal None  
aq_1151 5e-65; 0.005; transduction   
Bacterial-archaeal 325-795; 116-279; c-di-GMP   
  c-di-GMP phospho- PAS/PAC phospho-diesterase   
  diesterase domain    
2984285_ 586875_Bs; 3915955_Mj; Molybdenum None  
aq_2060 4e-63 3e-09; cofactor   
Bacterial-archaeal 1-252; 270-441; bisynthesis enzyme(?)   
  PHP superfamily pyruvate    
  hydrolase formate-lyase    
   activating enzyme    
   (Fe-S cluster    
   oxidoreductase)    
   Bacillus subtilis      
2632283_yaaH, 4980914_Tm 399377_Rn Chitinase 2635915 B. subtilis encodes two
1945087_ydhD 1e-06 2e-11    paralogous proteins with the
Bacterial-eukaryotic 2-92; 221-402;    same domain architecture
  LysM repeat domain chitinase    
2633242_yhcR 645819_Dr; 2622704_Mth; Nuclease-nucleotidase None  
Bacterial-archaeal 1e-64; 0.008 (probable repair   
  584-1068; 151-257; enzyme)   
  5'-nucleotidase; nucleic acid-binding    
  1175987_ domain (OB-fold)    
  ECR100;     
  2e-09;     
  377-521;     
  thermonuclease     
2632325_yabN 4981449_Tm; 3873806_Ce; Methyl-transferase/ None Other than in chlamydiae,
Bacterial-eukaryotic 2e-62; 0.003; pyro-phosphatase   the SWI domain is seen
  223-483; 7-125; (metabolic enzyme   in eukaryotic chromatin-
  MazG (predicted pyro- SAM-dependent of an unknown   associated proteins, leading
  phosphatase) methyl-transferase pathway?)   to the suggestion that
      chlamydial topoisomerase
      is involved in chromosome
      condensation
   Chlamydophyla pneumoniae      
4377077_ 730965_Bs; 3581917_Sp; DNA topoisomerase I, 7189103 SWI is a typical eukaryotic
CPn0769 e-148; 3e-10; possibly involved in   domain not found in
Bacterial-eukaryotic 1-727; 792-866; chromatin   prokaryotes other than
  DNA topoisomerase I SWI domain condensation   chlamydia (the ortholog
      in Chlamydia trachomatis has the
      same domain architecture)
   Deinococcus radiodurans      
6459294_ 7248325_Sco; 6754878_Mm; DNase None The G9a domain is not
DR1533 0.001; 9e-28;    detectable in other prokaryotes.
Bacterial-eukaryotic 171-265; 4-148;    In eukaryotes, this domain so
  McrA family G9a domain (DNA-    far has been found only as part
  endonuclease binding?)    of multidomain nuclear proteins,
      including transcription factors
   Escherichia coli      
1787179_ 94933_Ppu; 3747107_Rn; Oxidoreductase None The eukaryotic domain is present
b0947 3e-10; 3e-32;    (as a partial sequence) also in the
Bacterial-eukaryotic 287-367; 4-261;    beta-proteobacterium Vogesella.
  ferredoxin uncharacterized    This domain contains a conserved
   domain (thiol    pair of cysteines, which together
   oxidoreductase?)    with the ferredoxin fusion, may
      suggest a thiol oxidoreductase
      activity. Most of the eukaryotic
      proteins containing this domain
      appear to be mitochondrial,
      suggesting the possibility of an
      alternative evolutionary scenario
1787678_ 487713_Sli; 5459012_Pab; Methyl-transferase/ None  
b1410 3e-05; 1e-17; Lipase (exact function   
Bacterial-archaeal 408-522; 33-274; unclear)   
  SAM-dependent lyso-phospholipase    
  methyl-transferase     
1787679_ynbD 1591375_Mj; 7160233_Sp; Membrane-associated None An unusual case of fusion
Archaeal-eukaryotic 4e-04; 1e-06; bifunctional   between an apparently archaeal
  50-218; 346-415; phosphatase   and a typical eukaryotic domain
  membrane-associated tyrosine phosphatase    in a bacterium
  acid phosphatase     
1788589_ 5763950_Sco; 3860247_At; Bifunctional enzyme; None  
b2255 4e-35; 1e-55; exact function unclear   
Bacterial-eukaryotic 1-259; 318-652;    
  methionyl-tRNA dTDP-glucose 4-6-    
  formyl-transferase dehydratase    
1788938_yfiQ 929735_Nsp; 2649370_Af; acetyl-CoA synthetase/ None  
bacterial-Archaeal/ 8e-32; 4e-85; acetyl-transferase; exact   
eukaryotic 637-874; 6-689; function unclear   
  acetyl-transferase acetyl-CoA synthetase    
   Mycobacterium tuberculosis      
2909507_ 6469244_Sco; 4151109_Tbr; Adenylate cyclase/ 7476546, M. tuberculosis encodes three
Rv2488c, 5e-64; 6e-04; ATPase; probable 7476738 paralogous proteins that consist
2791528_Rv1358, 19-603; 6-167; transcription regulator   of three domains, the eukaryotic-
1419061_ 4726088_Rer; adenylate cyclase    type adenylate cyclase, AP
Rv1358 2e-12;     (apoptotic) ATPase and DNA-
Bacterial-eukaryotic 818-1073     binding response regulator, and
      two stand-alone versions of
      adenylate cyclase, which show the
      closest similarity to the cyclase
      domain of the multidomain
      proteins
1314025_ 120037_Tt; 178213_Hs; Ferredoxin/ 2076681 D. radiodurans also encodes the
Rv0886 1e-11; 4e-65; ferredoxin reductase   eukaryotic-type ferredoxin
Bacterial-eukaryotic 2-79; 93-543;    reductase, but the ferredoxin
  ferredoxin ferredoxin reductase    fusion is unique to mycobacteria
3261732_ 2661695_Sco; 279520_Dd; cAMP-dependent 4455714  
Rv0998 3e-13; 7e-07; acetyl-transferase(?) (M. leprae)  
Bacterial-eukaryotic 148-328; 30-105;    
  acetyl-transferase cAMP-binding domain    
2326726_ 421331_Cvi; 2645721_Mm; Bifunctional enzyme of 1929080  
Rv1683 1e-24; 6e-26; poly (3-hydroxy-butyrate)   
Bacterial-eukaryotic 23-359; 456-972; synthesis   
  poly (3-hydroxy- very-long-chain    
  butyrate) synthase acyl-CoA synthetase    
1403447_ 6752338_Sco; 3892714_At; Polyfunctional enzyme 2661651 In this protein, the domain of
Rv2006 2e-27; 8e-27; of trehalose metabolism   apparent eukaryotic origin
Bacterial-eukaryotic 23-240; 264-521;    is flanked by bacterial domains
  phosphatase; trehalose-6-phosphate    from both sides
  6448751_Sco; phosphatase    
  0.0;     
  534-1320;     
  trehalose hydrolase     
2896788_ 117648_Ec; 3073773_Mm; Polyfunctional enzyme 2337823 The presence of the stand-alone
Rv2051c 1e-16; 4e-31; of lipid metabolism (M. leprae); version of the eukaryotic
Bacterial-eukaryotic 94-514; 588-829;   6468712 domain in Streptomyces suggests
  apolipoprotein dolichol-phosphate-   (Streptomyces an ancient horizontal transfer
  N-acyltransferase mannose synthase   coelicolor)  
2791523_ 6225563_Scy; 1098605_Cnu; Multifunctional enzyme None  
Rv2483c 7e-16; 5e-22; of phospholipid   
Bacterial-eukaryotic 36-253; 289-492; metabolism   
  phosphoserine 1-acyl-sn-    
  phosphatase glycerol-3-phosphate    
   acyltransferase    
2894233_ 2633801_Bs; 4538974_At; Molybdopterin synthase 2076687 The same domain organization
Rv3323c 3e-19; 7e-06;    is seen in D. radiodurans, but in
Bacterial-eukaryotic 89-208; 2-82;    this case, both components
  molybdopterin molybdopterin    appear to be of bacterial origin
  synthase large subunit synthase small subunit    
  (MoaE) (MoaD)    
2960152_ 4753872_Sco; 466119_Ce; cAMP-regulated 2501688 M. tuberculosis encodes two
Rv3728, 1e-35; 7e-20; efflux pump(?)   strongly similar paralogs with
7477551_ 56-428; 549-964;    the same domain architecture
Rv3239c transmembrane cAMP-binding domain-    
Bacterial-eukaryotic efflux protein phosphoesterase    
2960153_ 4731342_Sl; 1591330_Mj; Bifunctional enzyme 1806159 The amino-terminal domain
Rv3729 3e-14; 3e-58; of molybdenum   stand-alone paralog is more
Bacterial-archaeal 510-776; molybdenum cofactor biosynthesis   similar to archaeal homologs
  C5-O-methyl- cofactor biosynthesis    than to the stand-alone paralog,
  Transferase protein MoaA    but nevertheless, the latter
  (mitomycin (Fe-S oxidoreductase)    appears to be of archaeal origin
  biosynthesis)     
3261806_ 40487_Cg; 7304009_Dm; Secreted protein 7649504 The stand-alone version of the
Rv3811 3e-12; 2e-12;   (S. coelicolor) eukaryotic domain is present
Bacterial-eukaryotic 404-494; 198-384;    only in Streptomyces
  major secreted peptidoglycan    
  protein recognition protein    
   Treponema pallidum      
3322964_ 7225946_Nm; 320868_Sc; Uridine kinase None A co-linear ortholog is present
TP0667 9e-04; 2e-13;    in Thermotoga
Bacterial-eukaryotic 10-154; 290-488;    
  threonyl-tRNA uridine kinase    
  synthetase (TGS and     
  H3H domains)     
   Thermotoga maritima      
4981276_ 68516_Bs; 3218401_Sp; Uridine kinase None A co-linear ortholog is present
TM0751 3e-07; 2e-11;    in Treponema
Bacterial-eukaryotic 11-200; 288-475;    
  threonyl-tRNA uridine kinase    
  synthetase (TGS and     
  H3H domains)     
Eukaryotes      
   Saccharomyces cerevisiae      
536367_ 586134_Bt; 7450047_Aa; Bifunctional signal- 5249 SurE homologs are not
Ybr094w 9e-10; 8e-09; transduction protein (Yarrowia detectable in eukaryotes other
Eukaryotic/ tubulin-tyrosine ligase acid phosphatase   lipolytica) than yeasts
Bacterial-archaeal   (SurE)    
1431219_ 577625_Hs; 3328426_Ct    
YDL141w 1e-39 5e-27;    
Eukaryotic- Biotin-[propionyl- biotin protein ligase Bifunctional biotin- None An ortholog with an identical
bacterial CoA-carboxylase(ATP-   protein ligase   domain architecture is present
  hydrolysing)] ligase     in S. pombe
458922_ 477096_Gg; 1653075_Ssp; heat shock NONE An ortholog with an identical
YHR206W 8e-18; 7e-17; transcription   domain architecture is present
Eukaryotic-bacterial 78-216 375-503; factor   in S. pombe (3327019)
  heat shock CheY domain    
  transcription factor     
  domain 2983676_Aa; Siroheme synthase 2330809 S. pombe also encodes a co-linear
486539_ 1146165_At; 1e-04;   (S. pombe) ortholog (3581882); apparent
YKR069w 3e-34; 22-188;    displacement of the bacterial
Eukaryotic-bacterial 249-556; precorrin-2 oxidase    precorrin-2 oxidase by a distinct
  urophorphyrin III     Rossmann fold domain
  methylase     
1302305_ 4938476_At; 3212189_Hi; Multifunctional enzyme None Co-linear orthologs in S. pombe
YNL256w 5e-65; 5e-05; of folate biosynthesis   (7490442) and Pneumocystis
Eukaryotic-bacterial 324-861 62-148;    carinii (283062)
  7,8-dihydro-6- 187-297;    
  hydroxymethylpterin- dihydro-neopterin    
  pyro-phosphokinase+ aldolase    
  Dihydro-pteroate     
  synthase     
1419887_ 7297709_Dm; 5918510_Sco; Bifunctional RNA 2213559 The known bacterial homologs
YOL066c 2e-72; 2e-10; modification enzyme (S. pombe) have a two-domain organization;
Eukaryotic-bacterial 42-408; 436-574;    the evolutionary scenario could
  large ribosomal pyrimidine deaminase    have included domain
  subunit pseudoU     rearrangements
  synthase     
1419865_ 2462827_At; 1075360_Hi; Transcriptional regulator None   Yeast encodes three strongly
YOL055c, 1e-39; 6e-24; of thiamine biosynthesis   similar paralogs with identical
2132251_ 22-390; 342-549; genes(?)   domain organization; co-linear
YPL258c, phosphomethyl transcriptional    orthologs are present in other
2132289_ pyrimidinekinase activator    ascomycetes
YPR121w (thiamine biosynthesis)     
Eukaryotic-bacterial      
1370444_ YPL214c 2746079_Bn; 2648451_Af; Bifunctional thiamine None Except for the one from
Eukaryotic-archaeal/ 1e-27; 9e-27; biosynthesis enzyme   A. fulgidus, all highly conserved
Bacterial 9-233; 251-531;    homologs of the kinase domain
  thiamin-phosphate hydroxyethyl-thiazole    of this protein are bacterial; it
  pyro-phosphorylase kinase    appears likely that the A. fulgidus
      gene is the result of horizontal
      transfer
  1. The following complete genomes were analyzed. Archaea: Aeropyrum pernix (Ap); Archaeoglobus fulgidus (Af); Methanococcus jannaschii (Mj); Methanobacterium thermoautotrophicum (Mth); Pyrococcus horikoshii (Ph); Bacteria: Aquifex aeolicus (Aa); Borrelia burgdorferi (Bb); Bacillus subtilis (Bs); Chlamydophila pneumoniae (Cp); Deinococcus radiodurans (Dr); Escherichia coli (Ec); Haemophilus influenzae (Hi); Helicobacter pylori (Hp); Mycobacterium tuberculosis (Mt); Mycoplasma pneumoniae (Mp); Rickettsia prowazekii (Rp); Synechocystis sp (Ssp); Thermotoga maritima (Tm); Treponema pallidum (Tp). No IKFs were detected in the genomes that are not shown in the table. Additional species name abbreviations: At, Arabidopsis thaliana; Axy, Acetobacter xylinus; Bn, Brassica napus; Ce, Caenorhabditis elegans; Cvi, Chromatium vinosum; Gg, Gallus gallus; Hs, Homo sapiens; Mm, Mus musculus; Rn, Rattus norvegicus; Sco, Streptomyces coelicolor; Sl, Streptomyces lavendulae.