Lateral gene transfer and parallel evolution in the history of glutathione biosynthesis genes
© Copley and Dhillon, licensee BioMed Central Ltd 2002
Received: 28 December 2001
Accepted: 5 March 2002
Published: 29 April 2002
Glutathione is found primarily in eukaryotes and in Gram-negative bacteria. It has been proposed that eukaryotes acquired the genes for glutathione biosynthesis from the alpha-proteobacterial progenitor of mitochondria. To evaluate this, we have used bioinformatics to analyze sequences of the biosynthetic enzymes γ-glutamylcysteine ligase and glutathione synthetase.
γ-Glutamylcysteine ligase sequences fall into three groups: sequences primarily from gamma-proteobacteria; sequences from non-plant eukaryotes; and sequences primarily from alpha-proteobacteria and plants. Although pairwise sequence identities between groups are insignificant, conserved sequence motifs are found, suggesting that the proteins are distantly related. The data suggest numerous examples of lateral gene transfer, including a transfer from an alpha-proteobacterium to a plant. Glutathione synthetase sequences fall into two distinct groups: bacterial and eukaryotic. Proteins in both groups have a common structural fold, but the sequences are so divergent that it is uncertain whether these proteins are homologous or arose by convergent evolution.
The evolutionary history of the glutathione biosynthesis genes is more complex than anticipated. Our analysis suggests that the two genes in the pathway were acquired independently. The gene for γ-glutamylcysteine ligase most probably arose in cyanobacteria and was transferred to other bacteria, eukaryotes and at least one archaeon, although other scenarios cannot be ruled out. Because of high divergence in the sequences, the data neither support nor refute the hypothesis that the eukaryotic gene comes from a mitochondrial progenitor. After acquiring γ-glutamylcysteine ligase, eukaryotes and most bacteria apparently recruited a protein with the ATP-grasp superfamily structural fold to catalyze synthesis of glutathione from γ-glutamylcysteine and glycine. The eukaryotic glutathione synthetase did not evolve directly from the bacterial glutathione synthetase.
Aerobic organisms produce intracellular thiols such as glutathione (GSH), homoglutathione , γ-glutamylcysteine (γ-Glu-Cys) , γ-glutamylcysteinylserine  and mycothiol  for protection against reactive oxygen species formed as by-products of aerobic metabolism. GSH is the most common of these. In addition to buffering the redox status of the cytoplasm and protecting biomolecules against oxidative damage, GSH provides reducing equivalents to several enzymes (including ribonucleotide reductase , 3'-phosphoadenosine 5-phosphosulfate reductase  and arsenate reductase ), and serves as a substrate for glutathione-S-transferases, which detoxify potentially dangerous electrophiles.
We have analyzed the sequences of genes encoding GshA and GshB and have discovered that the evolutionary history of these genes is more complex than expected. Our results are consistent with two possible explanations for the distribution of GshA genes. The most likely possibility is that GshA arose in the bacterial domain, and the gene was transferred to eukaryotes at an early stage in their evolution. A second, less appealing, possibility is that a GshA gene was present in the last common ancestor and was subsequently lost from many organisms, primarily those that live under anaerobic conditions. GshB appears to have arisen independently within bacteria and eukaryotes subsequent to the acquisition of the GshA gene. Notably, in each domain, the scaffold typical of the ATP-grasp superfamily was utilized to provide GshB. Multiple examples of lateral transfer of the GshA gene are evident, the most dramatic being a trans-domain transfer from an alpha-proteobacterium to a plant sometime before 300 million years ago.
Results and discussion
GshA sequences fall into three distinct groups
We assembled an initial set of GshA sequences by searching the NCBI Protein database  using either GshA or glutamate-cysteine ligase as query words. The set was expanded using the output of BLAST  searches with the sequences from Escherichia coli, Saccharomyces cerevisiae and Arabidopsis thaliana. Some of the sequences found in the BLAST searches correspond to hypothetical proteins whose functions have not been experimentally verified. For organisms that are known to synthesize GSH, these sequences are likely to encode GshA. However, the assignment of function is uncertain for the sequences from Mycobacterium tuberculosis, Streptomyces coelicolor and Clostridium acetobutylicum because these organisms are not known to synthesize GSH [4,17].
Three groups of known and putative GshAs
Eukaryotes (except plants)
Primarily alpha-proteobacteria and plants*
Buchnera sp. APS
(Streptococcus pneumoniae TIGR4)
p-values for blocks found in some known and putative GshAs
p-value block 1
p-value block 2
p-value block 3
Possible explanations for the distribution of GshA: lateral transfer versus massive gene loss
The possibility that GshA arose early in the bacterial lineage and then was transferred to the eukaryotic lineage and into at least one archaeon (Halobacterium) is perhaps more appealing. GshA may have arisen in cyanobacteria concomitant with the origin of oxygenic photosynthesis, a particularly attractive hypothesis because this would have been the first time protection against reactive oxygen species would have been important. Later, as proteobacteria developed aerobic metabolic processes that took advantage of the growing O2 concentration in the atmosphere, the consequent production of reactive oxygen species would have given a selective advantage to microbes that acquired a GshA gene by lateral gene transfer from cyanobacteria. GshA would have been advantageous to aerobic eukaryotes as well, providing selective pressure for the acquisition and retention of a bacterial GshA gene. Transfer of a GshA gene from the alpha-proteobacterial progenitor of mitochondria, as postulated previously [8,11,12], is one possible mechanism for the spread of GshA into eukaryotes. Our analysis does not show a significant association between eukaryotic and alpha-proteobacterial sequences that would support this mechanism (Figure 4). However, the extreme divergence of the sequences limits our ability to resolve phylogenetic relationships, so this mechanism remains a possibility. There are other possible mechanisms for an early trans-domain lateral gene transfer, as well. For example, Doolittle has proposed that protists that consume bacteria as food may incorporate bacterial genes into their genomes .
Lateral transfer of GshA genes
Pairwise sequence identities between GshA sequences from plants and alpha-proteobacteria
1. M. loti
2. C. crescentus
3. X. fastidiosa
4. Z. mobilis
5. B. japonicum
6. A. tumefaciens
7. S. melliloti
8. L. esculentum
9. M. truncatula
10. P. sativum
11. B. juncea
12. P. vulgaris
13. A. thaliana
Lateral gene transfer from alpha-proteobacteria to plants is particularly intriguing because transfer and retention of a foreign gene in a sophisticated multicellular organism is more difficult than in bacteria. Bacteria can acquire DNA from their environment in multiple ways (transformation, transduction and conjugation) . Furthermore, a transferred gene can be easily transmitted to progeny after recombination into genomic or plasmid DNA. However, known mechanisms for transfer of DNA into plants are more limited. The best understood mechanism is the transfer of T-strand DNA from the Ti-plasmid of Agrobacterium tumefaciens into wounded plant tissues, a process resulting in the formation of tumors . It is not known whether foreign genes can be transferred into plants by this mechanism in nature, but such a process is plausible. Perpetuation of a transferred gene is also not as easily achieved in seed plants as it is in bacteria, because the gene must be incorporated into genomic DNA in apical meristem cells, undifferentiated stem cells that produce new organs, including the cones or flowers that generate male and female gametes. An interesting issue is whether the group 3 alpha-proteobacterial gene displaced an ancestral group 2 eukaryotic gene, or whether the ancestral gene was first lost, allowing the alpha-proteobacterial gene to fill the functional gap.
Additional potential cases of lateral gene transfer are also suggested by our data. GshA from Xylella fastidiosa, a gamma-proteobacterium, clusters with group 3, rather than with the other gamma-proteobacterial sequences in group 1. This organism is a plant pathogen, and its physical association with plants has apparently provided an opportunity for gene transfer, either from a plant or, more likely, from an associated alpha-proteobacterium. Sequences from Halobacterium sp. NRC-1 (archaeon), Mycobacterium tuberculosis and Streptomyces coelicolor (high-GC Gram-positive bacteria), and Streptococcus pneumoniae (low-GC Gram-positive bacterium) cluster with the plant and alpha-proteobacterial sequences in group 3, and a sequence from Clostridium acetobutylicum, a low-GC Gram-positive bacterium, clusters with the gamma-proteobacterial sequences in Group 1. (Note that synthesis of γ-glutamyl cysteine has been demonstrated in Halobacterium sp. NRC-1 , but not in Mycobacterium tuberculosis, Streptomyces coelicolor or Streptococcus pneumoniae.) The occurrence of GshA homologs in these organisms could reflect persistence of an ancestral GshA in only some genera in the Archaea and in the Gram-positive bacteria, or could be the result of lateral transfer into a limited number of organisms. As discussed above, it is difficult to distinguish between these two explanations, although the lateral gene transfer hypothesis is most appealing.
Why are GshA sequences so divergent?
The three groups of GshA sequences are so divergent that it was difficult to demonstrate an evolutionary relationship between them. This level of sequence divergence is unexpected, and warrants some thought. Three obvious factors contribute to divergence of sequence in orthologs. First, the organisms being compared may be very distant. This explanation is probably not sufficient to explain the sequence divergence in the GshAs. The gamma-proteobacteria represented in group 1 are reasonably closely related to the alpha-proteobacteria in group 3, and the crown eukaryotes in group 2 are reasonably closely related to the plants in group 3. Orthologous relationships between these groups can often be identified. Of the 1,923 COGs (clusters of orthologous groups) [32,33,34] identified in P. aeruginosa (a gamma-proteobacterium), 80% are also found in the combined group containing Mesorhizobium loti and Caulobacter crescentus (both alpha-proteobacteria). As a specific example, the GshB sequences from E. coli (a gamma-proteobacterium) and C. crescentus have 41% identity, and those from Arabidopsis and human have 43% identity. Detection of orthologs in even more distantly related organisms is also possible in many cases. We have used PSI-BLAST to find orthologs of enzymes that use glutathione (including glutaredoxins, glutathione-S-transferases, glutathione reductases and glutathione peroxidases) in both bacteria and eukaryotes (data not shown).
A second reason that homologous sequences may be very divergent is that little selective pressure has been required to maintain function at the level required for the organism to succeed. This situation may occur if the reaction being catalyzed is not very demanding, or if the product of the reaction does not contribute to the fitness of the organism in an important way. An example of the first scenario is o-succinylbenzoate synthase, which catalyzes the dehydration of 2-hydroxy-6-succinyl-2,4-cyclohexadiene carboxylate to form o-succinylbenzoate synthase in the biosynthetic pathway for menaquinone synthesis. o-Succinylbenzoate synthases from various organisms have very low pairwise sequence identities compared to those seen for other members of the enolase superfamily. The low sequence identities in these enzymes have been interpreted as reflecting relatively low constraints upon the sequence because the reaction is quite facile even in the absence of the enzyme (because it forms an aromatic product), and thus the enzyme is not required to provide a great deal of assistance . This scenario is unlikely to account for the divergence of the GshA genes, as formation of a peptide bond is a quite difficult reaction. In fact, members of the ATP-grasp superfamily that catalyze comparable reactions (that is, GshBs (see further below), ribosomal S6 modification enzymes, D-Ala-D-Ala ligases) are sufficiently well conserved to be easily detected by PSI-BLAST, even though they utilize different substrates . With respect to the second scenario, it is clear that the ability to synthesize GSH provides a significant advantage. Bacteria and yeast that lack functional GshA are viable, but are hypersensitive to oxidative damage [37,38]. GshA-deficient strains of A. thaliana are viable, but are hypersensitive to cadmium . Mice in which γ-Glu-Cys ligase has been knocked out die before gestational day 13 . Thus, low selection pressure cannot account for the high levels of divergence among the GshA sequences.
Finally, sequences of homologs may diverge if they are subject to different selective pressures in different lineages. This might occur if the protein has a second function (a 'moonlighting' function) in some lineages and is subject to selective pressure that alters regions of the protein involved in that function. Alternatively, if a protein interacts with other proteins, then differences in those partner proteins will drive changes in the regions of the protein involved in the interaction. The possibility that the high level of divergence in GshA proteins is due to one of these factors is intriguing and worth experimental exploration.
GshB: a different story
A set of GshBs was assembled by searching the NCBI protein database using either GshB or glutathione synthetase as query words, and from the outputs of BLAST searches with the E. coli and S. cerevisiae proteins as query sequences. The sequences in the GshB set fall into two distinct groups, corresponding to bacteria and eukaryotes, that have no significant relationship to each other on the basis of pairwise sequence identities. As seen with GshA, PSI-BLAST searches with GshB sequences from either group did not find GshB sequences from the other group. A PSI-BLAST search with the human GshB converged after two iterations. The output contained only eukaryotic GshBs, a few putative homoglutathione synthestases, and a few eukaryotic proteins of unknown function. No bacterial GshBs were found in the output. After multiple iterations, a PSI-BLAST search using the E. coli GshB (gi121663) as an initial query sequence found 509 sequences of enzymes in the ATP-grasp superfamily, to which the bacterial enzyme is known to belong, but no eukaryotic GshB sequences. The ATP-grasp superfamily  includes at least 15 families of enzymes that catalyze formation of a bond between a carboxylate group of one substrate and an amino, imino or thiol group of a second substrate. The bacterial GshBs are most closely related to ribosomal S6 modification enzymes, which catalyze the addition of glutamate to the carboxyl terminus of ribosomal protein S6. Other members of the superfamily include carbamoyl phosphate synthases, cyanophycin synthetases and D-Ala-D-Ala ligases. Notably, many members of the ATP-grasp superfamily were found in eukaryotes. For example, A. thaliana has at least three superfamily members in the PSI-BLAST output, including carbamoyl phosphate synthetase, 3-methylcrotonyl-CoA carboxylase, and acetyl CoA carboxylase. However, the Arabidopsis GshB is not found in the output. Thus, ATP-grasp superfamily members in eukaryotes are more closely related to bacterial GshBs than to eukaryotic GshBs.
The structural similarity between the E. coli and human GshBs indicates that the bacterial and eukaryotic enzymes could be related to each other, but is not sufficient to prove that they are related to each other because the common structure might have arisen by convergent evolution from different progenitors. This consideration is especially worth noting in light of the very low conservation of residues in the active site. Consequently, we looked for evidence of an evolutionary relationship by attempting to find proteins that might be distantly related to the bacterial and eukaryotic GshB proteins and therefore might bridge the sequence gap between them using the Shotgun algorithm , which was designed to facilitate searches for distant relations between proteins. The algorithm performs a BLAST search with each of a set of query sequences. It then sorts the hits found by all of the proteins, and for each hit, identifies the query sequences that found that hit. A hit that is found by multiple members of two distinct groups of proteins, even with low BLAST scores, can be examined closely to determine whether it provides a link between the groups.
Motifs found in GshBs and related proteins
Best possible match
S6 modification enzymes
S6 modification enzymes
In a case such as this one, it can be difficult to determine whether common sequence motifs have arisen by divergence from a common progenitor, or by convergent evolution driven by a common function. The two common motifs found in the bacterial and eukaryotic GshBs correspond to the ATP-binding pocket. Thus, we examined the possibility that motifs 1 and 5 arose by convergent evolution driven by the need to bind ATP by looking for these motifs in other proteins that bind ATP. We searched the non-redundant database with motifs 1 and 5 using the MAST algorithm. Searches with motif 1 and 5 retrieved 145 and 40 sequences with E-values less than 10, respectively. Nearly all of the proteins with known functions in the output were GshBs or other members of the ATP-grasp superfamily. All of the 145 proteins found by motif 1 were members of the ATP-grasp superfamily except for two transcriptional regulators with E-values of 4.8 (gi15613287) and 7.8 (gi15224768), dihydroorotate dehydrogenases with E-values greater than 7 from several organisms, and a proline/betaine transporter (gi1589297) with an E-value of 9.9. All of the proteins with known functions found by motif 5 were members of the ATP-grasp superfamily except the LIM-containing protein kinase 2t (gi3273207), which had an E-value of 6.2. Thus, motifs 1 and 5 are characteristic of the ATP-binding region of the ATP-grasp superfamily enzymes.
The question of whether eukaryotic GshBs are members of the ATP-grasp superfamily is difficult to answer with certainty because the eukaryotic GshBs are so dramatically different from the other superfamily members. The two conserved sequence motifs involved in ATP binding do provide a link between eukaryotic GshBs and the ATP-grasp superfamily, but it is rather tenuous, as it is possible that these sequences provide the best way to bind ATP within the context of this structural fold and have evolved by convergent evolution in eukaryotic GshBs and the ATP-grasp superfamily members. The lack of conservation in the glutathione-binding region of bacterial and eukaryotic GshBs is also an important consideration. We feel that the evidence for a true evolutionary relationship between the eukaryotic GshBs and the ATP-grasp superfamily is rather weak given the evidence available at this time. For our purposes here, it is sufficient to conclude, in agreement with Polekhina et al. , that eukaryotic GshBs did not evolve directly from bacterial GshBs, but rather that both evolved from ancestors that had the characteristic fold of the ATP-grasp superfamily.
A different twist? A fused GSHA-ATP-grasp superfamily homolog in an odd collection of bacteria
In most organisms, GshA and GshB are encoded by separate genes that are not in close proximity. An interesting variation on this theme may occur in a small number of bacteria. Clostridium perfringens, Listeria monocytogenes, Listeria innocuans and Pasteurella multocidans have an open reading frame (ORF) that could encode a GshA homolog fused to an ATP-grasp superfamily member, raising the possibility that this protein might combine the two activities required for synthesis of GSH. C. perfringens and L. monocytogenes contain low levels of GSH (0.25 μmol/g residual dry weight) that are about 20-fold lower than those found in E. coli, but it is not known whether they synthesize GSH or simply import it from the medium . Experimental determination of the function of these fused proteins is clearly needed.
Percentage identities between fused GshA-ATP-grasp superfamily homologs and GshA, GshB and cyanophycin synthetase
Gene ID for fused GshA-ATP-grasp superfamily homolog
Percent identity between amino-terminal region and E. coli GshA
Percent identity between carboxy-terminal region and E. coli GshB
Percent identity between carboxy-terminal region and Anabaena variabilis cyanophycin synthetase
The occurrence of an ORF for the fusion protein in this cluster of bacteria is curious because C. perfringens, L. monocytogenes and L. innocuans are low-GC Gram-positive bacteria, while P. multocida is a gamma-proteobacterium. C. perfringens is found in soil and sewage and is often part of the normal intestinal flora of animals and humans. It causes gangrene and food poisoning in humans . L. monocytogenes and L. innocuans are ubiquitous contaminants of soil and water, and L. monocytogenes causes listeriosis, a serious food-borne illness . P. multocida colonizes the nasopharynx and gastrointestinal tract of many animals and birds, and causes a wide range of illnesses . Human infections are most often caused by dog or cat bites. Thus, the association of these bacteria with animals as either commensal or pathogenic organisms has apparently provided an opportunity for lateral transfer of the gene encoding the fusion protein.
The ORF for the putative fusion protein in P. multocida is particularly intriguing because gamma-proteobacteria typically have a group 1 GshA and a typical bacterial GshB. P. multocida has neither of these, and neither does its close relative, Haemophilus influenzae. The lineage leading to P. multocida and H. influenzae diverged from other gamma-proteobacteria approximately 270 million years ago . P. multocida and H. influenzae have considerably smaller genomes (2,014 and 1,743 predicted coding regions, respectively [50,51]) than E. coli (4,288 predicted coding regions ), suggesting that this lineage has undergone substantial genome reduction. It is possible that the lineage leading to P. multocida and H. influenzae lost GshA, and P. multocida subsequently acquired the fused GshA-ATP-grasp superfamily homolog in its place.
Putting together the pieces: thoughts on the evolution of the pathway
The pathway for GSH biosynthesis involves two enzymes, and it is of interest to consider which of these evolved first. Horowitz has postulated that biosynthetic pathways evolve in a retrograde fashion, beginning with the last enzyme in the pathway . This hypothesis rests on the assumption that organisms had, at one time, access to a supply of precursors for biological polymers such as DNA, RNA, proteins and polysaccharides. As the supply of a given precursor dwindled, the most successful organisms would be those that 'invented' an enzyme with which to catalyze formation of that precursor from compounds present in the environment. Thus, there would be continuous selective pressure to add enzymes in the retrograde direction to catalyze synthesis of precursors from ever more simple constituents. Evolution of a biosynthetic pathway in the forward direction was deemed unlikely, as there would be no selective pressure for evolution of enzymes to produce intermediates of no further use to the organism. Horowitz's proposal is logical and appealing. There are cases, however, in which forward evolution of a pathway seems more likely. For example, many organisms make complex natural products whose roles generally involve killing or manipulating other organisms. The pathways for building these complex structures have probably evolved in a forward direction by addition of enzymes capable of adding to the complexity of a pre-existing molecule and thereby contributing to its biological potency.
The GSH biosynthesis pathway is most likely to have evolved in a forward direction. If the pathway had evolved in a retrograde direction, the Horowitz theory would postulate that GshB arose to take advantage of γ-Glu-Cys present in the environment. It is unlikely that γ-Glu-Cys would have been available because formation of the high-energy amide bond would be unlikely to occur abiotically. Furthermore, this molecule would be unstable to oxidation in aerobic environments. However, evolution of the GSH biosynthesis pathway in the forward direction makes considerable sense. γ-Glu-Cys can serve some of the functions of GSH, and therefore could be advantageous to an organism even in the absence of GshB. Indeed, halobacteria contain millimolar levels of γ-Glu-Cys, but do not convert it further to GSH . However, γ-Glu-Cys is not an ideal solution, as it is more easily oxidized than GSH . Furthermore, the reactivity of a thiol depends upon its pKa, as thiolates are orders of magnitude more nucleophilic than thiols . The nucleophilicity of the thiol in γ-Glu-Cys should be diminished by the proximity of the negatively charged carboxylate. Further reaction of γ-Glu-Cys with Gly to form GSH would improve its properties with respect to both oxidation and nucleophilicity, thus, providing selective pressure for evolution of a GSH synthetase (GshB).
One possible source for an enzyme to catalyze the next step in a pathway evolving in the forward direction is the enzyme that catalyzed the last step, as this enzyme has a binding site that accommodates the product of the last reaction, and that product is the substrate for the next reaction. A similar situation occurs for pathways evolving in a retrograde direction. This type of enzyme recruitment, which takes advantage of already existing substrate-specificity determinants, but requires changes in catalytic groups, appears to occur rather infrequently . For example, among 510 proteins involved in the small-molecule metabolic pathways in E. coli, homology between consecutive enzymes in a pathway occurs only six times . Most often, enzymes are recruited to catalyze new reactions by virtue of the catalytic abilities of their active sites, and interactions required for substrate binding are then optimized. The GSH biosynthesis genes, however, appear to be an optimal case for recruitment of one enzyme to catalyze a subsequent reaction. As GshA has a binding site that accommodates γ-Glu-Cys, it would appear to be an ideal progenitor of GshB, which uses γ-Glu-Cys as a substrate and also catalyzes the ATP-dependent formation of an amide bond. However, GshA and GshB appear to be structurally distinct. There are no experimental structures for GshAs, but recent work suggests that GshAs are homologs of glutamine synthetases . GshBs have a different structural fold, characteristic of the ATP-grasp superfamily. Thus, the data support a scenario in which emergence of GshA was followed, in most organisms, by the recruitment of a different protein to serve as the progenitor of GshB. It is particularly interesting that, in both the bacterial and eukaryotic lineages, the ATP-grasp structural fold provided the starting point for the evolution of GshB.
Our analysis of the sequences of GshAs and GshBs suggests that the evolutionary history of these proteins is more complex than expected on the basis of the distribution of GSH in extant organisms. Our results, as well as the observation that GshA and GshB genes are generally not found in proximity in microbial genomes, suggest that these genes did not evolve together. Therefore, we must consider the evolutionary history of the two genes separately. Although the origin of the GshA gene cannot be unequivocally determined, it is most plausible to suppose that it arose in cyanobacteria, which would have been the first cells to require the protection conferred by γ-Glu-Cys against reactive oxygen species. If this hypothesis is correct, then subsequent lateral gene transfers must have occurred to spread the gene to the proteobacteria and eukaryotes, as well as to at least one archaeon and possibly to some Gram-positive bacteria. Because of the high level of sequence divergence, there is no clear indication in the sequence data as to whether eukaryotes acquired a GshA gene from a cyanobacterium or a proteobacterium. After the acquisition of GshA, a further improvement in protection against reactive oxygen species was obtained in most organisms by recruitment of an enzyme to convert γ-Glu-Cys to GSH. This recruitment apparently took place independently in the bacterial and eukaryotic lineages, since the sequence of the eukaryotic GshB is remarkably different from that of the bacterial GshBs, despite the structural similarities between these two proteins. At least for GshB, therefore, we can eliminate the possibility of transfer from the mitochondrial progenitor into an early eukaryote. The emerging picture of the evolution of the glutathione biosynthesis pathway is significant because it suggests that the pathway evolved in a forward direction, in contradiction to the Horowitz hypothesis.
Materials and methods
BLAST  and PSI-BLAST  searches were carried out at the NCBI website . Multiple sequence alignment was performed using ClustalW  at the Pittsburgh Supercomputing Center. Pairwise sequence identities were determined using the Distances algorithm in the GCG package at the Pittsburgh Supercomputing Center. Motif analyses were carried out using MEME  at the San Diego Supercomputing Center  and Block Maker  at the Fred Hutchinson Cancer Center . Phylogenetic analyses were carried out using PAUP 4.0b .
We thank Norman Pace, William Friedman and Scott Kelley for helpful discussions, and Patricia Babbitt for carrying out the Shotgun analysis. Financial support was provided by the NASA Astrobiology Program and a research computing grant from the Pittsburgh Supercomputing Center.
- Carnegie PR: Structure and properties of a homolog of glutathione. Biochem J. 1963, 89: 471-478.PubMedPubMed CentralView ArticleGoogle Scholar
- Newton GL, Javor B: gamma-Glutamylcysteine and thiosulfate are the major low-molecular-weight thiols in halobacteria. J Bacteriol. 1985, 161: 438-441.PubMedPubMed CentralGoogle Scholar
- Klapheck S, Chrost B, Starke J, Zimmermann H: γ-Glutamylcysteinylserine - a new homolog of glutathione in plants of the family Poaceae. Bot Acta. 1992, 105: 174-179.View ArticleGoogle Scholar
- Newton GL, Arnold K, Price MS, Sherrill C, Delcardayre SB, Aharonowitz Y, Cohen G, Davies J, Fahey RC, Davis C: Distribution of thiols in microorganisms: mycothiol is a major thiol in most actinomycetes. J Bacteriol. 1996, 178: 1990-1995.PubMedPubMed CentralGoogle Scholar
- Holmgren A: Thioredoxin and glutaredoxin systems. J Biol Chem. 1989, 264: 13963-13966.PubMedGoogle Scholar
- Russel M, Model P, Holmgren A: Thioredoxin or glutaredoxin in Esherichia coli is essential for sulfate reduction but not for deoxyribonucleotide synthesis. J Bacteriol. 1990, 172: 1923-1929.PubMedPubMed CentralGoogle Scholar
- Gladysheva TB, Oden KL, Rosen BP: Properties of the arsenate reductase of plasmid R773. Biochemistry. 1994, 33: 7288-7293.PubMedView ArticleGoogle Scholar
- Fahey RC, Newton GL, Arrick B, Overdank-Bogart T, Aley SB: Entamoeba histolytica: a eukaryote without glutathione metabolism. Science. 1984, 224: 70-72.PubMedView ArticleGoogle Scholar
- Brown DM, Upcroft JA, Upcroft P: Cysteine is the major low-molecular weight thiol in Giardia duodenalis. Mol Biochem Parasitol. 1993, 61: 155-158. 10.1016/0166-6851(93)90169-X.PubMedView ArticleGoogle Scholar
- Ellis JE, Yarlett N, Cole D, Humphreys MJ, Lloyd D: Antioxidant defenses in the microaerophilic protozoan Trichomonas vaginalis: comparison of metronidazole-resistant and sensitive strains. Microbiology. 1994, 140: 2489-2494.PubMedView ArticleGoogle Scholar
- Fahey RC, Sundquist AR: Evolution of glutathione metabolism. In Adv Enzymol Rel Areas Mol Biol Edited by A Meister. New York: John Wiley and Sons, . 1991, 1-53.Google Scholar
- Fahey RC: Novel thiols of prokaryotes. Annu Rev Microbiol. 2001, 55: 333-356. 10.1146/annurev.micro.55.1.333.PubMedView ArticleGoogle Scholar
- Yang D, Oyaizu H, Olsen GJ, Woese CR: Mitochondrial origins. Proc Natl Acad Sci USA. 1985, 82: 4443-4447.PubMedPubMed CentralView ArticleGoogle Scholar
- Gray MW: Origin and evolution of organelle genomes. Curr Opin Genet Dev. 1993, 3: 884-890.PubMedView ArticleGoogle Scholar
- NCBI protein database. [http://0-www.ncbi.nlm.nih.gov.brum.beds.ac.uk/]
- Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25: 3389-3402. 10.1093/nar/25.17.3389.PubMedPubMed CentralView ArticleGoogle Scholar
- Newton GL, Fahey RC, Cohen G, Aharonowitz Y: Low-molecular-weight thiols in Streptomycetes and their potential role as antioxidants. J Bacteriol. 1993, 175: 2734-2742.PubMedPubMed CentralGoogle Scholar
- Pfam database. [http://pfam.wustl.edu/]
- Henikoff S, Henikoff JG, Alford WJ, Pietrokovski S: Automated construction and graphical presentation of protein blocks from unaligned sequences. Gene. 1995, 163: GC17-GC26. 10.1016/0378-1119(95)00486-P.PubMedView ArticleGoogle Scholar
- Blocks WWW Server. [http://blocks.fhcrc.org/blocks/]
- Swofford DL: PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 4. Sinauer Associates: Sunderland, MA. 2001Google Scholar
- DesMarais DJ: When did photosynthesis emerge on earth?. Science. 2000, 289: 1703-1705.Google Scholar
- Doolittle WF: You are what you eat: a gene transfer ratchet could account for bacterial genes in eukaryotic nuclear genomes. Trends Genet. 1998, 14: 307-311. 10.1016/S0168-9525(98)01494-2.PubMedView ArticleGoogle Scholar
- Garcia-Vallvé S, Romeu A, Palau J: Horizontal gene transfer in bacterial and archaeal complete genomes. Genome Res. 2000, 10: 1719-1725. 10.1101/gr.130000.PubMedPubMed CentralView ArticleGoogle Scholar
- Jain R, Rivera MC, Lake JA: Horizontal gene transfer among genomes: the complexity hypothesis. Proc Natl Acad Sci USA. 1999, 96: 3801-3806. 10.1073/pnas.96.7.3801.PubMedPubMed CentralView ArticleGoogle Scholar
- Ochman H, Lawrence JG, Groisman EA: Lateral gene transfer and the nature of bacterial innovation. Nature. 2000, 405: 299-304. 10.1002/(SICI)1096-9861(19990315)405:3<299::AID-CNE2>3.0.CO;2-6.PubMedView ArticleGoogle Scholar
- Katz LA: Transkingdom transfer of the phosphoglucose isomerase gene. J Mol Evol. 1996, 43: 453-459.PubMedView ArticleGoogle Scholar
- Wolf YI, Aravind L, Grishin NV, Koonin EV: Evolution of aminoacyl-tRNA synthetases - analysis of unique domain architectures and phylogenetic trees reveals a complex history of horizontal gene transfer events. Genome Res. 1999, 9: 689-710.PubMedGoogle Scholar
- Stewart WN, Rothwell GW: Paleobotany and the Evolution of Plants, 2nd edn. Cambridge University Press: Cambridge,. 1993Google Scholar
- Schell J, Koncz C: The Ti-plasmid and plant molecular biology. Discoveries Plant Biol. 2000, 3: 393-409.View ArticleGoogle Scholar
- Sundquist AR, Fahey RC: The function of γ-glutamylcysteine and bis-γ-glutamylcysteine reductase in Halobacterium halobium. J Biol Chem. 1989, 264: 719-725.PubMedGoogle Scholar
- Tatusov RL, Koonin EV, Lipman DJ: A genomic perspective on protein families. Science. 1997, 278: 631-637. 10.1126/science.278.5338.631.PubMedView ArticleGoogle Scholar
- Tatusov RL, Natale DA, Garkavtsev IV, Tatusova TA, Shankavaram UT, Rao BS, Kiryutin B, Galperin MY, Fedorova ND, Koonin EV: The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res. 2001, 29: 22-28. 10.1093/nar/29.1.22.PubMedPubMed CentralView ArticleGoogle Scholar
- Clusters of Orthologous Groups (COG) Database. [http://0-www.ncbi.nlm.nih.gov.brum.beds.ac.uk/COG/]
- Palmer DR, Garrett JB, Sharma V, Meganathan R, Babbitt PC, Gerlt JA: Unexpected divergence of enzyme function and sequence: "N-acylamino acid racemase" is o-succinylbenzoate synthase. Biochemistry. 1999, 38: 4252-4258. 10.1021/bi990140p.PubMedView ArticleGoogle Scholar
- Galperin MY, Koonin EV: A diverse superfamily of enzymes with ATP-dependent carboxylate-amine/thiol ligase activity. Protein Sci. 1997, 6: 2639-2643.PubMedPubMed CentralView ArticleGoogle Scholar
- Harrop HA, Held KD, Michael BD: The oxygen effect: variation of the K-value and lifetimes of oxygen-dependent damage in some glutathione-deficient mutants of Escherichia coli. Int J Radiat Biol. 1991, 59: 1237-1251.PubMedView ArticleGoogle Scholar
- Stephen DWS, Jamieson DJ: Glutathione is an important antioxidant molecule in the yeast Saccharomyces cerevisiae. FEMS Microbiol Lett. 1996, 141: 207-212. 10.1016/0378-1097(96)00223-6.PubMedView ArticleGoogle Scholar
- Cobbett CS, May MJ, Howden R, Rolls B: The glutathione-deficient, cadmium-sensitive mutant, cad-1, of Arabidopsis thaliana is deficient in γ-glutamylcysteine synthetase. Plant J. 1998, 16: 73-78. 10.1046/j.1365-313X.1998.00262.x.PubMedView ArticleGoogle Scholar
- Dalton PD, Dieter MZ, Yang Y, Shertzer HG, Nebert DW: Knockout of the mouse glutamate cysteine ligase catalytic subunit (Gclc) gene: embryonic lethal when homozygous, and proposed model for moderate glutathione deficiency when heterozyogous. Biochem Biophys Res Commun. 2000, 279: 324-329. 10.1006/bbrc.2000.3930.PubMedView ArticleGoogle Scholar
- Galperin MY, Koonin EV: A diverse superfamily of enzymes with ATP-dependent carboxylate-amine/thiol ligase activity. Protein Sci. 1997, 6: 2639-2643.PubMedPubMed CentralView ArticleGoogle Scholar
- Hara T, Kato H, Katsube Y, Oda J: A pseudo-Michaelis quaternary complex in the reverse reaction of a ligase: structure of Escherichia coli B glutathione synthetase complexed with ADP, glutathione, and sulfate at 2.0 Å resolution. Biochemistry. 1996, 35: 11967-11974. 10.1021/bi9605245.PubMedView ArticleGoogle Scholar
- Polekhina G, Board PG, Gali Rr, Rossjohn J, Parker MW: Molecular basis of glutathione synthetase deficiency and a rare gene permutation event. EMBO J. 1999, 18: 3204-3213. 10.1093/emboj/18.12.3204.PubMedPubMed CentralView ArticleGoogle Scholar
- Pegg SC-H, Babbitt PC: Shotgun: getting more from sequence similarity searches. Bioinformatics. 1999, 15: 729-740. 10.1093/bioinformatics/15.9.729.PubMedView ArticleGoogle Scholar
- Thompson JD, Higgins DG, Gibson TJ: ClustalW: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994, 22: 4673-4680.PubMedPubMed CentralView ArticleGoogle Scholar
- Bailey TL, Elkan C: Fitting a mixture model by expectation maximization to discover motifs in biopolymers. In Proc Int Conf Intell Syst Mol Biol. 1994, 2: 28-36.Google Scholar
- MEME 3.0. [http://meme.sdsc.edu/meme/website/]
- Madigan MT, Martinko JM, Parker J: Brock Biology of Microorganisms 9th edn. Upper Saddle River, NJ: Prentice-Hall. 2000, 9Google Scholar
- Klein NC, Cunha BA: Pasteurella multocida pneumonia. Semin Resp Infect. 1997, 12: 54-56.Google Scholar
- May BJ, Zhang Q, Li LL, Paustian ML, Whittam TS, Kapur V: Complete genome sequence of Pasteurella multocida, Pm70. Proc Natl Acad Sci USA. 2001, 98: 3460-3465. 10.1073/pnas.051634598.PubMedPubMed CentralView ArticleGoogle Scholar
- Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ, Tomb JF, Dougherty BA, Merrick JM, et al: Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science. 1995, 269: 496-511.PubMedView ArticleGoogle Scholar
- Blattner FR, Plunkett G, Bloch CA, Perna NT, Burland V, Riley M, Collado-Vides J, Glasner JD, Rode CK, Mayhew GF, et al: The complete genome sequence of Escherichia coli K-12. Science. 1997, 277: 1453-1462. 10.1126/science.277.5331.1453.PubMedView ArticleGoogle Scholar
- Horowitz NH: On the evolution of biochemical syntheses. Proc Natl Acad Sci USA. 1945, 31: 153-157.PubMedPubMed CentralView ArticleGoogle Scholar
- Roberts DD, Lewis SD, Ballou DP, Olson ST, Shafer JA: Reactivity of small thiolate anions and cysteine-25 in papain toward methyl methanethiosulfonate. Biochemistry. 1986, 25: 5595-5601.PubMedView ArticleGoogle Scholar
- Gerlt JA, Babbitt PC: Divergent evolution of enzymatic function: mechanistically diverse superfamilies and functionally distinct suprafamilies. Annu Rev Biochem. 2001, 70: 209-246. 10.1146/annurev.biochem.70.1.209.PubMedView ArticleGoogle Scholar
- Teichmann SA, Rison SCG, Thornton JM, Riley M, Gough J, Chothia C: The evolution and structural anatomy of the small molecule metabolic pathways in Escherichia coli. J Mol Biol. 2001, 311: 693-708. 10.1006/jmbi.2001.4912.PubMedView ArticleGoogle Scholar
- Abbott JJ, Pei J, Ford JL, Qi Y, Grishin YN, Pitcher LA, Phillips MA, Grishin NV: Structure prediction and active site analysis of the metal binding determinants in γ-glutamylcysteine synthetase. J Biol Chem. 2001, 276: 42099-42107. 10.1074/jbc.M104672200.PubMedView ArticleGoogle Scholar
- Sherrill C, Fahey RC: Import and metabolism of glutathione by Streptococcus mutans. J Bacteriol. 1998, 180: 1454-1459.PubMedPubMed CentralGoogle Scholar
- Fahey RC, Buschbacher RM, Newton GL: The evolution of glutathione metabolism in phototrophic microorganisms. J Mol Evol. 1987, 25: 81-88.PubMedView ArticleGoogle Scholar
- Okumura N, Masamoto K, Wada H: The gshB gene in the cyanobacterium Synechococcus sp. PCC 7942 encodes a functional glutathione synthetase. Microbiology. 1997, 143: 2883-2890.PubMedView ArticleGoogle Scholar