- Open Access
The ASRG database: identification and survey of Arabidopsis thalianagenes involved in pre-mRNA splicing
© Wang and Brendel; licensee BioMed Central Ltd. 2004
- Received: 25 June 2004
- Accepted: 20 October 2004
- Published: 29 November 2004
A total of 74 small nuclear RNA (snRNA) genes and 395 genes encoding splicing-related proteins were identified in the Arabidopsis genome by sequence comparison and motif searches, including the previously elusive U4atac snRNA gene. Most of the genes have not been studied experimentally. Classification of these genes and detailed information on gene structure, alternative splicing, gene duplications and phylogenetic relationships are made accessible as a comprehensive database of Arabidopsis Splicing Related Genes (ASRG) on our website.
- Splice Factor
- Intron Retention
- Splice Regulator
- Splice Mechanism
- Alternative Splice Pattern
Most eukaryotic genes contain introns that are spliced from the precursor mRNA (pre-mRNA). The correct interpretation of splicing signals is essential to generate authentic mature mRNAs that yield correct translation products. As an important post-transcriptional mechanism, gene function can be controlled at the level of splicing through the production of different mRNAs from a single pre-mRNA (reviewed in ). The general mechanism of splicing has been well studied in human and yeast systems and is largely conserved between these organisms. Plant RNA splicing mechanisms remain comparatively poorly understood, due in part to the lack of an in vitro plant splicing system. Although the splicing mechanisms in plants and animals appear to be similar overall, incorrect splicing of plant pre-mRNAs in mammalian systems (and vice versa) suggests that there are plant-specific characteristics, resulting from coevolution of splicing factors with the signals they recognize or from the requirement for additional splicing factors (reviewed in [2, 3]).
Genome projects are accelerating research on splicing. For example, with the majority of splicing-related genes already known in human and budding yeast, these gene sequences were used to query the Drosophila and fission yeast genomes in an effort to identify potential homologs [4, 5]. Most of the known genes were found to have homologs in both Drosophila and fission yeast. The availability of the near-complete genome of Arabidopsis thaliana  provides the foundation for the simultaneous study of all the genes involved in particular plant structures or physiological processes. For example, Barakat et al.  identified and mapped 249 genes encoding ribosomal proteins and analyzed gene number, chromosomal location, evolutionary history (including large-scale chromosomal duplications) and expression of those genes. Beisson et al.  catalogued all genes involved in acyl lipid metabolism. Wang et al.  surveyed more than 1,000 Arabidopsis protein kinases and computationally compared derived protein clusters with established gene families in budding yeast. Previous surveys of Arabidopsis gene families that contain some splicing-related genes include the DEAD box RNA helicase family  and RNA-recognition motif (RRM)-containing proteins . At present, the Arabidopsis Information Resource (TAIR) links to more than 850 such expert-maintained collections of gene families .
Here we present the results of computational identification of potentially all or nearly all Arabidopsis genes involved in pre-mRNA splicing. Recent mass spectrometry analyses revealed more than 200 proteins associated with human spliceosomes ([13–17], reviewed in ). By extensive sequence comparisons using known plant and animal splicing-related proteins as queries, we have identified 74 small nuclear RNA (snRNA) genes and 395 protein-coding genes in the Arabidopsis genome that are likely to be homologs of animal splicing-related genes. About half of the genes occur in multiple copies in the genome and appear to have been derived both from chromosomal duplication events and from duplication of individual genes. All genes were classified into gene families, named and annotated with respect to their inferred gene structure, predicted protein domain structure and presumed function. The classification and analysis results are available as an integrated web resource, the database of Arabidopsis Splicing Related Genes (ASRG), which should facilitate genome-wide studies of pre-mRNA splicing in plants.
Arabidopsis snRNA genes
4-61, 93%; 80-166, 88%
4-50, 91%; 91-166, 88%
10-60, 88%; 84-118, 88%
18-37, 100%; 60-102, 90%
1-46, 89%; 62-100, 89%
1-65, 95%; 81-110, 93%
Conservation of major snRNA genes
As shown in Table 1, each of five major snRNA genes (U1, U2, U4, U5 and U6) exists in more than 10 copies in the Arabidopsis genome. U2 snRNA has the largest copy number, with a total of 18 putative homologs identified. Both U1 and U5 snRNAs have 14 copies, U6 snRNA has 13 copies, and U4 snRNA has only 11 copies. Sequence comparisons within Arabidopsis snRNA gene families showed that the U6 snRNA genes are the most similar, and the U1 snRNA genes are the most divergent. Eight active U6 snRNA copies are more than 93% identical to each other in the genic region, whereas active U1 snRNAs are on average only 87% identical. The U2 and U4 snRNAs are also highly conserved within each type, with more than 92% identity among the active genes. Details about the individual snRNAs and the respective sequence alignments are displayed at .
Previous studies identified two conserved transcription signals in most major snRNA gene promoters: USE (upstream sequence element, RTCCACATCG (where R is either A or G) and TATA box [24–27]. All 14 U5 snRNAs have the USE and TATA box. Furthermore, their predicted secondary structures are similar to the known structure of their counterparts in human, indicating that all these genes are active and functional (structure data not shown; for a review of the structures of human snRNAs, see ). Similarly, we identified 17 U2, 10 U1, nine U4, and nine U6 snRNA genes as likely active genes, with a few additional genes more likely to be pseudogenes because of various deletions. U4-10 and U6-7 do not have the conserved USE in the promoter region, but their U4-U6 interaction regions (stem I and stem II) are fairly well conserved. U2-16 is also missing the USE but has a secondary structure similar to other U2 snRNAs. These genes may be active, but differences in promoter motifs suggest that their expression may be under different control compared with other snRNAs homologs. The U2-17 snRNA has all conserved transcription signals, but 20 nucleotides are missing from its 3' end. The predicted secondary structure of U2-17 is similar to that of other U2 snRNAs, with a significantly shorter stem-loop in the 3' end as a result of the deletion. We are not sure if the U2-17 snRNA is functional, but the conserved transcription signals imply that it may be active.
Other conserved transcription signals were also identified in most active snRNAs, including the sequence element CAANTC (where N is either A, C, G or T) in U2 snRNAs (located at -6 to -1) , and the termination signal CAN3-10AGTNNAA in U snRNAs (U1, U2, U4 and U5) transcribed by RNA polymerase II (Pol II) [23, 24, 32]. The previously identified monocot-specific promoter element (MSP, RGCCCR, located upstream of USE) in U6.1 and U6.26  is also found in five other U6 snRNA genes (U6.29, U6-2, U6-3, U6-4, U6-5). In all seven U6 snRNAs the consensus MSP sequence extends by two thymine nucleotides to RGCCCRTT. Although the MSP does not contribute significantly to U6 snRNA transcription initiation in Nicotiana plumbaginifolia protoplasts , the extended consensus may imply a role in gene expression regulation in Arabidopsis.
Low copy number of minor snRNA genes
The minor snRNAs are functional in the splicing of U12-type (AT-AC) introns. Four types of minor snRNAs, which correspond to four types of major snRNAs, exist in mammals. U11 is the analog of U1, U12 is the analog of U2, U4atac is the analog of U4, and U6atac is the analog of U6. The U5 snRNA seems to function in both the major and minor spliceosome . Two minor snRNAs (atU12 and atU6atac) were experimentally identified in Arabidopsis . Both have the conserved USE and TATA box in the promoter region. We identified another U6atac gene (atU6atac-2) by sequence mapping. This gene has a USE and a TATA box in the promoter region. The atU6atac-2 gene is more than 90% similar to atU6atac in both its 5' and 3' ends, with a 10-nucletotide deletion in the central region. The putative U4atac-U6atac interaction region in atU6atac-2 is 100% conserved with the interaction region previously identified in atU6atac [28, 35].
U11 and U4atac have not been experimentally identified in Arabidopsis. BLAST searches using the human U11 and U4atac homologs as queries against the Arabidopsis genome failed to find any significant hits, indicating divergence of the minor snRNAs in plants and mammals. Using the strategy described below, we successfully identified a putative Arabidopsis U4atac gene. It is a single-copy gene containing all conserved functional domains. We also found a single candidate U11 snRNA gene (chromosome 5, from 17,492,101 to 17,492,600) that has the USE and TATA box in the promoter region. This gene also contains a putative binding site fr Sm protein and a region that could pair with the 5' splice site of the U12-type intron.
Identification of an Arabidopsis U4atacsnRNA gene
The tentative U4atac snRNA gene contains not only the stem II sequence, but also the stem I sequence that presumably base-pairs with U6atac snRNA stem I. Furthermore, a highly conserved Sm-protein-binding region exists at the 3' end. The predicted secondary structure is nearly identical to hsU4atac, with a relative longer single-stranded region (data not shown). With the highly conserved transcriptional signals, functional domains and secondary structure, this candidate gene is likely to be a real U4atac snRNA homolog. We named it AtU4atac and assigned At4g16065 as its tentative gene model because it is located between gene models At4g16060 and At4g16070 on chromosome 4.
Tandem arrays of snRNAs genes
Previous studies had determined 30 snRNA genes and 46 protein-coding genes related to splicing in Arabidopsis (see Tables 1 and 2). In this study, we have computationally identified an additional 44 snRNA genes (Table 1) and 349 protein-coding genes (Table 2) that also may be involved in splicing. Among the five types of U snRNAs, U6 is the most conserved and U1 is the least conserved. We identified seven U1-U4 snRNA gene clusters. We were surprised to see so many U1-U4 clusters in Arabidopsis. In Drosophila, four snRNA clusters were reported , but none of them includes U1-U4 gene pairs. It is likely that a U1-U4 snRNA cluster existed in a progenitor of the current Arabidopsis genome, which was duplicated several times to form the extant seven clusters. The non-clustered U1 and U4 snRNA genes may have arisen by individual gene duplication or gene loss in duplicated clusters.
Among the proteins involved in splicing, most animal homologs are conserved in plants, indicating an ancient, monophylytic origin for the splicing mechanism. A striking feature of plant splicing-related genes is their duplication ratio. Fifty percent of the splicing genes are duplicated in Arabidopsis. The duplication ratio of the splicing-related genes increases from genes encoding snRNP proteins to genes encoding splicing regulators. These data strongly suggest that the general splicing mechanism is conserved, but that the control of splicing may be more diverse in plants.
The high duplication ratio of Arabidopsis splicing-related genes could be the result of evolutionary selection. Unlike animals, which can move around to maintain more homogeneous physiological conditions, plants are exposed to a larger range of stress conditions such as heat and cold. The duplicates will more probably be maintained in the genome as their functions become diversified, and potentially plant-specific, to ensure the fidelity of splicing under such varied conditions. Chromosome duplication has produced several Sm proteins, SR proteins and hnRNP proteins in Arabidopsis, which in turn could create positive selective pressures influencing the rate of duplication for functionally related genes. Because chromosome duplication occurred differentially within each plant lineage, we would expect different duplication patterns of these genes in, for example, monocots and dicots.
As introns evolve rapidly, the mechanism to recognize and splice them should either evolve correspondingly or be flexible enough to accommodate the changes. It seems that plants deploy the most economic and practical way by keeping a largely conserved splicing mechanism and a very flexible recognition and control mechanism. Direct evidence comes from the presence of plant-specific splicing proteins, such as the novel SR protein family and the superfamily of hnRNP A/B. The absence of SMN complex and some yeast U1 snRNP proteins in Arabidopsis indicates that other organisms also have integrated new proteins or pathways into the splicing mechanism over the course of evolution relative to other eukaryotes. Other evidence supporting the conserved splicing but flexible regulating mechanism include differential conservation among U snRNAs (U1 snRNAs are less conserved than U6 snRNAs) and high alternative splicing frequency in U1 snRNP proteins, SR proteins and hnRNP proteins. The SR proteins and U1 snRNP proteins are involved in early steps of splicing and 5' and 3' splice-site selection; multiple isoforms of these proteins may be functionally significant in the control of splicing.
It is interesting to note that the overall alternative splicing frequency in splicing related genes is much higher than the frequency averaged over all Arabidopsis genes. More than half of SR proteins and U1 snRNP proteins show alternative splicing. Alternative splicing might increase protein diversity derived from splicing-related genes, which would further add flexibility to the splicing mechanism. The high frequency of alternative transcripts from splicing related genes raises another interesting question - how is splicing regulated in these splicing-related genes? One possible answer is that some splicing-related genes may be autoregulated. Accumulation of one transcript would feed back to inhibit/promote other isoforms. Several splicing-related genes have been reported to be regulated in this way. For example, AtGRP7 (hnRNP A/B superfamily) is a circadian clock-regulated protein which negatively autoregulates its expression . When the AtGRP7 protein accumulates over the circadian cycle, it promotes production of alternative transcripts which use a cryptic 5' splice site. As a result of message instability, the alternative transcripts contain pre-mature stop codons and do not accumulate to high levels, thus decreasing the level of AtGRP7 protein . atSRp30 has similar effects on its own transcripts . Another possible answer is that some splicing-related genes might regulate the splicing of other splicing-related genes. For example, overexpression of AtGRP7 and atSRp30 is known to affect the splicing of AtGRP8 and atSR1, respectively [65, 79]. A third possibility is that the environment could affect the alternative splicing pattern. A good example is the SR1 gene. The ratio of two transcripts from the SR1 gene (SR1B/SR1) increases in a temperature-dependent manner . Generally, heat or cold stress could cause intron retention in some splicing regulators, which could further alter the splicing pattern of other genes. The fourth possible regulators are intronless genes. Combining all these possibilities, a pathway to regulate splicing could be inferred as follows: environmental changes → splicing pattern changes in some specific splicing-related genes and/or intronless genes → expression pattern changes (including splicing pattern changes) in general splicing related genes → changes in splicing patterns for specific genes.
A large number of Arabidopsis splicing-related genes were computationally identified in this study by means of sequence comparisons and motif searches, including a tentative U4atac snRNA gene containing all conserved motifs, a new SR protein-coding gene (atRSp32) belonging to the atRSp31 family, and several genes related to genes encoding known splicing-related proteins (atULrp and atFCA2). A web-accessible database containing all the Arabidopsis splicing related genes has been constructed and will be expanded to other organisms in the near future. This compilation should provide a good foundation to study the splicing process in more detail and to determine to what extent these genes are conserved across the entire plant kingdom. Our data show that about 50% of the splicing-related genes are duplicated in Arabidopsis. The duplication ratios for splicing regulators are even higher, indicating that the splicing mechanism is generally conserved among plants, but that the regulation of splicing may be more variable and flexible, thus enabling plants to respond to their specific environments.
Search for ArabidopsissnRNAs
Sequences of the 15 experimentally identified major snRNAs were downloaded from GenBank. The two minor snRNAs sequences were compiled from the literature . These genes were used to search against the Arabidopsis genome at the AtGDB BLAST server  and at the SALK T-DNA Express web server . Our initial analysis was based on Release 3.0 of the Arabidopsis genome (GenBank accession numbers NC_003070.4, NC_003071.3, NC_003074.4, NC_003075.3, and NC_003076.4). Local BLAST  was used to derive the locations of the snRNA homologs from more recently sequenced regions of the genome. Criteria used for local BLAST were 'e 1 -F F -W 7' (cutoff eval is 1, dust filter on, with a minimum word size of 7). Human and maize snRNAs were also included as query sequences, and all hits with e-values less than 10-5 were regarded as possible homologs. A total of 70 major snRNAs and three minor snRNAs were identified by this method. Each major snRNA type has 10-18 copies in the genome. A tentative gene name and gene model were assigned to each snRNA gene after comparison with the snRNAs identified in MATDB . Sequence-similarity values were based on BLAST alignments.
Search for Arabidopsissplicing-related proteins
A three-round BLAST search strategy was used to identify Arabidopsis splicing related protein-coding genes. First, sequences of splicing-related proteins from human and Drosophila were downloaded from GenBank according to several recent proteomic studies [15–18] and the website compilation of Stephen Mount's group available at . Human hnRNP proteins identified in a recent review  were downloaded from GenBank. All these sequences were used as queries in a local BLAST search against Arabidopsis annotated proteins (obtained from TIGR at ). All hits with an e-value less than 10-10 were collected as candidates. Many of these candidates had highly significant e-values (usually 10-30 or below and much lower than other hits). These candidates were regarded as true homologs.
In the second step, all identified true homologs were used to query the Arabidopsis protein set again. An e-value of 10-20 was used as a cutoff value to find possible paralogs of the true homologs. Sequences identified in both rounds of BLAST hits were regarded as main candidates for splicing related proteins.
Finally, the main candidates were queried against GenPept and all annotated human proteins (obtained from Ensembl ). All candidates with significant similarity to proteins unrelated to splicing were removed from the main candidate list, and all candidates with significant similarity to proteins related to splicing were regarded as true splicing-related genes and were promoted to the status of true homologs. The remaining candidates were regarded as unclassified splicing-related proteins. BLAST results were initially analyzed by MuSeqBox ). Two custom scripts were written to read MuSeqBox output files, largely automating the search procedure.
Gene structure and chromosomal locations
The gene structure and chromosomal locations for the genes encoding splicing-related proteins were retrieved from AtGDB . The chromosomal locations of the snRNA genes were inferred from the BLAST results. The location maps (Figure 1) were generated using the AtGDB advanced search function . Spliced alignments of ESTs and cDNAs generated by GeneSeqer  were used to verify gene models. Gene structure information was used as an important criteria to group homologs into gene families.
InterProScan 3.3 was downloaded from  and was subsequently used to search protein domain databases using default parameters . A Perl script was written to process the text results from InterProScan. Protein domain information was used in comparisons of homologs from different species. The search of the National Center for Biotechnology Information Conserved Domain Database (NCBI-CDD)  was conducted manually for certain genes to confirm the InterPro results.
The gene families with multiple copies were inspected to determine whether they were likely to have derived from chromosome-duplication events. Gene models of the duplicated gene were searched against the gene list of each chromosome redundancy region at MATDB . If the gene and its duplicate were both in the list, they were regarded as a chromosome duplication pair. Otherwise, they were assumed to be produced by random gene duplication.
Identification of alternative splicing
All Arabidopsis ESTs and cDNAs were aligned against the genome using the spliced alignment program GeneSeqer as made available through AtGDB . We retrieved the intron and exon coordinates of the reliable cognate alignments from the database. Scripts were written to identify introns that overlap with other introns or exons. We defined the alternative splicing cases as follows: alternative donor (AltD): an intron has the same 3'-end coordinate but different 5'-end coordinate as another overlapping intron; alternative acceptor (AltA): an intron has the same 5'-end coordinate but different 3'-end coordinate as another intron; alternative position (AltP): an intron has different 5'-end and 3'-end coordinates as another overlapping intron; exon skipping (ExonS): an annotated intron completely contains an alternatively identified exon in the same transcription direction; intron retention (IntronR): an annotated intron is completely contained by an alternatively identified exon.
Database and interface construction
Details about each splicing-related gene were saved in a MySQL database. PHP scripts were written to interact with the database and generate the interface web pages. Text and BLAST searches were implemented by Perl-cgi scripts.
We thank Shannon Schlueter for help with the web page and database design and implementation. We are also grateful to Shailesh Lal, Carolyn Lawrence and Michael Sparks for discussions and critical reading of the manuscript and to the anonymous reviewers for excellent suggestions. This work was supported in part by a grant from the ISU Plant Sciences Institute and NSF grants DBI-0110189 and DBI-0110254 to V.B.
- Kazan K: Alternative splicing and proteome diversity in plants: the tip of the iceberg has just emerged. Trends Plant Sci. 2003, 8: 468-471. 10.1016/j.tplants.2003.09.001.PubMedView ArticleGoogle Scholar
- Lorkovic ZJ, Wieczorek Kirk DA, Lambermon MH, Filipowicz W: Pre-mRNA splicing in higher plants. Trends Plant Sci. 2000, 5: 160-167. 10.1016/S1360-1385(00)01595-8.PubMedView ArticleGoogle Scholar
- Reddy ASN: Nuclear pre-mRNA splicing in plants. Critical Rev Plant Sci. 2001, 20: 523-571. 10.1016/S0735-2689(01)80004-6.View ArticleGoogle Scholar
- Mount SM, Salz HK: Pre-messenger RNA processing factors in the Drosophila genome. J Cell Biol. 2000, 150: F37-F44. 10.1083/jcb.150.2.F37.PubMedView ArticleGoogle Scholar
- Käufer NF, Potashkin J: Analysis of the splicing machinery in fission yeast: a comparison with budding yeast and mammals. Nucleic Acids Res. 2000, 28: 3003-3010. 10.1093/nar/28.16.3003.PubMedPubMed CentralView ArticleGoogle Scholar
- Arabidopsis Genome Initiative: Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000, 408: 796-815. 10.1038/35048692.View ArticleGoogle Scholar
- Barakat A, Szick-Miranda K, Chang IF, Guyot R, Blanc G, Cooke R, Delseny M, Bailey-Serres J: The organization of cytoplasmic ribosomal protein genes in the Arabidopsis genome. Plant Physiol. 2001, 127: 398-415. 10.1104/pp.127.2.398.PubMedPubMed CentralView ArticleGoogle Scholar
- Beisson F, Koo AJ, Ruuska S, Schwender J, Pollard M, Thelen JJ, Paddock T, Salas JJ, Savage L, Milcamps A, et al: Arabidopsis genes involved in acyl lipid metabolism. A 2003 census of the candidates, a study of the distribution of expressed sequence tags in organs, and a web-based database. Plant Physiol. 2003, 132: 681-697. 10.1104/pp.103.022988.PubMedPubMed CentralView ArticleGoogle Scholar
- Wang D, Harper JF, Gribskov M: Systematic trans-genomic comparison of protein kinases between Arabidopsis and Saccharomyces cerevisiae. Plant Physiol. 2003, 132: 2152-2165. 10.1104/pp.103.021485.PubMedPubMed CentralView ArticleGoogle Scholar
- Aubourg S, Kreis M, Lecharny A: The DEAD box RNA helicase family in Arabidopsis thaliana. Nucleic Acids Res. 1999, 27: 628-636. 10.1093/nar/27.2.628.PubMedPubMed CentralView ArticleGoogle Scholar
- Lorkovic ZJ, Barta A: Genome analysis: RNA recognition motif (RRM) and K homology (KH) domain RNA-binding proteins from the flowering plant Arabidopsis thaliana. Nucleic Acids Res. 2002, 30: 623-635. 10.1093/nar/30.3.623.PubMedPubMed CentralView ArticleGoogle Scholar
- TAIR: gene family information. [http://www.arabidopsis.org/info/genefamily/genefamily.html]
- Makarova OV, Makarov EM, Urlaub H, Will CL, Gentzel M, Wilm M, Lührmann R: A subset of human 35S U5 proteins, including Prp19, function prior to catalytic step 1 of splicing. EMBO J. 2004, 23: 2381-2391. 10.1038/sj.emboj.7600241.PubMedPubMed CentralView ArticleGoogle Scholar
- Will CL, Schneider C, Hossbach M, Urlaub H, Rauhut R, Elbashir S, Tuschl T, Lührmann R: The human 18S U11/U12 snRNP contains a set of novel proteins not found in the U2-dependent spliceosome. RNA. 2004, 10: 929-941. 10.1261/rna.7320604.PubMedPubMed CentralView ArticleGoogle Scholar
- Zhou Z, Sim J, Griffith J, Reed R: Purification and electron microscopic visualization of functional human spliceosomes. Proc Natl Acad Sci USA. 2002, 99: 12203-12207. 10.1073/pnas.182427099.PubMedPubMed CentralView ArticleGoogle Scholar
- Makarov EM, Makarova OV, Urlaub H, Gentzel M, Will CL, Wilm M, Lührmann R: Small nuclear ribonucleoprotein remodeling during catalytic activation of the spliceosome. Science. 2002, 298: 2205-2208. 10.1126/science.1077783.PubMedView ArticleGoogle Scholar
- Rappsilber J, Ryder U, Lamond AI, Mann M: Large-scale proteomic analysis of the human spliceosome. Genome Res. 2002, 12: 1231-1245. 10.1101/gr.473902.PubMedPubMed CentralView ArticleGoogle Scholar
- Jurica MS, Moore MJ: Pre-mRNA splicing: awash in a sea of proteins. Mol Cell. 2003, 12: 5-14. 10.1016/S1097-2765(03)00270-3.PubMedView ArticleGoogle Scholar
- Arabidopsis Splicing Related Genes Database. [http://www.plantgdb.org/prj/SiP/SRGD/ASRG]
- Arabidopsis thaliana Genome Database. [http://www.plantgdb.org/AtGDB]
- Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25: 3389-3402. 10.1093/nar/25.17.3389.PubMedPubMed CentralView ArticleGoogle Scholar
- Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994, 22: 4673-4680.PubMedPubMed CentralView ArticleGoogle Scholar
- Vankan P, Filipowicz W: Structure of U2 snRNA genes of Arabidopsis thaliana and their expression in electroporated plant protoplasts. EMBO J. 1988, 7: 791-799.PubMedPubMed CentralGoogle Scholar
- Vankan P, Edoh D, Filipowicz W: Structure and expression of the U5 snRNA gene of Arabidopsis thaliana. Conserved upstream sequence elements in plant U-RNA genes. Nucleic Acids Res. 1988, 16: 10425-10440.PubMedPubMed CentralView ArticleGoogle Scholar
- Vankan P, Filipowicz W: A U-snRNA gene-specific upstream element and a -30 'TATA box' are required for transcription of the U2 snRNA gene of Arabidopsis thaliana. EMBO J. 1989, 8: 3875-3882.PubMedPubMed CentralGoogle Scholar
- Waibel F, Filipowicz W: U6 snRNA genes of Arabidopsis are transcribed by RNA polymerase III but contain the same two upstream promoter elements as RNA polymerase II-transcribed U-snRNA genes. Nucleic Acids Res. 1990, 18: 3451-3458.PubMedPubMed CentralView ArticleGoogle Scholar
- Hofmann CJ, Marshallsay C, Waibel F, Filipowicz W: Characterization of the genes encoding U4 small nuclear RNAs in Arabidopsis thaliana. Mol Biol Rep. 1992, 17: 21-28.PubMedView ArticleGoogle Scholar
- Shukla GC, Padgett RA: Conservation of functional features of U6atac and U12 snRNAs between vertebrates and higher plants. RNA. 1999, 5: 525-538. 10.1017/S1355838299982213.PubMedPubMed CentralView ArticleGoogle Scholar
- Marker C, Zemann A, Terhorst T, Kiefmann M, Kastenmayer JP, Green P, Bachellerie JP, Brosius J, Huttenhofer A: Experimental RNomics: identification of 140 candidates for small non-messenger RNAs in the plant Arabidopsis thaliana. Curr Biol. 2002, 12: 2002-2013. 10.1016/S0960-9822(02)01304-0.PubMedView ArticleGoogle Scholar
- ASRG snRNAs. [http://www.plantgdb.org/prj/SiP/SRGD/ASRG/AtsnRNA.php]
- Patel AA, Steitz JA: Splicing double: insights from the second spliceosome. Nat Rev Mol Cell Biol. 2003, 4: 960-970. 10.1038/nrm1259.PubMedView ArticleGoogle Scholar
- Connelly S, Filipowicz W: Activity of chimeric U small nuclear RNA (snRNA)/mRNA genes in transfected protoplasts of Nicotiana plumbaginifolia: U snRNA 3'-end formation and transcription initiation can occur independently in plants. Mol Cell Biol. 1993, 13: 6403-6415.PubMedPubMed CentralView ArticleGoogle Scholar
- Connelly S, Marshallsay C, Leader D, Brown JW, Filipowicz W: Small nuclear RNA genes transcribed by either RNA polymerase II or RNA polymerase III in monocot plants share three promoter elements and use a strategy to regulate gene expression different from that used by their dicot plant counterparts. Mol Cell Biol. 1994, 14: 5910-5919.PubMedPubMed CentralView ArticleGoogle Scholar
- Tarn WY, Steitz JA: Pre-mRNA splicing: the discovery of a new spliceosome doubles the challenge. Trends Biochem Sci. 1997, 22: 132-137. 10.1016/S0968-0004(97)01018-9.PubMedView ArticleGoogle Scholar
- Shukla GC, Padgett RA: U4 small nuclear RNA can function in both the major and minor spliceosomes. Proc Natl Acad Sci USA. 2004, 101: 93-98. 10.1073/pnas.0304919101.PubMedPubMed CentralView ArticleGoogle Scholar
- Shukla GC, Cole AJ, Dietrich RC, Padgett RA: Domains of human U4atac snRNA required for U12-dependent splicing in vivo. Nucleic Acids Res. 2002, 30: 4650-4657. 10.1093/nar/gkf609.PubMedPubMed CentralView ArticleGoogle Scholar
- Krämer A: The structure and function of proteins involved in mammalian pre-mRNA splicing. Annu Rev Biochem. 1996, 65: 367-409. 10.1146/annurev.bi.65.070196.002055.PubMedView ArticleGoogle Scholar
- Will CL, Lührmann R: Protein functions in pre-mRNA splicing. Curr Opin Cell Biol. 1997, 9: 320-328. 10.1016/S0955-0674(97)80003-8.PubMedView ArticleGoogle Scholar
- ASRG proteins. [http://www.plantgdb.org/prj/SiP/SRGD/ASRG/ASRP-home.php]
- Stevens SW, Abelson J: Purification of the yeast U4/U6.U5 small nuclear ribonucleoprotein particle and identification of its proteins. Proc Natl Acad Sci USA. 1999, 96: 7226-7231. 10.1073/pnas.96.13.7226.PubMedPubMed CentralView ArticleGoogle Scholar
- Stevens SW, Barta I, Ge HY, Moore RE, Young MK, Lee TD, Abelson J: Biochemical and genetic analyses of the U5, U6, and U4/U6 × U5 small nuclear ribonucleoproteins from Saccharomyces cerevisiae. RNA. 2001, 7: 1543-1553.PubMedPubMed CentralGoogle Scholar
- Gottschalk A, Neubauer G, Banroques J, Mann M, Lührmann R, Fabrizio P: Identification by mass spectrometry and functional analysis of novel proteins of the yeast [U4/U6.U5] tri-snRNP. EMBO J. 1999, 18: 4535-4548. 10.1093/emboj/18.16.4535.PubMedPubMed CentralView ArticleGoogle Scholar
- Caspary F, Shevchenko A, Wilm M, Seraphin B: Partial purification of the yeast U2 snRNP reveals a novel yeast pre-mRNA splicing factor required for pre-spliceosome assembly. EMBO J. 1999, 18: 3463-3474. 10.1093/emboj/18.12.3463.PubMedPubMed CentralView ArticleGoogle Scholar
- Krämer A, Grüter P, Gröning K, Kastner B: Combined biochemical and electron microscopic analyses reveal the architecture of the mammalian U2 snRNP. J Cell Biol. 1999, 145: 1355-1368. 10.1083/jcb.145.7.1355.PubMedPubMed CentralView ArticleGoogle Scholar
- Fabrizio P, Esser S, Kastner B, Lührmann R: Isolation of S. cerevisiae snRNPs: comparison of U1 and U4/U6.U5 to their human counterparts. Science. 1994, 264: 261-265.PubMedView ArticleGoogle Scholar
- Will CL, Lührmann R: Spliceosomal UsnRNP biogenesis, structure and function. Curr Opin Cell Biol. 2001, 13: 290-301. 10.1016/S0955-0674(00)00211-8.PubMedView ArticleGoogle Scholar
- Xiong L, Gong Z, Rock CD, Subramanian S, Guo Y, Xu W, Galbraith D, Zhu JK: Modulation of abscisic acid signal transduction and biosynthesis by an Sm-like protein in Arabidopsis. Dev Cell. 2001, 1: 771-781. 10.1016/S1534-5807(01)00087-9.PubMedView ArticleGoogle Scholar
- Golovkin M, Reddy AS: Structure and expression of a plant U1 snRNP 70K gene: alternative splicing of U1 snRNP 70K pre-mRNAs produces two different transcripts. Plant Cell. 1996, 8: 1421-1435. 10.1105/tpc.8.8.1421.PubMedPubMed CentralView ArticleGoogle Scholar
- Simpson GG, Clark GP, Rothnie HM, Boelens W, van Venrooij W, Brown JW: Molecular characterization of the spliceosomal proteins U1A and U2B' from higher plants. EMBO J. 1995, 14: 4540-4550.PubMedPubMed CentralGoogle Scholar
- Casacuberta E, Puigdomenech P, Monofort A: A genomic duplication in Arabidopsis thaliana contains a sequence similar to the human gene coding for SAP130. Plant Physiol Biochem. 2001, 39: 565-573. 10.1016/S0981-9428(01)01280-3.View ArticleGoogle Scholar
- Golovkin M, Reddy AS: Expression of U1 small nuclear ribonucleoprotein 70K antisense transcript using APETALA3 promoter suppresses the development of sepals and petals. Plant Physiol. 2003, 132: 1884-1891. 10.1104/pp.103.023192.PubMedPubMed CentralView ArticleGoogle Scholar
- Gottschalk A, Tang J, Puig O, Salgado J, Neubauer G, Colot HV, Mann M, Seraphin B, Rosbash M, Lührmann R, Fabrizio P: A comprehensive biochemical and genetic analysis of the yeast U1 snRNP reveals five novel proteins. RNA. 1998, 4: 374-393.PubMedPubMed CentralGoogle Scholar
- McLean MR, Rymond BC: Yeast pre-mRNA splicing requires a pair of U1 snRNP-associated tetratricopeptide repeat proteins. Mol Cell Biol. 1998, 18: 353-360.PubMedPubMed CentralView ArticleGoogle Scholar
- Huang T, Vilardell J, Query CC: Pre-spliceosome formation in S. pombe requires a stable complex of SF1-U2AF(59)-U2AF(23). EMBO J. 2002, 21: 5516-5526. 10.1093/emboj/cdf555.PubMedPubMed CentralView ArticleGoogle Scholar
- Lewis JD, Gorlich D, Mattaj IW: A yeast cap binding protein complex (yCBC) acts at an early step in pre-mRNA splicing. Nucleic Acids Res. 1996, 24: 3332-3336. 10.1093/nar/24.17.3332.PubMedPubMed CentralView ArticleGoogle Scholar
- Kmieciak M, Simpson CG, Lewandowska D, Brown JW, Jarmolowski A: Cloning and characterization of two subunits of Arabidopsis thaliana nuclear cap-binding complex. Gene. 2002, 283: 171-183.PubMedView ArticleGoogle Scholar
- Hugouvieux V, Kwak JM, Schroeder JI: An mRNA cap binding protein, ABH1, modulates early abscisic acid signal transduction in Arabidopsis. Cell. 2001, 106: 477-487. 10.1016/S0092-8674(01)00460-3.PubMedView ArticleGoogle Scholar
- Domon C, Lorkovic ZJ, Valcarcel J, Filipowicz W: Multiple forms of the U2 small nuclear ribonucleoprotein auxiliary factor U2AF subunits expressed in higher plants. J Biol Chem. 1998, 273: 34603-34610. 10.1074/jbc.273.51.34603.PubMedView ArticleGoogle Scholar
- Lopato S, Waigmann E, Barta A: Characterization of a novel arginine/serine-rich splicing factor in Arabidopsis. Plant Cell. 1996, 8: 2255-2264. 10.1105/tpc.8.12.2255.PubMedPubMed CentralView ArticleGoogle Scholar
- Lopato S, Mayeda A, Krainer AR, Barta A: Pre-mRNA splicing in plants: characterization of Ser/Arg splicing factors. Proc Natl Acad Sci USA. 1996, 93: 3074-3079. 10.1073/pnas.93.7.3074.PubMedPubMed CentralView ArticleGoogle Scholar
- Lopato S, Forstner C, Kalyna M, Hilscher J, Langhammer U, Indrapichate K, Lorkovic ZJ, Barta A: Network of interactions of a novel plant-specific Arg/Ser-rich protein, atRSZ33, with atSC35-like splicing factors. J Biol Chem. 2002, 277: 39989-39998. 10.1074/jbc.M206455200.PubMedView ArticleGoogle Scholar
- Golovkin M, Reddy AS: The plant U1 small nuclear ribonucleoprotein particle 70K protein interacts with two novel serine/arginine-rich proteins. Plant Cell. 1998, 10: 1637-1648. 10.1105/tpc.10.10.1637.PubMedPubMed CentralGoogle Scholar
- Golovkin M, Reddy AS: An SC35-like protein and a novel serine/arginine-rich protein interact with Arabidopsis U1-70K protein. J Biol Chem. 1999, 274: 36428-36438. 10.1074/jbc.274.51.36428.PubMedView ArticleGoogle Scholar
- Lazar G, Schaal T, Maniatis T, Goodman HM: Identification of a plant serine-arginine-rich protein similar to the mammalian splicing factor SF2/ASF. Proc Natl Acad Sci USA. 1995, 92: 7672-7676.PubMedPubMed CentralView ArticleGoogle Scholar
- Lopato S, Kalyna M, Dorner S, Kobayashi R, Krainer AR, Barta A: atSRp30, one of two SF2/ASF-like proteins from Arabidopsis thaliana, regulates splicing of specific plant genes. Genes Dev. 1999, 13: 987-1001.PubMedPubMed CentralView ArticleGoogle Scholar
- Lopato S, Gattoni R, Fabini G, Stevenin J, Barta A: A novel family of plant splicing factors with a Zn knuckle motif: examination of RNA binding and splicing activities. Plant Mol Biol. 1999, 39: 761-773. 10.1023/A:1006129615846.PubMedView ArticleGoogle Scholar
- Lazar G, Goodman HM: The Arabidopsis splicing factor SR1 is regulated by alternative splicing. Plant Mol Biol. 2000, 42: 571-581. 10.1023/A:1006394207479.PubMedView ArticleGoogle Scholar
- Tronchere H, Wang J, Fu XD: A protein related to splicing factor U2AF35 that interacts with U2AF65 and SR proteins in splicing of pre-mRNA. Nature. 1997, 388: 397-400. 10.1038/41137.PubMedView ArticleGoogle Scholar
- Lin CH, Patton JG: Regulation of alternative 3' splice site selection by constitutive splicing factors. RNA. 1995, 1: 234-245.PubMedPubMed CentralGoogle Scholar
- ASRG SR protein gene structure. [http://www.plantgdb.org/prj/SiP/SRGD/ASRG/Display.php?GID=2.2&Gst=1]
- Cowper AE, Caceres JF, Mayeda A, Screaton GR: Serine-arginine (SR) protein-like factors that antagonize authentic SR proteins and regulate alternative splicing. J Biol Chem. 2001, 276: 48908-48914. 10.1074/jbc.M103967200.PubMedView ArticleGoogle Scholar
- Chan SP, Kao DI, Tsai WY, Cheng SC: The Prp19p-associated complex in spliceosome activation. Science. 2003, 302: 279-282. 10.1126/science.1086602.PubMedView ArticleGoogle Scholar
- Yong J, Pellizzoni L, Dreyfuss G: Sequence-specific interaction of U1 snRNA with the SMN complex. EMBO J. 2002, 21: 1188-1196. 10.1093/emboj/21.5.1188.PubMedPubMed CentralView ArticleGoogle Scholar
- Bender J, Fink GR: AFC1, a LAMMER kinase from Arabidopsis thaliana, activates STE12-dependent processes in yeast. Proc Natl Acad Sci USA. 1994, 91: 12105-12109.PubMedPubMed CentralView ArticleGoogle Scholar
- Savaldi-Goldstein S, Aviv D, Davydov O, Fluhr R: Alternative splicing modulation by a LAMMER kinase impinges on developmental and transcriptome expression. Plant Cell. 2003, 15: 926-938. 10.1105/tpc.011056.PubMedPubMed CentralView ArticleGoogle Scholar
- Krecic AM, Swanson MS: hnRNP complexes: composition, structure, and function. Curr Opin Cell Biol. 1999, 11: 363-371. 10.1016/S0955-0674(99)80051-9.PubMedView ArticleGoogle Scholar
- Heintzen C, Melzer S, Fischer R, Kappeler S, Apel K, Staiger D: A light- and temperature-entrained circadian clock controls expression of transcripts encoding nuclear proteins with homology to RNA-binding proteins in meristematic tissue. Plant J. 1994, 5: 799-813. 10.1046/j.1365-313X.1994.5060799.x.PubMedView ArticleGoogle Scholar
- Lambermon MH, Fu Y, Wieczorek Kirk DA, Dupasquier M, Filipowicz W, Lorkovic ZJ: UBA1 and UBA2, two proteins that interact with UBP1, a multifunctional effector of pre-mRNA maturation in plants. Mol Cell Biol. 2002, 22: 4346-4357. 10.1128/MCB.22.12.4346-4357.2002.PubMedPubMed CentralView ArticleGoogle Scholar
- Staiger D, Zecca L, Wieczorek Kirk DA, Apel K, Eckstein L: The circadian clock regulated RNA-binding protein AtGRP7 autoregulates its expression by influencing alternative splicing of its own pre-mRNA. Plant J. 2003, 33: 361-371. 10.1046/j.1365-313X.2003.01629.x.PubMedView ArticleGoogle Scholar
- Simpson GG, Dijkwel PP, Quesada V, Henderson I, Dean C: FY is an RNA 3' end-processing factor that interacts with FCA to control the Arabidopsis floral transition. Cell. 2003, 113: 777-787. 10.1016/S0092-8674(03)00425-2.PubMedView ArticleGoogle Scholar
- Macknight R, Bancroft I, Page T, Lister C, Schmidt R, Love K, Westphal L, Murphy G, Sherson S, Cobbett C, Dean C: FCA, a gene controlling flowering time in Arabidopsis, encodes a protein containing RNA-binding domains. Cell. 1997, 89: 737-745. 10.1016/S0092-8674(00)80256-1.PubMedView ArticleGoogle Scholar
- Quesada V, Macknight R, Dean C, Simpson GG: Autoregulation of FCA pre-mRNA processing controls Arabidopsis flowering time. EMBO J. 2003, 22: 3142-3152. 10.1093/emboj/cdg305.PubMedPubMed CentralView ArticleGoogle Scholar
- Paillard L, Legagneux V, Osborne HB: A functional deadenylation assay identifies human CUG-BP as a deadenylation factor. Biol Cell. 2003, 95: 107-113. 10.1016/S0248-4900(03)00010-8.PubMedView ArticleGoogle Scholar
- Lambermon MH, Simpson GG, Wieczorek Kirk DA, Hemmings-Mieszczak M, Klahre U, Filipowicz W: UBP1, a novel hnRNP-like protein that functions at multiple steps of higher plant nuclear pre-mRNA maturation. EMBO J. 2000, 19: 1638-1649. 10.1093/emboj/19.7.1638.PubMedPubMed CentralView ArticleGoogle Scholar
- Lorkovic ZJ, Wieczorek Kirk DA, Klahre U, Hemmings-Mieszczak M, Filipowicz W: RBP45 and RBP47, two oligouridylate-specific hnRNP-like proteins interacting with poly(A)+ RNA in nuclei of plant cells. RNA. 2000, 6: 1610-1624. 10.1017/S1355838200001163.PubMedPubMed CentralView ArticleGoogle Scholar
- Riechmann JL, Heard J, Martin G, Reuber L, Jiang C, Keddie J, Adam L, Pineda O, Ratcliffe OJ, Samaha RR, et al: Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes. Science. 2000, 290: 2105-2110. 10.1126/science.290.5499.2105.PubMedView ArticleGoogle Scholar
- Vision TJ, Brown DG, Tanksley SD: The origins of genomic duplications in Arabidopsis. Science. 2000, 290: 2114-2117. 10.1126/science.290.5499.2114.PubMedView ArticleGoogle Scholar
- AtGDB BLAST. [http://www.plantgdb.org/cgi-bin/PlantGDB/AtGDB/BRview.pl]
- T-DNAexpress: the SIGnAL Arabidopsis gene mapping tool. [http://signal.salk.edu/cgi-bin/tdnaexpress]
- MIPS: MATDB snRNAs. [http://mips.gsf.de/cgi-bin/proj/thal/search_type?all/185]
- Drosophila mRNA processing factors. [http://www.life.umd.edu/labs/Mount/factors]
- TIGR ftp site. [ftp://ftp.tigr.org/pub/data/a_thaliana/ath1/SEQUENCES/]
- Ensembl. [http://www.ensembl.org]
- Xing L, Brendel V: Multi-query sequence BLAST output examination with MuSeqBox. Bioinformatics. 2001, 17: 744-745. 10.1093/bioinformatics/17.8.744.PubMedView ArticleGoogle Scholar
- AtGDB. [http://www.plantgdb.org/AtGDB]
- AtGDB advanced search. [http://www.plantgdb.org/cgi-bin/PlantGDB/AtGDB/ASview.pl]
- Brendel V, Xing L, Zhu W: Gene structure prediction from consensus spliced alignment of multiple ESTs matching the same genomic locus. Bioinformatics. 2004, 20: 1157-1169. 10.1093/bioinformatics/bth058.PubMedView ArticleGoogle Scholar
- InterPro. [http://www.ebi.ac.uk/interpro]
- Zdobnov EM, Apweiler R: InterProScan - an integration platform for the signature-recognition methods in InterPro. Bioinformatics. 2001, 17: 847-848. 10.1093/bioinformatics/17.9.847.PubMedView ArticleGoogle Scholar
- NCBI-CDD search. [http://0-www.ncbi.nlm.nih.gov.brum.beds.ac.uk/Structure/cdd/cdd.shtml]
- Arabidopsis thaliana: MATDB Redundancy Viewer. [http://mips.gsf.de/proj/thal/db/gv/rv/rv_frame.html]
- Zhu W, Schlueter SD, Brendel V: Refined annotation of the Arabidopsis genome by complete expressed sequence tag mapping. Plant Physiol. 2003, 132: 469-484. 10.1104/pp.102.018101.PubMedPubMed CentralView ArticleGoogle Scholar
- PHYLIP. [http://evolution.genetics.washington.edu/phylip.html]
- Hirayama T, Shinozaki K: A cdc5+ homolog of a higher plant, Arabidopsis thaliana. Proc Natl Acad Sci USA. 1996, 93: 13371-13376. 10.1073/pnas.93.23.13371.PubMedPubMed CentralView ArticleGoogle Scholar
- Landsberger M, Lorkovic ZJ, Oelmuller R: Molecular characterization of nucleus-localized RNA-binding proteins from higher plants. Plant Mol Biol. 2002, 48: 413-421. 10.1023/A:1014089531125.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.