Accelerated exon evolution within primate segmental duplications

Table 1 Number of exons, transcripts and genes analyzed

	Exons	Txs	Genes	Exons dup (%)	Txs dup	Genes dup
Total	193,165	28,099	18,850	11,132 (5.76)	3,966	2,445
Studied	178,295	26,383	17,367	10,634 (5.96)	3,642	2,187
D_e > D_i	25,559	16,405	11,059	3,231 (12.64)	2,220	1,387
q < 0.05	625	802	573	226 (31.16)	291	200
q < 0.05, coverage MMU, D_i > 0.01	244	319	226	133 (54.51)	173	119
Domains	39	46	36	16 (41.03)	17	14
PPs	35	52	33	19 (54.29)	30	18
Manually rejected	96	140	96	57 (59.38)	84	57
Good	74	86	64	41 (55.41)	43	31

'Exons' refer to the nonredundant list of human coding exons in RefSeq. We define a transcript as a unique combination of RefSeq ID, gene name, and coordinates while genes are determined solely by the gene name. Exons in the 'Studied' category correspond to exons for which a corresponding intronic region was determined. Proportions of duplicated exons (Exons dup) relative to the total set are shown in parentheses. 'D_e > D_i' refers to exons with higher exonic rate of changes than in their neighboring introns. Significant increases are shown as 'q < 0.05'. The coverage of macaque reads in their introns (more than two reads on average) and with an intronic rate greater than 0.01 was also considered. Numbers for exons discarded because of tandem protein domains, processed pseudogenes (PPs), or after visual inspection for even coverage are also shown. Dup, duplicated; MMU,; Txs, transcripts.

ISSN: 1474-760X