Skip to main content
Fig. 4 | Genome Biology

Fig. 4

From: A super-pangenome of the North American wild grape species

Fig. 4

Characterization of genes in the super-pangenome. a Gene-based pangenome modeling. For each combination of 1–9 genomes, the number of genes is represented per class. The lines represent smoothed conditional means with a 0.95 confidence interval. b Gene class composition for each genome. c Percentage of genes attributed to the same class in both the graph-derived and orthology-based pangenomes. The number of consistent genes between both approaches is represented on the left of the bar while the different are on the right. d Transcript length (kb); e CDS length (kb); f Number of exons; g Percentage of annotated domains; h Transcript abundance (TPM; transcripts per million); i TE-affected genes; j dN/dS ratios, and k Expanded/Contracted gene families per class of genomes. P values were determined using ANOVA and significant groups were assigned following Tukey’s “Honest Significant Difference” method. The middle bars represent the median while the bottom and top of each box represent the 25th and 75th percentiles, respectively. The whiskers extend to 1.5 times the interquartile range and data beyond the end of the whiskers are plotted individually as outlying points. For the dN/dS ratios (j), P values were determined using two-tailed Student’s t test. To prevent the compaction of the y-axis by extreme outlying points, the upper limit was capped. Panels ak share the same color legend. Enriched gene ontology terms for the core genes (l) and the variable (dispensable + private) genes (m) are represented as circles with their size depending on the number of genes involved and their color based on the number of genomes having the term detected as significant. For each GO term, the number of significant genes relative to the total number of annotated genes for the term is represented as a gene ratio on the x-axis

Back to article page