Skip to main content
Fig. 4 | Genome Biology

Fig. 4

From: DESMAN: a new tool for de novo extraction of strains from metagenomes

Fig. 4

a Variant detection for the 75 CONCOCT clusters of complex strain mock that were 75% pure and complete. Here, 36 clusters (shown) had variants, and 27 of these mapped onto multi-strain species enabling us to calculate variants that were present in the species (true positives or TPs), the number detected not in the species (false positives or FPs) and the number we failed to detect (false negatives or FNs). b Haplotype inference accuracy. For the 25 75% complete CONCOCT clusters that possessed variants and mapped onto species with strain variation, we plot the true number of strains (x-axis) against the inferred number (y-axis), with random jitter to distinguish data points. The colour reflects the mean error rate in SNV predictions on single-copy core genes (Err) and the size the total coverage of the cluster (see Additional file 1: Table S8 for actual values). c Comparison of the true relative strain frequency and inferred haplotype frequency across the 96 samples for the complex strain mock. The data points are coloured by the SNV error rate (E) in the haplotype prediction. (Linear regression of true vs. predicted frequency all: slope = 0.820, adjusted R-squared = 0.741, p-value = <2.2×10−16; haplotypes with E<0.01: slope = 0.853, adjusted R-squared = 0.810, p-value <2.2×10−16.) d Haplotype SNV error vs. gene presence/absence inference error rate. For each of the 67 inferred haplotypes, we give the SNV error rate on single-copy core genes to the closest reference strain against the error rate in the prediction of gene presence/absence in that strain. Cov coverage, Err error, FN false negative, FP false positive, SNP single-nucleotide polymorphism, SNV single-nucleotide variant, TP true positive

Back to article page