Skip to main content

Table 3 Using cardinality estimates does not decrease classification performance on the test dataset. KrakenUniq in the default mode—using HyperLogLog cardinality estimation with precision 14—classifies reads as accurately as KrakenUniq using exact counting, on both the species and genus level. (Only genus level is shown in the table, which also shows Kraken’s performance for comparison). Note that we tested two versions of exact counting. In version 1, we implemented exact counting using C++ standard library’s unordered_set. Most time is spent on merging counters in the end for report generation. In version 2, we implemented exact counting using khash from klib (https://github.com/attractivechaos/klib/). KrakenUniq uses version 2. Both unordered sets and the hash map require heap allocations for updating, which can cause significant performance cost at runtime because of global locks. Wall clock time for KrakenUniq includes report generation (which takes an additional 2m33s for Kraken)

From: KrakenUniq: confident and fast metagenomics classification using unique k-mer counts

 

Kraken

KrakenUniq

Default

Exact(1)

Exact(2)

Computational performance

 Wall clock time3

17m38s

14m18s

3h30m6s

45m30s

 Speed [Mbp/m]

478.4

595.4

95.9

377.8

 Memory [GB]

167.1

168.2

466.2

272.4

 Minor page faults × 106

203.5

192.2

272.5

904.6

Classification performance

 Recall

0.827

0.888

0.888

0.888

 F1 score

0.922

0.935

0.935

0.935

  1. Bold values indicate the highest or lowest values in each row