Table 1 Runtime penalty for an increasing number of bins in an IBF. The values represent ratios by which the query run time increases when using an IBF with b bins compared to an IBF with 64 bins. We simulated b equally sized (user) bins for \(b\in \{64,256,512,...,32768\}\) of random sequence content (1 GiB in total) and sampled 1 million reads of length 250. We then constructed an IBF over these b bins using 32-mers and 2 hash functions. We conducted this experiment for different false-positive rates, including 0.0001, 0.0125, and 0.3125, which resulted in IBFs of different densities. Next, we measured the time required to count the k-mer occurrences in each of the b bins for all 1 million reads. The reported values are the average of five repetitions

From: Hierarchical Interleaved Bloom Filter: enabling ultrafast, approximate sequence queries

\(\textbf{b}\)	64	128	256	512	1024	2048	4096	8192	16384	32768
\(p_{fpr}=0.0001\)	1.00	1.06	1.35	1.35	1.57	1.96	3.41	5.49	6.81	10.95
\(p_{fpr}=0.0125\)	1.00	1.01	1.17	1.34	1.83	2.70	5.39	8.62	15.05	28.33
\(p_{fpr}=0.3125\)	1.00	1.28	1.55	2.26	3.78	6.54	12.94	24.45	47.63	93.47

Back to article page

ISSN: 1474-760X

Contact us

Submission enquiries: editorial@genomebiology.com
General enquiries: info@biomedcentral.com