Skip to main content

Table 5 Statistics of benchmark variants in chromosomes 1–22 of each genome aligned to the GRCh38 reference genome. Four genomes with GIAB benchmark variant calls, with v3.3.2 for HG001 and HG005-7, and v4.2.1 for HG002-4, together with the statistics within the high-confidence regions. For HX1, high-confidence regions are created by removing GIAB “all difficult-to-map” regions from the GRCh38 reference genome

From: NanoCaller for accurate detection of SNPs and indels in difficult-to-map regions from long-read sequencing by haplotype-aware deep neural networks

 

Whole genome

High-confidence region

Non-homo-polymer

region

Genome

SNPs

Indels

SNPs

Indels

Total Length

% of genome

Indels

HG001

3,002,314

517,177

2,960,486

483,941

2,330,204,759

81.05

181,036

HG002

3,459,843

587,987

3,365,115

525,466

2,542,724,465

88.44

210,352

HG003

3,430,611

569,180

3,327,480

504,497

2,529,085,210

87.97

199,302

HG004

3,454,689

576,301

3,346,597

510,516

2,525,035,837

87.83

200,556

HG005

2,945,666

432,747

2,904,403

403,859

2,290,538,775

79.67

172,678

HG006

3,030,507

435,520

2,982,278

405,828

2,348,035,455

81.67

158,063

HG007

3,048,404

437,866

3,000,039

407,892

2,345,850,549

81.59

157,966

HX1

3,489,068

697,736

2,788,450

176,587

2,182,959,159

75.93