Fig. 2From: happi: a hierarchical approach to pangenomics inferenceWe investigate the performance of methods for testing for differential gene presence under simulation. (left) We find that logistic regression methods (e.g., GLM-Rao) do not control type 1 error, while happi-np controls type 1 error at nominal levels for all sample sizes. Additionally, we find that happi-a controls type 1 error for large sample sizes (\(n=100\)) and lower correlation between quality variables and the covariate of interest (\(\sigma _x=0.5\)). (right) For tests that control error rates at nominal levels, we evaluate the power of happi-np and happi-a to reject a false null hypothesis, finding that happi-a has slightly higher power than happi-np at sample size \(n=100\). We find that power increases for all methods as sample sizes and effect sizes grow, but decreases with greater correlation between quality variables and the covariate of interestBack to article page