Scope | Main areas of improvement | Suggestions | Outcomes |
---|---|---|---|
1. Extensibility | Poor extensibility of benchmarks (addition of new components such as methods or metrics) | Manage code with workflow management systems (see above), improve documentation, organize code to allow addition of new components (increase modularity) | Increased potential for code reuse by method developers and overall research article quality. Reduced effort required for future benchmarks with the same scope. Ultimately, improved comparison of results across studies |
2. Output availability | Intermediate and final benchmarking outputs are often not made public or are not explorable | Provide (intermediate) outputs in a suitable format as supplementary material or make the available code complete enough to fully regenerate intermediate results | Easier access to information for readers (specific case-studies). Outputs can be reused for other comparison studies |
3. Parameters | Most evaluated methods are run with default settings | Evaluate the sensitivity of the methods when parameters need fine-tuning | Users will be more aware of the critical parameters to set when fine-tuning is necessary |
4. Workflow management and containers | Workflow management systems and containers are scarcely used | Encourage the training of these tools in scientific workshops and undergraduate courses | Improved reproducibility of benchmarks and increased chance that they will be reused or extended, thus improving their visibility |
5. Code licences | Licences are seldom defined, making any code reuse unclear | Define a licence based on the scope of the research and potential commercial distribution | Increased reuse of code and decreased chance of (un)intentional misuses of intellectual property |