A meta-analysis of bioinformatics software benchmarks reveals that publication-bias unduly influences software accuracy





1. Background

Bioinformatics provides widely used and powerful software tools for testing and making inferences from biological data.

As the volume and complexity of biological data increases, our reliance upon heuristic methods that trade software speed for mathematical completeness also increases. In this study we investigate whether trade-offs between speed and accuracy are reasonable, and explore the relationship between some commonly used proxies for quality (e.g. recency, number of users, journal impact and author reputation) and software accuracy.

2. Method

We developed a literature mining approach to identify published benchmarks of computational biology software using pubmed records. We manually collected data from manuscripts that meet our three inclusion criteria (1. the main focus of the article is a benchmark. 2. the authors are reasonably neutral. 3. the test data and evaluation criteria are sensible).

We extracted data on the relative accuracy and speed of 243 software tools, for each of these we also collected data on recency (date of first publication), number of users (approximated by the number of citations), journal impact (impact factor and H5 index) and the reputation of the corresponding authors (H index and M index).

We attempted to model the potential relationships between software accuracy and any of speed, author reputation metrics, journal impacts and the age of methods using Spearman’s rank correlations, linear models of all the parameters and a weighted sum-Z test.

Using permutation-based analysis we show that there are a number of regions in the accuracy-speed landscape with either significant over- or under-representation of software tools.

3. Results

We found that author reputation, journal impact, the number of citations, software speed and age are not reliable predictors of software accuracy.

We also found that there is an excess of “slow and inaccurate” software tools across multiple sub-disciplines of bioinformatics. Meanwhile, there is a major discrepancy between the expected and observed number of tools of “middling accuracy and speed”. Specifically, there is apparently a missing cohort of tools that have compromised between speed and accuracy. Based upon this result we hypothesise that there is a strong publication bias that unduly influences the publication and development of bioinformatic software tools.

4. Conclusions

We found many common proxies researchers use to select good software tools, such as journal impact, are poor software quality predictors.

An over-abundance of early-to-publish, slow and inaccurate tools from different fields appears to mask the anticipated software accuracy and speed association.

The under-representation of quality software that trades some accuracy for speed is concerning. The few tools that fit this category include widespread methods BLAST and HMMER2. These may be more difficult to publish due to editorial and reviewer practices. Which leaves an unfortunate gap in the literature upon which future software refinements cannot be constructed.

5. Future ideas/collaborators needed to further research?

One of the email responses to this study was “like most things that have happened lately, it's either hilarious or depressing, and I can't quite make up my mind which”. This is a sentiment that captures much of how we feel about this study too. We think the results have broad implications for the field of bioinformatics.

From software developers, software users, reviewers and editors, all could benefit from keeping these results in mind when evaluating the impact and likely accuracy of software.

A perceived weakness of the current manuscript, is that we haven’t definitively proven that the under-representation of middling accuracy and speed software is due to a publication bias. We are currently developing different publishing models and attempting to simulate the impact of different scenarios on the software accuracy and speed landscape.

We’re more than happy to hear further ideas about how we can improve the manuscript, results or interpretations and conclusions.

6. Please share a link to your paper


Christopher Massey
over 1 year ago

We’re more than happy to hear further ideas all you can eat near metrotown about how we can improve the manuscript, results or interpretations and conclusions.

hayden bunker
over 1 year ago

Submissions and entries have been made for the success in life. The engagement of the enemy so that all students could write a paper for me for the possibility and its impact for the future times.