Access and Feeds

Crisis of Reproducibility: The Need for a Reproducibility Checklist

By Dick Weisinger

‘Not reproducible’, or as some Google researchers are calling it, ‘underspecification‘. AI results reported in papers might have spectacular and glowing results, but when other researchers attempt to replicate those same results when following the methodology specified in the research paper often fail.

Why? There are a multitude of reasons, and intentional cheating by manipulating data to achieve the desired result, potentially for professional gain, is likely not a major reason, although it can’t be entirely dismissed.

One of the big problems is that a lot of research results are based on a very small data set with all of the data coming from the same location. This lack of data diversity can often lead to impressive results base on that narrow slice of data used but which then fails when applied to anything outside of that domain.

On the opposite side of the spectrum are results reported from extremely large data sets, for example data available only to large tech companies, and then processed using extremely powerful computers. In this case, the results can not even be attempted for reproducibility by others because the cost of acquiring the resources to try.

Anna Rogers, a machine-learning researcher at the University of Massachusetts, asked in an article by Wired “is that even research anymore? It’s not clear if you’re demonstrating the superiority of your model or your budget.”

Ben Dickson, founder of TechTalks, quoted some comments from users of the Reddit r/MachineLearning community in his article on “Probably 50%-75% of all papers are unreproducible. It’s sad, but it’s true,” and “Think about it, most papers are ‘optimized’ to get into a conference. More often than not the authors know that a paper they’re trying to get into a conference isn’t very good! So they don’t have to worry about reproducibility because nobody will try to reproduce them,” and “its’ easier to compile a list of reproducible papers than nonreproducible ones.”

To help combat the problem of reproducibility, Joelle Pineau, a computer science professor at McGill, has developed what she is calling a “reproducibility checklist”. She suggests the checklist be required for papers that are accepted to provide additional guidance about how the results were achieved. The checklist includes things like the data, the number of models trained, the compute resources used, and any additional assumptions made when arriving at the results. Ideally, public availability of the code and data sets used in the research would be a major step towards enabling reproducibility.

Digg This
Reddit This
Stumble Now!
Buzz This
Vote on DZone
Share on Facebook
Bookmark this on Delicious
Kick It on
Shout it
Share on LinkedIn
Bookmark this on Technorati
Post on Twitter
Google Buzz (aka. Google Reader)

Leave a Reply

Your email address will not be published.


three × four =