Sign In

Communications of the ACM

ACM Careers

Could Machine Learning Fuel a Reproducibility Crisis in Science?

View as: Print Mobile App Share:
intertwined arrows

The research community risks losing public trust owing to the severity and prevalence of the reproducibility crisis, the authors say.

Credit: Getty Images

From biomedicine to political sciences, researchers increasingly use machine learning as a tool to make predictions on the basis of patterns in their data. But the claims in many such studies are likely to be overblown, according to Sayash Kapoor and Arvind Narayanan at Princeton University. They want to sound an alarm about what they call a "brewing reproducibility crisis" in machine-learning-based sciences.

Machine learning is being sold as a tool that researchers can learn and use by themselves — and many follow that advice, Kapoor says. That can create issues and lead to reproduceability failures, according to "Leakage and the Reproducibility Crisis in ML-based Science." The researchers have created guidelines for scientists to avoid such pitfalls, including an explicit checklist to submit with each paper.

Their rallying cry has struck a chord. More than 1,200 people have signed up for an online workshop, to be held July 28, designed to come up with and disseminate solutions. "Unless we do something like this, each field will continue to find these problems over and over again," Kapoor says.

From Nature
View Full Article


No entries found

Sign In for Full Access
» Forgot Password? » Create an ACM Web Account