Sign In

Communications of the ACM

ACM Careers

Machine Learning Is Contributing to the Reproducibility Crisis in Science

Genevera Allen of Rice University

Machine learning systems and the use of big data sets has accelerated the reproducibility crisis in science, Genevera Allen says.

Machine-learning techniques used by thousands of scientists to analyze data are contributing to the reproducibility crisis in science by producing results that are misleading and often wrong. Genevera Allen of Rice University warns scientists that if they don't improve their techniques they will be wasting both time and money.

A growing amount of scientific research involves using machine learning software to analyze data that has already been collected. Allen says the answers they come up with are likely to be inaccurate or wrong because the software is identifying patterns that exist only in that data set and not the real world.

"There is general recognition of a reproducibility crisis in science right now. I would venture to argue that a huge part of that does come from the use of machine learning techniques in science," Allen says.

From BBC News
View Full Article


No entries found