Could Machine Learning Fuel a Reproducibility Crisis in Science?

AI, artificial intelligence, Machine Learning, Predictive Analytics
3632 Views

3 years ago
Could Machine Learning Fuel a Reproducibility Crisis in Science?

Originally published in Nature, July 26, 2022.

‘Data leakage’ threatens the reliability of machine-learning use across disciplines, researchers warn.

From biomedicine to political sciences, researchers increasingly use machine learning as a tool to make predictions on the basis of patterns in their data. But the claims in many such studies are likely to be overblown, according to a pair of researchers at Princeton University in New Jersey. They want to sound an alarm about what they call a “brewing reproducibility crisis” in machine-learning-based sciences.

Machine learning is being sold as a tool that researchers can learn in a few hours and use by themselves — and many follow that advice, says Sayash Kapoor, a machine-learning researcher at Princeton. “But you wouldn’t expect a chemist to be able to learn how to run a lab using an online course,” he says. And few scientists realize that the problems they encounter when applying artificial intelligence (AI) algorithms are common to other fields, says Kapoor, who has co-authored a preprint on the ‘crisis’¹. Peer reviewers do not have the time to scrutinize these models, so academia currently lacks mechanisms to root out irreproducible papers, he says. Kapoor and his co-author Arvind Narayanan created guidelines for scientists to avoid such pitfalls, including an explicit checklist to submit with each paper.

What is reproducibility?

Kapoor and Narayanan’s definition of reproducibility is wide. It says that other teams should be able to replicate the results of a model, given the full details on data, code and conditions — often termed computational reproducibility, something that is already a concern for machine-learning scientists. The pair also define a model as irreproducible when researchers make errors in data analysis that mean that the model is not as predictive as claimed.

To continue reading this article, click here.

EXCLUSIVE HIGHLIGHTS

Related

3 years ago
Could Machine Learning Fuel a Reproducibility Crisis in Science?

Originally published in Nature, July 26, 2022.

What is reproducibility?

One thought on “Could Machine Learning Fuel a Reproducibility Crisis in Science?”

Leave a Reply Cancel reply

Login

Industry News

Connect with Us

Subscription

ADVERTISEMENTS

Produced By:

Archives

The Machine Learning Times © 2025 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190
Produced by: Rising Media & Prediction Impact

EXCLUSIVE HIGHLIGHTS

Related

3 years agoCould Machine Learning Fuel a Reproducibility Crisis in Science?

Originally published in Nature, July 26, 2022.

What is reproducibility?

Recommended

Five Trends in AI and Data Science for 2025

AI data readiness: C-suite fantasy, big IT problem

AI Optimism vs. Skepticism: Bridging the Gap Between Hype and Practicality

How Gen AI and Analytical AI Differ — and When to Use Each

One thought on “Could Machine Learning Fuel a Reproducibility Crisis in Science?”

Leave a Reply Cancel reply

Login

Industry News

Connect with Us

Subscription

ADVERTISEMENTS

Produced By:

Archives

The Machine Learning Times © 2025 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190 Produced by: Rising Media & Prediction Impact

3 years ago
Could Machine Learning Fuel a Reproducibility Crisis in Science?

The Machine Learning Times © 2025 • 1221 State Street • Suite 12, 91940 • Santa Barbara, CA 93190
Produced by: Rising Media & Prediction Impact