Sign In

Communications of the ACM

ACM News

I’m a Data Scientist Who is Skeptical About Data

View as: Print Mobile App Share: Send by email Share on reddit Share on StumbleUpon Share on Hacker News Share on Tweeter Share on Facebook
When the numbers mean nothing.

Data doesn't say anything. Humans say things.


After millennia of relying on anecdotes, instincts, and old wives' tales as evidence of our opinions, most of us today demand that people use data to support their arguments and ideas. Whether it's curing cancer, solving workplace inequality, or winning elections, data is now perceived as being the Rosetta stone for cracking the code of pretty much all of human existence.

But in the frenzy, we've conflated data with truth. And this has dangerous implications for our ability to understand, explain, and improve the things we care about.

I have skin in this game. I am a professor of data science at NYU and a social-science consultant for companies, where I conduct quantitative research to help them understand and improve diversity. I make my living from data, yet I consistently find that whether I'm talking to students or clients, I have to remind them that data is not a perfect representation of reality: It's a fundamentally human construct, and therefore subject to biases, limitations, and other meaningful and consequential imperfections.

The clearest expression of this misunderstanding is the question heard from boardrooms to classrooms when well-meaning people try to get to the bottom of tricky issues:

"What does the data say?"

Data doesn't say anything. Humans say things. They say what they notice or look for in data—data that only exists in the first place because humans chose to collect it, and they collected it using human-made tools.


From Quartz
View Full Article



No entries found