"Data Is" or "Data Are"?

September 7, 2012 Paul M. Davis

“Data is” or “data are”? It’s the type of sticky linguistic thicket that invites vociferous debate. Working with data and writing about that work are both vocations that presumably appeal to the meticulous and, well, the pedantic. (I’ll plead guilty to the latter trait, if not the former.) Tracing the word back to its Latin roots, “data” is the plural of “datum”, which makes it an airtight case for advocates of “data are”.

This page on the Wake Forest University makes the case for plural usage in characteristic manner:

[The] word data is plural, like people, so you cannot correctly say “That satellite tracking data is nice.” You could say “That satellite tracking datum is nice,” if you’re talking about just one piece of information, because datum is singular, like person is singular. That is why you will read sentences like “These data are too hot to handle.”

If you use the word data correctly, you will be one up on people like network newscasters, Presidents of the United States, and just about everyone else, who think the word data is singular. On the other hand, if you use it incorrectly, probably no one will notice!

It’s a solid argument, but “correct” grammar changes with common usage, no matter what prescriptivists might prefer. If we strictly followed Latin grammar or the lessons of elementary school teachers, we’d ask one another “on what did you step?” in response to a foot injury, rather than “what did you step on?” And as usage of data in the singular becomes increasingly common, “data are” often sounds overly formal, even incorrect, to the ear.

In a post on The Wall Street Journal blog Real Time Economics this morning, Paul Martin, the newspaper’s Assistant Managing Editor and final word on house style, allows the singular usage of data in certain cases:

Most style guides and dictionaries have come to accept the use of the noun data with either singular or plural verbs, and we hereby join the majority.

As usage has evolved from the word’s origin as the Latin plural of datum, singular verbs now are often used to refer to collections of information: Little data is available to support the conclusions.

Otherwise, generally continue to use the plural: Data are still being collected.

On The Guardian’s Datablog, editor Simon Rogers argues in favor of the singular, pointing to this pithy ruling from the paper’s style guide:

Data takes a singular verb (like agenda), though strictly a plural; no one ever uses “agendum” or “datum”.

I generally err towards “data are”, if for no other reason than to avoid angry email from data and/or grammar geeks. This often feels overly formal, and in casual conversation, I’m far more likely to use “data is”. Even in articles, the plural usage is often awkward, as noted by O’Reilly Media’s Alex Howard in a Twitter conversation I had with him earlier:

@suzisteffen @jeffcdi @paulmdavis @pogowasright I read “data are the new oil” and cringe, though. Just. Doesn’t. Sound. Right.

— Alex Howard (@digiphile) September 7, 2012

 

@digiphile @suzisteffen @jeffcdi @pogowasright yeah, I read that and my mind asks, “who are these data you speak of?” ;)

— Paul M. Davis (@paulmdavis) September 7, 2012

 

@paulmdavis @suzisteffen @jcstearns @jeffcdi “These are not the data you are looking for” #ObiWan

— Alex Howard (@digiphile) September 7, 2012

About the Author

Biography

More Content by Paul M. Davis
Previous
Open Source Roundup: Rails, Rubygems.org, and LicenseFinder
Open Source Roundup: Rails, Rubygems.org, and LicenseFinder

Having just come off a project, Ian Lesperance and I spent the week working on some open source initiatives...

Next
Alpine Data Labs Brings Predictive Analytics to Where Data Lives
Alpine Data Labs Brings Predictive Analytics to Where Data Lives

For all the promise of predictive analytics, realizing that potential can be elusive. Moving data is a cumb...

Enter curious. Exit smarter.

Register Now