Monday, August 2, 2010

Data Are Not Information

by Jeff Stanger

Data and information are not synonyms. Data only have the potential to inform. They are half the equation. It is communication that transforms data into information, and in a digital age the communication landscape has been fundamentally altered. This requires using new mechanisms born natively in interactive media to effectively turn data into meaningful information.

Research is the process of gathering data — measuring something quantitatively or qualitatively, formally or informally — usually involving a question and by necessity a methodology and instrument. This process results in data broadly defined, whether numeric, textual, or audio-visual, structured or unstructured. (See, for example, Lucy Bernholz's "What Kind of Data are we Talking About?" and a video interview explaining "data" as anything that can be digitized.) All data depend on measurement, even data as common and seemingly objective as the weather. I could conduct research by walking outside and gathering the data points "hot" and "humid." Or I could collect the data "90 degrees" and "50% relative humidity" from the same set of circumstances using a thermometer and a hygrometer. Same reality, different data. Why? Different research.

Research → Data

Communication turns data into information

Data and information are not synonyms. Data only have the potential to inform. They are half the equation. Information equals data plus communication.

Information = Data + Communication

Information results from finding something out and then telling others about it. Data un-communicated, or ineffectively communicated, amount to personal knowledge or useless raw material. My measurement of "hot" and "humid" remains my personal data/knowledge until I walk back into the house and tell my wife. But what if I tell her in a language she doesn't understand? What if I don't speak loud enough? What if I give her a written note that she ignores in favor of a richly interactive Accuweather iPad app? I have failed to inform not because of a failure of data, but a failure of communication. Communication turns data into information.

Because data must be communicated to be informative, our methods of communication become paramount. In a digital age, these methods have been revolutionized, fundamentally altering the landscape in which data become information. This shift started fifteen years ago with the widespread adoption of the World Wide Web, accelerated 8-10 years ago with the start of broadband proliferation and the introduction of more capable Web browsers, flew off the desktop three years ago with the release of the iPhone (followed later by the SDK and App Store) and other rich-media smartphones, and most recently hit the gas again with the unveiling of the iPad.

This rapid expansion of the possibilities in digital communication has unquestionably transformed information consumption. According to a recent survey by the Center for the Digital Future at the University of Southern California's Annenberg School for Communication & Journalism, 82% of Americans now have access to the Internet, with 78% of them rating the Net as a "very important" or "important" source of information, surpassing both television (68%) and newspapers (56%) by wide margins (view survey results). The Pew Research Center's Internet & American Life Project finds that greater than eight in 10 Americans have mobile phones, and they are doing a whole lot more than making calls — 38% of them (and rising) use those devices to access the Internet (View survey results).

Whether in journalism, government, health care, philanthropy, policy research, education, advocacy, publishing or countless other fields, the conditions under which data become information look different every day.

Meantime, also due to the powerful effects of digital technology, our ability to gather and store raw data — aka, research — has expanded exponentially. This is well-addressed in a February 2010 Economist cover story "Data, data everywhere." (subscription required) Note that the article admittedly, and incorrectly in my opinion, uses the terms data and information interchangeably:

Quoting Kenneth Cukier: Information has gone from scarce to superabundant. That brings huge new benefits — but also big headaches.

The world contains an unimaginably vast amount of digital information which is getting vaster ever more rapidly. This makes it possible to do many things that previously could not be done... But [data] are also creating a host of new problems.

Joe Hellerstein, a computer scientist at the University of California in Berkeley, calls it 'the industrial revolution of data.' The effect is being felt everywhere, from business to science, from government to the arts. Scientists and computer engineers have coined a new term for the phenomenon: 'big data'.

Citing Hal Varian, Google's chief economist: Data, he explains, are widely available; what is scarce is the ability to extract wisdom from them.

The communication platforms for converting data into information are shifting under our feet, while the data fire hose hits us full-force. As a result, to paraphrase Varian, we are awash in data, but barely wet with information.

The opportunities and strains in transforming data into information in a digital environment are evident in the open data, open government, and Gov 2.0 movements related to government data; linked data or semantic web efforts in the Web standards community; the newspaper industry's grappling with evolving digital platforms and new commercial models related thereto; the push for electronic health records; and elsewhere.

A fresh example comes from the health field: Paul Tarini of the Robert Wood Johnson Foundation's Pioneer Portfolio, in a recent interview with O'Reilly Radar, said that Pioneer sponsored the Open Source Convention's (OSCON) Health Track to explore how to make vast amounts of health data more "useful and meaningful" to both patients and health care providers by creating data-driven Web-based and mobile applications. In other words, how do we turn these large oceans of data into useful wells of information, and do it with effective communication in digital media?

That sort of transformation is at the heart of what I call digital information — raising the information-equals-data-plus-communication equation to the technology power.

Digital Information = (Data + Communication)Technology

Technology has transformed both the way we collect data and the way we communicate. Producing digital information requires more than creating digital clones of constructs such as books, pages, reports, white papers, articles, and stories borrowed from a pre-digital era. That approach — digital distribution — amounts to unimaginative (read, ineffective) communication of data given today's diverse and transformative digital toolbox. Instead, successful information in a rapidly changing communication age demands new mechanisms born natively in the media of our time, utilizing their unique and powerful interactive capabilities. Interactive information graphics, data-rich Web applications, integrated multimedia, maps, smartphone and tablet applications all baked with dynamic computation, customization, and visualization — things impossible in a pre-digital era (or even six months ago) — can and should raise our narratives to the technology power. Only then will digital data become digital information.

Comments welcome...