Thursday, September 22, 2011

How can social sector information avoid the filter bubble?

by Jeff Stanger

A dispatch from the Communications Network annual conference in Boston, following a plenary by "Filter Bubble" author Eli Pariser

How can those who engage in communication around important social issues avoid the information "filter bubble" that Eli Pariser argues has resulted from algorithms, code, and data-based personalization of web sites? This is not an insignificant question as more people turn to digital sources for news and information. If you haven't read up on Eli's thesis, he sees algorithms (i.e., internet code) as the new information gatekeepers, gradually filtering out material we haven't "liked" or clicked on via search engines and social media. (Read more about Pariser's book The Filter Bubble) The tech industry's desire — made increasingly doable because of the vast amounts of online behavior data now available — is to become bespoke tailors of the online experience. This trend is of particular concern to those who are trying to raise awareness of more serious social and policy topics that don't fit neatly into this "like" economy. The possibility exists that if your issue hasn't been "liked" or clicked on, eventually it will fade from public view.

If we grant Pariser's argument that code and algorithms are the new gatekeepers, one solution I see is to engage those gatekeepers on their digital-era terms (not unlike we did in the old days of newspaper editors). Rather than "search engine optimization" that tries to game the coder's creation, why not engage directly in "search engine partner-ization," transacted on data, where organizations with important information work alongside those who are writing the code? Let me explain:

This approach relies on two concepts: enhanced search results and web-friendly, structured data.

Enhanced Search Results: Bing bills itself as a "decision engine," providing more than a simple list of links. Similarly, Google has been aggressively moving to deliver more than "Los Links" we've become so familiar with. Note Google's acquisition of travel data provider ITA Software for $700 million. They didn't spend that kind of cash in order to include more links to ITA's web site (they can do that for $0 million). They did it for the data the travel company has, for the purpose of creating a souped up experience. The trend is toward enhanced search results. Another example: do a search for the baseball hero of your choice — Albert Pujols. Los Links? Nope. Glance down the page. Albert's thumbnail baseball card, complete with statistics as of last night's Cards game, is resident on Google didn't suddenly start tracking baseball stats; it's pulling them over the internet as a service.

Structured Data: Enhanced search results like this depend upon web-friendly, structured data: XML, JSON, GeoJSON, etc. delivered in raw, but machine-readable format to web applications (Google is a massive web application, not a site) usually via an API (application programming interface). For the non-programmer, gets the data from another location on the internet and seamlessly integrates them into its results format.

Why don't social policy researchers and foundations get in on this? They should. Foundations and their grantees have piles of data on a wide array of important social issues. They invest millions in collecting them. They illuminate the scope and nature of pressing social conditions. Imagine typing in "poverty rate over time," or "number of uninsured Americans" and seeing graphical or tabular displays of those figures, funded by foundations, rigourously gathered by subject matter experts, delivered directly to the search engine of your choice as structured data. Instead of spinning wheels with search engine optimization, subjecting this vital information to the filter bubble, the social sector would be giving it to the digital gatekeepers in a format they can readily use.

But it will take a paradigm shift. Foundations and research grantees are used to thinking in terms of "data as documents" — policy reports, academic journal articles, etc. — pages and files to be somehow found by the code. The new paradigm sees "data as data" — web-friendly, structured data designed to be baked into the code.

My gut tells me that the digital gatekeepers will want to partner. Structured data are a raw material they desperately need. Social sector information will cost them $700 million less, and has far more social significance than data on flights to Boston (sorry Boston). Foundations' research investments will enjoy long digital shelf lives and prominent placement. Their issues will be "covered" by the technological gatekeepers of our time. And our dialogue on issues of public importance will be better for it.

Comments welcome...