Last Friday marked the one-year anniversary of the Obama Administration’s Open Government Initiative (OGI). The occasion was honored with a cupcake and candle on the landing page of the newly re-designed Data.gov site and a widely disseminated announcement from the White House.
For global publishers who have generated a significant portion of revenue building and selling databases, a requirement to make their data freely available is a mixed blessing. Despite the fact that global access and use of the data are expected to rise exponentially, balance sheets will take a hit.
Databases are not just part of a publishers portfolio, if done right they can be the most profitable part and have sometimes carried the less profitable and declining parts of the publishing line up — namely, books. Presses being impacted by this change must quickly seek new ways to recapture publishing expense and reinvent the services they provide.
Conversely, if a business has retooled to conceive of and build data services, it’s a golden egg. For publishers in adjacent spaces — CQ Press, Bloomberg, LexisNexis, Thomson Reuters, National Journal, CQ-Roll Call, the Washington Post — access to troves of free, authoritative, updated data presents a significant opportunity to create new revenue streams by developing bespoke products and services that monetize free content.
What’s It All About?
If unfamiliar with the OGI, an excellent summary of the initiative and the role the Office of Science and Technical Information (OSTI) has played can be found on the OSTIblog in an article written by Walt Warnick, Director of OSTI, and Peter Lincoln, co-author of the Department of Energy (DOE) Open Government Plan:
On January 21, 2009, his first full day in office, President Barack Obama signed the Memorandum on Transparency and Open Government. The memo was addressed to the heads of all Cabinet departments and agencies, and in it, the President called for “an unprecedented level of openness in Government” and instructed the Director of the Office of Management and Budget (OMB) to prepare a directive that would serve “to ensure the public trust and establish a system of transparency, public participation and collaboration” throughout the Federal Government.
On December 8, 2009, OMB Director Peter Orszag issued the Administration’s Open Government Directive, which required agencies to take a number of steps to advance the principles of transparency, participation and collaboration, including preparation and publication of an Open Government Plan by April 7, 2010.
The Department of Energy was one of 29 agencies that has posted its Open Government Plan online, and OSTI’s contributions appeared throughout the 30-page DOE document.
Data.gov includes more than 250,000 datasets, up from 47 made available at launch. The impact of the OGI is not confined to the United States. At present, six nations outside the U.S. are also developing open repositories of government data.
To date, the site has received 97.6 million hits, and following the Obama Administration’s lead, governments and institutions of all sizes are unlocking the value of data for their constituents. . . . From these datasets, citizens have developed hundreds of applications that help parents keep their children safe, let travelers find the fastest route to their destinations, and inform home buyers about the safety of their new neighborhood.
In the area of semantic Web innovation, a proposal is also in the works with Rensselaer Polytechnic Institute to provide a “new encoding of datasets converted from CSV (and other formats) to RDF.”
The message from the Obama Administration is that the OGI signals a sea change for government information that will:
- Spawn a global movement to democratize access
- Enable global linking of data
- Foster innovation and transparency via the creation of “community developed” applications
Who is Paying Attention?
This sweeping initiative presents an enticing opportunity for the technology community. The Gov 2.0 Expo crowd is already descending on Washington for a meeting this week. Referred to as “THE IT event for 21st Century Government” by UMB TechWeb and O’Reilly Conferences (the organizers), the Gov 2.0 Expo will include keynotes from Sir Tim Berners-Lee, Danah Boyd, Dave Girouard, Tim O’Reilly, and others. The premise, question, and objective that the meeting proposes to deal with:
The rise of Government 2.0 signals the emergence of IT innovation and the Web as a platform for fostering efficiencies within government and citizen participation. How can we harness these innovations to decrease waste and increase productivity? Gov 2.0 Expo brings stakeholders together to explore transformative technologies and discover new solutions.
Sunlight Labs, an extension of the Sunlight Foundation and member of the data.gov community (featured previously on the Scholarly Kitchen), will announce the winners of its “Design for America” contest during the event. Sunlight developed the contest with the purpose of inspiring the design community to create and share applications using the data.gov resources.
Nancy Scola, a NY-based writer with the Personal Democracy Forum, has followed the site since its launch last February. In “The New Data.gov Sells the Idea of Gov Data,” Scola notes some interesting differences in the way that the project is being presented today compared to 2009.
[L]ast February, Data.gov had data itself front in center. . . . The new version of the flagship site of the Obama Administration’s open government push seems to have an increased interested in selling the very concept of open government data. [T]he Obama White House and CIO Vivek Kundra have a lot riding on Data.gov. There seems to be a renewed acknowledgment in the new site that the vast majority of us have a very tough time wrapping our minds around the import of raw data sets.
Will It Work?
The quasi-evangelical enthusiasm coming from fans of the program tends to focus the conversation towards future opportunities and away from present day challenges. Stripping out the rhetoric, what data.gov and its international counterparts deliver are profoundly complex sets of expert research data via API.
APIs and data are only part of a larger equation.
While grandma flips through photo albums on her sleek iPad, government agencies (and most corporations) process mission-critical transactions on cumbersome web-based front ends that function by tricking mainframes into thinking that they are connected to CRT terminals. These systems are written in computer languages like Assembler and COBOL, and cost a fortune to maintain. . . . [OGI] provides entrepreneurs with the data and with the APIs they need to solve problems themselves. They don’t need to wait for the government to modernize its legacy systems; they can simply build their own apps.
A post on NextGov says “the Obama administration still has its work cut out for it” and goes on to discusses potential weaknesses and areas for improvement–noting that the academic research sector can help:
[T]he information portal now needs to focus on data context and integrity to achieve true transparency …. Data.gov must do a better job of disclosing the methodology agencies and the White House use to collect and process the underlying information. . . . Academic research has well-established protocols and expectations for how data should be revealed in order to permit others to replicate reported results.
Even professionals face challenges.
In a first installment in the Guardian, “Making things with data.gov.uk – Part 1,” a staff developer presents a play-by-play of what it takes to create and application from data in the beta release of the UK Government Data platform, data.gov.uk. (The post also includes a useful summary of the UK counterpart to the Open Government Initiative and Government 2.0 in the United States, which, in the UK, is led by Sir Tim Berners-Lee and is described in his TED2009 talk focusing on the “next Web”.)
The obstacles that Thorpe describes indicate that there is a steep learning curve, with support needed, for building even a basic app:
One of the challenges of making official government data driven apps is that only a small percentage of the people already making things in this space are fluent in SPARQL, the query language used to retrieve data from RDF stores.
SPARQL 1.0 has no support for aggregate queries such as COUNT but fortunately SPARQL 1.1 which many of the data.gov.uk stores support does.
Not all records are created equal . . . not all of the schools have triples corresponding to these objects. For example at the time of writing 3 schools didn’t have a name.
And, in response to Thorpe’s post:
[I]t’s interesting that the Guardian are offering an introduction to SPARQL before anyone has published a dedicated handbook on the subject. I know the data.gov site has had a bit of a bashing in newspaper comments, but that’s at least partly due to the lack of a guide like this one.
The End Game?
Professionals will find or create the means to build utilities from these emerging global repositories of government data that will:
- Enable comparisons of data that has historically been unavailable, siloed, and non-standardized
- Deliver tools that surface previously hidden relationships between data points and suggest relational meanings
- Aid users develop new hypotheses and research entry points
Whether this translates to empowerment of the general public — or strictly adds to the use of charts and graphs in presentations and articles by researchers and in the media, which pass by the general citizenry — is an open question.
The Obama Administration has presented their lofty vision. However, the locus of control for productizing the data currently lies outside government. As the data.gov website states, innovation will be driven by the “community.” This means that significant responsibility rests with technology professionals and businesses who are equipped to deliver tools, applications, visualizations, and services from the data.
As we have seen recently in the Web 2.0 space, businesses that begin with noncommercial “do no harm” doctrines may ultimately be won over by forces pulling them in other directions.
- Will the technology community remain fiercely committed to using open data to serve the public good?
- Will commercial interests predominate?
- Will the level of commitment and interest in the objectives of a global data program continue without institutional incentives?
- Does the Administration have its own plans for making this type of information digestible for the general public?
An articulated strategy for harnessing resources to continue the process will be a primary determinant of outcome. Otherwise, it is up to independent business and nonprofit interests to embrace and expand upon the mission.