If you’re reading this blog, you likely have an opinion about open access to journal articles and research results. The White House Office of Science and Technology Policy (OSTP) has put out two formal Requests for Information; one on the subject of “Public Access to Peer-Reviewed Scholarly Publications” and the other on “Public Access to Digital Data.”
While most of us enjoy the seemingly endless back and forth discussion online (or ranting and raving, as the case may be), this is a chance for all stakeholders to have a direct influence where it matters most. The White House is crafting requirements for recipients of federal research funding and the information received here will be crucial to setting policy.
There are two separate issues here, public access to journal articles from federally-funded research, and the tricky question of how to make the most of the raw data collected in those federally-funded experiments.
The OSTP previously called for a public consultation on the subject and is now extending that process further. Here they’re specifically looking for input from “the public, universities, nonprofit and for-profit publishers, libraries, and research scientists.” There are 8 specific questions asked:
- Are there steps that agencies could take to grow existing and new markets related to the access and analysis of peer-reviewed publications that result from federally funded scientific research?
- What specific steps can be taken to protect the intellectual property interests of publishers, scientists, Federal agencies, and other stakeholders involved with the publication and dissemination of peer-reviewed scholarly publications resulting from federally funded scientific research?
- What are the pros and cons of centralized and decentralized approaches to managing public access to peer-reviewed scholarly publications that result from federally funded research in terms of interoperability, search, development of analytic tools, and other scientific and commercial opportunities?
- Are there models or new ideas for public-private partnerships that take advantage of existing publisher archives and encourage innovation in accessibility and interoperability, while ensuring long-term stewardship of the results of federally funded research?
- What steps can be taken by Federal agencies, publishers, and/or scholarly and professional societies to encourage interoperable search, discovery, and analysis capacity across disciplines and archives? What are the minimum core metadata for scholarly publications that must be made available to the public to allow such capabilities?
- How can Federal agencies that fund science maximize the benefit of public access policies to U.S. taxpayers, and their investment in the peer- reviewed literature, while minimizing burden and costs for stakeholders, including awardee institutions, scientists, publishers, Federal agencies, and libraries?
- Besides scholarly journal articles, should other types of peer-reviewed publications resulting from federally funded research, such as book chapters and conference proceedings, be covered by these public access policies?
- What is the appropriate embargo period after publication before the public is granted free access to the full content of peer-reviewed scholarly publications resulting from federally funded research?
That these questions are being asked shows that the OSTP has a deep understanding of the issues involved and is looking for a nuanced solution that provides maximum benefit from research results without causing irreparable economic harm or destroying the functional system of filtering, verification, and distribution of those results.
This isn’t just a simple decision whether papers should be free. Questions 1 and 4 spell out an agenda that seeks to create new markets and new products from the scholarly literature.
Question 3 appears directed at the seemingly nonsensical approach of PubMed Central, which for unknown reasons requires all papers to be held in one physical repository. This seems an archaic absurdity in an interconnected age. Google seems to work just fine in indexing material spread throughout the world; why must PubMed Central have everything in one box? Publishers would be much more supportive of PubMed Central if the traffic for viewing free articles went to the journals themselves rather than PubMed Central.
That question, combined with question 5, points to an interest in creating a science-wide index, one that reaches beyond the bio-medical range of PubMed. The utility of such an index is clearly obvious in an increasingly interdisciplinary world.
Question 8 is the one that likely has the most immediate importance for scholarly publishers. The NIH requires that research be released after a 12-month embargo, but it’s unclear if the same time scale that seems to work for medicine and life sciences would be appropriate for the physical or social sciences. Does the usage and citation pattern of articles in all fields match that seen in medicine and biology?
Access to Data
Providing public access to the data collected in federally-funded experiments is a great idea, but it’s an incredibly complex undertaking. There are costs in preparing the data for the use of others, costs in storing the data and costs in providing the data.
Beyond finding a way to pay for all those extra expenses, you have the near-infinite variety of types of data collected. Data archiving and availability will require a clear set of standards, but is it realistic to consider the development of such standards for such wildly variant material?
And there’s still a deeper underlying question — for some types of data, is it worth the bother? Many experiments are posed to ask a very specific question under very specific conditions — it’s unclear if the data generated could ever be re-used for a different purpose. As technology improves, we will reach (if we haven’t already reached) a point where it’s cheaper to recreate some data than it is to store it.
The request for information here asks 13 very thoughtful questions for moving the process forward. The NIH and NSF already require “data management plans” from funded researchers but these requirements could use much further refinement.
Even ignoring the potential benefits of data sharing, researchers stand to benefit greatly if funding agencies are willing to put money toward the organization and archiving of data. Having your own laboratory’s data in a clear and permanent system will save countless hours of work as one student graduates and moves on, taking precious knowledge with him that often needs to be recreated from scratch.
For libraries and publishers, there’s great opportunity here. Librarians are experts in the organization and retrieval of information. There’s an open door here for the creation of an entirely new role for the institutional library as the data archive and the librarian as the archivist.
For publishers, our authors and readers have a crying need for systems to manage and store the vast hordes of data used to generate research articles. The company that masters the technologies involved and offers services and solutions in this field will prosper by filling a tremendously valuable niche. It’s not clear if publishers are the right people to fill this role, but our expertise in content management seems a good fit.
These requests for information are of great importance for the future of academia and scholarly publishing. If you’re a traditionalist who sees open access as the downfall of civilization, an advocate who thinks information must be free, or someone who falls somewhere in between, this is your chance to create the future you’re seeking. If you’re a researcher who doesn’t want to be burdened with the storage and tracking of minutiae, a completist who wants his every action recorded for posterity or someone somewhere in between, this is your chance to determine the data policy of the future.
For publishers and librarians, the shape of your industries and careers are on the line here. Send in your responses by January 2, 2012, or let others decide your fate.