This is a follow up on David Crotty’s recent Scholarly Kitchen post on the two OSTP “Requests for Information” (RFI) regarding public access to scientific publications and data. As a refresher, the Publications RFI is at http://federalregister.gov/a/2011-28623 and the Data RFI is at http://federalregister.gov/a/2011-28621. Responses are due January 2.
As is typical of these sorts of requests, the RFIs are short but the issues are far-reaching. However, it is important to understand the context of these requests, if one is to formulate realistic responses. This means understanding who is asking for this information, and what their needs are. Given the labyrinthine structure of federal science, this is not as simple as one might hope. Moreover, these are not new initiatives, so a certain amount of groundwork has already been laid; responses should build on this background if possible.
To begin with, two different interagency groups will be looking at the responses. The scholarly publications issues are under review by the National Science and Technology Council’s (NSTC) Task Force on Public Access to Scholarly Publications. This task force is relatively new and has not produced a report, but it has been deliberating for about a year.
The data management and access issues are being handled by the NSTC’s Interagency Working Group on Digital Data (IWGDD). The IWGDD has been around for some time and issued a report in January 2009, entitled “Harnessing the Power of Digital Data.” People wishing to respond to the data RFI would do well to look at that report. (I was involved in preparing it, although I am no longer involved with the IWGDD.)
Each of these two groups has members from most of the more than 20 federal science funding organizations. By and large, these folks are not empire builders, nor utopians; rather they are each looking at the practical aspects of what their agency can do. This is important; these are hard-headed realists. General theories of public access are of little use to them in this task. It helps if you know what the various science organizations are already doing in these areas.
Note, too, that the NSTC itself is primarily composed of the heads of all the federal departments and agencies that do science. What this means is that at this point there are very few government-wide policies in these two public access domains. I personally doubt that there will be any in the near future, but that’s just me. Basically, the different agencies are thinking about what they can do themselves. They tend to be highly independent, so a government-wide policy is unlikely to emerge. Thus, this is primarily an interagency activity, not just a White House policy enquiry.
Then there are the issues. On the publications side, the existing standard is the public access program of the National Institutes of Health, which funds about half of all federal basic research. Typically, authors submit copies of their accepted journal papers and these are published in NIH’s central repository after a six-month embargo. It’s clear from the RFI questions that the task force is considering options that are very different from this central, federal model. For example they ask these questions:
(3) What are the pros and cons of centralized and decentralized approaches to managing public access to peer-reviewed scholarly publications that result from federally funded research in terms of interoperability, search, development of analytic tools, and other scientific and commercial opportunities?
(4) Are there models or new ideas for public-private partnerships that take advantage of existing publisher archives and encourage innovation in accessibility and interoperability, while ensuring long-term stewardship of the results of federally funded research?
In fact, the task force begins with a surprising question, namely that of developing new markets. It asks:
(1) Are there steps that agencies could take to grow existing and new markets related to the access and analysis of peer-reviewed publications that result from federally funded scientific research?
What they are looking for seems to be ways in which individual federal science organizations can work with individual publishers, not a grand federal scheme. This is a time for experimentation, not standardization, in my view anyway.
On the data management side, the big issue is cost and funding. Research is generating tremendous quantities of data, from a growing host of satellites, telescopes, sensors, analyzers, supercomputers, etc. Individual instrument repositories can cost millions of dollars. How much of it should be maintained, for how long, and by whom? How should it be integrated, and by whom? Several agencies, including NIH and NSF, have made a start with these issues, so the IWGDD questions are somewhat more developed.
But the data issue is still ultimately one of public access, not just preservation, and access is potentially a form of scholarly publishing. Thus we might be talking about some very large new markets, including repository services, analytical capabilities, and portals. As the data RFI asks:
(6) How could funding mechanisms be improved to better address the real costs of preserving and making digital data accessible?
Clearly, there is a lot for publishers to think about in these two RFIs, as far as new approaches and markets are concerned. The important thing is to get realistic ideas into the policy pipeline.