On February 22, 2013, the White House Office of Science and Technology Policy (OSTP) released a memorandum on, “Increasing Access to the Results of Federally Funded Scientific Research.” Today marks the first release of a funding agency’s plans to fulfill the requirements of that memo, as the Department of Energy has now announced their Public Access Plan.
There is much to digest here, and I suspect we’ll see a lot of detailed analysis over the next few weeks, but some initial thoughts below. I do serve on the Interim Board of Directors for CHORUS (the Clearinghouse for the Open Research of the United States), so consider this a conflict of interest statement. To be absolutely clear though, as the question has come up in recent attributions of quotes from The Scholarly Kitchen, all opinions voiced in this blog post are solely my personal thoughts and not the official position of CHORUS, the SSP, other Scholarly Kitchen authors or my employer.
In general, I think this is a strong plan, and an excellent approach to exploring the new frontier of public access for research articles. Its strength lies in its flexibility, as the DOE has crafted a plan that allows for multiple routes for compliance, including directly depositing articles in their centralized repository (PAGES, the Public Access Gateway for Energy and Science) as well as having PAGES serve as a centralized source of metadata to point outwards toward articles made available through a variety of repositories and in the journals themselves via CHORUS.
What this does is allow several different methods to prove their value and efficiency to the community. I suspect that over time we will see practices consolidate for DOE funded researchers as they settle on the preferred method for compliance that best meets their needs.
All that said, the plan, as articulated remains somewhat ambiguous, and many statements in the plan will need further details or clarification. Some specific excerpts and a quick reaction to each:
All researchers receiving DOE funding will be required to submit metadata and a link to the full-text accepted manuscript (or the full text itself) to OSTI [Office of Scientific and Technical Information]. Publishers who participate in DOE’s public access activity will submit article metadata and links to OSTI.
It’s one thing to ask authors to submit a copy of a paper (or a link), it’s quite another to ask them to write standards compliant article metadata. Will the DOE be providing a tool to do this for authors or will they be left to figure it out on their own?
Saving researchers time and effort is one area where CHORUS will likely shine. I spent a week recently at a biology society’s annual meeting, talking to attendees about public access policies and potential solutions. Everyone that I spoke with had a vague sense that these policies were either in place or coming, but very few knew any details. They were unanimous in all having a story about being confused when depositing articles, having difficulties doing so, and in several cases, being knowingly delinquent. The real strength of CHORUS is in building compliance into the already existing publication process, essentially automating things for the researcher. For a policy like the DOE’s, an author who publishes in a CHORUS member journal is compliant with little to no effort required. This is a valuable service that publishers can provide for authors and making authors lives easier should be seen as a competitive advantage for a journal.
Publishers retain their rights under copyright to their VoR.
This statement appears twice in the DOE plan and suggests that there remain some confusion regarding how copyright works. This statement implies that there are different copyright holders for different versions of the same article, rather than the author retaining copyright to all slightly varied versions of their own writing. If you copyedit and typeset this blog post, maybe change a few words, you can’t claim a new and separate copyright over it. It’s still mine.
DOE’s Office of Scientific and Technical Information (OSTI) will maintain a repository of accepted manuscripts and can make individual, unclassified and otherwise unrestricted manuscripts publicly accessible if there is no other publicly available version.
This suggests a commitment to a distributed approach, and that the DOE will only offer public access to articles themselves as a last resort, if no other version is available in a repository or the journal.
PAGES will provide metadata and abstracts for such publications in a way that is open, readable, and available for bulk download.
Something that’s missing from the DOE’s plan appears to be any sort of mechanism for text- and data-mining (TDM) of articles. Bulk download of metadata and abstracts is a good thing, to be certain, but there is increasing demand for full text TDM functionality across scholarly publishing.
Publishers, both individually and as a group through efforts like CrossRef’s TDM service, have been putting in an enormous amount of work to clarify usage terms and create technological interfaces to better enable TDM. This was assumed to be a part of the OSTP memo requirements, and I’d like to see the DOE take better advantage of the opportunities offered here.
In all cases, OSTI will maintain a dark archive of manuscripts to be used in the event links become broken or full-text access is otherwise interrupted or discontinued.
It’s unclear exactly how this archive will work, whether articles need to be deposited or will be harvested by the DOE, and what the terms are for bringing an article to light, and whether the process is reversible.
To integrate data management planning into the overall research plan, the Department will ensure that all research proposals selected for funding include a Data Management Plan (DMP).
Separate from the plan for public access to research articles, the DOE has also released their requirements for the data availability portion of the OSTP memo. I’m a little surprised by the path chosen–this seems very similar to the NSF’s existing policy, basically requiring grant applicants to submit a DMP which will be reviewed and approved by the agency. This approach strikes me as something of a punt, leaving responsibility in the hands of researchers rather than setting up standards and methodologies for researchers to follow. Data availability is a complex issue, and having some rules to follow might make things easier for researchers.
Starting October 1, 2014, the Department will begin to include requirements for the submission of accepted manuscripts and publication metadata in award agreements.
From this it appears that any grants issued after October 1, 2014 will be subject to this policy, and it does not appear to be retroactively applied to existing grants. That may cause some compliance problems, as researchers, publishers and repositories will need to know the date that funding was rewarded in order to know whether deposit and public access is required. It also means that at the earliest, the first required public access articles will appear no earlier than late 2015, more likely early 2016 (with the exclusion of Gold OA articles freed up immediately or any publishers or authors who choose to get a jump on the policy early).
Given that timescale though, it seems like the actual implementation of this policy will have some space to evolve and for the details to be worked out. As noted in the policy, the DOE has set up several channels for feedback and will be running a user focus group to continuously revise and improve its public access model.
All in all, a great start for an important policy, a significant achievement for the White House, and a major shift in the scholarly publishing landscape. We’ve all been planning for this moment for nearly a year and a half, so it’s exciting to finally be able to dig in and get to work.
53 Thoughts on "US Department of Energy Announces Public Access Plan"
I agree that there are more unanswered questions and ambiguities than answers. More information is available here:
The author, Jeff Salmon, is the DOE “owner” of OSTI and PAGES. There are however some interesting elaborations. Note for example that Jeff invokes the “best available version” principle which has been part of the PAGES design since it was first build in May 2013. Best available seems to mean that PAGES will not link to Accepted Manuscripts via CHORUS, only to Versions of Record.
But then it is unlikely that PAGES will link to SHARE AM’s, as Jeff suggests, because SHARE is not planning on systematically collecting funding data. It may get some from its repositories but it will be on a catch as catch basis. See http://www.arl.org/storage/documents/publications/SHARE-notification-service-architectural-overview-14apr2014.pdf.
In short it is going to be a complex process for PAGES to decide how to link to each article, a process that has yet to be defined.
Is DOE committing to a gold open access policy for journals? Doesn’t seem so. In that case, CHORUS becomes less important.
No, not gold open access at all. This is a “public access” policy, the type of which CHORUS was designed to help fulfill. Not sure what you’re getting at here, as CHORUS is a key part of fulfilling this policy for funded authors.
In fact it is part of the PAGES design to link immediately to all gold OA articles that are identified as appropriately based on DOE funding. In that sense there is indeed a gold aspect to the DOE policy. This would include APC articles from both hybrid and pure OA journals, as well as from subsidized journal articles. Mind you this is another significant complexity that has really not been addressed in setting up the PAGES sorting procedures. Moreover CHORUS might well provide the links to some of these articles. CHORUS is not confined to subscription articles, is it? That is not my understanding.
You are correct, CHORUS can work equally well with articles of all various access types. But to be clear, the DOE has not announced any requirements that funded authors must pay APCs for immediate gold OA, though those doing so would be in compliance with the DOE’s policy.
For the record, PAGES was operational just 90 days after the Feb 2013 OSTP memo was issued. It was built by then OSTI director Walt Warnick and his team, who specialized in rapid development. That it has taken 18 days short of 18 months for the first public access system to go public is due to the onerous federal review system, not development time.
In this context it is worth mentioning that CHORUS has just gone into production and is accepting memberships. See http://chorusaccess.org/2014/07/31/chorus-goes-live-into-production/. Public Access is finally alive.
Richard Van Noorden provides a balanced summary of the policy and some of the expressed concerns.
It’s interesting that the main complaints about the policy seem to be centered around the notion of wishing that it was something that it’s not, that the policy isn’t “open access” as defined by requiring unfettered reuse and redistribution of articles. But if you actually read the the OSTP memo, the phrase “open access” doesn’t appear in it a single time, which, I assume, was deliberate. The DOE policy does seem lacking in addressing TDM issues, but these sorts of analyses can be accomplished without the requiring the rights to re-sell and redistribute content.
I can’t tell if this set of talking points represents either a confusion about OSTP policy, which never once called for abdication of copyright and intellectual property, or if it’s an attempt to publicly drive policy in that direction.
The talking points don’t centre around the wish that the policy was ‘open access’ in the first place, so much as the DOE failing to fully address the White House/OSTP call that agencies ““maximize the potential for…creative reuse to enhance value to all stakeholders”.
This extremely interesting phrase – never fully defined – appeared to hint to some advocates (who, admittedly, are driving for full OA) that the White House/OSTP wanted agencies to expand beyond even the NIH’s current policy on public access.
I suspect that if the White House approved the policy, then it meets their requirements. As you note, the phrase “maximize the potential” is indeed ambiguous. Assuming it meant a potentially illegal seizure of author rights and intellectual property is something of an overstep. As far as I’ve heard, no guidance requiring particular licensing terms for articles has been officially issued by the White House.
Again, it’s unclear to me whether this was a misunderstanding of OSTP policy or is a bit of gamesmanship meant to drive things further.
Question: do people not realize that papers deposited in PubMed Central retain the copyright terms under which they were originally published?
But aren’t they also under the federal use license? In principle this license can allow uses not otherwise allowed by the original copyright terms, can’t it? Isn’t making them public just such a case, when the publisher does not do so?
Can you point to where this is stated on PMC or the papers therein? I find only this:
I have no idea whether PMC discloses it, but the basic legal principle as I understand it from the lawyers is that the Feds claim a federal use license on works based on research they fund. A license by definition modifies the original terms of the copyright. The open questions are what rights are conveyed to the US Government and how much can they pass along to the public?
Interestingly, while PMC has clear statutory authority, DOE does not. But I doubt anyone will challenge them because they have the gold so can make the rules.
Statutory authority aside, I think NIH/PMC has the right interpretation: the government license can provide free access to the public, but does not then give the public unfettered use of copyrighted works. And that’s how I read DOE’s statement about publishers retaining their copyrights to the VoR, namely, “Don’t read the government’s license to abrogate publishers’ copyright.” Holdren’s memo isn’t a subtle play, where open access masquerades as public access. It’s straightforward and acknowledges all the stakeholders and what they bring to the table.
For what it’s worth, PMC states the following:
“All of the material available from the PMC site is provided by the respective publishers or authors. Almost all of it is protected by U.S. and/or foreign copyright laws, even though PMC provides free access to it. (See Public Domain Material below, for one exception.) The respective copyright holders retain rights for reproduction, redistribution and reuse. Users of PMC are directly and solely responsible for compliance with copyright restrictions and are expected to adhere to the terms and conditions defined by the copyright holder. Transmission, reproduction, or reuse of protected material, beyond that allowed by the fair use principles of the copyright laws, requires the written permission of the copyright owners. U.S. fair use guidelines are available from the Copyright Office at the Library of Congress.”
Just so. The government can’t legitimately say that private purposes are actually government purposes. Rather than being intellectually dishonest, it’s better to be transparent and admit that the public must rely on fair use to guard them against liability for infringement.
What about bibliometric analysis for the purpose of program evaluation or exploration? For example I am interested in mapping the relations between DOE’s basic research programs and their applied programs, to see the transitioning (or lack thereof). Then there is impact analysis for different programs, something that is already commonly done for universities and countries. Would this sort of use not fall within federal use?
Why would you need any sort of copyright to do that sort of analysis? Are you going to be redistributing and/or reselling the articles themselves? If not, I don’t see that copyright is relevant here.
Are you talking about text or data mining? The UK just changed their copyright law to allow for noncommercial mining without a license. I don’t think the US has this same exception. But like David says, maybe they don’t need one?
Interesting question, David. It does require bulk downloading, which I thought was a sticky wicket. Beyond that if one were to make their data available that could include all the articles, which might be a form of redistribution.
But the basic point is that until federal use is defined I do not think you can claim that it is not a potential problem. That seems like an argument of the form “I can’t think of a case so there must not be one.” Moreover, if federal use is different from fair use then PMC cannot change that just by saying it. It seems like an open legal issue to me, simply because federal use is not yet defined. I have been told that federal use is an unwritten rule and that makes me nervous.
My opinion (and I am not a lawyer) is that if you have legal access to the articles, then you can download them and read them or analyze them as you see fit. It only becomes problematic for copyright issues if you want to redistribute or re-sell them. It’s the difference between Google, which does bulk downloading of every single word in every single journal article and uses that to create a search engine results and someone downloading a copyrighted book and then trying to sell copies of that book to others.
If one needs redistribution rights, then they must be negotiated and obtained legally.
Van Noorden is wrong about repositories. There is no requirement for deposit of papers. This is just the “best available version” PAGES concept. First choice is linking to a publisher’s site for the VoR, if available. Second choice is linking to a repository AM, if available but not required. Third choice is linking to DOE’s own version.
Note that DOE will collect a copy of every article for its own, partially dark, collection, so going outside is actually an added expense. If this becomes too expensive they may not do it.
How DOE expects to collect a copy of every article is unstated in the policy. Also not spelled out, but hopefully acceptable is public availability of the author’s manuscript in the journal (essentially the journal serving in the “repository” role).
DOE is using their STIP program to collect all AMs for its archive. STIP already collects all research reports. Note that the majority of papers will come from the DOD national labs, which are DOE contractors. Interesting point about the journal as repository. That might work.
Can you clarify what STIP is? Does this mean that authors have to take the additional step of depositing their manuscripts with the DOE, regardless of whether they are available in the journal or in a repository? Does the DOE harvest this material themselves? What are their compliance rates like?
STIP is a procedural program distributed throughout the DOE research complex. Many offices have STIP responsibilities. See http://www.osti.gov/stip/. There is a deep procedural problem here. DOE needs a copy of every AM for the dark archive. In principle it could first get those AM’s that are available on the web then use STIP to get the rest. But STIP has no way of knowing which AM’s those are so it is set to collect them all. DOE thus has no need to collect duplicate AM’s from the web.
In fact if it links to VoR’s this raises a big de-duping problem, because OSTI does not want to provide access to multiple versions, unless it now does. Clearly the policy is in flux.
David, you are wrong. I went through around 5 or 6 email rounds of questioning with the DOE’s Brian Hitson over the weekend, much of it focusing on just this question. Hitson told me: “In 100% of cases, DOE authors will be required to send us accepted manuscript metadata and a link to the accepted manuscript on their institutional repository”.
Now, I agree that the DOE is not making clear exactly the timetable by which it expects its authors to upload their manuscripts to institutional repositories (and this is a significant ambiguity), but the stated intent is that the DOE does want there to be *two* versions of each paper out there on the internet: an accepted manuscript in a repository *and* a VoR. The fact that its portal will ‘link’ only to the best version is a secondary point. Google will find all versions, no?
Thanks Richard–my reading is strictly from the words in the policy. Again, the question of what qualifies as a “repository” is unanswered, and what an author is expected to do if their institution has no such repository. If there are two versions of the paper out there, a publicly available AM on the publisher’s site, and a subscription access VoR, would that fulfill the requirements?
Thanks Richard. If true then this is a new policy which goes well beyond the OSTP memo, which nowhere mandates repository deposit. It is one thing to say supply a repository link if there is one and quite another to say there must be one. Note by the way that in many cases there will not be a VoR. Plus in cases where the publisher makes the AM available there will be two AM’s online under a repository mandate. But Brian does not make DOE policy.
I’d hope that Brian – as representative contact for reporters – would be well-versed on DOE policy. Meanwhile the policy itself states: “All researchers receiving DOE funding will be required to submit metadata and a link to the full-text accepted manuscript (or the full text itself) to OSTI”. Top of page 5, http://www.energy.gov/sites/prod/files/2014/08/f18/DOE_Public_Access%20Plan_FINAL.pdf .
This can be read to mean they can submit the full text to OSTI, as opposed to submitting a link. Repositories are not mentioned. I suspect we have stumbled onto an unresolved policy issue, which is normal when new rules are first issued.
There is also page 6: “To ensure long-term preservation and access, all DOE-funded authors will be required to submit accepted manuscript metadata to OSTI along with a document or link to a publicly accessible, full text version of the accepted manuscript available on an institutional repository.”
So yes, authors can either submit a manuscript to OSTI, or submit a link to a repository-manuscript. And yes, it is not clear which is expected to be the dominant mode of delivery, or if authors will be ‘required’ to send a link to a manuscript in an institutional repository. That is why when I was reporting this out David, I detected these ambiguities and took especial care to ask the DOE’s elected press representative multiple times about them. Hence my statement from Brian Hitson above – which I assume would have been checked and approved through innumerable bureaucratic levels before it got back to me. Perhaps that’s not so. It’s the best information I have, as the policy itself is ambiguous.
Consider this, Richard. DOE claims that, unlike most funding agencies, it does not have to go through a rulemaking process in order to mandate its public access program. Supposedly this is because its standing order (241.1B) already covers accepted manuscripts (even though they have not been collected). I have my doubts about this, but in no case does 241.1B include a repository deposit mandate.
So if DOE wants to now issue a repository deposit mandate is should have to go through a rulemaking to do it.
Richard, Did Brian Hitson say what an author is supposed to do if their institution, like mine, does not have an institutional repository?
That is the way I understand it in the “best available version” model. A link is preferred but not required. One also wonders if an author’s personal publications page will do for the link? In any case OSTI will probably have to collect the AM via the link and archive it. Otherwise they will face the configuration management problem of authors moving, URLs changing, links disappearing, etc. Ingest is a potentially expensive challenge, just ask PMC.
Indeed, there are two big acceptance issues here. What counts as an article and what counts as a link? If arXiv is allowed then they have to be sure it is to the AM not a preprint.
David, regarding your discussion of the data management plan concept, I was involved in its origin with the Interagency Working Group on Digital Data. The idea is that the scope and nature of data varies tremendously from discipline to discipline, and even study to study. Thus universal standards or practices, such as you seem to want, cannot be specified in advance. Is there something in particular you think might be done? It is certainly worth considering as we go forward.
That DOE has followed NSF’s lead is good news because up until now they had no data policy and did not want one. It is also good to have consistency from agency to agency.
AIP’s Steven Corneliussen has a nice summary of some of the more prominent press coverage of the DOE plan, see here:
SPARC’s Heather Joseph has been widely quoted complaining that there is no central system for text mining. But I wonder if DOE’s proposed dark archive, containing every accepted manuscript, can really be kept dark? A little litigation might pry it loose as there are no classified documents in it.
What litigation do you foresee being successful? My first thought was FOIA, but I don’t see how FOIA can be used to annihilate third-party copyrights where the government happens to possess a copy of a work. (If that’s the case, forget Amazon and the local library: I’m going to FOIA the entire catalog of works at the Library of Congress!) The dark archive is an untapped resource, to be sure. But I wonder if the public would get to dictate the terms of access, or whether the DOE’s discretion is defendable in court?
David, you ask about a tool for submitting metadata? For grantees this will probably be OSTI’s Elink system, which is how research reports are submitted. See https://www.osti.gov/elink/241-3.jsp
It already has a document type category for journal articles. However, I do not see how a link is submitted, as opposed to an article.
The DOE National Labs, which are contractors, have internal processes for collecting and batching submissions.