This week Congress unveiled HR 3547, the “Consolidated Appropriations Act of 2014.” This bill is essentially the plan to fund the US government for the current fiscal year, and brings with it good news for for the research community, as it does away with the automatic cuts made during sequestration. This will provide welcome relief, if not the increases that many had sought. HR 3547 also contains a surprise for scholarly publishers: new legislation mandating public access to research publications.
In section 527 on page 1,020 of the bill, legislation pertaining to federal agencies under the jurisdiction of the Labor, Health, and Human Services, Education And Related Agencies (LHHS) Committee now must meet the following requirements:
SEC. 527. Each Federal agency, or in the case of an agency with multiple bureaus, each bureau (or operating division) funded under this Act that has research and development expenditures in excess of $100,000,000 per year shall develop a Federal research public access policy that provides for—
1) the submission to the agency, agency bureau, or designated entity acting on behalf of the agency, a machine-readable version of the author’s final peer-reviewed manuscripts that have been accepted for publication in peer-reviewed journals describing research supported, in whole or in part, from funding by the Federal Government;
2) free online public access to such final peer-reviewed manuscripts or published versions not later than 12 months after the official date of publication; and
3) compliance with all relevant copyright laws.
The bill, I’m told, is likely to be passed as written. This section pertains only to LHHS Agencies, a list of which can be found here.
While it seems a bit odd for Congress to be moving ahead here, undercutting the White House’s own efforts in this area before they have a chance to take effect, with a few exceptions, the concept here is pretty similar to that put forth by the OSTP. It’s unclear that any plans from the agencies beyond what they’ve already developed for OSTP will be necessary.
While I am not a lawyer, and any interpretations should be taken with that caveat, a first glance at the language used raises a few questions. First, the legislation requires the deposit of the author’s final accepted manuscript. Many journals currently deposit the published, version of record in PubMed Central, but it’s unclear if this improved version of the article would be acceptable for agency requirements.
The language also seems geared to encouraging solutions like SHARE and CHORUS. Agencies can designate entities to act on their behalf as far as repositories go. The requirement for public access does not specify that it must be the archived version of the article that is made publicly available, opening up the possibility of journals providing this service.
The flat 12 month embargo will likely cause the most consternation, particularly because it does away with the OSTP’s carefully considered plans to allow for evidence-based rational policies to be implemented in this area. But given that the vast majority of LHHS funded research is in areas of health and medical sciences, the end result is likely much the same as it would have been under the OSTP plan. With the relatively short half life of medical research papers and the precedent set by the NIH, it’s unlikely that these articles were going to see an embargo longer than 12 months anyway.
Still, the embargo is problematic in that there remains no explanation or rationale given to support this particular length. It also sets further precedent, much like the NIH has done, that may influence embargoes in other areas. The rapid turnover in usage patterns and the high levels of funding for medical research create conditions different than those seen for other fields, so what’s appropriate for medicine may not be appropriate for history or math.
The legislation does create a new level of complexity, a new set of rules that are slightly different from other rules, and to which researchers must pay attention. As more and more funding agencies and institutions set idiosyncratic policies, researchers hoping to remain in compliance will be required to negotiate complex sets of rules from multiple sources, potentially requiring the deposit of multiple versions of a paper in multiple repositories, each with a different embargo period required.
Any additional burden placed on the researcher, time and effort spent away from doing actual research, is unwelcome. The added work, combined with the complexity of the rules, may result in poor levels of compliance.
Much of PubMed Central’s success can be attributed to publisher willingness to take on these responsibilities for the researcher. Author manuscripts appear to make up less than 20% of what’s posted in PubMed Central. Feedback from authors indicates that saving this time and effort is a valuable service that journals provide when they automatically deposit on an author’s behalf. If other agencies hope to emulate the NIH’s success, then automating the process through publisher partnerships is essential for driving compliance.
Which brings us back to CHORUS. It’s one thing to set up an automated deposit system for the largest funding agency in the country, it’s quite another to set up individual systems for every funding agency and institution on earth. CHORUS offers the efficiency of a single, unified system, but one with enough flexibility to meet different agency needs.
By integrating compliance as an automated part of the regular publication process, CHORUS saves everyone time, effort and money. Authors need do nothing more than indicate their funding agency when they submit a paper–nothing further is needed on their part to ensure compliance. Funding agencies save the costs of building and maintaining their own systems (or the costs of outsourcing to another agency) and see a higher level of compliance from grantees through automation. Publishers gain a new valuable service they can offer their customers and get to retain traffic to papers in their journals. The benefits offered to all seem increasingly obvious.
UPDATE: As of last night, the bill passed in the House, on to the Senate.
UPDATE: The Senate has passed it as well, now just awaiting the President’s signature and it’s a law.
31 Thoughts on "New US Public Access Legislation Included in Government Funding Bill"
It is not obvious that the mandated approach of this bill supports CHORUS. If the agency is required to ingest the author manuscript (AM) then the simplest thing to do is to make that AM accessible. Integrating with CHORUS would be a separate process and almost redundant. It seems to me that the value of CHORUS is for the agencies not to have to ingest the AM in the first place, thus saving money. But this mandate seems to take away that option.
It is also worth noting that CHORUS really does not exist at this point as a functioning system. In particular FundRef, which CHORUS is built on, is just getting off the ground and has a long way to go.
In any case this is a bad piece of legislation, setting an overly restrictive and potentially harmful precedent. It may even rule out the way PubMed Central presently operates. It is interesting however that no timeframe or deadline is specified. Legislation like this has a way of being ignored.
My read would be that articles can be submitted to a designated entity acting on their behalf of the agency (e.g. PubMed Central, SHARE). Also, since CHORUS is designed to deposit articles in dark archives, I don’t see why the agency couldn’t designate or create such an archive themselves and use CHORUS as the automated mechanism for deposit.
And there’s more to the process of access, at least in a useable fashion than just putting something up on the web. For example, the legislation requires that the copy be “machine readable” which will require particular standards be met. Note that PMC spends the majority of its multi-million dollar budget on the 20% of articles needing transformation into a useable form.
Also I’d argue that stating that CHORUS does not exist is something of a misnomer. There is a live functioning pilot program (http://chorusaccess.org/pilotservices/) and CHORUS is now shifting into production. By the time any of these agencies take action, things should be well along.
I imagine that what you describe can be made to work but it is pretty complex. The point is that requiring the delivery of the AM changes the game. CHORUS was designed to make that delivery unnecessary by providing access via the publisher’s website. In fact the machine readability requirement suggests that it is the agency copy that is to be make public, not the publisher’s copy. This requirement speaks to data mining. But we are now in the realm of conjecture.
I did not say that CHORUS does not exist. I said it does not yet exist as a functioning system because FundRef has yet to be widely used. In fact for the agencies under this mandate FundRef does not even have a breakdown of component funding entities, such as you link to. I agree that given time all this can be worked out. My concern was your use of the present tense.
It seems to me that the delivery to the agency (or its designated entity) is about archiving, and CHORUS does provide archiving services.
Delivery can just as well be about access as archiving. If you have a mandated archive why not just make it accessible? And once again, CHORUS will provide archiving services once FundRef is widely used, but it does not do so now. The articles in the pilot archives were hand selected as far as I know. They were not based on FundRef tagging, which is not yet widely deployed.
If you have a mandated archive why not just make it accessible?
Because it’s complicated, time consuming and expensive, and there are acceptable alternatives that are cheaper, easier, and more effective.
The agencies were always going to have their own portals, which CHORUS could provide but has chosen not to. If they are also required by law to collect the AMs where is the avoidable expense?
Must a portal offer full text copies of the papers, or can portals instead offer links to versions hosted elsewhere? If an agency chooses to use PubMed Central as its repository, it certainly won’t be hosting and serving a second version of the paper itself. There is tremendous expense and actual work involved in presenting this material in a useable manner. Posting a directory of a hard drive with a list of filenames does not seem an adequate solution to create the desired usability. As noted, PMC needs a fulltime staff and a multimillion dollar budget to make things work.
PMC is expensive and laborious because of the XML. But the DOE PAGES system promises to offer full text access for very little expense. They basically bootstrapped it. See my http://scholarlykitchen.sspnet.org/2013/07/18/meet-pages-does-prototype-public-access-system/. OSTI specialized in low cost portals. The big cost is ingest.
My understanding is that PAGES was reviewed by other funding agencies and found to be inadequate for their needs.
Not that I know of. PAGES, like everything else in the program, is on hold pending OSTP review. OSTP has yet to respond to the draft plans submitted late August. Even DOE cannot implement PAGES, its own ready-to-run system. The federal program is going nowhere fast.
But PMC is unlike anything else in the government so it is not a model. NLM’s annual budget is about $350 million while DOE OSTI’s is about $9 million. NSF has basically no access budget. Anything that is done will be very cheap compared to PMC.
Hmm. I think you’re right: a legalistic reading of this language seems to say that if a federally funded researcher publishes in an OA journal like PLOS ONE, she still also has to submit the accepted manuscript to the agency or its proxy. That surely can’t have been the intention.
“Still, the embargo is problematic in that there remains no explanation or rationale given to support this particular length.”
Needless to say, I am in full agreement with you; but for the opposite reason from yours.
As things stand now for NIH authors, even if they publish in a Gold OA journal like those run by PLOS, the articles still must be deposited in PubMed Central. Doing that deposit on behalf of the author is a valuable service that PLOS performs for its authors, just as, for example, OUP deposits NIH-funded papers in PMC on behalf of authors. The difference between the subscription papers and the OA papers is the length of embargo set before they become publicly available in PMC.
Looking at this legislation, one would think the publicly available version of the paper in a PLOS journal would suffice for the public access part of this, but it still must be deposited.
Typically the legalistic reading is the only way to read a law. If there is any discussion of this section in the bill’s supporting documents it might be useful in court but not as a way to ignore what the law specifically says. All sorts of bad stuff gets hung onto big money bills like this and this poorly thought out section is an example.
And this is what happens when Congress cuts and pastes copy into a bill and doesn’t bother actually asking the industry or parties involved what it all means. Vague and ill considered language will only make it harder on the affected agencies. Machine-readable? Do they mean digitally available or available for data mining or simply able to be indexed?
Then again, Congress loves making laws about things that already exist. As mentioned, the huge majority of these papers are going to be NIH funded.
NIH may be behind this. Their big expense is creating or correcting XML but this law may make that a legal requirement on the authors instead.
I think that’s unwarranted speculation at best. Unless you have evidence that they were involved here, or that the NIH is trying to cut their spending on PMC it’s not a supportable statement. PMC’s mission is best served when researchers are required to do as little work as possible for compliance. Asking researchers to take time away from research to code XML would be contrary to the agency’s mission.
reminds me of the damage Congress did by including a reference to “multiple copies” in the language of Section 107 (fair use) of the Copyright Act of 1976, ignoring all judicial precedent in doing so though Congress proclaimed not to be changing judge-made law. It’s little things like this that can create havoc later on.
Whatever Congress has mandated, the rubber will hit the road when it comes to research funding decisions. If you don’t make your articles freely available at all, it’s safe to assume this could affect your future funding. Gold OA or CHORUS? It’s hard so see funding organizations penalizing researchers for making their research more available in a better version!
Can anyone help me to see why the obvious solution to the emerging US open-access requirements isn’t just to put everything in PMC? It’s there, it works, it’s respected. Scaling an existing operation of handle more volume is always easier and cheaper than building a new one, no?
It is a viable solution, but it does present some problems. The first is expense. From what I understand, there is a significant startup cost, followed by a per-article charge. As none of these mandates offer any additional funds, these costs come out of other areas of an agency’s budget.
Second, as noted above, much of PMC’s success can be attributed to publishers depositing NIH-funded articles on authors’ behalf. It is unclear if publishers are willing to expand their efforts in this area for every US funding agency, and hence compliance levels would likely be much lower than the NIH sees. This creates an additional burden on researchers, who from my experience, already have a full plate.
Third, as far as I can tell, there is no mechanism built in to PMC for monitoring and enforcing compliance. If these mandates are to be anything other than toothless suggestions, that’s an additional area of expense and effort.
If these needs can be met by CHORUS with the necessary guarantees at no expense to the agencies, it would seem a superior solution. CHORUS is built on existing infrastructure, reducing costs, and the costs involved will be covered by member organizations which in return gain the benefit of retaining reader traffic on their journal websites.
For the reader and author, it should make no difference whatsoever where the article is made available, and having it in the context of the journal offers some advantages over a separate archive (notices of corrections, retractions, related articles, letters to the editor, etc.).
Indeed, PMC’s proposed charges are very big for most agencies. There is after all a big budget crunch going on and a new public access program is way down the list of priorities. People are trying to save their research programs. Plus NIH is the rich kid and the poor do not like to give to the rich.
But at this point no agency has any plans until OSTP rules so there is nothing to implement. It is not even clear who is going to make what decisions, because there are no decisions to make. As Richard Huffine points out in my Kitchen interview (posted yesterday), what USGS does is ultimately up to Interior HQ, not USGS, plus OSTP of course. Everyone is just treading water.
Your solution may very well be what other HHS Sub-agencies do to comply with the language in this Omnibus bill but I think it would be nearly impossible for Labor and Education to use PMC or take its’ infrastructure and deploy it for their funded research outputs. PMC is heavily dependent on MESH and other very specific metadata structures that are built for the biomedical sciences and maintained by the National Library of Medicine or other components of HHS specifically.
It will be interesting to see how Labor and Education address these requirements with their limited means and with the very different sets of vocabularies and concepts they cover in the research disciplines they fund.
Of course it is comforting to scholars and publishers in the humanities that the NEH is not among the agencies affected.
It appears that all (or almost all) of the comments above addressed STM. But various agencies-departments of the U.S. Government support research in HSS and to a smaller degree in LTR.
Perhaps a follow-up article on HSS (and possibly LTR) might be useful.
Professor of Marketing
A lot of the commentary on this article is focused on health and medical publishing, but that’s mostly what the LHHS Committee funds (as noted in the article above http://www.appropriations.senate.gov/sc-labor-jurisdiction.cfm). We have written about the issues facing HSS publishing often, a few immediately springing to mind include:
I have seen and read the articles you mentioned above in your comment; and references to RCUK, etc., are useful and important.
But the article you published today addresses the legislation likely to become law. The first two agencies on the Subcommittee Jurisdiction list are: Administration on Aging; and Administration for Children and Families. These two (and there are others on the list) are likely to support HSS research. The Pension Benefit Guaranty Corporation (Labor) might support LTR research.
So all I tried to do was point out that a follow up article, as detailed and as useful as the one you published today, on the potential impact of the law on HSS research (and possibly LTR research) in the U.S. might be useful.
I wonder how the specific language chosen “submission to the agency, agency bureau, or designated entity acting on behalf of the agency, a machine-readable version of the author’s final peer-reviewed manuscripts” will affect any determination on whether these materials are official records that will ultimately need to be archived with the National Archives and Records Administration (NARA).
Typically, publications of the government are permanent records that are ultimately deposited at NARA. Early discussions were that the articles themselves would not be considered permanent records but that the metadata about what publications were produced under specific funding would be captured and archived with NARA.
In theory, it would be sufficient to include references to the articles in an records archiving practice since the articles were commercially published. Similar decisions have been made in the past in regulatory work, where commercial works are included in archives of dockets only by reference.
I don’t believe that NARA officials are on record at this time regarding how making federally funded research articles publicly accessible will be addressed in records schedules and archiving practices within Federal agencies.
For the record, whitehouse.gov says the president signed the bill into law on Jan 17