chorus group "sakura"
chorus group “sakura” (Photo credit: nyaa_birdies_perch)

Today, a group of publishers and publishing associations announced an initiative to address the Office of Science and Technology Policy’s (OSTP) public access memorandum.

The initiative, called CHORUS, for the ClearingHouse for the Open Research of the United Status, provides what the memorandum requires in a way that saves the US government and its agencies time and money; gives the public access to publicly funded scientific findings via embargoed access to publishers’ final approved, edited, and formatted papers; creates a smart and useful set of discovery pathways by leveraging emergent digital infrastructure elements; and allows publishers to retain the traffic and customer relationships they depend on to survive in the digital age.

An interview with Susan King, a member of the CHORUS steering group, is also available via the Scholarly Kitchen podcast.

The steering group represents publishers and organizations covering a range of scientific domains:

  • Pat  Kelly, Vice Chair of the PSP-AAP Executive Council, Publishing Director, John Wiley and Sons
  • Fred Dylla, CEO, American Institute of Physics
  • Susan King, American Chemical Society
  • Niko Pfund, President, Academic Publisher, Oxford University Press
  • Thane Kerner, CEO, Silverchair Information System
  • Ed Pentz, Executive Director, CrossRef
  • Joe Serene, Publisher/Treasurer, American Physical Society
  • John Tagler, Executive Director, PSP-AAP
  • David Weinreich, Government Affairs, PSP-AAP
  • Alicia Wise, Director of Universal Access, Elsevier
  • Fran Zappulla, Senior Director, IEEE Publishing Operations

Agency seats are set aside in the CHORUS governance framework. These will be filled at a later date.

The list of signatory organizations has been growing consistently, and it probably already exceeds those listed below (disclosure — I am a signatory on behalf of my organization):

  • Acoustical Society of America, David L. Bradley, President
  • American Association for the Advancement of Science, Beth Rosner, Publisher
  • American Association for Cancer Research, Kathleen Case, Interim Publisher
  • American Association of Physicists in Medicine, John D. Hazle, Ph.D., President
  • American Association of Physics Teachers, Beth A. Cunningham, Executive Officer
  • American Astronomical Society, Kevin B. Marvel, Executive Officer
  • American Crystallographic Association, Inc., William L. Duax, Chief Executive Officer
  • American Geophysical Union, Christine McEntee, Executive Director/CEO
  • American Institute of Aeronautics and Astronautics, Dr. Sandy Magnus, Executive Director
  • American Institute of Physics, Fred Dylla, Ph.D., Executive Director
  • American Mathematical Society, Donald E. McClure, Ph.D., Executive Director
  • American Medical Association, Thomas J. Easley, Senior Vice President & Publisher
  • American Meteorological Society, Keith Seitter, Executive Director
  • American Nuclear Society Robert C. Fine, JD, CAE, Executive Director
  • American Physical Society, Kate P. Kirby, Ph.D., Executive Officer
  • American Physiological Society, Martin Frank, Ph.D., Executive Director
  • American Psychological Association, Susan J. A. Harris, Senior Director, APA Journals
  • American Society of Agricultural & Biological Engineers, Darrin Drollinger, Executive Director
  • American Society of Civil Engineers, Patrick Natale, Executive Director
  • American Speech-Language-Hearing Association, Arlene Pietranton, Chief Executive Officer
  • AVS: Science & Technology of Materials, Interfaces and Processing, Susan B. Sinnott, President
  • Biophysical Society, Rosalba Kampman, Executive Director
  • Columbia University Press, Jennifer Crewe, Associate Director/Editorial Director
  • Elsevier, Alicia Wise, Ph.D., Director, Universal Access
  • Entomological Society of America, C. David Gammel, CAE, Executive Director
  • Fabricators and Manufacturers Association, International, Edward Youdell, President and CEO
  • Genetics Society of America, Adam P. Fagen, Ph.D., Executive Director
  • Human Factors and Ergonomics Society, Lynn Strother, Executive Director
  • IEEE, Fran Zappulla, Staff Director, IEEE Publishing Operations
  • Institute of Physics Publishing, Steven Hall, Managing Director
  • Journal of Bone and Joint Surgery, Kent Anderson, CEO/Publisher
  • Materials Research Society, Todd Osman, Ph.D., Executive Director
  • McGraw-Hill, Scott Grillo, Vice President, Publisher Mc-Graw-Hill Professional
  • New England Journal of Medicine, Christopher Lynch, Vice President, Publishing
  • The Optical Society, Elizabeth A. Rogan, Chief Executive Officer
  • Oxford University Press, Niko Pfund, Academic Publisher
  • Silverchair Science+Communications, Inc., Thane Kerner, President & CEO
  • Society for the Advancement of Materials on Process Engineering, Gregg Balko, Executive Director
  • Springer Science+Business Media LLC, William F. Curtis, President
  • Thieme Publishers, Brian Scanlan, President
  • University of Chicago Press, Garrett P. Kiely, Director
  • Wolters Kluwer Medical Research, Sami Hero, Vice President, Marketing

CHORUS is, to me, a much more modern and sensible response to the demand for access to published papers after a reasonable embargo period, as it doesn’t require an expensive and duplicative secondary repository like PubMed Central. Instead, it uses networked technologies in the way they were intended to be used, leveraging the Internet and the infrastructure of scientific publishing without diverting taxpayer dollars from research budgets.

The infrastructure is described visually in the figure below:

Pages from CHORUS discussion draft 5 24

The solution proposed by CHORUS is simple:

  • Publishers create and support a new domain, CHORUS.gov, with agency input
  • Publishers deposit metadata via CrossRef and FundRef for papers with relevant funding
  • Users can search and discover papers directly from CHORUS.gov or via any integrated agency site
  • Users retrieve paper directly from the publishers’ sites using the version of record

The timeline for development and deployment is rapid, which underscores the fact that much of what is being brought to bear already exists:

  • High-level System Architecture — Friday, June 14
  • Technical Specifications — Friday, July 26
  • Initial Proof-of-Concept — Friday, August 30

The redundant expenses this approach spares the government from incurring are significant, as publishers have spent years deploying proven solutions to issues like cross-domain discoverability, DOI accession indices, preservation and archiving protocols, and uptime and availability monitoring services.

In addition to CrossRef and FundRef, technologies contemplated in the CHORUS infrastructure include CLOCKSS, LOCKSS, and Portico for archival storage and retrieval. But there’s a more basic benefit to the CHORUS approach and architecture — namely, nobody has to handle the XML and PDF of the article again, as PubMed Central currently must. Estimates put the cost of this handling at $50-60 per article, which includes translation protocols, QA and QC, loading, and storage. These costs are unnecessary and redundant, and shouldn’t be part of an efficient government solution, as they force taxpayers to shift money away from researchers.

It’s good to see publishers being proactive about the OSTP public access memorandum. Earlier mandates left publishers in a defensive role. This time, the publishers seem to know they can’t win the talking points and PR game, so they have to ensure that the solution shows what publishers are capable of doing and how their involvement helps all stakeholders. CHORUS demonstrates in concrete terms that involving publishers in publishing actually makes the most sense financially and economically.

An alternative scenario that has been actively promoted among some of the agencies contemplating the OSTP public access memorandum has been to expand PubMed Central to become the US government’s repository for all scientific papers resulting from government funding. Comparatively, CHORUS seems far superior to this path. CHORUS saves the government money. And, by not reducing publisher revenues by decreasing direct traffic, as PMC does, the government isn’t caught in a win-lose scenario for the US economy and US taxpayers.

CHORUS also signals the possibility that scientific publishers and governmental agencies might be able to restore the public-private partnership that is more typical and admirable than the recent years of wedge-driven discord. Publishers provide objective, third-party validation and specialized technologies and expertise in content management and development, which benefits the government. CHORUS is a great example of how these same strengths can be exhibited in the digital age, all while saving the government money and trouble.

I think everyone involved should sing its praises.

Kent Anderson

Kent Anderson

Kent Anderson is the CEO of RedLink and RedLink Network, a past-President of SSP, and the founder of the Scholarly Kitchen. He has worked as Publisher at AAAS/Science, CEO/Publisher of JBJS, Inc., a publishing executive at the Massachusetts Medical Society, Publishing Director of the New England Journal of Medicine, and Director of Medical Journals at the American Academy of Pediatrics. Opinions on social media or blogs are his own.

Discussion

83 Thoughts on "Joining a CHORUS, Publishers Offer the OSTP a Proactive, Modern, and Cost-Saving Public Access Solution"

I didn’t see any library organizations involved. Who is speaking for the end user in this CHORUS?

The more relevant university stakeholders are the provosts and the researchers. Spending to create duplicative repositories deflects money from actual research, so new repositories–which incur substantial costs, notwithstanding occasional rhetoric to the contrary–reduce the amounts available for research itself. The OSTP memo is explicit in this respect: it calls for “identification of resources within the existing agency budget to implement the plan.”

The CHORUS architecture is designed to maximize re-use of existing publishing infrastructure while delivering the public access required by the memo. This is the most responsible approach from a budgetary perspective.

Has the CHORUS group gained any input from the OSTP or other government agencies to date? Is there a sense for what the willingness is to venture down this path as opposed to attempting to build onto PMC? I agree that what is outlined by CHORUS is a highly sensible route that builds on existing investments and systems.

The CHORUS Steering Committee has had discussions with a number of relevant stakeholders, including representatives of OSTP, a number of agencies, and several university and library organizations. The Committee has documented the CHORUS architecture, and is actively encouraging input from each of these groups to insure that CHORUS meets the needs of the community as effectively as possible.

It’s either societies(most of which are publishers) or libraries who will have to speak for researchers, who are the real end users, unless the researchers decide to speak for themselves.

I think the bigger issue here is that PMC is already built, so most of your proposed cost savings aren’t really relevant. I know you’d like PMC to be scrapped so you can keep your traffic up, but the point you keep missing is that the federal government hasn’t promised any business the right to preserve its business model.

Fear change if you must, but the smart business will adapt to the fact that publishing is a different business now and all the OSTP is doing is asking you to take the hint.

PMC is built to a certain point, but it costs millions of dollars per year to run and enhance. Do you know how much? It’s only rumored. They won’t tell us, but educated guesses are between $2.5 million and $5 million per year overall. These are the direct costs. The indirect costs are the millions in lost revenue for publishers who have traffic diverted to PMC, including OA publishers (yes, a study I did of PLoS traffic showed that more than 20% of their traffic is diverted to PMC, which reduces their advertising inventory and costs them a lot of money — there is no reason to believe this is otherwise for OA publishers elsewhere). COUNTER reports are thrown off, journals devalued, and so forth. It is a very damaging and costly approach to solving a problem that CHORUS proves can be solved more elegantly and without hurting anyone or costing the US taxpayer anything.

CHORUS seeks to save taxpayers money, and allow taxpayer dollars to be spent on research rather than on publishing. Do you not share that goal?

Do you really think there are no costs in expanding PMC by 20 times its current size? For coordinating the different agencies, each with its own complex set of policies?

Fear change if you must, but the smart business will adapt to the fact that publishing is a different business now and all the OSTP is doing is asking you to take the hint.

Isn’t that sort of adaptation exactly what CHORUS represents? I note your employer is participating.

Libraries will likely continue to get their access through publishers if they are not (competently) open access publications, otherwise hopefully there might be direct access (and search!) through CHORUS.

One sentence concerns me:
“Publishers create and support a new domain, CHORUS.gov, with agency input”
As far as I know, .gov is reserved for government use. Unless the plan is to make this primarily a government operation good luck getting that .gov address.

The OTSP memorandum calls for public access to “final peer reviewed manuscripts or final published documents.” CHORUS meets this requirement, which leaves to the discretion of the publisher which artifact (manuscript or published paper) is delivered. The important innovation for the scientific record (compared to a distinct repository) is that users will retrieve the publicly accessible article in the context of its journal of origin, on the publisher site. This means users will navigate to the one location that concatenates all relevant components of the publication, including the version of record, data, addenda and errata, retractions, etc. Within the framework of the specific policies agencies develop (with which CHORUS participants will comply), publishers’ varying business models will drive how and when these elements are delivered (e.g., for “Gold Open Access” papers, no embargo would need to be observed).

CHORUS will consist of infrastructure and open APIs that are not restricted to a particular website or specific search tools. While there will be a CHORUS-run website (perhaps CHORUSaccess.org — not .gov, which is reserved exclusively for government owned websites), many agencies have expressed interest in developing their own portals, which is consistent with the CHORUS idea. While there may be costs in developing such portals, CHORUS nonetheless reduces government costs by doing the collecting, tagging, storage, and delivery rather than requiring additional investments for these duplicative tasks.

The open API’s are a really important part of this proposal. This is a new system that will allow for accurate identification of articles in a lot of ways, and so any resource that’s doing discovery service (including things like PubMed or Google Scholar) can access the information and improve their results. This has far-reaching improvements for discovery that go beyond the initial website itself.

This does look potentially positive, though I think there is a big trust gap to be bridged before researchers, librarians and indeed the government will be happy entrusting all this to the very publishers who up till now have made themselves roadblocks in the path of all such initiatives.

Here’s the part I don’t understand at all: “CHORUS saves the government money. And, by not reducing publisher revenues by decreasing direct traffic, as PMC does”. Surely the solution that saves the government the most money will be one that does reduce how much it spends with publishers? It looks as though CHORUS is trying to reconcile two conflicting goals: keeping government expenditure down and publisher revenue up.

Finally: asked “I didn’t see any library organizations involved. Who is speaking for the end user in this CHORUS?”. Kent replied that libraries are not end-users, which is true but not really to the point. Chuck’s twofold question remains: who is speaking for the libraries, and who for the end users?

If a government spends taxpayer money on redundant systems that can be achieved without costing taxpayers money, that’s an inefficient government and an inefficient use of taxpayer money. If a government creates conditions that slow down the economy unnecessarily, reducing tax revenues and cutting off services for citizens as a result, that’s poor civil service.

The approach epitomized by CHORUS shows us a way to restore millions to the NIH research budget while also keeping the larger economy more vibrant through employment, growth, and vitality of an important information industry — and this is not an OA/non-OA debate, as the revenue threat of PMC cuts across both business models.

If an OA publisher (such as BioMedCentral) sells advertising, then any repository that takes traffic away from that publisher’s site reduces impressions, thus reducing ad inventory, thus reducing revenue. That’s revenue coming in from commercial sources that could be used to subsidize the APC for researchers, reducing their costs.

In this post from last year, I analyzed the situation. What’s really interesting is that PLoS has a greater drag on its traffic because of PMC than the subscription publishers that have been analyzed to date. In another post, I analyzed across the PLoS titles, and the effect was even greater for PLoS ONE — upwards of 30% of traffic lost to PMC. If 30% of PLoS ONE’s ad impressions never exist, that is inventory it can’t sell.

David’s exactly right in the logic. The net financial effect for PLoS alone is probably in the low-to-mid six-figures in revenue loss attributable to PMC diverting traffic from their publications. It would be interesting to know this across all OA publishers and publications. It might exceed $1 million once you tally it all, or even more (PLoS + BMC + SAGEOpen + etc.).

If ad revenue were such a great source of income for publishers, they wouldn’t be fighting tooth and nail to keep people away from their content – they would instead be opening up their sites and imploring people to read their papers. The fact that they don’t confirms that, not only does current ad revenue account for a trivial fraction of their total income, but that they also believe that even in there were zero barriers to accessing articles, the revenue from ads wouldn’t come anywhere close to replacing their lost subscription revenue.

Yes, journals can make a small amount of money selling ads. Correct me if I’m wrong, but aside from a few very high profile places like Nature, NEJM, etc…, I suspect were talking a few million dollars a year for the whole industry – a drop in the bucket in a $10b industry – certainly not something on which to base important policy decisions.

Hundreds of journals make a large percentage of their revenues from advertising. You think the entire ad industry for journals is “a few million dollars”? It’s probably in the hundreds of millions. I’ll check with some experts, but when you throw together all the print and digital advertising across all science publications (and we often forget dentists and ophthalmologists in the health sciences, and equipment manufacturers for labs), it’s a pretty huge number.

PLoS itself made more than $450,000 in 2011, and this has certainly grown since then.

Advertising sales is a blend of volume, frequency, and targeting in the online space. Access controls help with targeting, and a qualified audience is more valuable than an audience of unknown people. Access controls don’t decrease traffic very dramatically, only marginally, and often that traffic is not core to a content site.

I can confirm that advertising revenues are vastly more extensive than “a few million dollars for the whole industry”. Studies over the last few years put non-subscription revenues as ranging from 20% to 40% of total revenue for STM journals, the bulk of which comes from advertising.

I’m surprised by those numbers. It’s certainly not true in the life sciences – the society publishers I’ve spoken to say they’ve struggled to get much ad revenue. So these numbers must be heavily biased towards clinical journals, which have a very different economics.

So, are PLoS’ advertising revenues skewed toward PLoS Medicine? It seems you might know something about what advertisers are buying within your family of journals.

Health care journals carry advertising, but so do many science journals. It’s a larger and more widespread business model than you think. ASM has plenty, BMC has advertising, etc. Again, it’s more common than you know.

If PLoS would be losing so much advertising revenue if readers accessed papers via PubMed Central – how come PLoS just issued a statement that supports PMC and dismisses CHORUS?
(Yes this is a rhetorical question).

According to Outsell, advertising accounts for more than $2.1 billion dollars of revenue for STM publishers. And it’s growing.

There is no way that number is for journals only.

According to the industry’s STM report (http://www.stm-assoc.org/2012_12_11_STM_Report_2012.pdf) 4% of journal revenues come from advertising. Based on their estimate of industry size ($9b) you get total ad revenues of $360m. That’s a lot, of course, but from what I’ve been reading this is very unevenly distributed with the bulk of that number is advertisements from pharmaceutical and equipment companies in medical journals.

I looked more closely at some numbers for PLOS, and our ad revenue last year was about $15 for every published article. About 1/3 of our page views are at PMC, so scale that up to $22 per article if ad revenue scaled linearly with page views. This is probably far more typical of life sciences journals than the industry-wide numbers, and suggests total web revenue for the industry of at most $50m. It’s not chump change, but it’s a pittance compared to other revenue streams, and would have to grow 10x or more for it to have a major impact either on APCs or on journal bottom lines.

And answering your question about whether our ad revenues are focused in medicine, they are not because PLOS Medicine (and PLOS in general) does not take ads from drug companies who account for the bulk of advertising in medical journals.

So, if you take the STM numbers, that makes advertising a $376 million business. As for ad revenues dropping for a particular publisher, that can happen, especially if a lot of the advertising comes from pharmaceutical companies. As you note, you can still make quite a lot without taking pharma advertising.

It’s interesting to note that you’re giving up $7 per article in ad revenue to PMC. You’re basically admitting that PMC is hurting business at PLoS, which is my point. If you take that across the industry, that’s millions of dollars in suppressed economic activity. Add that to the millions it takes to run PMC, and how unnecessary it is in the Internet age to have a redundant repository, it’s kind of ridiculous. Especially when the ostensible goal was to help US taxpayers. It’s actually costing US taxpayers quite a lot, and they don’t even know it.

And according to Elsevier’s 2012 annual report, ad revenue DROPPED ~30% in 2012.

Note that the estimates of journal income percentages used in the linked report come from a 2008 study and may or may not accurately reflect current levels. The journals market has changed quite a bit over the last five years and factors like the rise of OA publishing and a shift from institutional to consortia subscriptions may have altered those percentages.

Yes. Technically PLOS is giving up revenue by allowing articles to be posted on PMC. But our goal is not to maximize revenue – it is to ensure that our authors’ research reaches the widest possible audience. That is what they are paying us to do, and that mission is served by allowing people to find and read their articles at PLOS, at PMC or anywhere else.

So, you could make the same revenue and charge your authors less if PMC weren’t taking traffic — and, by extension, ad revenues — from PLoS. As it is, your authors are subsidizing PMC by overpaying for your lost ad revenues.

If PMC did not exist, PLoS’ OA research could reach the same people via PLoS sites, your authors could pay less to publish, and the NIH would have millions of dollars more per year to devote to research. That would be a win-win-win.

“So, you could make the same revenue and charge your authors less if PMC weren’t taking traffic — and, by extension, ad revenues — from PLoS.”

This would be true if reaching readers was a zero-sum game — if every researcher who reads an article on PMC would otherwise have read it on the PLOS site. But there’s no reason to think this is the case: it’s more likely that at least some who read it on PMC would otherwise bever have found it at all.

So it becomes a trade-off: the additional contribution to the progress of science verses the loss of advertising revenue. In PLOS’s judgement, the trade-off is a net win: the scientific advance is greater than, or more important than, the revenue loss.

Not only that, it’s likely that readers encountering PLOS papers for the first time on PMC will go on to read more PLOS papers at the original site. The free versions on PMC are then acting as adverts for the revenue-generating versions provided by the publisher. This is exactly the same principle by which songs are played on the radio, for free, as enticement for listeners to buy those songs (or others by the same artist).

There’s a pervasive tendency for content providers to see every access that doesn’t go through them as a lost sale. (We saw it of course when music labels want to shut down Napster, ignoring the boost in music sales that it provoked. To be clear, I am not arguing that such filesharing is morally right, only that industry’s kneejerk reactions to it tend to be mistaken.) More forward thinking content providers instead use such channels as promotional tools, and seek to use them to drive growth.

Michael Eisen admitted views not going through PLoS is a potential lost sale. If PMC is a promotional tool, what is it promoting? At what cost?

You are defending your own status quo.

From PLOS’s perspective, what PMC promotes is PLOS. But that is only a pleasant side-effect from its true role which is to promote the progress of science.

So, taxpayers are paying to promote PLoS? I thought this whole OA thing was about taxpayers getting what they’d paid for, not paying more for publishers. Now you’re saying PMC is an explicit PLoS promotional tool, one that PLoS is happy to not only have authors pay more for (by foregoing ad revenue due to diverted traffic), but also happy to have taxpayers paying to sustain?

I’m sorry, but it makes no sense no matter how you cut it.

In several important respects discovery is a zero sum game. Searching one way or place means not searching in the other available places or ways. Plus the amount of time available for search is limited, which I call the attention budget. In this regard PMC has a serious limitation which I have been studying. In the reference list for each article the only live links are mostly to other articles in PMC, plus a few to PubMed.

This means you are being confined to NIH funded research, which is a big, strange bias indeed, in effect blocking much of the world’s research..For example, I picked an article at random and it had 75 references but only 15 live links, all but one to other PMC articles. Given that CrossRef gives live links to all references this is a truly strange limitation, basically a small closed universe and one that can be quantified. It seems that using PMC as a discovery tool is a bad plan.

So yes it is likely that someone searching on a specific scientific problem would wind up at the publisher’s site if PMC did not exist, and they would probably be better off there.

“So, taxpayers are paying to promote PLoS”

As you said in another context, it’s a win-win. When taxpayers find a PLOS paper on PMC they both have the paper and discover PLOS. Good for the taxpayers, good for PLOS.

Actually, the win-win would be to shut down PMC. That would save taxpayers money and give more money to PLoS so author fees don’t have to rise. That also saves taxpayers money, from grants. And more NIH money could go to research. Right now, for PLoS to win via PMC, taxpayers lose.

Libraries have a vested interest in how CHORUS works, since libraries, currently, pay the bills for the most part for the current system. They are deeply interested in how things could be different for end users. Libraries have a vested interest in providing information to their customers, the readers of publisher mediated content. Helping them identify and access such content. We spend many millions of dollars annually with such a goal whether that’s through identification systems like bibliographic specialty databases or through retrieval systems investment and design.
So far the publishing community has been unable to solve the simple problem of identifying in a uniform manner “freely available” content with a machine readable standard notification system for such content. In the face of that simple failure, how much “trust” can anyone have in the idea that publishers working together (with the government) will solve the even larger challenges of systematic provision of content in an environment that guarantees accessibility to government funded research?
Without intense involvement of those two communities, i.e. libraries and end users the system won’t hold as much promise as it should.

With respect to trust, you can reasonably expect that the ultimate deliverables of CHORUS will be enacted with executed Service Level Agreements (SLAs) between the relevant stakeholders (including any participating agencies), as is common in business arrangements in the commercial and NGO sectors. These SLAs would provide contractual guarantees around key provisions related to access and availability, archiving, auditing, error remediation, etc.

With respect to your second point, fortunately for publishers the preponderance of revenues under traditional models are not derived from government, but rather from thousands of private institutional and individual subscribers around the globe. The government expenditures that are being saved by CHORUS are the costs of establishing and/or expanding duplicative government repositories, and of reprocessing content that has already been processed at publisher expense. The publisher revenues that are being preserved are the subscription, membership, and advertising income that fund the legitimate and valuable services provided by publishers.

With respect to the end user, there are several in the case of CHORUS. One is the researcher/author, for whom CHORUS provides the most seamless and frictionless method of complying with public access requirements. Another is the reader, for whom free access is being provided universally, regardless of discovery source (public search, agency portal, CHORUS interface, etc.), and better still, provided in the context of the journal of origin. A third is the funding agency, for whom CHORUS will provide scalable, automated methods to catalogue and report on its research outputs.

It would be helpful if you could better explicate the nexus between the librarian community and the CHORUS initiative. What if any are the library concerns that you believe CHORUS is overlooking?

An interesting initiative! Can you say a bit more about the expected embargo periods before papers are made openly available? Will there be a distinction between embargo periods by papers in different subject areas?

What people may be missing is that CHORUS is a voluntary industry initiative to meet a Federal need. This is a major step forward. An analog is the North American Electric Reliabiliy Council (NERC) formed after the great blackout in the 1960’s. CHORUS is a grand bargain, where the industry keeps the eyeballs by meeting the Federal needs, provided the latter are reasonable. Time will tell. This is a promising start not a finish, so much work lies ahead. Stay tuned.

Agreed–I think there’s some misunderstanding out there, complaints that this doesn’t immediately and 100% address every question that funding agencies might have. I’ve seen complaints that this doesn’t address the researcher data sharing aspect of the OSTP mandate, which is well outside of the purview of publishers, for example.

This is a first step, a good faith effort by publishers to provide a solution for some parts of the mandate and to start the conversation. Though there are likely some who care more about who is doing something rather than what is being done, it’s hard to see this as anything other than a positive step in the right direction.

Interesting to see this happening. The central role of CrossRef is useful in terms of trust and functionality, and could well help to bring on board a good number of libraries and societies. We could also see further initiatives like ORCID being developed between all the key stakeholders in a manner that brings related data/research outputs together alongside published articles (perhaps even working with the likes of DataCite / Figshare). It’s up to the agencies to engage and publishers to provide proof of concept, but I’d certainly be interested in seeing whether CHORUS would go ahead anyway should OSTP et al not engage with it directly (even if without the .gov listing), which has to be one scenario.

“where the industry keeps the eyeballs by meeting the Federal needs, provided the latter are reasonable.”

That’s the kicker. Those of us who remember PRISM, the RWA and the Georgia lawsuit are not predisposed to imagine that publishers’ notions of what is “reasonable” will coincide with ours. To pick one obvious example, I’m pretty confident that the OSTP’s 12-month embargo periods will quickly become 24 or 48.

I think publishers have a lot of bridge-building to do before librarians and researchers will trust them with something as important as a PMC replacement, and the CHORUS proposal has come too soon for that to have happened.

Yet, PMC’s value is entirely based on taking published papers and putting them in a government-run repository. It seems there’s enough trust out there to leverage the final output from publishers, and this remains the best version of papers. Seems the trust level around publishers might be higher than you want to believe.

I don’t think there’s any failure of trust in publishers to be able to deliver a published product. The trust gap, as I’m sure you realise is around business practices. The embargo period is one such, but by no means the only one. You can’t tell me that if publishers owned the OSTP’s dissemination system, that wouldn’t give them leverage to obtain longer embargoes.

Boy, that’s paranoid. Publishers keep their agreements. If publishers and the government agencies agree on an approach, that will be a good agreement, and it will be honored. The NIH public access policy has been honored for years, despite its obvious flaws, one of which is PMC.

Am I paranoid? If I am, it’s in large part because initiatives like PRISM, the RWA, various publishers’ initial responses to the OSTP call and the Georgia state lawsuit all shows that publishers interests are not aligned with those of researchers; in many cases, they are diametrically opposed. I don’t think it’s realistic to expect researchers, librarians and the government that acts for them to forget all this history overnight.

I’d be careful before speaking of the “interests of researchers” so cavalierly. Researchers spend most of their time as information consumers, not information producers, and they really want highly filtered, high-quality content, not just more content. Study after study shows this. New unfiltered mega-journals rank poorly compared to traditional branded journals. As readers, researchers are largely satisfied with the current system. Most of them have more than enough access. In this regard, fringe cases are being over-emphasized.

As far as researchers’ role as authors, in survey after survey, OA comes in last as something researchers care about. In a recent survey, CC-BY was noted to be the least-favored licensing approach for content, with copyright ranking much better. The RCUK mandate is an initiative that threatens research dollars and academic freedom, and has been called out as such. Thjs would seem to indicate that the “interests of researchers” are not OA and CC-BY, yet that is what is being pursued at all costs. I think the real problem might be in how OA mandates are not aligned with the interests of researchers.

You have strong opinions about how the world should be. The facts as they exist and as they are emerging often contradict your opinions.

“Researchers spend most of their time as information consumers, not information producers, and they really want highly filtered, high-quality content, not just more content. Study after study shows this. New unfiltered mega-journals rank poorly compared to traditional branded journals. As readers, researchers are largely satisfied with the current system. Most of them have more than enough access.”

This does not at all reflect the reality of the paleontological community at least (to which Mike Taylor belongs). I’ve only ever heard laments that more “minor” data be published more often (e.g. descriptions of more specimens of known species, multiple photos in different orientations for bones, field notes) and complaints about when data is highly filtered as in Nature and Science. I have yet to hear anyone express concern that e.g. PLoS ONE is publishing too much “low quality”, “unfiltered” data, and certainly don’t feel that way myself. That you think most researchers have more than enough access is laughable in paleontology at least. Just note the constant requests for pdfs on the Dinosaur Mailing List. If I had to guess what percentage of papers the average vertebrate paleontologist has to get from asking authors or colleagues as opposed to being able to download from the journal’s website, I’d say 50-70% (with the range maybe being 30-90% depending on institutional access).

Also your division of consumer vs. producer is largely useless, as a good percentage of the information we consume is necessary for us to produce. Similarly, we need near complete consumption in whatever area we study in order to produce usefully. Even if we only study a small section of diversity like theropod dinosaurs, we’d need access to most of ~90 articles in ~40 journals published in 2012 for instance. Every paleontologist I know is too busy producing and consuming enough to allow that to have the free time to consume much else they would like to.

I think misconceptions like yours about what modern researchers have and desire are a big reason publishers are content with the current system.

Generalizing as Taylor does from paleontology can lead to poor conclusions. I was speaking generally, and there are multiple data to back up what I’m saying, across all fields. There can be dramatic differences between fields, but as noted elsewhere, we are over-emphasizing fringe cases then.

I don’t see why at least related fields like biology or geology would be different from paleontology in the factors I mentioned, and this entire area is hardly small enough to be considered “fringe”. If anything, the popularity of dinosaurs might have given their internet presence a head start, and thus the situation in vertebrate paleontology may be predictive of what other fields will be like as the internet grows and online communities form.

If life and earth sciences ARE that different from e.g. humanities and mathematics in how their researchers feel and what their researchers want, surely it would be better to discuss different rules for different fields than just dismiss life scientists as a fringe case that have to live with the resources given to other researchers. Researchers who apparently have more access to pdfs than they require, want to read only a fraction of publishable work, and have enough spare time to read more than they write.

I wonder if you could cite references to some of the “multiple data” from “study after study” which you refer to as showing this is the case.

Size doesn’t make it fringe, behavior makes it fringe. Astronomy is another example — a large field that, if you based all your assumptions on how it behaves, you’d be misled about what happens generally.

My references are multiple reports from STM and Outsell. I’ve cited them in numerous posts over the years. I’m not digging them all out. Google away!

“Size doesn’t make it fringe, behavior makes it fringe.”

If a majority exhibits a certain behavior, that cannot be considered fringe by definition. Thus I disagree and indeed you contradict yourself in your next sentence- “Astronomy is another example — a large field that, if you based all your assumptions on how it behaves, you’d be misled about what happens generally.” “Generally” equaling what most often occurs and thus referring to size.

“My references are multiple reports from STM and Outsell. I’ve cited them in numerous posts over the years. I’m not digging them all out. Google away!”

Wow, that helps me come around to your point of view so well. Maybe it didn’t occur to you that I tried to search this blog for said surveys before asking you for their citations, and indeed even after this latest comment adding “STM” or “Outsell” to the search bar didn’t result in the studies you refer to. Publishers could learn a thing or two from researchers, like say… referencing your sources.

I was tired, and I feel no obligation to look up data for everyone who drops in a comment based on anecdotes. That said, just now, I took 30 seconds to find two references for you:

1. http://www.ingentaconnect.com/content/alpsp/lp/1998/00000011/00000004/art00002;jsessionid=1ert2t5ycd1is.victoria
2. http://www.emeraldinsight.com/journals.htm?articleid=863951

I guess I haven’t made my point clearly. The roles that matter aren’t domain-based (paleontologists vs. astronomers), but role-based. Authors in astronomy or paleontology or dentistry or physics act as authors; readers in those same fields act as readers. In many ways, they exhibit contradictory priorities between these roles.

In addition, you’ve only given me anecdotes, while I’m using data. Now that I found some of the data again, let’s talk after you’ve reviewed it.

Here’s a post to read on the topic:

http://scholarlykitchen.sspnet.org/2012/10/16/authors-arent-scientists-scientists-are-authors-why-catering-to-a-role-could-prevent-robust-publishing/

I think that’s another misunderstanding here. Certainly implementation of the various funding agency policies will involve legally binding agreements, rigorous methods of ensuring compliance, and backup plans, just in case. As noted elsewhere in these comments, CHORUS is not the entirety of what will be done, just a mechanism to make things easier (and less expensive) to implement.

Fortunately Mike, you are not the US government so it is not your demands that need to be met. I doubt your prediction although IEEE has asked for a 24 month embargo period for their technology stuff, which may be reasonable. The point is to have the discussion, something I doubt you want.

Not only am I not the US government, I am not even a US citizen. So I really don’t have a horse directly in this race, except in so far as science is a global undertaking. But my goals and those the OSTP are strongly aligned: we both want the maximum possible access, as quickly as possible, with the fewest possible restrictions, to the maximum possible range of published research. I certainly don’t speak for the US Government; but I do speak in some harmony with it.

It’s strange that you would claim I don’t want to have the discussion right here on the blog where I have come precisely in order to have the discussion.

What frustrates some is that these desires–“maximum possible access, as quickly as possible, with the fewest possible restrictions, to the maximum possible range of published research”–are expressed as unilateral, without apparent consideration for the sustainability of high-quality scientific communications. All of the steps between the completion of a research project and the publication of its results require effort. At scale, effort requires financial resources. Perhaps you believe the entire scientific publishing ecosystem should be overthrown and we should start from scratch. Notwithstanding the negative impacts this will have on the quality, usability, organization, and preservation of the scientific record, it’s inconceivable that any future arrangement will produce scientific literature without costs. So looking at the equation solely from the consumption end produces unrealistic rhetoric.

I like the idea of a CHORUS-like organization establishing metadata standards for publishers to identify government articles. Why, though, is CHORUS proposing to run a search engine? If the metadata’s there, won’t Google Scholar and others do this job?

CHORUS will provide APIs to enable consumption of its aggregated metadata by any and all 3rd-party search tools (including Google, Microsoft, Scopus, PubMed, any agency portal, etc.). The CHORUS interface is conceived as a place to visualize and sort the papers resulting from federally-funded research, not as a primary discovery mechanism. As you imply–and is supported by research–information discovery is a heterogeneous function served by many existing tools and services. Expenditures to duplicate a function that is already well served would be inconsistent with the foundational principles of CHORUS. CHORUS exists to identify and provide public access to these articles.

Indeed, the agencies will probably develop their own portals, using CHORUS data. The OSTP guidance sort of calls for this and most funding agencies already have portals. There may even be an interesting competition for discovery features. The point of CHORUS is that every agency does not have to create its own article ingest system, which would be a nightmare for funded authors.

I talked about this problem here:
http://scholarlykitchen.sspnet.org/2013/02/25/confusions-in-the-ostp-oa-policy-memo-three-monsters-and-a-gorilla/

So I really like CHORUS.

CHORUS appears to be a general way to address some of the issues whilst maintaining significant control in the hands of STEM publishers and their related organisations. Like Mike Taylor, I sceptical of the intentions of publishers in the wake of them doing everything possible to undermine open access to publicly funded research, but at first blush CHORUS itself seems a sensible solution. The proof will of course be in the eating and how STM publishers manage CHORUS.

I am somewhat concerned that the publishers aren’t up to this – there are numerous examples of them not being able to correctly label Gold OA works, falsely claiming they own the copyright or sending a user off to a clearing house to negotiate access to a Gold OA work. Whilst Service Level Agreements might imply a level of service we can expect – what’s to hold publishers to these contracts? As a consumer I see these things broken all the time or updated at the will of the company involved. Perhaps when the US Government is your “customer” companies are better behaved…?

I appreciate CrossRef is now doing more useful things metadata search-wise, though I recall a time, not that many years ago, where CrossRef would only return the first author on a publication as part of the metadata made freely available.

As a consumer of such things (a researcher) I would feel more comfortable with CHORUS if the metadata supplied by publishers went to an fully-open repository rather than just through CrossRef. It is naive to think publishers are best suited to providing useful access to heir data, and whilst CrossRef is certainly improving (e.g. CrossRef Metadata Search) in this regard it still represents one API, and we are restricted to what that API allows/does. If the metadata were fully open people would be free to develop their own tools and APIs around those data. Publishers would still benefit from traffic driven to their sites and opening up these metadata more widely would buy a good deal of trust and appreciation from researchers.

Perhaps CrossRef MetaData Search and its API has gone a long way to resolving this issue of openness and providing full access for free to the metadata? But innovation could be supported by being a bit freer here.

Thanks, but that is not the same thing. CrossRef’s MetaData Search API is open but you can do only what that API allows. If the metadata were open, anyone could package that up into a resource and provide their own APIs. Or am I misunderstanding Thane’s point?

These are helpful comments about CrossRef’s access policies.

Just to clarify, the reason that CrossRef queries formerly returned only the first author was because CrossRef only had the first author. When CrossRef started, it was thought that the first author was all that was necessary to accomplish reference linking, and at the time, that was all the CrossRef metadata could be used for. So the original schema did not allow for depositing more than one author. This has not been the case for a long time.

CrossRef does have other APIs than CrossRef Metadata Search, including a free OpenURL interface. Users that have needs beyond what is available through the free interfaces are usually able access the metadata through CrossRef’s Affiliate programs.

Feedback on existing APIs is routinely incorporated into improvements and the CrossRef board frequently reevaluates its access policies for the publisher metadata. These policies continue to evolve.

One more thing: for what it’s worth, more than 1900 worldwide libraries (including at academic institutions and government agencies) have free affiliate accounts at CrossRef to access the publisher metadata.

Thanks for your comments Carol. The history of the OpenURL single author issue is most useful. I was in two minds whether to critique CrossRef on past performance; I’ve seen some very nice things from them in various presentations recently and it seems they’ve got people working on their products that really understand openness, APIs etc. As good as they may be now, however, still doesn’t remove totally the risk of having these data in the hands of one repository. What is to stop CrossRef from changing things or withdrawing services that may have been built upon an existing CrossRef API or free service?

The only way to truly free up this part of the ecosystem is to open up the meta data. Short of that CrossRef may not be so bad after all. So perhaps it is just my scepticism that the publishers can manage this properly when we already have PMC doing the job nicely..?

Based on the experience of seeking to confirm that Gold APC expenditure created the expected outcomes, I have to agree that the current state of affairs matches the description above “there are numerous examples of them [publishers] not being able to correctly label Gold OA works, falsely claiming they own the copyright or sending a user off to a clearing house to negotiate access to a Gold OA work.”

CHORUS looks like a great idea in theory, I wouldn’t want the US government to suspend plans until CHORUS has been shown to work in practice. And part of that demonstration is to remove the above frustrations.

The CHORUS model is too complicated and too obviously an attempt by publishers to retain control. The diagram has been let through several editorial gateways without being spell checked.

CHORUS and PMC are both relatively complex system designs, as they must be given the intrinsic complexity of the scientific publishing system. PMC is complex on the ingest side while CHORUS is complex on the access side. Neither is too complex to work.

It is not a matter of control but of not having the government stealing eyeballs. The publishers are buying back the eyeballs by solving the ingest problem for the government. It is a fair exchange and a natural extension of the publisher’s role in the system, basically a tagging job. The government should not be publishing what has already been published.

Comments are closed.