Rocket misfire

When the open access (OA) movement started, the goals were to provide taxpayers with greater value by making scientific reports of research funded by taxpayers freely available. As part of this, some hoped that publishers would be transformed, that some big publishers might be eliminated or at least humbled, and that publishing itself would be reimagined. Academics and universities would benefit, the public would get more for their tax dollar, and a general liberalization in scientific communication would emerge.

Fast-forward nearly 15 years, and here’s what we see:

  • Publishers have adapted, and largely no longer view OA as a financial or existential threat
  • The taxpayer is being asked to pay more for government-run repositories, enforcement procedures and related staff, and a matching bureaucracy
  • The value proposition of OA is being viewed more skeptically, and blanket statements that were once embraced are now being assessed more critically
  • Academics are realizing their academic freedoms could be curtailed with mandates like those in the UK and Europe
  • Universities are being asked to implement new, problematic, and potentially expensive management structures to deal with the demands of funded OA
  • Research funding is being reallocated, squeezed, or reduced to pay for OA publishing
  • Distractions, inefficiencies, and limitations are more regularly cited

These problems are perhaps most acute in the UK with the recently revised Research Councils UK (RCUK) policies. The problems appear to be taking over the conversation. As Martin McQuinlan wrote in a recent edition of Times Higher Education:

. . . the RCUK-government response to Finch has essentially reduced all this complexity to a pay-to-publish model that primarily benefits publishers and commercial users of datasets such as AstraZeneca, which will no longer need to subscribe to journals. The cost of academic publishing has been thrown back on to universities which will in turn inevitably be forced to make economic and strategic decisions about which academic papers they should fund. Expensive and elaborate peer-review mechanisms will have to be established to manage the process. And the costs of all this will not be recouped from university library budgets: on the contrary, libraries will still have to pay for journals from the rest of the world unless other countries implement a gold mandate, and this looks unlikely. As a result, university budgets will be further squeezed and the publishing research base squeezed too. The likely outcome of a unilateral gold open-access policy will be a contraction of research in the UK.

The RCUK policy is also being perpetuated in Europe, with a July 2013 implementation date. And, in a parallel to what happened in the UK, academics from the disciplines of the humanities and the social sciences are pushing back. In a recent announcement, French academics had this to say:

We are afraid that the embargo proposed by the Commission – delaying free distribution of material for 12 months after paper or electronic publication – will prove to be completely insufficient for maintaining a large number of academic journals in the fields of the humanities and social sciences, where publication is only economically viable over a longer period. . . . We therefore fear that these measures . . . will quickly prove to be counterproductive; that they will lead to a deterioration in the quality of publications in the humanities and social sciences and to an impoverishment of intellectual debate, undermining the diversity that is so essential to the publishing landscape and limiting the independence of authors.

Another area of increasing tension for the OA movement is around its licensing. The Creative Commons Attribution license, otherwise known as the CC-BY license, is required by the RCUK mandate, and is viewed by many as sacrosanct to the OA movement. It is described on the Creative Commons site thusly:

This license lets others distribute, remix, tweak, and build upon your work, even commercially, as long as they credit you for the original creation. This is the most accommodating of licenses offered. Recommended for maximum dissemination and use of licensed materials.

However, not everyone agrees, including one persistent advocate of OA, Heather Morrison:

Many open access advocates consider CC-BY to be consistent with the strongest form of open access, libre open access, as it includes the least restrictions. I argue that the lack of restrictions leaves open access vulnerable, for example vulnerable to re-enclosure for toll access dissemination downstream. For this reason, I consider CC-BY-NC-SA to be the closest choice of the CC license options for strong or libre open access, allowing a broad range of re-uses while imposing restrictions that protect the open access status of the work for the long term.

The perception is that the CC-BY license makes remixing and reusing content free and therefore easy. It is promoted as a great license to help text- and data-mining initiatives, which could possibly lead to some discovery at some point in time. However, it is inherently quite encumbered — not financially, but academically. The potential encumbrance is attribution. The Wellcome Trust notes in its document about CC-BY that:

As a matter of good practice, we encourage text-miners to cite the dataset (including the query they used) in all publications which make use of text-mined facts.

That is, attribution is required for any piece of content included in a dataset or analysis that relies on text-mining and makes use of facts garnered in this way.

Others have also noted this, including Alex Ball in the Digital Curation Centre, associated with JISC:

. . . compiling a dataset from many others is likely to be unfeasible due to the administrative burden of crediting each individual contributor to the superset in the manner of their choosing. This problem is sometimes known as ‘attribution stacking’.

So, if the entire 2012 contents of PLoS ONE were used for text- or data-mining — content that is entirely within a CC-BY license — conceivably all 23,464 articles (just those from 2012) would have to be in the reference list on any publication resulting from such text- or data-mining. Of course, a scientist using these data could avoid this by seeking a waiver from each copyright holder. But since copyright under CC-BY resides with the author, and isn’t centralized by a publisher, that would require more than 23,000 waivers, not just 3-5 from publishers who gather and administer the copyrights. Because most publishers wouldn’t view text-mining as a copyright violation, just as fair use, CC-BY may actually be more academically restrictive than copyright, simply because of the burdens it places on scholars and the inefficiencies it creates through decentralization.

The OA bargain is becoming a bit more fraught as more people get involved and as founding concepts are stress-tested by practical matters. If the OA movement remains dogmatic about the details, and is not willing to compromise or improve on ideas established years ago and never thoroughly validated on a large scale, it run some serious risks of losing support in the wider world.

Things change when they scale up. Things change when they’re asked to do more than just provide an alternative.

What’s the future of OA? It seems it is not the future once envisioned in the past.

(Hat tip to RS for the insight about the CC-BY licenses.)

Enhanced by Zemanta
Kent Anderson

Kent Anderson

Kent Anderson is the CEO of RedLink and RedLink Network, a past-President of SSP, and the founder of the Scholarly Kitchen. He has worked as Publisher at AAAS/Science, CEO/Publisher of JBJS, Inc., a publishing executive at the Massachusetts Medical Society, Publishing Director of the New England Journal of Medicine, and Director of Medical Journals at the American Academy of Pediatrics. Opinions on social media or blogs are his own.


31 Thoughts on "Whoops! Are Some Current Open Access Mandates Backfiring on the Intended Beneficiaries?"

So when you say “academics” are pushing back against open access mandates, you mean “editors of toll-access journals”.

Editors are usually high-powered academics. They are not the only ones. Department chairs, deans, researchers are all growing concerned.

Both the financial and usage demands of OA will be its demise. On the one hand Government, authors, institutions and others who have to pay the bill will realize that this publishing game is just too expensive and on the other users will find the system too cumbersome to utilize the access they have gained.

I see no demise on the horizon. RCUK has done something stupid but PubMed Central is doing well. We are just sorting out a revolution.

Perhaps I should have not used the word demise, but I do see it becoming just another model for publishing a paper. Additionally, as government regulation and funding impacts OA I can see its growth being curtailed.

I see things a little differently. I think PMC is actually courting problems, and the OSTP approach isn’t going to make life easier as it steered clear of repositories. Architecturally, PMC is problematic. Data exist showing it competes with publishers. Then we have the ethical lapses around eLife and F1000 Research (a non-journal gets into PubMed via PMC). Finally, there is the expense, which is millions each year. I see PMC as being on thinner ice with each passing year. It may get sorted, too.

I have a question about CC-BY licenses.

Does a CC-BY license mean that some can take articles about, says, sickle cell anemia or Tay Sachs, and publish them on a white supremacy site? I don’t see any way this can be prevented with CC-BY. At least with current copyright laws, a forced take down can occur.

Is anyone in academia pondering about the unforeseen, unintended consequences of this thinking?

It’s in the nature of revolutionary thinking to discourage any discussion of unforeseen (or even merely unintended) consequences. OA evangelism is rife with this unwillingness.

You make a good point here Kent, but perhaps not the one you intended. Attributing Open Access articles is actually often far easier to do than for their paywalled-access ‘cousins’. Good OA publishers provide Open Bibliographic data and API’s e.g. the PLOS API here which make it fairly easy to extract fully accurate bibliographic data with which to attribute analysed articles with. By contrast many paywalled journals do NOT provide corresponding Open Bibliographic data for the articles behind the paywall and hence these articles are harder to individually attribute. OA generally makes attribution easier. Paywall access makes things harder (as always).

So, you’re saying you would cite an article without having read it? That’s what I get from this observation.

Why does Ross personally have to read the paper? The whole point of text mining is that a computer does the reading for you. It is easy to get the computer to record well-formed meta data whilst it is mining the data. Not so much with many of the pay-walled publishers though, as Ross himself has commented on on his blog.

As long as your methods and mining code are open, well documented, and subject to scrutiny, why should a human have to read everything? – that’s the computer’s job.

I think he has to read the articles for the sake of intellectual integrity. If you’ve read the articles, you can gather the bibliographic information as you go. I was inferring that he hadn’t read the articles from the context. Can you point me to a useful example of something you’re talking about — science emerging from a computer reading papers and a human not reading the same papers?

I’m reasonably sure Heather didn’t read all 11000 papers she mined for this research: That’s just one. The problem with your comments in this part of the comment thread is you seem to purposely not understand text mining and dismiss the whole area out of hand. I’m not very familiar with the science outputs but even I was able to go and see that Heather and Peter Murray-Rust for two are publishing research using these techniques and from those I was able to see whole journal special issues devoted to doing science via text mining.

Whether you feel text mining is a fruitful area of scientific endeavour is largely irrelevant to the point Ross raised.

And this is against the backdrop of your constant complaining about CC-BY. How can you possibly claim here that CC-BY is less open/more restrictive than CC-BY-NC-SA??? The latter might protect downstream products by ensuring those are “open” and shared under similar terms as the original works, but if those downstream products never materialise because of the non-commercial clause what does that buy us? The point of CC-BY is that the original data/source remains open to everyone for all time. One could feasibly repeat what someone did using these works even if they released that work under a non-open licence. CC-BY-NC-SA just restricts access for a large swath of the research community.

To be honest, I’d rather people used and built on public research than those advances never happen at all just because of a licencing decision.

Point by point response:

The Piwowar paper contains limitations stemming from her not reading the papers, including accommodations for things she can’t explain in her data (“I attempted to estimate if the paper itself reused publicly available gene expression microarray data by looking for its inclusion in the list that GEO keeps of reuse” and “The gender of the first and last authors were estimated using the Baby Name Guesser website” and “I quantified the content of journal data-sharing policies based on the “Instruction for Authors” for the most commonly occurring journals”). There are editing problems that cause statements to be conflated, as well as some big assumptions, both of which added to the confusion in the paper. There’s a lot to admire here, but it’s not flawless. It would be stronger if there were less guesswork and more actual intellectual engagement, and that might not have been much more work.

My question was a bit insouciant, but it contains a germ of truth — that is, how can you as an author stand behind a set of findings if you’re letting a computer program (or set of them) tell you things that really require much more intellectual effort than just writing search queries and concatenating outputs? How can you trust your algorithm?

I’m actually really interested in text-mining and semantics, but I think we have to be pretty humble about what they can do. They can do really clever things, and make really clever products. But can they achieve real findings we can trust? That’s a much more loaded question.

I never said CC-BY is less restrictive than CC-BY-NC-SA. But let’s discuss that. Is it? If the article is OA, and the use is academic, there is no real difference. It also means that the resulting work is “share alike,” which cuts out the possibility of commercial cul-de-sacs forming around tools built on OA content and text-mining of same. Actually, that does seem more academic. But it still has the troublesome “BY” aspect to it. But I don’t see it as “more restrictive” except that it restricts commercial reuse, which some people have argued is preferable.

There’s a widespread misconception that copyright prevents text-mining (most publishers view this as fair use), or that paywalls prevent text-mining (most publishers are interested in this, but also don’t want to be ripped off). Paywalls do require some requests to publishers, and this requires researchers to have their acts together. But why should text-mining researchers have it easy? All research is difficult to do well. Patient consents. Data protocols. Cultures. If the barrier you’re complaining about is having to write some emails or make some phone calls to get access to large platforms of text for legitimate research, cry me a river. It’s just not that hard compared to what other scientists do on a daily basis.

Wow, that is completely not what Ross said – his point is that attribution to paywall articles is made more difficult because paywall publishers often don’t provide open bibliographic data.

Attribution isn’t hard if you’ve read the articles. You just gather the information in a database as you go. What’s so hard about that? I was inferring that since he hadn’t gathered the bibliographic information through a natural workflow, he hadn’t read the articles. That’s what it sounded like to me.

But this has nothing to do with OA and nothing to do with the CC-BY license. I do agree that OA publishers have been among the leaders in implementing new technologies, better metadata, API’s, etc. But there’s no reason these technologies can’t be implemented by a subscription access, traditionally copyrighted journal.

The OA nature of some of the journals doing this may be purely correlative, rather than causative. Other factors come into play for a publisher like PLoS. First, they have a relatively small number of publications, and it’s much easier to implement new technologies for 10 journals than it is for 100 or 1000. Second, their ownership structure is much less complex than that of many publishers. To make major changes, they don’t need to get approval from research society journal owning partners. And perhaps most significantly, their relatively high profit margins offer them available capital to invest in expensive endeavors like this in ways that smaller academic publishers who don’t bring in the same level of funds can’t easily afford.

With all due respect, high-powered academics usually provide their academic affiliation as well as their ‘editor of toll-access journal’ one. But both this and the clearly misleading title are old journo tricks for drawing attention to an otherwise ill-informed, severely biased text, not so much for its useful reflections on Open Access evolution and licensing models as for its rather inconsistent statements and/or quotations. “Publishers have adapted”? Well pardon me, I though there was something called the Academic Spring, “AstraZeneca will no longer need to subscribe to journals”? This nearly ridiculous piece of gross propaganda -the kind of jaw-dropping statements one has grown used to listen to at publishers’ conferences- discloses a deep misunderstanding of what Open Access is and aims for.

Department chairs, deans, researchers are all growing concerned (save for the several thousand signatories of the Academic Spring initiative, I presume) because they possibly never devoted a thought before to how the commercial research publishing cycle was getting funded. “The taxpayer is being asked to pay more for government-run repositories”? One would perhaps expect a high-powered academic to have a slightly clearer view of what a public service is. But potential conflicts of interests with (quite often) government-funded publishing industry may not be that helpful at all.

I wished some constructive dialogue were possible between publishers and librarians. It’s not there are not good opportunities out there for having it, see this “Publishers and librarians: we share the same values – why are we fighting?” talk by T. Scott Plutchak,
University of Alabama at Birmingham, at the forthcoming 36th UKSG Annual Conference in Bournemouth, Only the bit about ‘sharing the same values’ may be a bit unclearer after reading this post.

The Academic Spring was an empty bit of rhetoric aimed at the wrong targets, as I wrote at the time:

For there to be a true “academic spring,” academics would need to demand that administrators return to spending the same percentage on library acquisitions they spent, well, let’s say 10 years ago; require they be paid to do research and not be forced to constantly scrounge for grants; hold universities accountable for telling students what their chances of good careers in the sciences actually are, rather than just padding their enrollment numbers; and so forth.

It actually worked against librarians. And this is something Scott Plutchak has urged librarians to do — that is, wake up! These trends need smart engagement from librarians, and having your budgets drained over the years while other parts of the university become the funders of scholarly communication has consequences for librarians and libraries.

Some statements you call “ill-conceived” are actually not ill-conceived. Nature, Elsevier, Wiley, Springer, SAGE, OUP, and many other major publishers now offer OA options or completely OA initiatives. They have adapted. The concern that corporations will accidentally benefit from OA on the back of taxpayers isn’t ill-conceived — it’s real. When research budgets are diverted or cut, and large pharma companies save money for the same reason — well, that might not go down too well with researchers. When taxes go up for the same reason, that might not go down too well with taxpayers.

I think if the OA proposition had been from the outset, “We will request tax increases to fund free access repositories to the scientific literature after a one-year embargo period,” we could have assessed interest in such a matter. Or, more straightforward, “We will request tax increases to make published scientific papers free upon publication, using these funds to pay publishers,” again, we could have assessed the taxpayers’ willingness. But it’s being sneaked in the side door. More people are catching on to this.

Why don’t you tell us what OA is and what it aims for? And how, in so doing and in so being, it avoids all unintended consequences, especially those that are actually emerging.

I think that the Dryad Digital Repository avoids the attribution problem in data mining by requiring dataset contributors to agree to a CC0 (Creative Commons Zero) license. I think this rarely-used license has even fewer restrictions than the CC BY license.

I’m not sure it is quite true to say that the RCUK policy is being perpetuated in Europe – RCUK is self-professedly pro Gold OA, whereas the EC is self-professedly agnostic about Green vs Gold, with at least some EU countries almost certain to go Green. Both have their advantages and disadvantages, but one thing is certain – as others (including you, Kent) have pointed out, deciding where to publish is going to get a lot more complex depending on where you are based, who is funding you, whether you (or your funder can afford to pay), which journals are compliant, etc etc.

I agree. The picture is really complicated. And then there’s the fact that most papers are multi-national and multi-author. The work and distractions for authors aren’t trivial.

Simply put the OA movement is driving the industry into a regulatory regime. Regulation is always a complex business. With every country, or even every funder, making its own rules the potential for complexity is huge. The US Government alone may wind up with dozens of different OA systems.

That the Internet should lead to a government takeover of scholarly communication is truly ironic but new freedoms often lead to new controls.

In fairness, what you’re anticipating isn’t actually a government takeover of scholarly communication. It’s government regulation of scholarly publication based on government-funded research. There will still be many scholars (most outside the hard sciences) working outside of that funding environment who will be free to publish wherever they wish. (Unless their institutions say otherwise, but universities aren’t governments.) I have many concerns about the scenario you describe, but it’s not the same thing as a government takeover of scholarship.

Okay let’s call it government control not a takeover. The point is that it is becoming a regulatory regime with all that implies. My understanding is that most basic scientific research is federally funded and that is my focus, not the humanities. So it is scientific communication not scholarly communication. Note that 50% of the US basic research budget goes to NIH so that much control already exists. The OSTP memo is just picking up the last half. The government takes control of the documents either physically or by order.

I do not suggest that this control applies to scholarship per se merely to communication. Control of scholarship happens via funding, a different issue.

I do not suggest that this control applies to scholarship per se merely to communication.

Even there, you’re overstating it. The regime in question doesn’t control all scholarly communication, nor even all scientific scholarly communication. Only scientific publishing based on government-funded research. That constitutes a lot of publishing, of course–but it’s nowhere near all scholarly publishing, let alone all scholarly communication.

Comments are closed.