Recent mandates from funding agencies, including the Wellcome Trust and the RCUK, require funded journal articles to be published using a CC-BY license. Last week, OASPA and PLoS issued articles explaining the need for such licensing terms. But both articles are based on a flawed premise, confusing the rights to reuse the data behind an article with the rights to reuse the article itself.
First, to be clear on the licensing terms being discussed:
CC-BY-NC: You are free to copy, distribute and transmit the work, and to adapt the work. You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). You may not use this work for commercial purposes. Any of the above conditions can be waived if you get permission from the copyright holder.
CC-BY: You are free to copy, distribute and transmit the work, to adapt the work and to make commercial use of the work. You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work).
In this discussion, it is vital to understand that we are talking about the licensing terms for the article — for the set of written words and images describing the research in question, not the research itself. And this distinction is where the proponents of the CC-BY license seem to be confused.
This is somewhat confounding, as neither of the articles announcing the initial draft of the sequence (published in Science and Nature) were published using a CC-BY license. Why, then, do the authors claim that the article licensing terms matter? Neylon writes:
The human genome project generated US $141 for every dollar spent, but this immense return is widely distributed as a result of thousands of people’s work. It is rare for research teams to have both the academic and business expertise required for commercial exploitation, and they typically require a commercial partner. Open-access publishing effectively increases our chances of finding those potential partners. If we restrict their capacity to use our research by restricting commercial use, we limit the chances of partners, commercial or otherwise, finding and contacting us.
Redhead offers similar reasoning:
The human genome project is a compelling demonstration of the power of open access to research, and reflects a well-established practice within the genome community to make research data publicly available for all reuses via resources such as GenBank.
Both have made the same mistake — confusing the genome sequence, the data behind the studies, with the articles written about those studies. The $141 generated per $1 spent ratio did not emerge through the reuse of the Science and Nature articles; the effect was generated through reuse of the studies’ data.
The licensing status of the data, and the results of a study, are not governed by the copyright terms of the journal article written about the study. Both authors are essentially right — barriers to reuse of research results do block progress, a great example being the patenting of the use of the BRCA genes in detecting breast cancer.
But that had nothing to do with the copyright status of articles published on BRCA. It had everything to do with the University of Utah locking up the information behind a patent paywall.
This seems to be a key misunderstanding in the demand for the CC-BY license, and perhaps something of a hypocritical approach by many research institutions. There’s a drive toward open access for the research articles written by authors on campus, but at the same time those universities are blocking others from reusing the research itself. It’s as if they’re saying it is vitally important that you can freely read about our breakthrough in curing cancer, but if you actually want to use that breakthrough to cure cancer, you have to pay us.
Harvard University, a leader in open access mandates for faculty, made more than $13.8 million in 2011 through patent paywalls. The University of California system made over $100 million. This reeks of NIMBY thinking — advocating for progress that requires others to sacrifice, but refusing to accept any sacrifice on one’s own part.
If the goal is to promote the free reuse of research data, then the targets for change need to be the research institutions and the researchers themselves, not the journal publishers. No journal publisher claims ownership of the facts contained in a copyrighted article. That’s not how copyright works. In fact, many journals that employ standard copyright terms require authors to deposit their data in public databases, making it freely available for reuse. The CC-BY license for articles is irrelevant for this goal.
Things get even more confusing when Neylon points out how important it is for a researcher to let someone else commercially exploit their work. Isn’t this the exact opposite of the reasoning behind the intense anger toward commercial publishers like Elsevier? Aren’t they commercially exploiting research in the manner described? Is commercial exploitation of research results good or bad, or just bad when a company you don’t like does something you don’t like with it? If the latter, then is a license that allows anyone to do anything they want with your results in your best interests?
So why the push for CC-BY licenses?
We are reaching a point where technology is turning the journal article itself into a research tool. New semantic technologies and text-mining efforts represent new avenues for discovery. For a text-miner, the journal article itself becomes a data point the way a gene sequence works for a bioinformatician. Reducing restrictions on reuse of articles in this manner is thus an important goal.
It is unclear, though, what relevance the copyright status of an article has toward individual text-mining experiments. If you have free access to an article, why would copyright stop you from using it (and presumably citing it) as a data point in your study? Again, it is unclear why the CC-BY license would improve things here.
Publishing under a CC-BY license does not guarantee that the publisher will make the article available via bulk download, does not guarantee the development of useful text-mining APIs, nor that the article itself will have freely available, useful, organized metadata.
The CC-BY license does open up the paper for further commercial exploitation. The Wellcome Trust presents some better examples of where the CC-BY license would actually be beneficial, allowing reuse of figures in a commercially-driven blog or letting someone charge for a translation of an article.
From the viewpoint of a funding agency, particularly a government funding agency, there’s a strong argument to be made that maximizing the economic gains from funding a research project is a worthy goal. Governments want to drive the creation of new businesses, which will result in more tax revenue and employment. The CC-BY license offers free raw material for new ventures based on reusing published papers. For example, start-up company X could download every paper published under a CC-BY license and repackage those papers based around a suite of semantic tools and re-sell them back to the research community.
That potential for economic development is increasingly important in economic downtimes. The CC-BY license though, is something of a blunt instrument, and the approach brings up some logical conundrums and some likely unexpected consequences.
First, the logic behind much of the movement here is the idea that with the article processing charge (APC), the article has been paid for. It seems counterintuitive then to set up a system where the “already paid-for” article will then be sold back (perhaps repeatedly) to the research community. In some ways it creates a new subscription economy. The article is free, but if you want to be able to get the most out of it, you need to be able to afford a subscription to the reseller offering the tools.
There’s also the likelihood that new start-ups won’t be the result here, but instead further entrenchment of the current publishing establishment. A well-funded company like Elsevier may be better situated to invest in developing new technologies (and buying up any promising start-ups). Under CC-BY, Elsevier could potentially scoop up every paper published by everyone and put them all under its SciVerse umbrella and become the de facto monopoly source of access for scholarly literature. Every other publisher then becomes merely a feeder of content to Elsevier.
History has shown that the Internet tends toward consolidation on one monopoly source for information (see Google or Facebook as examples). Monopolies, whether startups or entrenched powers, are generally bad for progress, bad for customers, and bad for content creators. If the relatively modest efforts of PubMed Central are drawing away 14% of traffic from other publishers, imagine what could be done with the business acumen, funding base, and marketing budget of an enormous multinational corporation.
If there are specific goals the community is seeking for reuse of research papers, then a license that is clearly geared toward their achievement may be more effective than the broad strokes of the CC-BY license. Can a standard license be developed that allows certain types of reuse (particularly reuse for education and further research) but that requires fees for specific types of commercial reuse?
For example, as noted previously, the CC-BY license essentially does away with reprint revenue. If we are truly trying to move journals away from the subscription business model, particularly to other business models that lower the financial burden on the research community, it seems counterproductive to deny financial support to journals in order to provide pharmaceutical companies with free marketing opportunities.
We need to be clear about the role of CC licenses — what they do and do not provide. They are useful tools in driving business opportunities and economic development. But changing the copyright terms on an article written about an experiment does not, in any way, change the terms for use and reuse of the data behind that article.
If open availability and reuse of research results is the goal, then funders need to re-examine their position on patents, not their position on journal article copyrights. For US government funding, this is going to take an act of Congress as the Bayh-Dole Act will need to be repealed. Publishers should not be unfairly painted as scapegoats when the real targets should be institutional technology transfer offices.
These sorts of changes should be approached cautiously. The Bayh-Dole Act is generally seen as having been very successful. The Economist notes that it is:
perhaps the most inspired piece of legislation to be enacted in America over the past half-century. . . . Together with amendments in 1984 and augmentation in 1986, this unlocked all the inventions and discoveries that had been made in laboratories throughout the United States with the help of taxpayers’ money. More than anything, this single policy measure helped to reverse America’s precipitous slide into industrial irrelevance.
There are clear economic incentives to driving the exploitation of research. But removing all restrictions on that reuse removes the direct benefits the research community receives in return for discovery. Patents and financial reward provide strong incentives for researcher achievement. Technology transfer provides an enormous boost for research funding (more than $1.8 Billion in 2011) for research funding.
The benefits of unrestricted exploitation must be weighed against the subsequent losses in university funding and financial reward to successful researchers. Regardless, the copyright status of journal articles has no bearing on this issue. There are potential benefits to changing the copyright terms for journal articles, but let’s be clear about what they are, and let’s find licenses that are best suited to achieving those goals.
(Editor’s Note: Due to power outages from Hurricane Sandy, David Crotty can’t reply to comments. The other Chefs are going to do their best in the meantime.)