Elsevier has taken flack recently over the fact that they used a company to find unauthorized versions of articles online for which they hold copyright. The company sent take-down notices to universities, individual website administrators, file-sharing networks, and likely cyberlockers. Elsevier is not alone in searching for content online. Publishers large and small have also been engaged in looking for unauthorized copies of books and journal articles online.
The resulting outrage was less than subtle. Researchers, who signed copyright transfer forms and acknowledgements to the Elsevier policy about sharing published PDFs on the web, were distraught that Elsevier had decided to enforce these same agreements. After explosions on the Twittersphere and countless blog posts across academia, mainstream media started covering the story and mostly got it all wrong. Michael Clarke did a nice job of summarizing the issue and the comments there mirror what was happening elsewhere.
I kept reading about this “unwritten understanding” between publishers and researchers that it was okay to post published PDFs. Another excuse for posting PDFs was that while researchers confirm that they signed copyright transfer forms, they still feel that it is “morally right” to share their published PDF.
A clear distinction needs to be made here. Many, if not most publishers allow authors to post the final accepted manuscript version of their paper on the internet. Many, if not most publishers impose an embargo period for doing so. Some publishers simply ask that authors wait until the paper is published in the journal before a manuscript version is posted.
Despite arguments to the contrary, peer review does cost the publisher money. Publishers also spend a lot of time and money to format, copyedit, tag and typeset papers in order to present the work in the best light possible. In return for facilitating peer review, many publishers do impose an embargo period, even on the manuscript version to allow time for recouping these expenses. I wonder why this is such an unfair compromise.
With that in mind, I am still waiting for the rational argument about why researchers HAVE TO share the published version. What is the moral imperative? Do users dislike the manuscript version so much that only the published PDF will do? Is it a matter of the author preferring a cleaned up version? If either or both of those statements hold true, then what does this say about a future where OA mandates only require accepted manuscripts to be shared?
The publishing landscape is rapidly shifting and the traditional business model is being chipped away, one little dig at a time. Universities are exerting copyright authority over manuscripts that are to be submitted to journals. Theses and dissertations are falling under the same category which has led to questions about how to turn those into journal papers and books.
File sharing/social networking sites that encourage users to share their papers—such as Mendeley (now owned by Elsevier), ResearchGate, Academia.edu—are also chipping away at the traditional publishing model. These sites offer new services that publishers should take note of and try to work with or replicate.
While each sharing site has disclaimers and ways to report copyright infringement, they often very actively promote article file sharing. Despite a policy that users only share what they are allowed to share, I have not seen evidence that they are regularly policing their sites for infringing content.
Then there are the cyberlockers–huge illegal sites, mostly outside the US, that scoop up content and resell it. Publishers, particularly of ebooks, have been battling 4shared, bookos (also bookza, bookos-z1), libgen and docin. Google cooperates by removing them from search results and the US Federal government is involved as well.
When a publisher hires an anti-piracy company to look for content, scans of the internet look for direct hits that match the title, authors, maybe even abstracts and full text. It is even possible to train the system to find only published final PDFs as opposed to authorized accepted manuscripts.
Thanks to Google Scholar, it is not hard to find published PDFs online. Publishers must then decide what to do about it. For society publishers, the answer may likely be to do nothing. Maybe this is the “unwritten understanding” to which some are referring. Ignoring this behavior does not imply permission but is simply done because society publishers lack the resources to do anything about it. Keeping constant vigilance would require a full time staff position which is beyond the means of smaller, not-for-profit publishers.
Technical solutions are improving. As hackers figure out how to scoop up content and resell it via cyberlockers, the antidote is to look for those copies and exert copyright to have them removed.
After ASCE launched ebooks online, we contracted with a company to find unauthorized versions of ebooks and sent take-down notices. There were a surprising number of infringements and little resistance at having the ebook files removed from sites. It made perfect sense then, for us to explore journal content. What was found was alarming, to say the least. Vast amounts of content are living in cyberlockers. There is a smattering of papers on university sites, mostly author profile pages. And the most recent scan returned a lot of published PDFs on networking/file sharing sites.
I know that for many in the Open Access advocacy world, the answer would be to set it all free. That is the easy answer and the one that would put many society publishers out of business. Publishers that spend millions of dollars a year facilitating peer review and producing journal content, should have some right to protect the published PDF, particularly when those publishers allow for the manuscript version to live openly on the web. Why isn’t that enough?