Interview With Mike Rossner: On Scientific Integrity, Making Research Data Publicly Available and Routes to Open Access

Mike Rossner has long been a leading innovator in scholarly publishing. Over the last 16 years at Rockefeller University Press (the last seven as RUP’s Executive Director), Mike has continuously driven new technologies and experimented with licensing rights and open access methodologies. With Mike’s recent departure from RUP, I thought it an ideal time to look back on some of the highlights of his accomplishments in these areas.

Q: You were the first publisher, back in 2002, to institute the use of an image screening program to guard against digital manipulation. How did that come about and what level of screening does a journal really need?

A: I launched the image screening program in September, 2002 when I was the Managing Editor of The Journal of Cell Biology (JCB). The journal had just moved to a completely electronic workflow, where authors submitted all figures in electronic format rather than sending in a set of glossy photographic prints. Having figure files from all authors allowed us to begin comprehensively screening all images in all accepted manuscripts for evidence of manipulation using simple contrast and brightness adjustments in Photoshop.

RUP also instituted universal screening for its other two journals, The Journal of Experimental Medicine and The Journal of General Physiology , and I continue to believe that screening of all images before publication is important. It is something that can be done by journals to protect the published record and should be done. Journal editors who screen just a subset of images often argue that this will serve as a deterrent to other authors, but in JCB’s experience even universal screening does not seem to serve as a deterrent.

Q: How effective a tool is this?

A: It is difficult to quantify the effectiveness the Press’ screening process because that would require knowing how many problems were missed to determine the percentage detected. What I can provide are nearly 11 years of data on what was detected at JCB: at least 25% of authors had to remake at least one figure in their manuscript because manipulation was detected that violated the journal’s guidelines but did not affect the interpretation of the data. JCB had to revoke the acceptance of 1% of accepted manuscripts because manipulation was detected that affected the interpretation of at least one piece of data in the manuscript. Those numbers were remarkably consistent since instituting the screening program.

Q: We’ve been hearing about a rise in the number of retractions of journal articles in recent years. Have you seen a similar rise in problematic images?

A: The data presented above indicate that the number of image manipulation cases at the JCB has not risen over the past decade. I can only speculate about the rise in the number of article retractions across the scientific publishing industry, but it may be due to increased awareness of the issue of data integrity and thus greater scrutiny.

Q: There’s a great deal of momentum toward making research data publicly available, whether through formal publication or deposition in an archive. What role do you see the scholarly journal playing in making data available?

I have stated previously that I think the future of scientific publishing lies in providing the reader with access to all of the data underlying a publication and giving the reader the ability to interact with those data. Journals do not necessarily have to host the data (as, for example, they do not host genomic data), but they must provide access (either through links to identifiers or preferably APIs) to the data stored in reliable, publicly-accessible repositories.

Q: You broke new ground in this area with the JCB DataViewer.

A: At the time we developed the JCB DataViewer, there was (and still is) no repository for original, multidimensional, microscopy image data. The idea to create such a repository for JCB image data emerged from a discussion I had in 2007 with Jason Swedlow, the head of the Open Microscopy Environment (OME) and CEO of OME’s commercialization arm, Glencoe Software. The OME Consortium had developed an open source database engine (OMERO) for storing multidimensional image data and their associated metadata, and they had created an open source interpretation library (Bio-Formats) for translating proprietary microscopy file types into a standardized format (OME TIFF). At the time, OMERO worked through LAN-based connections to the microscope and to a desktop viewer. To make the system work for JCB, we had to develop a Java application for file upload to a central database and a browser-based viewer to present those data to readers over the internet.

The viewer itself provides limited tools for data analysis and mining. With the authors’ permission, users can also download the original image data for more complex analyses using software of their choice. Although the main purpose of developing the JCB DataViewer was to provide a facility for sharing multidimensional microscopy image data, access to the source data also provides readers with reassurance about the integrity of those data.

Q: How have authors and readers used the tool?

A: Authors have used the JCB DataViewer to share anything from just a few individual images to all of the images that went into a quantification or even all of the images underlying an entire paper (more than two thousand in one instance). Authors have also shared 3D tomogram data, very large images (including what is currently the largest image ever presented on the internet – according to Wikipedia – at 280 Gigapixels), and high-content screen data, with more than 97,000 images in just one of those datasets. A total of about 2 Tb of data underlying over 300 papers published in JCB are currently accessible.

The JCB DataViewer is accessed by ~15,000 unique users in an average month. User surveys indicate that the data are used for evaluating the conclusions of a specific article, generating new hypotheses for research related to a specific article, and obtaining images for educational purposes. An average of ~10 Gb of data are accessed on the site each month.

When Jason and I decided to develop the JCB DataViewer, we always considered it to be a prototype for a universal data bank for published microscopy image data, rather than as a resource for just a single journal. We believe we have successfully demonstrated the feasibility and value of such a database, and we have been working with Mike Keller, the Stanford University Librarian, to obtain funding to develop a universal resource for all publishers as part of the Stanford Digital Repository.

Q: RUP has long been a leader in expanding access to the research literature. But you chose not to follow the author-pays immediate Gold Open Access (OA) route for publication, instead making all articles freely available after 6 months. What’s the reasoning behind this decision?

A: RUP started to release the content of the journals to the public six months after publication in January, 2001 in response to calls from the community for greater access to research publications (which led to the formation of PLOS). RUP did not offer an immediate access option because we did not want to provide spotty immediate access to just a subset of articles; we wanted to provide uniform public access to all articles in a way that was consistent with a subscription-based business model.

I believe that hybrid access can unfairly distribute the costs of publication. The RUP journals are “selective” journals, and the cost of publication per article is higher than even the highest Gold OA publication fees charged by other publishers. If a subscription-based journal makes an article available for free immediately for less than the actual cost of publication, then the Gold OA fee is subsidized by subscribers, who may not want to subsidize immediate access to a particular subset of articles.

In my opinion, the fact that the policy of delayed public access remains in compliance – more than a dozen years after it was adopted – with even the strictest of funder public access policies indicates that the research community and the funding agencies agree that it is reasonable.

Q: RUP under your stewardship took a fairly unique stand on licensing and copyright. You were an early adopter of a Creative Commons license, with some modifications. Can you explain how the system works?

A: Since 2008, the content of all RUP journals has been effectively published under a CC-BY-NC license. As a subscription-based publisher, the only extra condition RUP had to place on that license is a “no mirror sites” restriction within the first six months after publication. That was to prevent someone with a subscription from downloading all of the content and making it freely available on another website when it was still under access control on the journal websites. When access control is lifted at six months, that added restriction is not necessary, and the content is fully CC-BY-NC.

Q: Have you seen any interesting reuses of content? Did you see a subsequent loss in licensing revenue?

A: I am not aware any different types of reuse of the content under the CC-BY-NC license compared to before such licensing was in place – it continued to be primarily the repurposing of figures. The main difference was that people did not have to request permission from RUP for non-commercial reuse, removing a barrier to distribution of content for scholarly pursuits. RUP did not see a loss of revenue, because permission for non-commercial reuse was always provided free of charge.

Q: Along similar lines, despite calls for the CC-BY license that allows unfettered commercial reuse of articles, you’ve resolutely stayed with the non-commercial CC-BY-NC license. Since much of the demand for CC-BY seems to be driven by a desire to implement text and data-mining experiments, you put in place a separate set of licensing terms specifically allowing this type of reuse.

A: My recent clarification of RUP’s licensing terms was in response to policy developments in the UK at RCUK and The Wellcome Trust, which went into effect on April 1^st of this year and which require CC-BY licensing if these agencies pay a fee for immediate access. One of the reasons given by Wellcome for requiring CC-BY was to open up the content for text and data mining by commercial entities.

That type of activity is probably allowed in the United States under the “fair use” limitation to copyright, but it is still unclear what is permitted internationally. To clarify RUP’s position on this issue, I explicitly stated that the content is open to anyone for text and data mining, so no one would feel inhibited by the existing CC-BY-NC license from mining the text and data to potentially advance science.

The advantage of this license is that it could be applied to all of the content published by RUP while allowing the Press to continue functioning under a subscription-based model by preventing third parties from reselling the content. The disadvantage is that the clarification related to text and data mining is not machine readable. The scholarly communication community needs to develop a tagging standard to indicate whether a particular file is available for text and data mining.

Q: You are something of a rarity, a resolutely subscription-based publisher who has been able to find common ground and work directly with OA advocates. How did your involvement in the OA movement come about?

A: JCB started to release its content to the public six months after publication back in 2001, but my own advocacy work for public access did not really begin until 2007. It was prompted by Emma Hill Ganley, who had just arrived at RUP from PLOS to take the position of Executive Editor of JCB, after I had been promoted to Director of the Press. Emma was appalled by the announcement of PRISM, a lobbying effort by the Association of American Publishers opposing public access initiatives, and she pushed me to use the Press’ position as a subscription-based publisher that favored such initiatives to take a public stand.

Peter Suber was instrumental in publicizing those efforts, which were followed by numerous letters to Congress in support of the NIH mandate and efforts to expand it. Those letters stated my opinion that access to the results of publicly funded research is a public good, and that publishers have an obligation to give something back to the public, which funds the generation of much of the content submitted to them. I also emphasized our experience at RUP showing that delayed public access, as exemplified by the NIH mandate, is compatible with the subscription-based publishing model.

That message was heard by Heather Joseph at SPARC. Heather recognized the importance of my middle-ground message, and I was awarded the SPARC Innovator Award in July, 2009.

Q: You were one of the driving forces behind the White House petition that helped secure implementation of the OSTP’s policy memo.

A: In Heather’s continuing efforts to drive the expansion of public access mandates to other federal agencies, she was able to arrange a meeting with John Holdren and Mike Stebbins of the Office of Science and Technology Policy in May of 2012. She thought that it was important for them to hear my message, so I was invited to participate in that meeting.

Although it was clear from that meeting that the OSTP already supported the development of public access mandates for other agencies, it was also clear that the public profile of the issue needed to be raised to get the administration to act. That led to the creation of the White House Petition by myself, Heather, John Wilbanks and Mike Carroll. The petition garnered the required 25,000 signatures in less than two weeks, which meant that the White House was obligated to respond. They did so in February of this year, with the OSTP memorandum to all large federal funding agencies requiring them to draft policies for public access to the publications resulting from – and the data produced by – publicly funded research.

Q: That memorandum calls for stakeholders to work together, forming public-private partnerships as the most effective means of driving progress. You’re someone who has already crossed that border, and found ways to work with groups who are often (at best) skeptical of publishers. What has your experience working with advocates been like? Any advice for publishers looking to bridge that gap?

A: It has been extremely rewarding to work collaboratively with a group of dedicated and tireless OA leaders to include the opinions and experiences of a subscription-based publisher in discussions of public access policy. We hold the common belief that public access to the results of publicly funded research is genuinely a good thing, and it has been remarkable for me to be able to tap into their vast networks in pursuit of that goal.

The collaboration came about because my own advocacy was noticed by the OA leaders. The perspective I offered was unique for a couple of reasons. First, although there were numerous other non-profit publishers of biomedical research that released their content within the first 12 months after publication (and many of them had done so for a long time), they mostly chose not to advocate for public access mandates, even those with which they were already compliant.

Second, legislators and policy makers hear plenty from the extremes, but those who advocate for compromises that nevertheless might bring meaningful change provide a particularly valuable viewpoint. I think it is important to realize that decision makers in government greatly appreciate hearing the opinions of interested citizens (especially if those opinions are backed up by data), and their decisions can be influenced by those opinions. In other words, public advocacy with a reasonable message can work!

At the other end of the spectrum, publishers who oppose public access mandates have been very vocal advocates for their position. As a result of those efforts, there’s a trust barrier to overcome before those in favor of mandates will partner with them to drive progress toward public access. That said, I support any effort to expand public access to the scholarly literature and think it is beneficial to have several efforts in development in parallel, as each one may have particular advantages.

David Crotty

@davidacrotty

David Crotty is a Senior Consultant at Clarke & Esposito, a boutique management consulting firm focused on strategic issues related to professional and academic publishing and information services. Previously, David was the Editorial Director, Journals Policy for Oxford University Press. He oversaw journal policy across OUP’s journals program, drove technological innovation, and served as an information officer. David acquired and managed a suite of research society-owned journals with OUP, and before that was the Executive Editor for Cold Spring Harbor Laboratory Press, where he created and edited new science books and journals, along with serving as a journal Editor-in-Chief. He has served on the Board of Directors for the STM Association, the Society for Scholarly Publishing and CHOR, Inc., as well as The AAP-PSP Executive Council. David received his PhD in Genetics from Columbia University and did developmental neuroscience research at Caltech before moving from the bench to publishing.

Discussion

4 Thoughts on "Interview With Mike Rossner: On Scientific Integrity, Making Research Data Publicly Available and Routes to Open Access"

Interesting interview, thanks. It would be useful to know also about the business aspects of these developments at RUP to determine how sustainable they might be for replication of the model, particularly at other small university presses.

By Andrew Miller (Elsevier)
Jul 11, 2013, 8:38 AM

I stand in awe of the things Mike has been able to accomplish at RUP. And his OA advocacy has certainly been heard loud and clear in the university press community, where he led the effort to oppose the AAUP’s position on OA legislation that reflected the AAP’s. (I joined the group of ten “dissenting” press directors that Mike led.) It is important to note that Mike does not believe that the 6-month model will necessarily work across all subject areas and disciplines, and in our group effort in the AAUP he was willing to acknowledge that humanities journals, like the ones we published at Penn State, might need a longer embargo period. I entirely agree with him about CC-BY-NC being a better license to use for scholarly journals that seek to remain viable on a subscription model.

By Sandy Thatcher
Jul 11, 2013, 10:50 AM

Six months is hardly the middle ground. It is actually a pretty extreme position. Fortunately the US has opted for 12 months. It remains to be seen if any US funding agency adopts anything longer or shorter for anybody. That will be hard.