It is no small undertaking to divert the flow of a major river in a new direction. Here: The Imperial Dam diverting the Colorado River in the southwestern United States. Image via Wikipedia.

Last week, the news broke about a new service called DOAI that is designed to support open access. It is not a publishing model or a repository but rather a type of infrastructure. When a user inputs a DOI, DOAI connects the user to a freely available copy of the publication. This is the latest of a series of developments in terms of infrastructure and services in support of open access.

Over the years, the Kitchen has provided extensive coverage to the discussion about open access issues. Much of the discussion has turned on publishing models. Should traditional publishers offer gold open access, and what is the likelihood that their models will flip altogether to gold? What are the rights associated with green open access, and when can that model work effectively? What risks are there from the so-called predatory publishers, and how is this affecting the marketplace overall?

While open access is surely a matter of publishing, it is no less a matter of workflows. Researchers need to encounter sources in order for them to be used and to have impact. For those who seek to pull the weight of scholarly publishing in a more open direction, discovery is no less important than availability.

Gold open access models fit perfectly into standard workflows, which is a notable if under-recognized benefit they offer. They do so by using the same “official” discovery and access channels that are maintained and recommended by scholarly publishers and libraries, such as CrossReflink resolvers, and discovery services.

Other types of non-gold access tend to be less amenable to these “official” discovery and access channels. These include materials licensed for re-use according to open principles as well as other materials that have been made freely available online, through institutional repositories, scholars’ websites, and services like Academia and ResearchGate. They have benefited tremendously from indexing in Google, which is one reason that SEO for institutional repositories has emerged as a focus area. But the “official” channels have proved to be a barrier to the adoption of some of these non-gold open and free models.  

In this context, DOAI’s creators deserve a great deal of credit for its sophistication. The service uses a large-scale search database focusing on open access materials (more on its specifics below) to identify open and free versions of articles that one is trying to access using a standard DOI. It sends researchers to open and free versions in repositories, in ResearchGate, and in other non-publisher sources. Based on some small-scale sampling, it does not appear to privilege “official” open access channels such as the version of record on a publisher’s platform. DOAI is an interesting illustration of the importance of owning or influencing discovery and “appropriate copy” infrastructure.

The database that powers DOAI is BASE, a discovery service run by Bielefeld University that indexes millions of records from repositories and other open sources around the world, and its breadth is impressive. To take a small personal example: The vast majority of my writing is via open access channels operated by Ithaka S+R. While we are in the process of obtaining DOIs for our publications, I was pleasantly surprised to learn that several of my recent publications are indexed in BASE. For example, a paper I wrote on print preservation was indexed by the University of North Texas (including the addition of descriptive metadata), which is a source of BASE. The system is distributed and as best I can tell self-organizing and looking at the breadth of sources included in BASE it seems to achieve impressive coverage.

The Directory of Open Access Journals is another discovery service focused on open access materials. Because of this focus, BASE and DOAJ make more sense embedded inside of broader discovery channels. With a somewhat similar purpose, DPLA has focused as much on building its platform and API as on serving as a presence or portal of its own. Other services can incorporate the information contained in such a database and the information about open items it contains into discovery services and tools that may have broader purposes or fit into a traditional workflow.

And these workflow services need not be other search engines. One interesting model can be found in browser extensions. For example, Lazy Scholar released a browser extension that looks for free text online based on a highlighted reference or when a user navigates to a journal article that is inaccessible.  (It even provides a pop-up alert if you find yourself visiting a journal that is included on Beall’s List.)

DOAI has not yet found the right way to insert itself into a user’s workflow, since it currently requires that a user copy a DOI and navigate to its alternative resolver. Building on the browser extension model, however, there is an obvious prospect for DOAI. When a user navigates to a website or loads a PDF in a browser, a DOAI extension would send any DOI links found directly to the DOAI resolver.  It would thereby use this link to take the researcher seamlessly to the open or free version. Similar options could presumably be introduced into PDF management services like Mendeley. This would be an interesting example of infiltrating the standard and emerging research workflow, of co-opting it, to support not only open but apparently non-gold options as well.

Other interesting infiltration opportunities can be found in the discovery landscape.  Libraries have tried to take a more prominent role in search-based discovery by licensing the discovery services that are provided by EBSCO, ProQuest (Primo and Summon), and OCLC. These services allow the library to “tune” the preferred sources when a work is available on more than one platform. I have been waiting for these services to offer a generic “prefer open” option, which I have to imagine many libraries would select to drive usage away from publisher versions and towards repositories and other open versions. It would be interesting to see how commercial publishers would react to find their own value-added metadata being used to drive discovery of the open versions of their articles.

Standard workflows protect the traditional roles of publishers and libraries alike. It is worth reflecting on the fact that there is apparently enough content openly or freely available online that it makes sense to consider how these standard workflows can be co-opted to prioritize open copies of content.

Roger C. Schonfeld

Roger C. Schonfeld

Roger C. Schonfeld is the vice president of organizational strategy for ITHAKA and of Ithaka S+R’s libraries, scholarly communication, and museums program. Roger leads a team of subject matter and methodological experts and analysts who conduct research and provide advisory services to drive evidence-based innovation and leadership among libraries, publishers, and museums to foster research, learning, and preservation. He serves as a Board Member for the Center for Research Libraries. Previously, Roger was a research associate at The Andrew W. Mellon Foundation.


11 Thoughts on "Co-opting “Official” Channels through Infrastructures for Openness"

Thank you for this helpful analysis, Roger. DOAI is clearly a particularly powerful tool for periodical literature. Thinking about your closing recommendation to coopt standard workflows to expose open content, an area that clearly needs more work is discovery of OA *book* content. Since relatively few books still have DOIs and ISBNs are so much less web-friendly, this is a more manual burden that probably needs to fall mostly on EBSCO, ProQuest, OCLC, the aggregators and the library jobbers. While it is clear that there is some constructive thinking going on, these organizations can only fully engage if they see clear demand (and better still, a willingness to pay for the cataloguing convenience) from their customers, the libraries. The recent announcement of the unlatching of Knowledge Unlatched’s second round of academic monographs, and the persisting challenge of discovering the OA versions of first round KU and new Luminos titles via library catalogs, illustrates the need for more attention in the area of Open Access book discovery.

Green OA and other versions of articles may be useful for many purposes, but it is dangerous to promote their use for the purpose of publishing scholarly articles citing them because, not being versions of record, they may contain differences in content, and even different page numbers, from the versions of record and are therefore not suitable for accurate scholarly citation.

I agree. Green OA might be ok for reading, but I want to see the copy of record if I’m citing the work. Speaking of the copy of record, why doesn’t DOAI list it first if it’s Gold OA?

I have been testing out DOAI and I noticed that even in the case that an article is published as open access, DOAI redirects to the ResearchGate version of the paper instead of the publisher’s version. I find this somewhat disconcerting.

Agreed. I would love to see this service expanded to help get readers to freely available versions of papers in the journals themselves rather than just looking at third party repositories, some of which are illicit at best.

I think this is much less of a concern these days, isn’t it? There used to be a lot of confusion among researchers over definitions of preprint and postprint, yellow, green and blue versions but I think the definition of the green OA version has come to mean the final accepted manuscript which contains almost the same text and figures as the final version of record, except for copy-editing corrections and formatting.

Getting the correct citation information is pretty trivial these days, it’s automated in many reference management tools, and in any case, you can get it from bibliographic databases without having the version of record.

I think what bothers me is that if authors go straight to a green version of the paper publishers don’t get to see traffic/downloads/interest in the paper. And libraries don’t see this usage either (which shouldn’t matter for an OA journal, but this could potentially affect subscriptions to hybrid journals). Basically– why is a publisher going to pay to host a paper if it’s just going to be read on ResearchGate? (This could open up a whole different debate).

That view accords very little weight to copyediting. In an experiment we did for an article for Against the Grain a bunch of copyeditors checked out Green OA versions and compared them with final versions of record and found a disturbing number of mistakes that were corrected, including misquotations of sources. Another study found that the reason so many quotations are misquoted is because scholars do not go back to the original source to check the accuracy of quotes but simply accept the version that others have used, which sometimes are inaccurate, thus perpetuating mistakes from one article to the next. Granted, though, this may be more of a problem in HSS publishing than in STEM publishing.

I think concern over “version of record” is a bit overblown—the reason people are often reading Green versions is because they CANNOT access (or find) the paid-access version. I think that DOAI is something fantastic! I would love to see more objects list associated DOIs/DOAIs. This would make our systems much more useful.

Comments are closed.