Today’s guest post is by Niels Stern, managing director, OAPEN Foundation and co-director, DOAB, and Ronald Snijder, CTO, OAPEN Foundation. OAPEN was founded as a not-for-profit foundation in 2010 to promote and support the transition to open access for academic books by providing open infrastructure services to stakeholders in scholarly communication. In 2013 DOAB was launched to help publishers increase the discoverability of their open access books and to enable libraries to freely integrate a large collection of peer-reviewed books into their catalogues.
This year, the Directory of Open Access Books (DOAB) is celebrating its 10-year anniversary, a great opportunity to reflect on how far we have come with open infrastructures for the distribution and discoverability of open access books (monographs, edited collections, and other long-form publications). Sometimes we hear that libraries are still waiting for infrastructures to emerge for open books; sometimes we hear publishers who don’t think open infrastructures interoperate with vendors; and sometimes we hear other voices who want more features than the current infrastructures are providing. Having spent some 15 years in open access (OA) book publishing we think that the challenge is probably not the lack of well-functioning infrastructures (as we will describe below). A bigger issue, in our view, is that there is a lack of global representation in DOAB which is currently the most global infrastructure. DOAB is designed to serve the whole world, but in fact we are only serving a part of it, predominantly Europe and North America and increasingly also South America. That’s a challenge of global equity that we must face and tackle – while, of course, we also keep improving the infrastructure for all those already using it.
However, let’s begin with some good news first – it is our anniversary after all!
DOAB is growing fast, with almost 70,000 books indexed from over 600 publishers. And it keeps growing. In fact, the five-year compound annual growth rate (CAGR) is 12% higher than for OA journals according to this DeltaThink analysis (2023). When DOAB launched in 2013, as a spin-off project from the OAPEN Foundation, only a few publishers were engaged in OA book publishing. Six European university presses had developed the OAPEN Library in 2010 (read the story of OAPEN in this article) to provide hosting, distribution, and preservation services for OA books. The idea of DOAB then was to offer academic publishers with a hosting and preservation arrangement already in place, a free discoverability service for their peer-reviewed, OA books. This turned out to be a good idea. In 2019 DOAB went from being a project to becoming an independent legal entity operated by OAPEN, and jointly managed and governed by OpenEdition and OAPEN.
The growth and relative success of DOAB probably derives from its alignment with three core principles for good infrastructures: (1) It connects users seamlessly, (2) it is interoperable and designed on open standards, and (3) it is community-driven. Both DOAB and OAPEN are based on community standards, fully open sourced, not-for-profit organizations abiding to the Principles of Open Scholarly Infrastructure (see POSI self-audit here). This is particularly important as it ensures transparent interconnectivity, interoperability, reusability, and reliability. These features have made DOAB useful to third-party services like Open Syllabus, OpenBPC, Thoth, and the Library of Congress to name a few. These integrations rely on DOAB maintaining its infrastructure at high standards both in terms of the curation process (DOAB only includes peer-reviewed, openly licensed academic books) and in terms of how it manages the transportation and translation of metadata between publishers and libraries.
To demonstrate this, let’s take the elevator down to the machine room of DOAB and OAPEN. The primary users are publishers (authors) and libraries (readers). Providing seamless connections for these users requires an understanding of the circumstances of publishers and of libraries, and what their systems can or cannot provide. Metadata is crucial to accomplish and it doesn’t matter whether the organization is large or small, whether it’s non-profit or commercial – all have equal opportunities.
From publishers/authors to DOAB and OAPEN
For the ingestion of new books and chapters into DOAB or the OAPEN Library (hosting over 28,000 full-text books and chapters), the data model must work for a diverse range of publishers, so it should be straightforward. However, the description of the books should contain enough information to be useful for libraries and funders. Licensing information is – obviously – crucial in an OA collection. Another important aspect is multilingualism: currently the DOAB and OAPEN collections contain titles in more than 60 languages. And let’s not forget identifiers: ISBN, DOI, ORCID, funding information… In total there are 32 data fields, but only 10 of these are mandatory. More information is available on uploading books and chaptersas well as data fields).
Knowing what metadata is needed is one thing; transporting the metadata – and the content files – into the OAPEN Library is another. Several publishers work with intermediaries such as CoreSource or BiblioVault. These organizations deliver a full ONIX XML file, which OAPEN then ‘decodes’ and extracts the necessary metadata to add the title to our collection. Another variation is the SciELO Books network, which aggregates titles from several publishers and combines them into a single ONIX document. Other publishers send us the ONIX metadata directly, using their internal systems to create it. Recently, the Thoth platform has been developed to support smaller OA book publishers to manage their metadata. One of its export formats is an OAPEN- and DOAB-optimized ONIX.
Nonetheless, not all publishers are willing or able to use this rather complex XML format. For many publishers, both large and small, it is more convenient to send us the metadata in a structured file. This also works best for several of our trusted network partners, such as OpenEdition and Project MUSE. We receive and ingest files in comma-separated text formats, Excel, or Word documents. These files are either transported through our FTP server or via email. Currently we are working out the technical details of our collaboration with JSTOR.
And last, but not least, for some titles the publications and their metadata are directly harvested into the OAPEN Library from other repositories.
From DOAB and OAPEN to libraries/readers
Much of the dissemination of OA books happens via libraries, which engage in different ways with the several types of metadata DOAB and OAPEN provide. Some libraries simply add the collections as links in their lib-guides. Others prefer to use an intermediary for handling their collection management. They can activate the DOAB and OAPEN collections in the library systems provided by OCLC, EBSCO, or ExLibris (Clarivate) which drive extensive traffic as the numbers from the ALMA/ExLibris knowledge base shows here below.
Using one of the knowledge bases of a library system provider has advantages, but it is not without issues. The intermediary is responsible for the ingestion and deduplication of the metadata, and when this is not optimal, it leads to problems that cannot easily be solved by the library. Additionally, while the metadata provided by the OAPEN Library and DOAB are available as public domain data, the enriched descriptions from the intermediaries cannot be freely shared, making cooperation between libraries difficult as described in this NISO article (2022).
Directly importing book descriptions into the catalogue enables libraries to sidestep those issues, as they have full control over what they ingest. To support this, we provide metadata in formats that are widely used in a library context: MARCXML and KBART.
Some libraries go beyond importing the whole collection of titles that are available in the OAPEN Library or DOAB. The Statewide California Electronic Library Consortium (SCELC) is operating a project where OAPEN provides customized feeds to participating SCELC libraries. Here OAPEN provides 10 different sets of book descriptions, based on profiles that have been defined by the member libraries of the SCELC consortium. Examples are a profile selecting titles related to religion in nine languages, or a selection of English-language books on art, music, design, or business. We see an increasing demand for selections of titles based on many different criteria, with specific metadata requirements. To cater to these needs which we hear echoed from other institutions, we are currently developing a flexible export module.
Looking ahead – DOAB’s next 10 years
Being more globally inclusive is one of the biggest challenges we face, and one that DOAB will continue to prioritize in the coming years. While the infrastructures handle user diversity with care (supporting bibliodiversity) to ensure that publishers of any size or shape fulfilling the core DOAB requirements can benefit from the discoverability provided through library systems and web services, DOAB is still mainly populated with publishers from Europe and North America. Although we are seeing growth from South America, publishers from Africa and Asia are largely absent. On the journal side of publishing, these regions (especially Asia) seem very busy with open access, so why aren’t publishers from those regions using DOAB? Maybe there is simply a lack of awareness – they don’t know about DOAB. Or maybe there are other things at stake. We don’t know but we do want to find out.
With fewer than 10 people in our team, we are dependent on others to engage with us in this mission to enable equitable access to global distribution for book publishers/authors, in the same way that we already provide open access books to libraries/readers globally. However, as a small first step towards ensuring that OA book publishers globally know about DOAB, we are preparing a pilot outreach project in Africa that will begin this summer. Later in the year we have plans to begin exploring partnerships in Asia.
In addition to expanding our efforts to be more globally inclusive, over the next 10 years, DOAB will also continue to deliver on its mission — to improve discoverability of OA books and increase trust in OA book publishing by providing an open, global, and trusted hub for all stakeholders in scholarly publishing. To be successful, we have to improve in (at least) these three interconnected areas: (1) Quality of DOAB, (2) Community engagement, (3) Financial sustainability.
Quality of DOAB
To ensure the quality of the DOAB books collection, publishers joining the directory must comply with peer review and licensing requirements for their academic books. Like DOAJ, we put considerable effort into checking publishers before we allow them to join DOAB and upload their books, which can be both time-consuming and complex. We therefore want to improve our internal guidelines and workflows to improve the evaluation process, working with our Scientific Committee. As part of our efforts to be more global,, we have created an excellent collaboration with SciELO Books in Brazil to help us assess publishers from Latin America. We want to create other similar collaborations elsewhere in the world. We also hope that our recently-launched Peer Review Information Service for Monographs (PRISM) will increase the transparency around the quality assurance process of books by enabling publishers to share details of their peer review processes and integrate this information in their books’ metadata. We believe that this service holds ample opportunities for further development.
Of course, quality relates not only to content, it’s also about technical stuff and metadata. As well as continuing to help (typically smaller) publishers get distributed just as well as any other publisher, our infrastructure itself also needs updating. This will include upgrading to DSpace 7, monitoring and improving our data model and integrations with third party systems, and ideally (with community support) improving the metadata quality of DOAB at the record level.
Community engagement will be vital to the continued growth of DOAB over the next 10 years. This includes assistance with the evaluation of publishers (like SciELO, as mentioned above), as well as help from the library community to improve metadata quality. Some things can be done automatically (e.g., checking links – a new service will be announced soon) but we also need a community of volunteers to help us ensure that DOAB can serve libraries with high-quality metadata either directly through our open API and metadata feeds or via library intermediaries.
Finally, to serve publishers and libraries or third parties who use DOAB data, the DOAB Foundation (like OAPEN) needs to generate the revenue we need to sustain and develop our infrastructure. Our POSI self-audit shows how we strive to make a financial surplus so that we can both innovate for our users and also establish a contingency fund. To keep developing our services and engaging with our growing user community we probably need to double our staff numbers in the coming years. Our revenue (mainly) comes from supporting libraries and consortia, with some additional funding from publishers (OAPEN revenues also come from publisher and funder service fees). We are grateful for this support, but are worried about its sustainability.
We often think of DOAB and OAPEN as old start-ups but, with open access to books truly accelerating, the time has now come for DOAB and OAPEN to mature as organizations. We believe that the library community will continue to want open infrastructures for academic books, so we are confident that in 10 years our organizations will be thriving, alongside other open infrastructures and vendors, as we seek to serve open and equitable scholarly publishing in the best possible ways.