Editor’s Note: Today’s post is by Amy Brand. Amy is Director of the MIT Press and Co-Founder of the MIT Knowledge Futures Group. She headed business and product development at Crossref from 2001 to 2008, and currently serves on the Crossref Board of Directors.
January of 2020 marks 20 years since the incorporation of Crossref. This platinum anniversary is an opportune time to look back and take stock of how far the organization has come in the intervening decades, and ponder where its strengths and achievements might lead it in the future. The visionary publishers who formed Crossref, and the staff who have run it from the start, should feel extremely proud of the organization they created – not least for its success as technical infrastructure, but also, arguably, as the scholarly information community’s most extensive, impactful, and stable consortium. That said, it is unlikely that the founding publishers envisioned at the outset the diverse, multi-stakeholder federation that the organization is today. As an early staff member myself, and now a member of the Board, I share a sense of pride in how far Crossref has come, and care deeply about its future.
Crossref now finds itself at something of a crossroads. Namely, will it realize its potential in the years ahead to become even more robust as a convening force that represents all voices in this changing community, and provides the infrastructure for the enhanced features and capabilities of tomorrow’s scholarly communication technologies? Or will it be held back, and its remit circumscribed, by legacy priorities and forces within the industry that may perceive open data and infrastructure as a threat to their own evolving business interests? As Crossref looks toward its annual member meeting and Board election next month, it is actively seeking community input to help shape its future. Indeed, the theme of this year’s meeting is “have your say”.
Crossref launched as a partnership among 12 academic journal publishers – commercial, society, and university based – with start-up loans from eight of the founding members. (See my 2001 D-LIB article, Crossref Turns One, for more on the organization’s origins.) As a non-profit trade association, exempt from federal income tax under section 501(c)(6) of U.S. Internal Revenue Code, the organization exists to improve operating conditions for the “business league” of depositing members as a whole, rather than for the benefit of any specific class of members or any particular business model. Crossref’s stated mission, from its original certificate of incorporation, reads: “To promote the development and cooperative use of new and innovative technologies to speed and facilitate scientific and other scholarly research.”
When operations got underway, Crossref’s immediate purpose was to provide the infrastructure for persistent cross-publisher linking, from references in scholarly articles to their source documents. Crossref was the first DOI (Digital Object Identifier) registration agency, achieving persistence in linking through a Handle System registry that stores DOI-URL pairs. The DOI for a work is effectively both its identifier and its permanent address, since the URL can be updated as needed by the DOI’s owner, in a self-service, distributed manner. Participating publishers also deposit rich bibliographic metadata to the Crossref system for each content item registered, and maintaining metadata quality is a key part of their membership commitment. As one measure of how well this model works, 80% of the Crossref records deposited between 2013 and 2016 were updated in some fashion by depositing members.
The Crossref of today, while still serving its original reference linking function, has become so much more, in terms of both the make-up of the membership and the services it provides to the stakeholder community. The number of members has increased over 900-fold since its founding, to 11,000 today. Currently, about half of Crossref’s members pay annual dues to the organization that range from US$275 to $50,000, and the other half fall under the membership of ~50 sponsoring organizations that include INASP, ABEC Brasil, the Ontario Council of University Libraries (OCUL), the National Research Foundation of Korea, NEICON (Russia, Belarus), and Relawan Journals Indonesia. Effectively, as the designation “publisher” has itself continued to evolve in the digital realm, Crossref’s membership has likewise expanded and diversified to include a global cross-section of scholarly content providers – including museums, banks, research funders, libraries, individual research groups, and multilateral organizations like the United Nations and WHO – depositing a range of content types. Because the membership elects the Board of Directors, Crossref’s Board is also diversifying (see https://www.crossref.org/board-and-governance/).
As of October 2019, identifiers and metadata for over 109 million scholarly content objects have been registered with Crossref. 73% of these relate to journal articles, 14.4% to books or book chapters, and 5.5% to conference proceedings. Other deposited records refer to datasets, technical reports, dissertations, standards, preprints, research grants, and peer reviews. Reference linking is still the core service, but Crossref now also integrates with the ORCID researcher ID; supports the deposit and retrieval of enriched or updated metadata (Crossmark); facilitates plagiarism screening for interested members (Similarity Check); enables text and data mining services to the community (TDM); is piloting Event Data to support new metrics and other services based on non-member assertions; and offers a Funder Registry. Nearly 47 million deposited works in Crossref also include references, and the organization makes its scholarly metadata available to the broader community through both free public interfaces (e.g., Crossref’s REST API) and paid high-volume access for subscribed organizations.
Why does all this matter? By dint of its vast coverage of scholarly literature, along with its ability both (1) to associate ever richer metadata with any DOI-identified object and (2) to convene – or rally – the scholarly information community around new initiatives, practices, and standards, Crossref is in a truly unique position to “scaffold” enriched representations of digital scholarship. That is, Crossref is better placed than any other organization to support community-driven efforts to improve discovery and navigation, and our ability to capture and assess contributions to science and scholarship. The pressing questions at this juncture, to my mind, are: will Crossref rise to this opportunity; who gets to decide whether or not it does; and do the governance and sustainability models it started with 20 years ago still serve the organization today, and into the future?
I think it’s fair to say that our digital publishing practices today do not take full advantage of the representational enhancements and decentralized authority that the Web enables. Precisely why that may be is beyond the scope of this article, but suffice it to say that academic norms and culture play a significant role, particularly when it comes to how research contributions are evaluated and job promotions are decided. How scholars are typically evaluated today has not, in most instances, caught up with how they are actually conducting and communicating their research on a daily basis. Citation-based metrics on their own are clearly inadequate when it comes to capturing the full picture of a scholar’s impact on their field, their institution, and on society at large (even more so when individuals and institutions are prevented from getting direct access to comprehensive citation data).
What, then, is the evidence base for excellence in the production of knowledge? How best can scholarly publishers of all stripes and sizes coordinate efforts to elucidate the connections among authorship, diverse measures of attention and influence, and academic careers? What can an individual contributor rightfully claim credit for, when scholarship today is increasingly collaborative, cross-disciplinary, and co-authored? What is the rightful role of peer review – so central to academic trust and credentialing – and how can the process be made more reliable and less bias-prone? These are among the most urgent questions the scholarly publishing community is grappling with today, and I believe the best answers will arise through multi-stakeholder coordination, open infrastructure, and standards. I also believe Crossref, given the size and diversity of its members, is well poised to get us farther on these issues, faster.
Crossref’s value proposition for the research community is crystal clear. Widespread adoption has greatly improved online discovery and navigation of scholarly content. I would go so far as to say that Crossref’s success was, if indirectly, a significant forcing function for open access as well. The experience of hitting a paywall after clicking on a DOI-powered link was and is a source of significant frustration for readers and libraries, especially when also encountering high fees for access to individual articles. The lucrative “article economy” envisioned in the 2000s never quite reached publishers’ expectations.
Crossref’s value proposition for its members, though, has indisputably shifted, as scholarly publishing proceeds down the path towards default open access models. In particular, the business benefit of link-based traffic to a purchase or subscription interface is significantly diminished with open access. This raises the question of whether Crossref’s revenue models should continue to be closely pegged to deposit transactions and the volume of publisher outputs. Today, well over half of Crossref’s annual revenues derives from deposit fees – charges for registering DOIs and deposing associated metadata. Membership dues reflect a much smaller percentage of the annual income, and metadata subscriber fees an even smaller proportion.
Although it may appear that the emphasis on transactional, volume-based charges puts the onus – and sense of ownership – on the largest commercial publishers, that isn’t in fact the case. There are only six publishers today within Crossref’s top-tier membership level ($50,000 in annual dues). These are among the publishers with the most content to register, and their overall payments to Crossref accounted for 30% of the organization’s revenue in FY2019. That said, given the sheer number of small publishers, payments from the lowest fee-tier members accounted for 33% of total Crossref revenues during the same period. What is the right balance among revenue sources as the organization looks ahead? I don’t know the answer, but this is clearly the right time for Crossref to be seeking as much community input as possible as it plans for the future.
With the growth of open access, some of the larger and more progressive commercial publishers have pivoted on strategy, and are banking increasingly on their data, technology, and analytics businesses (see, for example, the 2019 SPARC Landscape Analysis authored by Claudio Aspesi). It behooves the leaders of today’s research institutions to explore fully the implications of commercial control of research data, analytics, and infrastructure, along with the potential for community-owned alternatives. The prospect of an open metadata commons for digital scholarship, and open infrastructure for computing over that data, may be less exciting for entities who intend to grow revenues from their technology and analytics products than it is for other publishers, because of how it might compete with their current and future offerings. It would be foolhardy to ignore this fact as Crossref’s membership, staff, and Board work together to help the organization realize its full promise. The Crossref of 2040 could be an even more robust, inclusive, and innovative consortium to create and sustain core infrastructures for sharing, preserving, and evaluating research information.