The Scholarly Kitchen

What’s Hot and Cooking In Scholarly Publishing

  • About
  • Archives
  • Collections
    Scholarly Publishing 101 -- The Basics
    Collections
    • Scholarly Publishing 101 -- The Basics
    • Academia
    • Business Models
    • Discovery and Access
    • Diversity, Equity, Inclusion, and Accessibility
    • Economics
    • Libraries
    • Marketing
    • Mental Health Awareness
    • Metrics and Analytics
    • Open Access
    • Organizational Management
    • Peer Review
    • Strategic Planning
    • Technology and Disruption
  • Translations
    topographic world map
    Translations
    • All Translations
    • Chinese
    • German
    • Japanese
    • Korean
    • Spanish
  • Chefs
  • Podcast
  • Follow

The State of Scholarly Metadata: 2023

  • By Jamie Carmichael, Jessica Thibodeau, Roy Kaufman
  • May 9, 2023
  • 6 Comments
  • Discovery
  • Infrastructure
  • Metrics and Analytics
  • Open Access
  • Technology
  • Tools
Share
Share
0 Shares

Editor’s Note: Today’s post is by Jamie Carmichael, Jessica Thibodeau, and Chef Roy Kaufman. Jamie brings nearly 20 years’ experience in publishing to her current role as Senior Director, Information & Content Solutions at CCC. Jessica is Senior Director, Information and Content Solutions at CCC, responsible for the strategic direction of CCC’s Ringgold portfolio and go-to-market efforts for products and services across the scholarly publishing ecosystem.

As scholarly communication rapidly adapts to seismic shifts in open science, technology, and culture, a renewed focus has emerged on metadata and persistent identifiers (PIDs) — about people, places, and objects — as an essential component of a vibrant industry. At the US policy level alone, leveraging metadata to accelerate industry transformation is a common theme across the Nelson Memo and recent Requests for Information from the NIH and the Department of Transportation.

Scholarly research is complex and interconnected; change in one area can spark improvement or deterioration throughout the ecosystem. By way of example, consider the role of PIDs in open access (OA) funding entitlements. OA management platforms rely on metadata elements, particularly organizational PIDs passed from upstream submission and peer review systems, to automate the process of matching manuscripts with potential funding sources. This typically happens at article acceptance, and increasingly at submission, eliminating manual administration for authors as well as supporting publishers, institutions, consortia, and funders in achieving OA at scale.

In order to perform a health check on organizational IDs, in 2021, we reviewed cross-publisher records of institutional affiliation and/or funder data in our OA workflow tool, RightsLink for Scientific Communications. We discovered that 82% of accepted manuscripts included such data, which was an improvement over prior years. However, these statistics masked an ugly truth; namely that in many cases those manuscripts used institutional email domains as a proxy for funding or discount eligibility instead of a PID. And within the 18% that carried no PID, missed funding opportunities created unnecessary work (and payments) for authors, institutions, and publishers to reconcile retroactively.

Even if CCC — either alone, or with its partners and publishers — were able to close these metadata gaps at acceptance of manuscripts, this is late in the process and advantages of PIDs earlier in the research lifecycle would be lost. Solving for metadata gaps would be more effective in upstream systems of record so the tail doesn’t wag the dog. This is precisely why we’ve encouraged the NIH to consider the grant application process as an early opportunity to mandate PIDs and cascade to other systems underpinning the research lifecycle, for example, Current Research Information Systems (CRISs).

But where to start? Let’s face it, PIDs are a wonky topic and we need to communicate to people who are not naturally interested in the intricacies of, e.g., ISNI and Ringgold. But these people will care if they know that lack of PIDs can lead to lack of funding. In order to break this down, we recently talked with dozens of stakeholders and mapped a range of metadata challenges through an OA lens. We built on an existing body of work to visualize the ripple effect of a fragmented metadata supply chain. The result is an interactive report of the research lifecycle designed to offer everyone a deeper understanding of the state of scholarly metadata in 2023. Though the issues are numerous, they are not insurmountable, and much infrastructure exists to support change.

About the State of Scholarly Metadata: 2023

Working with Media Growth Strategies, we interviewed representatives from institutions, publishers, funders, researchers, service providers, PID providers, and industry associations to capture a broad view of the current state of metadata and PIDs across the ecosystem. We asked questions such as:

  • Who should create and maintain metadata? Where should it originate?
  • What resources do you invest to create, curate, or maintain various types of metadata?
  • What are your biggest challenges when it comes to metadata management and/or use of PIDs?
  • What are the most critical metadata elements?
  • What’s at stake if these elements don’t persist through scholarly communications?
  • Who should own metadata quality and control?
workflow diagram from the report
The State of Scholarly Metadata: 2023 visual report depicts economic and social impact of the fragmented metadata supply chain across the ecosystem.

Here is what they said about the costly implications of metadata breakages and complexities across the research lifecycle:

  • Researchers: There was overwhelming consensus among stakeholders that researchers shoulder a significant administrative burden to assert or re-assert data (e.g., institution affiliation, funder ID), ultimately disrupting and delaying scientific discovery.
  • Institutions: Because of metadata inconsistencies throughout the research lifecycle, institutions deploy labor-intensive workarounds to manually reconcile funding eligibility and APC billing, as well as normalize unstructured data across disparate systems for comprehensive analysis.
  • Funders: Missing metadata (e.g., registered grant DOIs, institution affiliation) makes it difficult and costly to link funding to research outputs, presenting potential barriers to open access uptake, problematic impact tracking, and incomplete analysis to inform future investments.
  • Publishers: Metadata breakages interfere with business transformation initiatives, contributing to high operational and opportunity costs and complicating fulfillment of open access agreement terms and analysis of deal performance to inform future decisions.

Many stakeholders we interviewed recognize that new metadata strategies, inclusive policies, and a robust framework of interoperable systems are essential for modernizing this element of scholarly communications. It’s also clear that an ecosystem-wide commitment to improving data quality across all groups will facilitate the transition to open while helping to preserve research integrity, expand discoverability, and improve impact measurement. If the industry works collectively to shrink these gaps by reexamining metadata policy and practice, stakeholders will undoubtedly feel less pain. Or, we can continue the current system of entropy, friction, and frustration. Together, we can decide our path.

Share
Share
0 Shares
Share
Share
0 Shares

Jamie Carmichael

Jamie Carmichael brings nearly 20 years’ experience in publishing to her current role as Senior Director, Information & Content Solutions at CCC. In this position, she owns go-to-market strategy for the company’s open scholarly communications portfolio.

View All Posts by Jamie Carmichael

Jessica Thibodeau

Jessica Thibodeau is Senior Director, Information and Content Solutions at CCC, responsible for the strategic direction of CCC’s Ringgold portfolio and go-to-market efforts for products and services across the scholarly publishing ecosystem.

View All Posts by Jessica Thibodeau
Roy Kaufman

Roy Kaufman

Roy Kaufman is Managing Director of both Business Development and Government Relations for the Copyright Clearance Center (CCC). Prior to CCC, Kaufman served as Legal Director, John Wiley and Sons, Inc. He is a member of, among other things, the Bar of the State of New York, the Author’s Guild, and the editorial board of UKSG Insights. Kaufman also advises the US Government on international trade matters through membership in International Trade Advisory Committee (ITAC) 13 – Intellectual Property and the Library of Congress’s Copyright Public Modernization Committee in addition to serving on the Board of the United States Intellectual Property Alliance (USIPA).

View All Posts by Roy Kaufman

Discussion

6 Thoughts on "The State of Scholarly Metadata: 2023"

Couple of reflections:
1. Wow! Don’t APCs add significant complexity (and therefore cost) to an already complex system; no wonder it’s painful for stakeholders.
2. This model is for research that ends up published in journals. It would be super interesting to see the equivalent for research that’s published as grey literature (where the current state of metadata is, to put it politely, in a real ‘state’, aka a typical teenagers’ bedroom).

  • By Toby Green
  • May 9, 2023, 9:44 AM

Thanks for this, Roy – I look forward to digging into the report itself. In the meantime, I wanted to add that, while it’s easy to see PID mandates as part of the solution, they only work if it’s easy for everyone involved to comply – and if everyone can see the benefits. At the moment, neither of those things are the case across the board – for all sorts of legitimate reasons, including not enough integrations with key systems, lack of consistent outreach to researchers, limited understanding/support for/investment in PIDs at the leadership level, associated metadata that is poor/missing/incomplete, etc. I fully agree that registering PIDs and adding metadata as early as possible in the process is the right way to go (and I believe ARDC’s Research Activity Identifier, RAid – see https://ardc.edu.au/services/ardc-identifier-services/raid-research-activity-identifier-service/ – will be a big help there). But I think our first step should be to focus on making PIDs work better and ensuring they are easier to implement for everyone!

  • By Alice Meadows
  • May 9, 2023, 10:01 AM

So sorry- I should have said many thanks to Jamie and Jessica too!

  • By Alice Meadows
  • May 9, 2023, 10:16 AM

Thanks very much for this feedback Alice and for your contributions to the study early on. We hope this work can help support discussions focused on ways to make PIDs work more effectively and always welcome feedback to that end.

  • By Jessica Thibodeau
  • May 9, 2023, 5:08 PM

Fantastic work done by the CCC team, in highlighting the persistent bottlenecks we all need to address to make research discoverability and for the potential of the scholarly body of literature and data to actually be searched through meaningfully for knowledge building as well as transfer towards societal benefits, implementation, and product development where applicable; let alone increase efficiency in the publishing process, of course, which will liberate funds to where they are desperately needed.

Referring back to my previous TSK article on mapping OS resources (https://scholarlykitchen.sspnet.org/2023/04/19/guest-post-mapping-open-science-resources-from-around-the-world-by-discipline/), of which open infrastructure and PIDs, of course, are integral components – as is this tool mapping metadata gaps (https://kumu.io/access2perspectives/open-science#disciplines/by-os-principle/the-state-of-metadata) -, see an overview of all kinds of research-related and research-applicable PIDs at https://kumu.io/access2perspectives/open-science#disciplines/by-resource-type/persistent-identifier?focus=1

  • By Jo Havemann
  • May 9, 2023, 12:56 PM

Thanks so much Jo. We look forward to continued discussions on these important topics. And kudos on the great work you are doing to map the open science resources.

  • By Jessica Thibodeau
  • May 9, 2023, 4:43 PM

Comments are closed.

Official Blog of:

Society for Scholarly Publishing (SSP)

The Chefs

  • Rick Anderson
  • Todd A Carpenter
  • Angela Cochran
  • Lettie Y. Conrad
  • David Crotty
  • Joseph Esposito
  • Roohi Ghosh
  • Robert Harington
  • Haseeb Irfanullah
  • Lisa Janicke Hinchliffe
  • Phill Jones
  • Roy Kaufman
  • Scholarly Kitchen
  • Alice Meadows
  • Ann Michael
  • Alison Mudditt
  • Jill O'Neill
  • Charlie Rapple
  • Dianndra Roberts
  • Roger C. Schonfeld
  • Avi Staiman
  • Randy Townsend
  • Tim Vines
  • Jasmine Wallace
  • Karin Wulf
  • Hong Zhou

Interested in writing for The Scholarly Kitchen? Learn more.

Most Recent

  • Guest Post — Beyond Efficiency: Reclaiming Creativity and Wellbeing in the Age of AI and Scholarly Publishing
  • Guest Post — Public Access to the Endless Frontier
  • Language Evolves, or rather, Constantly Cooks New Ways to Pass the Vibe Check

SSP News

Thank you to our 47th Annual Meeting Sponsors!

May 19, 2025

Get Your Tickets to the EPIC Awards!

May 14, 2025

Get Ready for SSP 2025: Innovation, Swag, and Scholarly Networking!

May 13, 2025
Follow the Scholarly Kitchen Blog Follow Us

Related Articles:

  • lightbulb idea concept Better Metadata Could Help Save The World!
  • Aerial view of crowd connected by lines Making the Case for a PID-Optimized World
  • chalkboard drawing connecting time (a clock) and money (a dollar symbol) Why PID Strategies Are Having A Moment — And Why You Should Care

Next Article:

Is the Essence of a Journal Portable?
Society for Scholarly Publishing (SSP)

The mission of the Society for Scholarly Publishing (SSP) is to advance scholarly publishing and communication, and the professional development of its members through education, collaboration, and networking. SSP established The Scholarly Kitchen blog in February 2008 to keep SSP members and interested parties aware of new developments in publishing.

The Scholarly Kitchen is a moderated and independent blog. Opinions on The Scholarly Kitchen are those of the authors. They are not necessarily those held by the Society for Scholarly Publishing nor by their respective employers.

  • About
  • Archives
  • Chefs
  • Podcast
  • Follow
  • Advertising
  • Privacy Policy
  • Terms of Use
  • Website Credits
ISSN 2690-8085