Editor’s Note: Today’s post is by Christina Drummond, a data governance technology strategist with more than 20 years of experience across scholarly infrastructure, public policy, and mission-driven organizations.

Author’s note: As the successful OAEBUDT data space pilot and governance development project wraps up this Spring, stakeholders are considering how data spaces could more broadly transform operations across our industry. This post summarizes how data spaces have been trialed for policy-aware data flows in news media and scholarly publishing supply chains, providing a primer for a May 12, 2026, BISG/SSP webinar Data Spaces and their Potential for Next Generation Publishing, Discovery, and AI Innovation that is free for BISG/SSP members to attend. Further opportunities for continued learning and engagement are noted at the end of this post.

Introduction

In the age of AI, entities across the scholarly communications landscape require ways to control and audit real-time data flows across national and organizational boundaries. Machine-actionable data governance is emerging as a requirement for legal compliance, risk reduction, and cloud cost controls. Data spaces are a way for publishers to make legal agreements and licenses machine-actionable across their supply chains, as many commercial organizations outside of our industry already do.

The concept of “data spaces” has emerged in the past decade to describe a specific type of community-based data governance infrastructure that allows networks, supply chains, and consortia to produce sensitive multi-enterprise data products and services. In this post, I introduce how emerging data space standards facilitate interoperability and federation across decentralized communities so they can securely exchange sensitive or proprietary data across supply chains, competitors, and business partners.

Text snippet image that reads: "Data spaces are equal parts participant network, community-governance, and data governance technology built upon global, industry, and data agnostic standards for interoperability. Unlike other types of data collaboratives, data spaces do not require a central copy of everyone's data. Instead, data spaces make it possible for each organization to control and monitor direct, granular data access in line with precise, machine actionable licensing."

By automating organizational handoffs while avoiding the storage of sensitive data in centralized data brokers, data spaces can expedite the dynamic generation of products and services that combine data across organizations and nations with diverse privacy, security, and data policies. When implemented across an industrial supply chain (like news media and automotive manufacturing), data spaces make it possible for digital products or services to be dynamically generated from data resting within an organization’s control.

Organizations join data space networks to directly configure, manage, and observe their live data connections with other organizations, thereby reducing and replacing the amount of data out of their direct control, while gaining auditable data provisioning controls and new observability that can support provenance reporting.

Controls to Manage and Observe How Your Organizational Data is Reused

In the AI economy, everything is data: from the digital versions of cultural and scientific artifacts we steward to the operational data we generate during research production, publication, discovery, preservation, access, and reuse. While organizations may leverage MCP servers as gateways to provision internal data to AI, data spaces solve a different data governance and legal compliance challenge. They reduce risks associated with data duplication and unauthorized reuse by enabling policy-authenticated data connections and computations across networks.

By going beyond identity-based authentication, data spaces make it possible for organizations to granularly control and monitor HOW the data under their care is used over time. Contractual terms become dynamically enforced, machine-actionable, and observable. Supply chain stakeholders jointly define industry-specific “data connector” templates for operational use cases, such as distributed usage data aggregation or content fact-checking. When an organization joins a data space, they have multiple templates available to connect data, configure metadata and licensing terms, and initiate auditable data sharing. Data spaces also provide an environment for organizations to engage with data marketplaces and trusted app stores to enable autonomous “compute in the cloud” workflows that can accommodate privacy enhancements or data standardization to streamline operations further.

Data space participants must be verified organizations with legal authority over the data and services they connect, so they can be held accountable for their actions within the data space. Governance authorities steward these data space networks to vet and onboard participants, administer contracts and finances, and resolve barriers to trusted, sustainable operations. Similar to participation in other types of data collaboratives, organizations benefit when a critical mass of their partners use the same data space to systematically streamline data normalization, improve metadata, and increase data discoverability. What’s different is that data spaces provide standard data governance controls for organizations seeking machine-readable and actionable licensing terms, while generating a digital data sharing trail for provenance and audits.

From European Origins to Global Pilots

Data spaces facilitate the secure flow of sensitive and nonsensitive data across jurisdictions and systems while preserving user control. The 2020 European Data Strategy laid the foundation through research and development (R&D) of Common European Data Spaces across nine sectors, ranging from health, agriculture, and cultural heritage, with the dual goal of increasing data availability for the economy and society, while keeping the companies and individuals who generate data in control.

An initial €2 billion infrastructure investment aligned with emerging regulations like Europe’s Data Governance Act, which regulates “data intermediaries” and the reuse of public and protected data, and Data Act, which seeks to unlock the value of growing volumes of industrial data. Global, regional, and industrial nonprofit consortia emerged to steward data space development, such as GAIA-X, to foster pan-European interoperability, and the International Data Spaces Association to facilitate global standards development, alignment, and adoption across industries and enterprises large and small. The resulting data space framework now underpins emerging ISO standards that define required data space components, interoperable data connectors,  and decentralized verifiable claims.

Instead of a single switchboard or hub, the R&D produced technology-neutral minimum standards and reference architecture so that decentralized communities can build fit-to-purpose data spaces that could federate for interoperability, easing organizational participation. Over 235 data space efforts emerged to explore data connectors for specific use cases within data space participant networks across industries and sectors. Fully operational, mature industry-focused data spaces provide inspiration, like the Catena-X network that enables data products and services that combine proprietary data across car manufacturers, suppliers, and their downstream business partners in North America, Europe, and Asia.

Resources like the IDSA’s Data Space Rulebook and DSSC’s Maturity Model assessment tool help stakeholder networks ensure interoperability when setting up a data space. Programs in Europe,  Australia and Canada directly support regional data space pilots, while data space support hubs and competency centers help enterprise networks onboard across four continents. Cloud computing hyper-scalers, such as AWS and Microsoft, have technical resources for data architects, while an Interest Group at the Research Data Alliance is forming to “share lessons learned while adapting, adopting, and piloting the international dataspace protocol and its associated governance and technical building blocks for academic and industrial data sharing.”

Scholarly communications stakeholders have already piloted a data space proof of concept. The OA Book Usage Data Trust (OAEBUDT) effort developed a minimum viable, extensible open-source data space in partnership with Think-It. They piloted it with a governance authority hosted at OPERAS-EU, in line with a community-developed data space participation rulebook and accompanying legal agreement. The initial OA book usage data focused data connector aimed to simplify the aggregation and benchmarking of distributed usage data. It was piloted by JSTOR, Michigan Publishing, Ubiquity Press/University Press Library Open, LibLynx, Punctum Books, and Knowledge Unlatched. 

Lessons learned signaled that a sustainable data space for scholarly communications would best incentivize participants to join if data connectors made simple what otherwise was complicated, or if data spaces streamlined sought-after connections with critical information nodes. However, the focus on book usage was too narrow to sustain a data space for its stakeholders, prompting the April 2026 sunsetting of the OAEBUDT governance and brand so that a broader scholarly communications data space effort could form in its place.

A visualization of how the OAEBUDT Data Space pilot supported usage data flows via a proof of concept usage data connector. Image Source: Ricci, L., & Clarke, M. (2025). OAEBUDT Return on Investment Case Study Report. Pg 6. Zenodo. https://doi.org/10.5281/zenodo.17860210
Figure 2: A visualization of how the OAEBUDT Data Space pilot supported usage data flows via a proof of concept usage data connector. Image Source: Ricci, L., & Clarke, M. (2025). OAEBUDT Return on Investment Case Study Report. Pg 6. Zenodo. https://doi.org/10.5281/zenodo.17860210 

Additional European data spaces related to rich content range from the Trusted European Media Data Space for news media, which piloted data connectors for content verification, audience analytics, and content marketplaces; to the European Language Data Space (LDS) that supports a marketplace for language corpuses; and Europeana’s Common European Cultural Heritage Data Space that aims to provide a digital gateway to Europe’s cultural artifacts.

How Data Spaces Work

Like all great infrastructures, data spaces are meant to run in the background behind dashboard interfaces for participating organizations. But how do they work under the hood?

At their most basic, data spaces make it possible for data providers and their recipients (or consumers for monetized assets) to clear, log, and dynamically allow or deny access and use over time based on the verified identity of the requester, PLUS a verified match between the access request and the licensing terms tied to the data in a given connector.

Organizations can serve in multiple roles in data spaces, such as:

  • Data Providers who configure and manage their organizational “data connectors” to provide policy-based data access tied to specific licensing terms and use cases.
  • Data Recipients, also known as Data Consumers, who configure connectors for specific use cases to manage data requests and receipt across partners in line with their licensing terms.
  • Data Space Service Providers who offer shared services in a data space to enhance functionality and introduce economies of scale – from privacy-enhancing encryption to other data processing, identity authentication, and transaction logging via clearinghouses.

Pre-configured for specific business use cases, data connectors in data spaces offer organizations machine-actionable templates to configure when sharing data under specific terms. Once activated by a data provider and data recipient, a data connector ensures that requested access matches licensing, while ensuring a minimum level of data quality, or processing and logging the process. When trusted data processors provide services through a data space, participants can run third-party apps while the data is in transit, providing further efficiencies while increasing transparency around data provenance and transformation.

Graphical representation of a data space, where organizational data stewards can use their data space interface to monitor and manage the receipt of data into their organization, and configure and advertise specific data offerings in line with licenses. Verified matches between requested terms and data offerings can become automated in mature data spaces, so data transmits as soon as there is a verified policy match.

Organizational data stewards can use their data space interface to monitor and manage the receipt of data into their organization, and configure and advertise specific data offerings in line with licenses. Verified matches between requested terms and data offerings can become automated in mature data spaces, so data is transmitted as soon as there is a verified policy match.

A Data Space Governance Authority coordinates activities, processes agreements, and manages financials for data space operations. The ecosystem bridging organizations that play this role steward trust in the data space over time, managing sustainable operations, service providers, technical roadmaps, and disputes across data space participants. To ensure interoperability with other data spaces, such authorities leverage the IDSA Rulebook and certify their use-case-specific data connectors. They can also maintain a data space-specific participation rulebook to clarify how their particular data space functions as both a participant community and technical infrastructure (e.g., the OAEBUDT data space participation rulebook).

Graphical representation of data space governance. Data space developers across industries and nations ensure interoperability through component certification, global standards alignment activities, and a detailed reference architecture model that describes shared attributes, roles, functions, semantics, and processes.

Data space developers across industries and nations ensure interoperability through component certification, global standards alignment activities, and a detailed reference architecture model that describes shared attributes, roles, functions, semantics, and processes. What started as Design Principles evolved into the International Data Spaces Reference Architecture Model (IDS‑RAM), which has evolved over five versions to define how data spaces are implemented in practice. The IDSA and Europe’s Joint Technical Committee (JTC 25) ensure standards alignment for these assets in collaboration with the Eclipse IEEE and W3C, introducing ISO/IEC standards like:

In sum, a data space is a trusted network of verified organizations that exchange data under shared governance rules and technical standards. Unlike centralized data platforms, a data space federates infrastructure, so each participant maintains control over its decentralized data by configuring automation for dynamic access management. While APIs and MCPs operate at the data layer, when routed through a data space, their access becomes dependent on tokens issued only when policy-based requests align with license terms, enabling automated, rules-based data sharing with greater assurance than identity-based trust alone. Emerging connector standards support a growing range of usage policies, allowing providers to tightly condition or restrict how data is accessed and used. When a data request meets such conditions, data flows.

Shaping Data Spaces for Scholarly Communications

Publishing is no different than other industries seeking to unlock the value of data assets, yet it remains unclear how data spaces will evolve to serve scholarly communication. Among the questions still to be answered:

  • What licensing or data sharing and use agreements should publishers make machine‑actionable and auditable?
  • What types of service innovation might emerge if data spaces unlock real-time analytics across publishing and discovery layers?
  • Which organizations are best situated to host this next-generation infrastructure?
  • How will research and discovery stakeholders sustain the multiple data spaces likely to emerge?

There are multiple opportunities to learn, contribute, and shape the emergence of data spaces in our industry.

So, how will your organization launch into the universe of data spaces?

Acknowledgements: This post would not have been possible without: 1) the many research teams, working groups, and users advancing data spaces globally, 2) the advisors continuing to shape the scholarly communications data space efforts, and 3) the Mellon Foundation for supporting the OAEBUDT effort that brought together university presses, commercial publishers, libraries, content aggregators, data space experts, and European infrastructures to begin developing governance mechanisms for a scholarly communications data space and a first data connector for distributed, open usage. I gratefully acknowledge the following peer reviewers of this post who helped me ground the complex world of data spaces for scholarly communication:  Peter Potter, Kevin Hawkins, Yannick Legré, Jennifer Kemp, Sharla Lair, and Wendy Queen.

Christina Drummond

Christina Drummond is a data governance technology strategist with more than 20 years of experience across scholarly infrastructure, public policy, and mission-driven organizations. Over the past five years, she facilitated community development of governance mechanisms and MVP infrastructure for the OA Book Usage Data Trust’s data space pilot. She is a member of SSP, RDA, IDSA, IAPP, and the Mid-Ohio Regional Planning Commission’s Regional Data Roundtable and co-chairs the Research Data Alliance Interest Group on Data Spaces in the Research Ecosystem. Equipped with an MA in International Science and Technology Policy and IAPP certifications in Information Privacy and AI Governance, she’s actively looking for ways to continue developing data spaces for research and education networks

Discussion

Leave a Comment