Editor’s note: Today’s post is by Chef Angela Cochran and Todd Toler, Group VP of Product & Market Strategy at Wiley.

Sometime last fall, the leadership of a mid-sized physics publisher was approached by a company that wanted to license their content for use in an AI system. The company wouldn’t say who it was. It wouldn’t say what the content would be used for, or by whom, or under what conditions. It just wanted access. They didn’t have a framework for evaluating the offer. No internal policy, no precedent, no rubric. The publishers told them no, obviously. But the interaction lingered. It was like being asked to give up your copyright by someone who wouldn’t tell you if they wanted to share your story or silence it.

This was one of 14 conversations we had between December 2025 and February 2026 with publishing leaders — society executives, commercial publishing directors, consultants — about where they stood with artificial intelligence (AI). We didn’t set out to produce a survey or a framework. We set out because we were curious, and maybe a little restless. We’d both been thinking about this privately for months, accumulating observations from our respective corners of the industry, and at some point, one of us said to the other: we should just call people and ask.

Overhead view of a diverse crowd walking across a bright blue surface interlinked with large white gears. The mechanical cogs symbolize teamwork, human systems, and the power of collective collaboration. Ideal for business, HR, and innovation themes.

The range of what we heard was striking. At one end, a senior executive at a large commercial publisher told us that AI work was literally keeping him up at night. The reason wasn’t anxiety. He had to review AI licensing contracts in the evenings because his day job was still making ScholarOne work. At the other end, a consultant described European editorial board members who wanted their scholarly association to ban AI in all its forms and issue position statements opposing it entirely, and who greeted outside strategic advisors with the warmth of “who are these wankers in suits coming in to talk business-speak to us?” 

In between were people who were thoughtful, conflicted, and almost universally under resourced. The publishing head of a large engineering society told us his biggest fear wasn’t that AI would steal his organization’s business but that it would “do a really nice simulation of what we do” — ruining the basis of the market without actually competing in it.

We are not arms length observers. Todd works at Wiley; Angela is vice president of publishing at the American Society of Clinical Oncology (ASCO). We’ve both worked on AI strategy, licensing, and editorial policy from inside publishing houses. What we brought to these conversations wasn’t neutrality. It was familiarity, and increasingly, our own confusion.

The more people we talked to, the less certain we became about the conventional wisdom. The industry’s dominant AI narrative — protect the content, do a training deal, pilot some integrity tools, wait for the courts — started to feel like it was organized around the wrong questions.

What follows is our attempt to say what we think the right questions are. We’ve organized it loosely, by theme rather than by respondent, and we’ve let people speak for themselves where their words were better than ours. That happened often.

One logistical note: this will run in two parts. The first, which you’re reading, covers the business landscape: licensing, revenue, demand, and the distinction we think matters most. The second, forthcoming, covers the people and the machines: organizational readiness, peer review, and where this is all headed.

Weights vs. Context

The distinction that matters most isn’t the one we’ve been arguing about.

For two years, the industry has been fixated on whether tech companies should pay to train their models on copyrighted content — and the flip side, whether publishers should take the check. It’s a debate that generates lawsuits and Napster analogies at board meetings, but it misses the strategic point. The mental model behind it is at once accurate and increasingly irrelevant. The machine eats the content, stores it in its weights, and regurgitates it for answers. Fine. But that’s not how these systems do their most meaningful work.

Steven Heffner at IEEE gave us the sharpest version of what’s at stake. The article, he said, persists as the basic unit of scholarship because it captures something irreducible: “an inquiry, an experiment, a report, an interpretation, and a communication of a moment in time.” When you reduce that to vectorized math in a neural network’s parameters, the ideas survive but the argument structure dissolves. 

“It’s a castle of sand you can’t build upon afterwards,” Heffner said. You can’t issue a correction to knowledge that’s been distilled into mathematics, refreshed once a year by a handful of the richest companies in the world. One publisher we spoke with called this “copywashing,” the intellectual equivalent of money laundering, where the signal survives but the source becomes untraceable.

But here’s the thing: the AI companies largely agree. Their business imperative is ramping up capacity and holding margins on inference, while still investing enough in R&D to have competitive models. Anthropic CEO Dario Amodei made this point explicitly in a recent interview: scaling laws are a business constraint, not an IP land-grab strategy. The distinction is worth maintaining.

They are training on a wide corpus so that the model can reason, not with the goal of faithfully regurgitating it. They want models that can think, not recite, and 300 billion parameters of verbatim text is a lot of cognitive baggage to host and serve.  As an analogy close to home for Todd: walking around with every lyric of Billy Joel’s “Scenes from an Italian Restaurant,” the complete saga of Brenda and Eddie, encoded in your head, might actually make you a less efficient person.

The models don’t want your verbatim content in their weights any more than you want it there. Andrew Smeall at Sage put the publisher’s side of that logic plainly: “We’re pretty careful not to let them have the front list, the most valuable stuff.”

The flip side of that caution is the question he’s now actively working on: “How do we get the content delivered via things like Wiley Gateway, or should we be building our own version?” — meaning a licensed, metered pipeline that delivers content to agents at the moment they need it, rather than baking it into the model’s weights.

This is the distinction. Content in the weights is a one-time transaction with a depreciating asset: the model that ingested your corpus in 2024 will be superseded by 2026, and you’ll have no claim on its successor. Content accessed as context — retrieved at inference time through what the industry is starting to call tool use, or agent architectures, or model context protocol (MCP) and command-line interface (CLI) endpoints — is an ongoing relationship. The article can be updated, corrected, or retracted. The access can be metered, licensed, or renewed. The source can be cited. Your goal is to be the machine that the agents call back.

When an editor, a board, or an executive says that they want their organization’s content in “all the AI places,” what they mean is that they want exposure and attribution — not to be an anonymous source of knowledge. 

“The agent harness is functioning in the role of the librarian,” one publisher told us. “In the past, you would ask a librarian, which sources should I consult? And the agent should be able to do that.” It can limit its search to authoritative sources in a given field. It can check whether a paper has been retracted. It can do all the things that scholars do when they consult a body of literature rather than recite from memory, but only if the content is there to be consulted and the agent is aware of the pathways and rules of a scholarly search. The librarian of tomorrow will be serving patrons and the agents of patrons.

Not everyone we talked to was persuaded. “What is context now will be training input later,” one publishing director told us. “Is it a meaningful difference?” 

That’s a fair challenge. A licensee with access to your content for retrieval-augmented generation (RAG) today could fine-tune on those same passages tomorrow. The subscribe-to-context model that we are framing here depends on enforcement mechanisms that don’t fully exist yet and on counterparties that may not exist at all in 18 months, given current burn rates in the AI sector. And Smeall himself offered a sharp caveat: for extremely specialized research questions like which neurotoxins bind to which receptors, “you kind of can’t reason about that unless you have that content” in the model itself. The line between weights and context blurs at the edges of deep domain knowledge.

There is, however, a third scenario that complicates the hopeful version of this story, and it’s already happening. Call it “Big RAG”. Companies like Open Evidence aren’t just building AI tools; they’re assembling the context layer themselves, aggregating both the content and the audience into a single platform. 

Open Evidence has licensed journal content from publishers, built a clinical AI interface around it, attracted a critical mass of physicians, and raised capital at a $12 billion valuation — more than double the market cap of Wiley and Springer Nature combined. 

Michael Clarke, a leading advisor to publishers and learned societies, let that number hang in the air for a moment before adding: “for somebody that’s been around for three years.”

This is a network play, not a technology play. 

Open Evidence’s moat isn’t its model; it’s the fact that physicians are already there, asking questions, developing habits, and increasingly being exposed to targeted pharma ads. For the publishers who did deals early (New England Journal of Medicine among them) the arrangement has upside: Open Evidence links directly to the journal rather than to PubMed, driving significant traffic back. 

For publishers who didn’t do deals or whose content was scraped before anyone thought to ask, the dynamic is less friendly. As one publisher at a medical society learned first-hand, some of these companies were refreshingly candid about the terms of engagement. “One of them literally said to me,” she recounts, “‘if you had put a login in front of your guidelines, we wouldn’t have taken them.'”

The risk here is that subscribe-to-context becomes subscribe-to-someone-else’s-context. 

If the aggregator controls the audience and assembles the content graph, publishers become suppliers to a platform, which is a familiar position in this industry, and not a powerful one. 

“It’s just content to them,” a medical publisher observed. “Everything’s flat. There’s no hierarchy.” The “premium” sources are really just defined as the sources that made a deal, not necessarily the highest quality content. The question for publishers isn’t just whether to make content available at inference time. It’s whether they’ll be the ones holding the context or whether they’ll have handed that position to a new flavor of intermediary.

The legal infrastructure for the subscribe-to-context model is catching up — query-based licensing, attribution-preserving APIs, and usage metering at the retrieval layer. The publishers building for that future are building something defensible, even if imperfect. But the window matters. The alternative is to keep arguing about who owes whom for what already happened, while the machines and the aggregators move on without you.

Authority as Product

You can dissolve the content into math. You can’t dissolve the expertise.

As with most things in scholarly publishing, there is no one-size-fits-all with AI use or content demand. Whether there is money in licensing is largely dependent on the discipline and the adoption of AI by the community it serves. 

Computer science content is valuable for training models because there is a strong use case for AI in the development of code. Content that can inform practice is at a premium: the American Society of Civil Engineers (ASCE) publishes the standards that inform civil infrastructure and building projects globally.

Dana Compton, ASCE managing director and publisher, explained where she sees potential for the use of AI. “We see interest from some corporate subscribers to add AI rights for their internal use of content. But mostly our opportunities lie in adding AI features to our existing platforms, particularly standards,” Compton said. Standards are similar to clinical practice guidelines in medicine in that they are evidence-based and expert-developed minimum recommendations for different kinds of civil infrastructure projects in varying conditions. 

In the clinical medicine space, there is a high demand for the clinical guidelines published by societies, as these inform practice, and there is no shortage of commercial tools designed to provide clinical decision support that are now baking in AI features. 

For publishers outside these corridors, the phone is quieter. One specialized scientific society we spoke with publishes nearly 2,000 rigorously peer-reviewed articles a year and hasn’t received a single AI licensing inquiry — not for lack of rigor or volume, but because the commercial use cases driving early deals skew heavily toward clinical decision support and consumer-facing tools. The models want scale, the domain players want a practitioner angle, and not every discipline has both to offer..

These are the opportunities publishers can see and size. We wonder if the demand that will reshape the landscape isn’t coming from specialized tools, but rather from the platforms that already own the audience.

Anthropic, OpenAI, and Perplexity are all building healthcare verticals off the back of their general-purpose large language models (LLMs). They state that they want users to be able to ask detailed medical questions and even upload their medical records in order to get personalized answers. 

Before you ask yourself why anyone would want to do this, know that uploading doctor notes and lab results is already common on these sites. The United States is one of two countries that permits direct-to-consumer pharmaceutical advertising, and decades of it have trained a population to research conditions, ask about specific drugs, and seek medical information without a clinician in the room. The AI health platforms are where those consumers are going now, and the advertising dollars will follow. 

In China, Baichuan didn’t bother with the physician intermediary at all. Its medical AI app went direct to patients from the start, positioning itself not as a clinical-decision support tool but as what the company calls a “health gatekeeper,” and its latest model now rivals GPT-5 on medical benchmarks. 

If Open Evidence is “Big RAG” with the doctor still in the room, Baichuan is what happens when the aggregator decides the doctor isn’t necessary.

Either way, the large LLMs want medical journal content — but not, so far, on context terms. The deals on offer still tend toward broad perpetual licenses for pretraining, reinforcement learning, and general use.

Subscribe-to-context may be where this is headed. It is not where most negotiations start. They’re building domain-specific inference workflows where clinical literature is pulled into context, combined with first-party data they collect, to answer questions on the fly. The largest technology companies in the world are now building the same kind of intermediary platform as Open Evidence. They are aggregating both the content and the audience into workflows that publishers don’t control. 

The difference is scale: when Anthropic or OpenAI builds a healthcare vertical, they’re not thinking about eating Up-to-Date’s lunch and getting all the same publishers officially onboarded. They’re a platform with hundreds of millions of users, and the question of whether your content ends up in their context layer may not be entirely up to you.

The demand landscape for AI rights has, in any case, split into two distinct worlds:

  • Pharma and corporate customers are showing up with checkbooks at the licensing door, eager to secure rights to feed content into their closed discovery systems. 
  • Academic institutions, standing at the subscription door, are largely absent from the conversation, but not for a single reason.

Some simply haven’t noticed there’s a conversation to join. Researchers are uploading PDFs to ChatGPT without any awareness that their library subscription doesn’t cover that use. “It’s so easy to use the tools,” says one Chief Publishing Officer, “that nobody is bothering to ask for the rights. Libraries won’t drive this.” Or, libraries are asking for broad AI rights as part of their subscription spend, slipping new terms into renewal tenders that are easily deflected by publishing house lawyers. 

But underneath the posturing is a genuine strategic collision. Should a library subscription cover AI use?

The question sounds administrative; the implications are structural. Publishers who say no are telling their longest-standing customers that the most transformative application of the content they’ve been paying for requires a separate check, which, to a librarian, feels like paying twice. 

Publishers who have effectively bundled AI rights into the subscription price may end up undercutting their own licensing deals with corporate buyers paying premium rates for the same access. This isn’t a pricing problem. It’s a paradigm problem: the subscription model and the licensing model are converging, and in some cases cannibalizing each other, and most publishers haven’t reckoned with what that means for either revenue line.

And then there are those who have arrived at their position not through ignorance but through a kind of exhausted clarity, people who have watched enough cycles of technological promise to know that the demo is never the product, and that the product is never the transformation. 

One society publishing executive told us her team evaluates every opportunity through a single lens: does it drive scientific excellence for the community we serve? AI licensing, so far, doesn’t pass that test. 

“I don’t feel like we’re there,” she said — not defensively, but with the steadiness of someone who knows what her institution is for. At the further end of this spectrum, one consultant who advises learned societies across Europe encounters active resistance rooted in scholarly identity, finding editorial board members and librarians who view LLMs as fundamentally incompatible with the enterprise of scholarship, unreliable, environmentally costly, and a kind of epistemic contamination. “This is our discipline, this is our turf.” 

And in the middle of all of this, everyone is improvising on price. One publishing leader told us she had changed her pricing model four times in three months. Each potential licensee arrives with a use case you hadn’t thought of, an audience you hadn’t considered, a technical architecture you hadn’t anticipated. “And now we’re back to the drawing board,” as one publishing house leader put it. 

The grammar for these deals doesn’t exist yet: there is no standard for what a RAG license costs versus a training license, no benchmark for how to price API access versus full-text delivery, and no consensus on whether metadata and full text should be priced differently or whether commercial and non-commercial use should be distinguished. Nobody knows what anything costs, and no one wants to set the floor too low.

And yet, for all the chaos, a few publishers are discovering that their most defensible asset was never the content itself. 

Jasper Simons at the American Psychological Association described a new venture called APA Labs that doesn’t license content to AI platforms; it evaluates them. Technology companies building mental health AI tools come to APA for an independent assessment of whether their products meet psychological science standards. “AI providers don’t want their platforms encouraging someone’s child to commit suicide, or the lawsuits that follow,” Simons told us. 

Karla Soares-Weiser, the new CEO of Cochrane, described a similar reckoning: in an AI future, Cochrane’s product can’t simply be Cochrane Reviews. The organization needs to tap into what sits behind the reviews: the methodology, the expertise, the network, the brand. These aren’t content licensing plays. They’re authority plays, and they suggest that what publishers have to sell in an AI economy isn’t just text but judgment. The content can be dissolved into weights or assembled into someone else’s context. The expertise — the capacity to evaluate, certify, and set the standard — cannot.


What we’ve described so far is, in a sense, the good problem. There is demand for scholarly content in AI systems. There is a plausible model (subscribe-to-context) for how publishers can participate in that demand without giving away the store. There are real deals being done, real revenue flowing, real infrastructure being built. The business case, if you squint, is there.

But every publisher we talked to, without exception, said some version of the same thing: we don’t have the people for this. Not the technology, not the strategy–the people. The organizational capacity to evaluate these deals, build these systems, manage these relationships, and do it all while still getting the journal out on time. 

In Part 2, we’ll turn to that problem; the organizational readiness gap, the peer review question, and the uncomfortable realization that the industry may need to transform faster than its institutions were designed to move.

Angela Cochran

Angela Cochran

Angela Cochran is Vice President of Publishing at the American Society of Clinical Oncology. She is past president of the Society for Scholarly Publishing and of the Council of Science Editors. Views on TSK are her own.

Discussion

5 Thoughts on "What Publishing Leaders Say About AI When They’re Not on Panels: A Pulse on All Things AI"

This is a hugely valuable contribution . And it makes clear the rapid leaning curve around AI which all of us have to undertake – every few months. Last month , seeking to focus some of the learning as well as the strategy arguments I started The Coalition of the Curious at Patreon/davidworlock. I actively seek to build a forum there. Thank you Angela and Todd for illuminating just how confused and confusing all of this is to our businesses at present .

This is a very thoughtful piece, and the off-panel candor is particularly valuable. What comes through clearly is that publishers are not rejecting AI outright — rather, they are struggling with the absence of the infrastructure needed to participate in it with confidence.

Across the conversations described here, the same concerns surface repeatedly: visibility into how content is used, the ability to control access to that content, and a fair mechanism for sharing in the value created when it informs AI systems. These are not purely legal or policy questions; they are fundamentally technical ones.

Historically, new digital markets tend to stabilize once the right infrastructure layers emerge. Online payments, for example, did not scale until trusted authorization networks made transactions visible, auditable, and compensable for all parties involved. Something similar appears to be missing in the relationship between AI systems and the scholarly knowledge ecosystem.

At the moment, most AI systems interact with publisher content in ways that are opaque to the very organizations that curate, validate, and invest in that knowledge. Without mechanisms that allow publishers to see when and how their content is used, enforce usage policies, and participate economically in downstream value creation, it is understandable that uncertainty and caution dominate the conversation.

What may ultimately be required is a neutral technical layer that allows AI systems to interact with authoritative knowledge in a **transparent, rights-aware, and economically aligned way**. Such an approach would not simply protect publishers’ interests; it could also help AI developers gain reliable access to high-quality curated content — something that increasingly appears essential as models move into scientific, educational, and professional domains.

If the next phase of the AI era depends on deeper engagement with the world’s validated knowledge, then the scholarly publishing community will play an essential role. The challenge — and opportunity — is building the infrastructure that allows that participation to happen safely and sustainably.

Really great article here dealing with incredibly difficult questions of the day. Meanwhile, the open web continues to shrink and the user attention gets answers from AI rather than from pages read online. Unfortunately, publishers should expect that to continue and accelerate.

Thank you Todd and Angela for naming this so clearly. The weights-versus-context distinction resonates deeply in the archival world, where the stakes of irreversibility run even higher. Once provenance collapses and a source becomes untraceable, you don’t lose a revenue stream — you lose the evidentiary integrity of the historical record itself. The UVA Archival AI Protocol was built around exactly this problem. The ‘castle of sand’ problem is not just a publishing problem. It’s a memory problem. https://doi.org/10.18130/5dqf-9w86

What struck me most while reading this is the disconnect between the People Building AI Tools and the People Producing or Publishing Content that feeds the AI tools. It seems to me those two groups have totally different motivations/goals/measures of what success looks like, which means they are operating on totally different incentive systems, and that is causing challenges that will be complicated to resolve. What is obvious to one group may not be obvious to another group, and that fact doesn’t seem to always be acknowledged or recognized when the two groups are negotiating what kind of relationship they might have. What matters to one group may not matter much, if at all, to the other group. Thanks for a very thought-provoking post!

Comments are closed.