Artificial intelligence is reshaping scholarly publishing at a pace that is challenging many organizations’ ability to respond and set strategy, generating both genuine excitement about what is possible and serious concern about potential consequences.
Todd Toler and I have been in an extended conversation about AI, publishing, and scholarly infrastructure for a long time — the kind of ongoing back-and-forth where you find yourself returning to the same hard problems from different angles and perspectives as you try to make sense of the complexities.
When Todd announced he was joining Ithaka S+R as Practice Lead for AI in Scholarly Communication beginning June 8, after almost two decades at Wiley, I wanted to understand what had crystallized for him, and why this role, at this moment.

Thanks for taking the time to do this interview. You’re joining Ithaka S+R as the inaugural Practice Lead for AI in Scholarly Communication — what will you be working on, at least initially, and why Ithaka S+R specifically rather than a publisher, a vendor, or a trade association?
At a certain point, you start thinking about how you actually want to spend your time. I know I like writing and research more than I like managing large organizations. These questions about AI and scholarly infrastructure need space and concentration that is hard to carve out inside a large commercial publisher, no matter how much freedom you’re given.
Wiley gave me considerable latitude to experiment — to learn AI workflows, to think through the infrastructure problems, to follow threads wherever they led. That eventually led me to pitch this practice concept to Ithaka S+R, as I came to see that it requires more perspective than that of a single publisher. The practice is focused on two things: advisory work with a broad spectrum of publishers, platforms, and infrastructure providers on AI strategy and product architecture, and development of agent-era infrastructure strategies and shared approaches for evaluating whether AI tools are meeting scholarly standards.
What makes Ithaka S+R specifically right for this is its ability to convene all sides of the conversation at once: publishers, libraries, funders, platforms, and infrastructure providers. If the problems I want to work on are genuinely cross-sector — and I think they are — then my position has to be genuinely neutral. Publishers and libraries both trust Ithaka S+R, and that trust is a critical resource.
That’s a really helpful framing. Let’s get more into the specific work you envision. You and I have talked about how a retrieval-augmented AI system decomposes articles into semantic chunks and vector embeddings, and that a core problem is that provenance does not automatically survive this chunking process. Can you unpack that for Scholarly Kitchen readers who aren’t familiar with what happens to scholarly content inside a retrieval-augmented generation (RAG) system?
A RAG system optimized for retrieval efficiency starts by decomposing a corpus of scholarly articles into many smaller semantic fragments — chunks, passages, paragraphs — that can be rapidly retrieved as potential answers to questions. From the system’s perspective, the article itself is no longer the primary unit of meaning. The corpus becomes, in effect, a large pool of candidate answers optimized for retrieval speed and relevance scoring.
That works well if the goal is answering questions quickly — for instance, in an AI chatbot grounded in a particular work. It works much less well when what you are actually traversing is a scholarly evidence network built around citation relationships, versioning, peer review status, retractions, corrections, and disciplinary conventions across multiple sources. Those signals live at the level of the scholarly object and its surrounding context. They do not naturally survive decomposition into isolated semantic fragments.
But the deeper issue is that provenance in an AI environment is not just a metadata problem around these “chunks.” It is really an infrastructure coordination problem. The challenge begins the moment an AI system accesses scholarly content, which can be a PDF, a figure, a supplementary file, or a semantic chunk extracted from any of these. Once that content moves through an agent pipeline, important scholarly signals can easily get detached from the material itself.
Kevin Kelly has a formulation I like: “the facts science calls true are provisional, deemed true by method until we prove otherwise.” That is why this is not just a content problem. Scholarship’s epistemic role depends on preserving the chain of custody around claims: how they were produced, reviewed, corrected, contested, and connected to other claims.
So the real question is: how do we create a trustworthy chain of custody for AI-mediated scholarly retrieval? Part of the answer may be metadata standards that travel with retrieved content. Part may be extending the persistent identifier (PID) infrastructure to do lookups and answer questions like, “Is this DOI real? Is this author real? Is this version current?” Part may be behavioral contracts between the AI agent and the source, for example, expectations that AI agents report retrieval activity or preserve attribution back to the originating source. And part of it will likely require shared evaluation frameworks so publishers, libraries, and institutions can independently verify whether AI systems are actually preserving provenance, version status, peer review signals, and retraction information.
My own view is that what is probably required is some combination of all of those approaches working together. We used to design for a world of readers, and now, I think in large part we’re designing for a world of programmers (because a researcher using Claude Code or ChatGPT Codex is a programmer), or as some are starting to use the phrase, a “computational reader.” Publishers don’t just need to think about normalizing their content; they need to normalize the way their APIs behave as well.
So, that’s clearly a problem, and quite a problem indeed! How are you envisioning potential solutions?
One limitation of the current RAG stack is that it is overfitted to text: retrieve passages, summarize passages, cite passages. But scholarship is not only prose. Evidence lives in figures, tables, datasets, methods, supplementary files, code, images, and instrument data. Scholar-ready AI has to reason over that broader evidentiary environment, not just retrieve convincing text. This is going to require publishers to improve their underlying content models in ways we’ve always known needed to happen, but AI will finally be the catalyst to drive change. Some of the problems are so basic, too. AI agents struggle to access PDFs and extract information from figures and tables, and they burn tokens while traversing authentication infrastructure not built for them. These are just some of the documented failures in current systems.
The hardest problem is driving collective action solutions. Major publishers and aggregators are each building proprietary solutions and evaluation frameworks and defining “quality” on their own terms. Institutional buyers cannot compare tools across platforms or verify the claims vendors make in procurement conversations. The trajectory, without intervention, is a fragmented landscape of siloed approaches, which is exactly the wrong outcome at the moment when shared infrastructure and business model conventions could still be shaped.
Ultimately, the challenge is less about any single technical fix than about establishing shared conventions for how AI systems interact with scholarly content. That includes how agents are granted rights to retrieve content, how provenance and version signals travel through retrieval pipelines, how attribution and retrieval events remain visible across the chain of custody, and how institutional buyers can independently evaluate whether tools are actually preserving scholarly integrity signals in practice.
I suspect the eventual answer will involve some combination of metadata standards, PID infrastructure, standardized retrieval behaviors, provenance payloads, and shared evaluation frameworks working together. The important thing is that these conventions become interoperable across the ecosystem rather than being reinvented independently by every platform, publisher, and AI company.
As challenging as these problems are, it is exciting to think about what’s possible if we can pull together and design this new ecosystem to serve scholars and scholarship. What role do you hope you can play in bringing potential solutions into reality?
Before Wiley, I was a qualitative researcher — leading usability studies and ethnographic fieldwork, mostly for consumer brands. That is actually where I started. In some ways, I think I am returning to that mode more than I am starting something new.
What Ithaka S+R adds to that is a set of research design methodologies that are genuinely exciting for this problem. The cohort study model, where you seek broad participation to pool resources and conduct larger studies with deeper scope, and where participating organizations can conduct their own federated research that rolls up into aggregated findings, is exactly the right structure for trying to build consensus on what “scholar-ready AI” actually requires.
You can’t produce a collective, shared standard by issuing it from the top. You build it by convening the stakeholders around shared empirical work and letting the evidence do the convincing. I’m excited to be at the nexus of bringing these communities together.
There are a lot of other groups working in this space as well. COUNTER is developing AI usage metrics, NISO is working on provenance standards, COPE is working on guidelines, and so on . What other organizations and stakeholders do you envision engaging as you do this work?
STM is one that you didn’t mention but is critically important, particularly as STM Solutions becomes a home for shared publisher technology infrastructure, a model that GetFTR and the STM Integrity Hub have established. If you want provenance signals to travel reliably across publisher platforms, publishers need a place to act collectively to build the necessary infrastructure, and STM is well-positioned for that.
There is a dense and diverse ecosystem here: organizations that register and maintain PIDs, develop and update standards, and coordinate collective action across different use cases. They were all designed for a world where content moved in relatively predictable ways, where you could trace a retrieval event back to a rights framework. In a world where AI agents traverse the scholarly record, those pathways change in ways nobody has fully mapped yet.
Which organizations become load-bearing in the agent pathway, and which need to adapt their models, are exactly the strategic questions I want to concentrate on. I don’t think anyone has a complete answer yet, and I’m skeptical of anyone who claims to have one.
If the coordination infrastructure you’re describing existed and functioned — let’s say three years from now — how would that impact publisher-library relations?
The thing I care most about is the diversity of the journal ecosystem. People focus on the large commercial publishers, and understandably so — but the scholarly publishing industry has an extraordinarily long and diverse tail of smaller and niche publishers, including library publishers. When I was on the Crossref board, the membership was something like 11,000 organizations that were originating scholarly content and minting DOIs, now there are more than 25,000. That is what scholarship actually is: the epistemology of how new knowledge gets created across thousands of disciplines, society presses, university presses, and companies. That diversity is the thing worth protecting.
What I’m hoping for is a world where the business model has been clarified. The most sustainable path for publishers isn’t the one-time value of training data licensing – it’s ongoing subscription access to authoritative content at runtime, where the AI retrieves from the live scholarly record rather than from weights trained years ago. That model requires the provenance infrastructure to work, but it also aligns with what many libraries and researchers actually want: traceable, rights-aware, and attributable access. When the business incentives and the scholarly integrity requirements point in the same direction, the relationship gets easier. That’s what winning looks like to me.
Is there anything I haven’t asked you about that you’d like to add?
Two things I want to make sure don’t get lost.
The first is that I’m not starting from scratch at Ithaka S+R. Roger Schonfeld turned it into one of the most important voices in the publisher-library-infrastructure conversation over many years. Roger is now directing his energy toward JSTOR Stewardship and new product work. These are significant shoes to step into. Roger and I became friends partly because we kept showing up trying to address the same hard problems from different angles, just as you and I have, and the practice I’m building is in real ways a continuation of work Ithaka S+R has already been doing. I’m looking forward to carrying that torch.
The second is something about this conversation itself. Lisa, you spent more than three decades in Illinois libraries building some of the most rigorous library thought leadership, especially when it comes to technology, in the profession — and now you’re moving into an independent practice after you retire later this month. We are arriving here in our second careers more or less simultaneously. I see us as having a lot of important work to do together, and that this is a particularly interesting time. I’m genuinely looking forward to it.
It is an exciting time. I feel like so many of the dreams I had as an early-career librarian about how technology had the potential to enable discovery, advance scholarship, and personalize teaching are becoming a reality. As you mentioned earlier, there is a tremendous opportunity to shape the development pathway. I look forward to keeping up our conversations and seeing what you are able to build in this new role at Ithaka S+R.
Discussion
2 Thoughts on "Building Scholar-Ready AI: A Conversation with Todd Toler"
Thanks for sharing your insights, Todd, really valuable as always. For those interested, STM is currently running a consultation on the Responsible Use of Research Content in Generative AI: https://stm-assoc.org/genai_consult/. Feedback from all perspectives will be very welcome.
Thanks Todd for your thought provoking post. On a personal note, I think one of the most important – and urgent – collective actions all stakeholders could now get behind is to align and collaborate around shared, open and interoperable metadata standards that enable us to trace, track and connect any chunk of scholarly knowledge. This is what the Barcelona Declaration, launched in 2024, is aiming to coordinate. For those wanting more information about this initiative go to https://barcelona-declaration.org/