Scholarly publishing is under pressure. Submissions keep rising, integrity threats keep evolving, reviewer capacity hasn’t kept up, and peer review struggles to pace demand. Output and submissions have grown rapidly — ~897,000 more indexed articles in 2022 than in 2016 (≈5.6% average annual growth) — driven by broader global participation, cross-disciplinary collaboration, faster research tooling, and a rise in low-quality research or fake work from paper mills. At the same time, AI is moving from clever prototypes to workflow partners — more knowledgeable in narrow domains, better at reasoning, increasingly multimodal, and capable of agentic work.
The opportunity isn’t to bolt AI features onto old systems. It’s to rethink and redesign for human–AI collaboration, shifting from rule- and experience-based workflows to data- and agent-driven processes with guardrails that are reliable at scale. These shifts touch not only technology, products and operation, but also vision, strategy, and culture.
What does that look like in practice? Here are some of my thoughts.

Preprints: From Repositories to Research Accelerators
Preprints are increasingly central to open science and author-centric publishing. Many services — such as arXiv and openRxiv — still rely on volunteer expertise, donations/memberships/grants, and minimal infrastructure: fine at modest volume, fragile at scale. Sustainability now depends on automation that reduces operating expenses while preserving trust. Trust also wobbles when low-quality or AI-generated submissions spike. arXiv, for example, will no longer accept non-peer-reviewed surveys or opinion pieces. Discovery feels dated when the default is “search, download, read” in a world where natural-language questions should yield instant synthesis.
The pivot: treat the preprint as the point to enrich, verify, and enable discovery, not just host. Translation and summarization widen access. Automated or community quality and integrity checks should attach to the record (not live in inboxes) and travel downstream to speed journal triage. Auto journal suggestions and structured handoffs can make “find a home” far smoother. If we capture consistent metadata and formats such as JATS XML at ingest, we stop paying the “formatting tax” later.
Preprints are also ideal for community engagement, which is essential to open science. Soundness, novelty, and reproducibility ratings — paired with lightweight commentary — help editors and readers separate signal from noise. Add reproducibility capsules (data + code + environment) that can be executed on demand. Beyond that, platforms can host interest groups, vendor sandboxes where tools are demoed and researchers can try them, cross-disciplinary collaborator matching, and trend analytics surfacing emerging hypotheses and research objectives. Given more synthetic content and rising value in AI-generated research, clearly labelled tracks and dedicated spaces for AI-assisted or AI-generated articles help readers evaluate merits without confusion.
All of this works better with clear policies (including AI disclosure) and resilient platforms that scale with usage, turning the preprint into a research accelerator, not merely an archive.
Submission: Beyond UX to True Throughput
Many submission systems have been modernizing UI/UX but still layer it onto traditional processes, leaning on editors and reviewers to catch issues later. We can do better. Picture a 90-second pre-submit copilot: it checks scope fit, writing quality, and compliance with journal guidance so authors can strengthen submissions before editors see them. Later, editors can review cleaner, better-documented manuscripts. This is a design choice, not science fiction.
What helps? Interactive guidance with concrete examples (especially around ethical AI use and disclosure); early screening baked into submission rather than a later add-on; and precise, up-front determinations for transformative-agreement (TA) eligibility, so billing, waivers, and compliance don’t trigger week-long email chains; Rejections will still happen, but auto-transfer to sibling journals preserves goodwill and retains submissions that might otherwise leave the portfolio; Just as with a preprint, it will benefit by generating standard formats (JATS XML) if not available to maintain a single source of truth for downstream rendering such as web, PDF, EPUB, and assistive experiences; Provide training for authors on ethical AI use and accurate, transparent disclosure. Together, these moves make authors feel seen and supported.
Screening & Integrity: From Reactive Detection to Proactive Assurance
Integrity work has matured from single checks to multi-signal, multi-stage screening. Yet it remains fragmented and reactive, and false positives — especially around “AI-generated vs. AI-polished” — waste time and erode trust. Chasing “AI content” is likely the wrong target.
A better target is assurance: do authors (and editors and reviewers) follow clear policies; is provenance captured; are required disclosures present; do artifacts line up (text, images, data, code, and video); and can we show our work?
That implies an operating model:
- Policy clarity. Allow legitimate AI use; require disclosure by authors, reviewers and editors; and provide tools for automated compliance and disclosure checks.
- Prevention beats detection. Adopt and embed industry initiatives — e.g., Google’s SynthID, Microsoft’s Project Origin, and ORCiD’s Trust Marker — to prevent problems at creation rather than relying on passive detection.
- Tool orchestration. Replace static checklists with a universal integrity toolkit and let AI select the next best check based on patterns and correlations in current signals (text similarity, image artifacts, data anomalies), moving from reactive detection to predictive, automated pipelines.
- Common taxonomy and benchmarks. Establish a shared, evolving, comprehensive vocabulary covering 80+ integrity cases to improve communication and monitoring (current efforts e.g., the Committee on Publication Ethics (COPE), United2Act, cover only some key issues.) plus independent public leaderboards to evaluate the 50+ integrity detection tools on the market so customers know which solve which problems.
- Further improve solutions. Use public or internal, human-verified screening reports and communications to continually enhance AI-powered detection.
- Ecosystem roles. Governments set policy and governance; institutions educate, incentivize good behaviour, and accept consequences for misconduct; publishers prioritize quality over quantity and publish integrity metadata as first-class information.
Shift the center of gravity from “did we catch the bad thing?” to “did we build a system that makes the bad thing unlikely — and obvious — if it occurs?”
Peer Review: Human Judgment with AI Speed
Peer review remains the backbone of quality, but it’s slow and capacity-constrained. Beyond AI powered reviewer finders and screening, there are many places where AI can reduce overhead and speed the process while keeping people in charge.
At the start of review, AI can verify that referenced datasets exist and are cited correctly. During review, it can surface related literature and check references. After review, it can help polish feedback and synthesize reports, so editors see where opinions converge or diverge — and why.
As AI becomes more knowledgeable and capable of reasoning, it will better assess novelty, methodology, and originality. Policies should make room for human + AI co-review: what’s allowed, what must be disclosed, and how to maintain confidentiality. Training helps reviewers and editors know when AI adds value and when it doesn’t, so judgments focus on the research not the tools.
Looking ahead, agentic, reproducibility-focused review will be feasible: controlled reruns of code and analyses in sandboxes to validate claims when appropriate. Messages and emails can be personalized and automated, with automatic status tracking that triggers corresponding actions, reducing effort and improving the user experience for editors, authors, and reviewers at both submission and peer-review stages. Over time, anonymized past reports can help models reason more like expert reviewers without diluting human judgment or blurring responsibility. Reviewers should be recognized and rewarded via services/technology like Web of Science Reviewer Recognition Service, Reviewer Credit, but also through policy changes, career advancement criteria, and research tracking systems that capture and credit review work. We’ll see multi-stage review expand, not only during peer review, but also open/community review at preprint and post-publication stages.
Production: Automation with a Human Hand on the Tiller
Production vendors are automating copyediting, typesetting, and proofing with AI. Publishers are also evaluating and developing in-house solutions to reduce costs and vendor dependencies. The key is structure first with AI taking the first pass and humans handling QA.
In practice: Review the standard formats generated at preprint/submission to maintain a single source of truth and focus human attention where model confidence is low or style choices matter. To avoid homogenizing author/brand voice through generative tools, apply style assistance with guardrails and require author sign-off on any substantive language change.
Richer outputs are now practical at scale: multilingual editions with human spot checks; accessible formats and alt text; lay summaries and highlight sections; graphical abstracts with SME review; and fine-grained nanopublications for specific claims and methods. Each becomes a trust signal for readers and a visibility signal for machines, crucial for downstream dissemination and discovery.
The north star is simple: AI handles the heavy, repetitive passes; humans make the nuanced, accountable decisions- measured by lower error and rework rates, faster time-to-publication, and demonstrably accessible, machine-readable content.
Publishing Platforms: From Reader Sites to Machine-First Evidence Layers
Discovery is drifting from publisher websites to “super apps” like ChatGPT and other assistants. Serving only human readers is no longer enough; machines are now major consumers. That suggests a shift toward a multi-layer model:
- An API-first content layer as the single source of truth — versions, expressions, corrections, retractions — each state with durable identifiers.
- A shared, multi-tenant reader app handling commodity UX, content management, and discovery needs across content sets.
- A smaller number of bespoke brand sites where differentiation matters, powered by customized design and solutions.
New services naturally fit the platform: post-publishing integrity checks as a final gate before public release, and on-demand content bundling such as related items + auto-generated summaries driven by platform demand or user queries.
Success metrics will evolve. Beyond CTR and pageviews, we’ll prioritize machine-readiness and AI attribution: structured data, embeddings, provenance and license clarity, reference completeness, and semantic coverage. On-site assistants are getting cheaper and easier to enable but mostly defend engagement; new value will come from services, licensing, and interoperability as well as from being the place that trusted answers remain anchored.
Dissemination & Discovery: From Clicks to Visibility
Audience behaviour is changing fast. McKinsey reports ~44% of users already prefer AI search as their primary source, while AI still drives <1% of referral traffic today. Traditional channels retain credibility, but they lack speed and personalization. In the new environment, broaden formats (expert and generalist), translate generously, and listen for live impact signals from the open web. AI can also connect questions and methods across fields, boosting interdisciplinary reach.
The bigger shift is mental: from a click economy to a visibility economy. In that world, GEO (GenAI Engine Optimization) complements SEO.
- SEO earns rankings and traffic by optimizing structure and keywords.
- GEO earns inclusion in answers by improving contextual fit, semantic coverage, and machine-readable signals.
Practically, that means structuring key-point blocks and direct answers; linking entities (ORCID, ROR, grant IDs); exposing schema.org; making data/code/licensing explicit; and tracking AI usage metrics (mentions, answer inclusion rate, sentiment/stance of mentions, time to first AI citation), not just human clicks. COUNTER-style reporting will eventually include AI consumption.
Trusted, unique information that answers real questions remains a goldmine in a GenAI world crowded with derivative content. Such information is invoked more often by AI systems and embedded in answers more frequently, strengthening GEO. Well-structured, trustworthy, and unique content doesn’t just get read; it gets used repeatedly. That’s the new multiplier.
The Path Forward
We’re moving from human-only workflows to human-centred ones with AI support, and onward to human + AI collaboration on both publishing and research workflows. All these changes above need be based on a new publishing workflow/infrastructure which is designed as open, interoperable and increasingly autonomous, while still under human governance and remaining reliable, safe and economical at scale. It will result in publishing richer, deeper, wider knowledge (not content) that serves both readers and machines. Done well, AI will amplify human expertise with the speed, context, and consistency the system needs -without replacing the people that makes scholarship work
Disclaimer: A proprietary AI tool assisted in polishing the post, with all facts verified by the author.
Discussion
8 Thoughts on "Reimagining Scholarly Publishing Workflow: A High-Level Map of What Changes Next"
Thanks Hong, an excellent summary of a fast moving scholarly technology world.
Thanks Adrian. Indeed “fast moving scholarly tech world”.
“The opportunity isn’t to bolt AI features onto old systems. It’s to rethink and redesign for human–AI collaboration…” Amen!
Thanks Thad. It is time to rethink what human and AI roles in these workflows and what new roles might be.
Nice writeup! We should have improved workflows to reduce the pain (and expenses) in publishing. Many people look at AI as strictly a threat, and maybe it is for some people, since the underlying promise is that AI will reduce the need for a lot of human labor. Tools that reshape society can be very disturbing to some, but there is also opportunity to utilize AI in a way that benefits society in other ways.
One comment you made caught my eye: “If we capture consistent metadata and formats such as JATS XML at ingest, we stop paying the “formatting tax” later.” One problem with this is that author’s don’t create JATS (or XML). We solved this problem (in STEM) by capturing the metadata directly from the LaTeX source written by the author. That way we avoid paying the formatting tax later, because the author’s LaTeX is used to encode the metadata and produce the final published version. We have a writeup about this at https://arxiv.org/abs/2504.10424
Of course it would be nice to have an automated system to automatically extract the metadata in a structured format, but the metadata world is already awash with a lot of bad data. We have to set very high standards for the output of such tools if we are to ever overcome the bad metadata problem.
Thanks, Kevin, for your comment — totally agree. I think AI-powered automatic metadata extraction and reformatting with human verification is the way to go. Preprints could be a great place to trial this first, with authors and editors verifying at submission, and production doing a final check before publication.
I scrolled to the bottom halfway through because I was sure it was written with an LLM (“the pivot:”, “This is a design choice, not science fiction”) and sure enough… The facts might be verified but it is clear that there is a risk of conformity and creeping mediocrity. Too bad the tools seem to accelerate that tendency but when the context is industrialised academic publishing, I guess that is just where we are.
Thanks for reading and for the feedback. To clarify: the ideas/thoughts, structure, and examples come from my own hands-on experience in publishing; as noted at the end, I used a proprietary tool only to polish the blog, and I verified all facts myself. You’re right that some phrases (“the pivot…”, “this is a design choice…”) can sound LLM-ish. I chose them for readability, not because a model “wrote” the post.
Ultimately, we should judge the piece by its meaning, arguments and evidence, not words. If the substance misses or wrong, that’s on me. If the language gets in the way, I’m happy to improve it. Appreciate you holding us all to a high bar.