Guest Post: Eight Hypotheses Why Librarians Don’t Like Retrieval Augmented Generation (RAG)

Editor’s Note: Today’s post is by Frauke Birkhoff. Frauke is a subject librarian at University and State Library in Düsseldorf, Germany, working on the library’s information discovery services. The article reflects her personal experience and opinions.

Twelve years ago, German librarian Anne Christensen presented eight hypotheses why librarians don’t like discovery layers. Today, another paradigm shift in search is happening, and we can and should learn from our previous discussions about the evolution of search technology. Libraries have a long history of having to reimagine themselves and their services in the face of technological advancement. With the advent of artificial intelligence (AI) search and the use of retrieval-augmented generation (RAG) in most commercially available search engines right now, we must start the conversation on how these tools can and should be used in library search interfaces. A holistic approach to this process also includes discussing why we might feel some discomfort with their introduction.

Starting in 2022 with the release of ChatGPT and other AI tools that quickly added RAG to their tech stack, user search behavior has undergone a rapid shift. Being able to state a search query in natural language and get in return a definitive-seeming answer with the citations to match seems to be confirmation of what decades of user studies in search behavior have found: Users prioritize quick, convenient answers, ideally reliable but, if necessary, merely supportive of their argument. Access barriers, such as paywalls, remain a major frustration.

In came tools from outside library-land such as perplexity.ai, Elicit and many others that threaten to yet again make library search interfaces less relevant for our users. And then there are the commercial solutions offered by library vendors such as Primo Research Assistant that promise libraries the ability to evolve their existing discovery layers to satisfy user demand. Clearly, the answer to making search more attractive in 2025 seems to lie in combining information retrieval algorithms with a large language model (LLM) and an outside database. This approach promises fewer hallucinations than querying a non-RAG-chatbot, however, it is still far from free of quality assurance or bias issues, as a recent evaluation of Scopus AI has demonstrated [Editor’s Note: see also this later update on Scopus AI].

There is reason to believe that in the next few years, discovery layers as we know them could be displaced by RAG-based search interfaces. Librarians currently working on discovery layers should get acquainted with these tools and ask themselves and their institutions how they should implement them for their communities, be it by using a commercially available tool or by investing in the development of an open-source version. Crucial to the success of these efforts, however, is the acceptance of the librarians implementing, maintaining and using these tools.

Let’s look at the perception of discovery tools by librarians, as described by Christensen in 2013, and see what we can learn from the past and the lessons for librarians as AI-enabled search comes to the fore.

1. They are too much work.

Discovery in its current form is still a lot of work, especially when it comes to maintenance and making sure that the delivery side works. While next-gen link resolvers make it somewhat easier to provide full-text access to users, library ecosystems still haven’t changed that much and a considerable amount of time and work goes into maintaining and fixing access issues.

What about AI search assistants? Library vendors market them as a plug-and-play solution, similarly to discovery systems. However, the delivery-side of things, i.e., the link to the full-text article referenced in the generated answer, is just as complex as it has been in the past. And new issues appear. Content providers are starting to silo their collections in order to restrict other RAG-based tools from accessing their content.

An example: in the case of Primo Research Assistant, collections from APA (and others such as Elsevier and JSTOR) are excluded from result generation. This would need to be explained to students and faculty using the tool, which adds considerably to the time and energy put into the communication needed to make these tools worth their licensing cost. It can reasonably be assumed that almost all content providers are going to invest in their own AI assistants or make licensing deals with existing ones. How many of these can and should we license and maintain? Librarians working on discovery layers should start making plans now for identifying the tools that best serve their community and how their workflows need to change.

2. They were not our idea in the first place.

RAG was developed by researchers of the Facebook AI Research team and other scientists. Discovery layers were inspired by web search and first sold to us by library vendors, although the community quickly came around and developed great open-source solutions like VuFind and K10plus-Zentral.

Library leaders should start working on identifying strategies to educate their employees on AI and RAG right now, if they haven’t already, making sure that these efforts are focused on enabling librarians to understand, evaluate and teach AI search tools with all their pros and cons. This means we can’t solely rely on education by library vendors. No commercially available AI assistant should be licensed without buy-in from the librarians needed to maintain it and those who teach students how to use it.

3. and 4. Strange things happen to our metadata and communicating these strange happenings is hard.

Transparency is a core value that many librarians subscribe to. Many of us also favor explainability, especially when it comes to the systems we are supposed to troubleshoot for our users. When discovery layers were introduced, many librarians took issue with the fact that relevance-ranking-algorithms made it harder to explain to users how search results are ranked and presented.

AI-assisted search is, as of now, rarely transparent or easily explainable. What we need, then, is the courage and skillset to play a part in the development of tools that try and solve these issues. Yet, Christensen identified a major issue here that has not been resolved yet: Librarians and IT people have a hard time communicating about library tools. These barriers cannot be overcome by integrating IT courses into librarian undergraduate or graduate programs, since time is simply too short to make experts out of us during the time it takes to complete the average MLIS degree. On the other hand, recruiting IT experts into libraries is nigh impossible here in Germany and elsewhere in Europe or the US, due to very limited pay. German universities offer two graduate degrees in Bibliotheksinformatik which generally put their focus on combining extensive computer science courses with library science coursework. While it’s too early to tell if these degrees may have a lasting impact on our communication issue, they are one possible solution.

5. They mess with the concept of the catalog.

The concept of the catalog goes out the window when implementing AI search, since it focuses on providing users with answers, not holdings overviews. Most libraries completely overhauled their search interface when they implemented a discovery layer and integrated both library collections and the discovery index into one system. Should the argument be made then that, with the change in search behavior, we should fully embrace RAG and let go of the concept of a tool that details the whole of a library’s collection?

No, we shouldn’t. Delivery and access to the information that RAG-based search tools pull upon are still not resolved and won’t be resolved anytime soon. Quality issues and bias, as well as issues with hallucinations are not resolved entirely. We still need tools that give us a complete overview of a library’s holdings and they should be made publicly accessible. It is our job to make sure that the current hype around AI search does not lead to us abandoning tried and true solutions for our users. Both tools have their use cases. It is on us to identify them and implement the right approach for each.

6. and 7. They are hard to use in reference interviews and they make users dumb and lazy.

Reference interviews as we knew them are hardly relevant in 2025. However, we should not underestimate one major drawback of AI search assistants: The dilemma of the direct answer. First described by Potthast et al. in 2020, it is described as “a user’s choice between convenience and diligence when using an information retrieval system.”

AI search assistants force this trade-off onto their users by default: Should a user accept the answer generated and quickly move on, working from the assumption that it is correct, or should that user invest considerable amounts of time and effort into making sure that the answer the tool has generated is correct? With how fast-paced academia is and how convenient these tools feel, we need to ask about the impact the tools we offer have on research quality.

Also, most AI assistants give us little insight into how documents are chosen to generate the answer to a search query or how the answer is generated. We as librarians should make sure that this trade-off is something we make clear to the users of our systems. We should use our information literacy efforts to present students and faculty with this dilemma and work on developing mitigation strategies. But we should be clear-eyed about the efficacy of these interventions. These tools are both inherently convenient and failure-prone, which is not optimal in academic research. We need a thorough discussion in academia on how to mitigate these drawbacks.

8. They cost us our jobs.

While I doubt that AI search will cost librarians their jobs right now, fewer available jobs in libraries are on the horizon. Or, on a more positive note, maybe we aren’t asking the right questions. What job skills are needed to provide high-quality, librarian-approved AI search tools? What does the future workday of a systems librarian or a subject librarian look like when she is developing or using AI search tools?

Christensen was right when she wrote that we needed to talk more about the discomfort of librarians with discovery layers in order to make them successful. We should start talking about it when it comes to AI search now, even before many libraries have actually implemented these tools. We could start by developing guidelines and checklists that help libraries decide if they should invest in a commercially available tool. We should coordinate the development of an open-source version AI search interface for libraries. And last but not least, we should equip teaching librarians with the tools to translate the benefits and drawbacks of these tools to students and faculty.

Frauke Birkhoff

Frauke Birkhoff is a subject librarian at University and State Library in Düsseldorf, Germany, working on the library’s information discovery services.

Discussion

5 Thoughts on "Guest Post: Eight Hypotheses Why Librarians Don’t Like Retrieval Augmented Generation (RAG)"

I don’t find this post to be entirely fair or representative of the views of the librarians I know. The librarians pictured here are narrow, regressive and self-interested.

The parallel with the reception of discovery layers is interesting, but they (and web search more generally) hardly match the impact of the potentially epochal global paradigm shift of AI, particularly if future generations are conditioned to depend on AI as the source of all knowledge. There are many legitimate reasons why librarians (and educators more generally, to say nothing of citizens or individuals) might be skeptical of or even opposed to such a shift, and are concerned about how their adoption of RAG-based tools would contribute in a small way to a truly dystopian future. I wouldn’t expect this post to detail all of those, but an acknowledgement that they exist and should be taken seriously would have been nice to see included.

By Matthew Goddard
May 8, 2025, 11:59 AM

I completely agree with you. For many librarians, it’s simply the fact that it’s difficult to learn to work with any new tool or set of tools that is changing so rapidly when we are often short staffed. I’m sure that other libraries have done what mine has, which is to have a group of librarians who are interested in this area, work together to create presentations for the rest of us to attend to learn about these tools and also to teach them to our users. I hear complaints – I sometimes grumble myself – but it’s generally due to the stress of the moment rather than a complete refusal to learn something new.

By Carol Shannon
May 8, 2025, 2:36 PM

Matthew,

thank you for your comment. It was not my intention to portray librarians as narrow, regressive or self-interested. I agree that there are legitimate reasons to be very skeptical of the tools on the market right now. Which is why I think we really need a targeted strategy for implementing them (or deciding against implementation, in some cases) in order to serve our respective communities best. However, that also involves acknowledging the possible strengths that RAG has in making library search tools more aligned with user behavior.

By Frauke Birkhoff
May 8, 2025, 3:30 PM

Here is a real-world reason why librarians shouldn’t trust RAG being used with their discovery system. The discovery tool I use has an AI search assistant and librarians found that queries asking the AI search assistant for information about the Tulsa race massacre were retrieving 0 results, yet results were being found if the AI search assistant wasn’t being used. This was replicated across different institutions and with different types of searches.

The discovery system’s response: “After diving into this report, we better understand that the [AI search assistant] may not return results for certain terms or topics. This is due to safeguard policies enforced by our AI service provider to support ethical and responsible AI use.” Here’s a link to the safeguard policies: https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/content-filter?tabs=definitions%2Cuser-prompt%2Cpython-new#content-filter-types

So essentially, the AI/RAG system is policing the searches and determining whether or not to return results. As a librarian, wouldn’t you be worried about this? Are the vendors aware of this? Do they want their content being suppressed because an AI tool thinks the topic is too controversial or inappropriate? How does this help scholarship? This alone should cause librarians and others to be cautious when using gen AI and RAG.

By Liz Suelzer
May 8, 2025, 12:57 PM

Liz,

thank you for your comment. To be clear: this is worrying to me and I think we badly need a more thorough discussion about the impact these tools have on scholarship. Critical librarian voices are extremely helpful in this.

However, as I said in the article, in order to make our discomfort / caution productive, we also need more technical knowledge in the field about these tools in order to be able to aid development of better AI-assisted search (since it’s not going to go away). And, as your case highlights, we need to be very careful in selecting the tools we make available to our users at this point in time. Which is why I think open source development is going to be crucial in the next couple of years, so that we can offer services that do not run afoul of our values.

By Frauke Birkhoff
May 8, 2025, 3:06 PM

The Scholarly Kitchen

Guest Post: Eight Hypotheses Why Librarians Don’t Like Retrieval Augmented Generation (RAG)

1. They are too much work.

2. They were not our idea in the first place.

3. and 4. Strange things happen to our metadata and communicating these strange happenings is hard.

5. They mess with the concept of the catalog.

6. and 7. They are hard to use in reference interviews and they make users dumb and lazy.

8. They cost us our jobs.

Frauke Birkhoff

Discussion

Announcing Our 2026 New Directions Seminar: “What Is a Journal in 2030?”

Ten Community Perspectives Celebrate Digital Preservation, the Second Recipient of the Rosenblum Award for Scholarly Publishing Impact

OMB Proposed Rule Town Hall: Summary, Board Outcomes, and Community Resources

1. They are too much work.

2. They were not our idea in the first place.

3. and 4. Strange things happen to our metadata and communicating these strange happenings is hard.

5. They mess with the concept of the catalog.

6. and 7. They are hard to use in reference interviews and they make users dumb and lazy.

8. They cost us our jobs.

Frauke Birkhoff

Related Articles:

Next Article: