Editor’s Note: Today’s post is by Hong Zhou and Sylvia Izzo Hunter. Hong is is Director of Intelligent Services and Head of AI R&D for Atypon, part of Wiley Partner Solutions, where he is responsible for overseeing the implementation of artificial intelligence–driven information discovery technologies. Sylvia is Manager, Product Marketing, Community & Content, at Wiley Partner Solutions; previously she was marketing manager at Inera and community manager at Atypon, following a 20-year career in scholarly journal and ebook publishing.
More than 1 billion people around the world have some type of disability (including visual, hearing, cognitive, learning, mobility, and other disabilities) that affects how they access digital content. No wonder we spend so much time talking about accessibility tools!
Digital transformation can revolutionize the world, turning it into an inclusive place for people with and without disabilities, with accessibility powered by artificial intelligence. This post provides an overview of how AI can improve accessibility in different ways, illustrated with real-world applications and examples.
AI and accessibility: Here and now
Artificial Intelligence is a broad set of technologies in which machines use computational capabilities to “think” like humans. As I have explained elsewhere, AI generates value from big data and delivers it to customers via the cloud. There are many different types of AI tools, each of which solves different problems. While not all the tools listed here were designed to solve accessibility problems, they do provide accessibility solutions!
Computer vision tools, like Microsoft’s Seeing AI, allow computers to gain understanding from digital images or videos, which can help people with vision impairments and those who have difficulty understanding visual content.
Speech recognition tools, like Apple’s Siri, allow computers to understand and create human speech, which can be very useful for people with hearing impairments or mobility limitations.
Knowledge graphs like Google Knowledge Graphs connect different types of knowledge, which helps machines understand the semantic meaning of content rather than just the individual words, and thus allows people with learning disabilities to better understand the content.
Natural language processing tools like ChatGPT by Open AI let machines process and understand text content, not only to improve writing quality but also to identify and extract key information which can be used to automate content processing to avoid human errors.
Information discovery tools like Amazon’s product recommendations target relevant content, products, and advertising to individual customers by understanding their online behavior and interests. This means that the site can display more relevant information to people directly, without additional searching, typing, and clicks.
Artificial Intelligence for Accessibility
Let’s explore some useful and real AI applications to improve accessibility.
The Challenge: Access to text and other visual materials
More than 2.2 billion people globally have some type of vision impairment, ranging from difficulty visualizing content to understanding certain forms of visual content.
AI can enhance or convert content to other formats that are easier to access and understand, including text alternatives, text to audio, and image enhancements.
Text Alternatives (Alt Text)
Alt text describes the appearance or function of an image and can be read aloud by screen readers to help visually impaired users to understand the content of your page. (It can also improve SEO.) AI can automatically generate alt text to describe images. Microsoft Edge, Amazon, and Google Chrome can already auto-generate alt text for many simple images, but they don’t perform well for complex images. However, the latest and future large AI models, such as ChatGPT 4 from Open AI, could understand and describe complex scientific images much better than before, thanks to their multi-modality capability which can understand text, images, and videos together.
Text to Audio
Many computers and other devices have built-in text-to-speech software, and screen-reader software provides important functionality such as navigating through headings.
Microsoft, Google, and Samsung all provide free apps to assist people with impaired vision by narrating the world around them.
In addition to driving all of these tools, AI can even be used to generate audio in different languages and accents!
AI can automatically increase contrast in images or enhance image resolution and quality. This helps users with low vision or poor contrast sensitivity by automatically increasing the contrast or enhancing images’ resolution and quality to improve readability. In addition, some AI solutions can automatically extract sub-figures from a compound figure. Displaying the sub-figures and corresponding sub-captions allows readers to focus on specific images with less distraction.
Segment Anything is an AI solution recently released by Meta, whose main goal was to create a promptable image segmentation model that would work with user input prompt like ChatGPT. The project has three pillars: Task, Model, and Data. The Segment Anything Model (SAM) can be used in applications that require identifying, segmenting, and refining objects within any image. It has the potential to enhance image quality and image discovery.
The Challenge: Access to audio content
Captions and Transcripts
AI can generate captions and transcripts for audio and video content to provide better access for people with impaired hearing or cognitive disabilities—as well as those listening in a second or additional language. Because transcripts are machine-readable and can be indexed by search engines, they can also boost the discoverability of audio and video content.
This is one of the most sophisticated and widely used AI applications in the world right now and is more accurate and faster than a human in many use cases.
It can be difficult to have a conversation in a noisy environment. AI can help people hear more clearly by enhancing speech while suppressing other noise. Check out this demo by Google.
The Challenge: Access with mobility limitations
In the United States, 11% of the population has some form of mobility impairment. For users with limited mobility, dexterity, or strength, it’s important to be able to interact with content in an immersive environment that doesn’t require body movement.
The Solution: Voice Access and Control
Voice access and control powered by AI is an indispensable assistant for people with limited mobility. Google, Apple, and Microsoft provide a voice assistant for device control, allowing people to browse websites, news, videos, and information without touch or body movement.
Voice assistance will play a greater role in accessibility and information discovery in the future. As of 2022, more than 120 million US adults use a smart assistant at least once a month. Recent AI products such as ChatGPT Plugin and AutoGPT are becoming more self-sufficient and can plan, execute, and learn from the tasks by themselves. They can access up-to-date information, call other services, and aggregate results for different purposes. Such AI tools can make voice assistance even more intelligent, meaning that users will no longer need to give exact step-by-step instructions but can simply describe the task, leaving the AI to figure out the necessary steps and get the desired result. Voice assistance will be even more powerful once AI is integrated with robots in the future to provide a variety of assistance and support to people with disabilities.
The Challenge: Boosting Readability and Understanding
Information overload slows down everyone’s ability to make timely decisions. There is just too much to read and understand in our current era of mass information. People with cognitive or learning disabilities or language and memory impairments can find it difficult to read, understand, and process content. (For example, 9-12% of the world’s population is affected by dyslexia.)
The Solutions: Structured, Compelling, and Focused Content
AI can help improve learning and knowledge by automatically generating more structured, compelling, and focused content.
For example, AI can generate images based on text descriptions. Great visuals can help communicate better, engage our attention, and improve comprehension. The two most popular AI Text to Image tools are Dall•E 2 and Stable Diffusion. Unfortunately, AI tools can also be used to manipulate scientific images for publications.To help researchers better and more quickly understand published papers, Wiley (full disclosure: the authors’ employer) has developed an AI service that automatically generates a structural abstract summary containing the key points of each section in the full text. This abstract makes it easier for researchers to understand the high-level information of the paper by reading just a short summary.
Knowledge mining services that take valuable information from customers’ existing content to create more structured content that is easier for people to read helps publishers to identify and generate new monetization opportunities.
The Challenge: Discoverability and Dissemination
While researchers struggle with information overload, publishers are facing content dissemination challenges.
AI can improve the discoverability and readability of content by transforming flat, legacy, image-based PDFs into indexable, searchable, and more researcher-friendly data.
AI can also translate content into different languages to improve its readability and discoverability.
Multimedia Content Search
Multimedia topic searches, image searches, video searches and funder searches are vital for content discovery, but many academic publishers don’t have the metadata available. AI tools can automatically extract relevant metadata from multimedia content to make it discoverable.
AI can make content recommendations more relevant to your audience. We’re all familiar with the “people who read this article also read these ones” model, and this is one of the most common applications of AI. AI can also suggest relevant experts and content based on their expertise and enable personalized news feeds based on individual user interests.
Overall, information discovery is evolving from fuzzy matching to semantic matching, and ultimately to precise info generation. The focus is shifting from delivering relevant results to generating exact answers, with evidence thanks to recent generative AI success. AI is not only significantly enhancing the quality of search results but also introducing new forms of information discovery, such as conversational search and personalized result generation. This will not only allow users to find answers more quickly than ever before but also enable publishers to create and publish original content based on existing materials to boost both reach and revenue.
The Challenge: Errors
Human errors introduced during submission and onward, such as ambiguous author, affiliation, and funder information, can significantly reduce the quality of publication and negatively impact discoverability and readability.
The Solutions: Information extraction and validation
Tools that automatically extract useful information and enable automated validation and submission can reduce data-entry errors.
For example, when peer-review systems integrate AI-powered tools (e.g., ReX Submission, eXtyles Arc), authors no longer need to manually fill in a form and instead just review and verify the data auto-extracted from their uploaded article file. Tools like this not only improve the user experience but also improve accuracy by minimizing data-entry errors. These advanced AI tools can further reduce human errors or improve publication quality by automating aspects of typesetting and copyediting. They can help with formatting while also proofreading for grammar, checking consistency, and ensuring adherence to style guides. This kind of AI tool does not replace human experts, however: AI-generated suggestions must always be reviewed by experienced editors to maintain quality and accuracy.
The future of accessibility and AI
We should all want to make our content accessible to anyone who’s interested in it, but policies are also being implemented to ensure that this happens. The European Accessibility Act 2019 will require a wide range of online content and services to meet accessibility requirements by 2025, and it’s now the private sector’s turn to step up. Moving from accessibility to public access, the Office of Science and Technology Policy’s recent “Nelson Memo” requires free public access to all research funded by the US government, which makes accessibility even more important: content that is open access but does not meet accessibility guidelines is not as accessible to the public as it should be.
Publishers can get started on meeting these requirements by using the W3C Web Content Accessibility Guidelines (WCAG) which provide standard metrics and guidelines for accessibility. It presents a model for evaluating how accessible web content and applications are to people with a wide range of disabilities and identify targets for improvement. Following the W3C accessibility guidelines often improves the user experience not only for users with a disability but for non-disabled users as well.
AI-powered tools are one way to improve accessibility. But creating accessible content and systems requires a collaborative effort that includes publishers, product, tech, design, and other teams. There are many important aspects to consider in this effort, such as web design, UX, regular accessibility audits, and input from editors, authors, and typesetters. With this collaboration in place, integrating tools powered by artificial intelligence can help us create an inclusive place for everyone to access scholarly content.