Editors Note: Today’s post is by Alexander Naydenov, Co-founder and Head of Marketing at PaperHive; and Heather Staines, Director of Business Development at Hypothes.is. (Full disclosure, both parties work for organizations that provide annotation services for scholarly communications)
Annotation is coming to scholarly content, but there are key choices to be made that will dramatically affect the collective outcome we achieve.
Digital annotation is not a new idea. The first conceptions of what would become the internet imagined it as enabling a much more interactive experience over scholarly content, known as the “read-write” web which many have lamented that we never achieved. Since 1993, when Mosaic first briefly experimented with native annotation, dozens of projects have tried without success. There are many reasons for this failure: a lack of standards, an unwillingness to adopt proprietary systems and centralized implementations, poor user experience, and slow browsers among others.
Approval of open annotation by the W3C as a web standard in February 2017 changed everything by establishing a foundation upon which interoperable systems could be built. (Nearly all widely adopted technologies rely to some degree on standards, for instance browsers, email, cellular networks, etc.) The existence of an annotation standard, along with FAIR (Findable, Accessible, Interoperable, and Reusable) principles, and Interoperability in particular, will finally make the widespread uptake of this technology possible in a scholarly context in a way that protects against vendor lock-in.
Some skeptics are quick to point out that publishers and media sites have tried to implement comments on their sites with the results ranging from low use, to incivility and trolls, to spam. In other words: “Haven’t we tried this already”?
Advantages of annotations over comments
There are some essential differences between comments and what is possible with the architecture of web annotations:
- Annotations can be public, private, or in various types of collaboration groups, while comments have only one option, public. Because we’ve only known public comments, we tend to assume that the most public form of engagement is the same for annotations too. However, public annotations represent only 25% of the activity of annotators. Annotations made privately are also indications of engagement, providing metrics on activity on parts of documents and timelines for such attention. The ability to form ad-hoc groups opens annotation up to diverse use cases from personal organization, to use in the classroom, among research teams, and many other applications.
- Comments have only one motivation, discussion. The W3C model allows for users to add a motivation to their annotation. You could imagine “commenting”, “correcting”, “questioning”, “classifying”, or “tagging” among others. While these have not been implemented in any systems we’re aware of yet, they hint at the much broader potential for use cases)
- Comments are stranded upon the individual pages where they are created. There is no way for readers to get access to all of their comments without returning to the pages in question — comments in a sense belong to the publisher that implements them. Annotations by contrast are owned by the author, and they can be browsed, searched and shared with others, giving readers and researchers the ability to organize notes and discover feedback generated by others from across the web.
- Annotations can syndicate across formats (HTML, PDF, EPUB) and even across platforms, so readers need not worry that a key conversation is taking place on another version.
- Comments are in a jumble at the bottom of the page. Annotations are in-line associated directly with text, connecting the reflections to the sentence or phrase in context. This means you can annotate precisely, and for more kinds of reasons as you move through a text.
- Annotations can thus serve as direct links that can take a visitor right to a passage, automatically scrolling the document however many pages down it is.
- Annotations can be created or retrieved through an API. This means that annotations can be made by machines, in specialized group layers for all kinds of purposes, including informative tags sets, biocuration, translation, correction and retraction alerts and more.
The promise of open annotation
What about bad behavior?
If publishers or other sites are deploying annotation across their content, they want to rest assured that the annotations created there do not detract from the quality of their content. They also want to make sure that the workload of moderation doesn’t overwhelm already burdened editorial staff.
While tool creators typically monitor the public layer, following up on any moderation flags that users click, publishers can also create groups for which they manage the moderation tools. Community and publisher guidelines detail expectations around user behavior. Improper annotations can be hidden from public view and repeat offenders may have their accounts suspended. Tool creators are also exploring the future utilization of sentiment analysis to identify toxic annotations or monitoring user behavior to identify quality contributors.
There is a wide range of use cases for annotations in academia and e-learning.
Personal uses include note-taking on documents to organize thoughts and ideas, to emphasize important information for later review, to add related information like images and links.
Private group annotations streamline the communication and the collaboration efforts of research groups. Teams use annotations to get a better understanding of a text by asking questions about it, and discussing its strengths and weaknesses. They can organize their takeaways from literature by sharing new insights in context, by pointing at relevant paragraphs or contributing related artifacts. New manuscripts can be improved by proofreading and reviewing.
Public annotations benefit the research community by improving the exchange between readers and authors and by keeping research information relevant and up to date. Readers can ask the community or the author of a text for clarifications, research data, or experiment protocols. They can underline the merits or limitations of research findings and add their own contributions and findings, creating a network of connected knowledge. Authors and editors can enrich literature by adding corrections, updates, and recommendations precisely where they belong.
E-learning is another field benefiting from in-document discussions. Annotations on lecture slides and textbooks are used to make university lectures more interactive, to power distance learning, to encourage students to help their peers, and to themselves contribute to the content. Integration of open annotation tools with Learning Management Systems can simplify instructor workflow, course interaction, and assessment.
Lastly, peer review involving annotations is more granular and detailed, which often results in improved quality. Annotation technology can be useful in both traditional pre-publication peer review, as well as post-publication open or community peer review.
Annotations and FAIR
Scholars and research organizations are increasingly interested in the FAIR guiding principles for scientific data management and stewardship: the requirement that these should be Findable, Accessible, Interoperable, and Reusable. Annotations themselves are data and should be FAIR. Annotations make scholarly content FAIR by adding searchable metadata and links.
To be Findable, annotations should have persistent unique identifiers which are included in their metadata and registered or indexed in a searchable index. To be Accessible, annotation metadata should be retrievable with a standardized communications protocol which is open, free, and universally implementable. That metadata should remain available even if the data is no longer available. Interoperability requires a formal, accessible, shared, and broadly applicable language, vocabularies that follow FAIR principles and qualified references to other metadata. Finally, to be Reusable, metadata should include a plurality of accurate and relevant attributes with a clear and accessible data usage license, and detailed provenance. Even annotations that are not made for public consumption benefit from being FAIR. Robust machine readable metadata renders annotations more able to be found, accessed, and reused by their creator or their collaboration group. Annotations originally made for private purposes, such as in peer review or in journal clubs, may well be made visible later. Further, interoperability is a key aim of the W3C Web Annotation group, so that annotations made with one tool can be interacted with by those using other tools, irrespective of their level of visibility.
Steady progress is being made to make annotations FAIR. “Crossref’s newest content type in our metadata store ensures that scholarly discussions such as annotations are easy to find, cite, link, and assess — all core characteristics of the FAIR principles. For those not registered with Crossref, our Event Data service ensures that annotations which are made on content with a DOI are also included in the Crossref scholarly research map,” notes Jennifer Lin, Director of Product Management at Crossref.
Interoperability is the key
Of all aspects of FAIR, Interoperability might be the most important. Interoperable, standards-based annotation will allow researchers, students and readers to read and respond to each others’ annotations even if they are using different platforms and clients — in much the same way that email works today. Interoperability of annotation tools should also allow users to port their data from one tool to another or to archive their annotations securely for use later in another context. Most importantly, interoperability is a safeguard against providers who would try to lock-in users to a specific implementation, or worse, to a monolithic service.
The Annotating All Knowledge Coalition, free to join, was formed in 2015 to bring together interested publishers, universities, and technology organizations to realize an open interoperable annotation capability within the scholarly world. Today, members are exploring the use of multiple tools from a user experience perspective, with the goal of someday achieving true interoperability.
What will it take to get there?
A healthy, robust interoperable annotation ecosystem, delivering upon all of the promise detailed in the use cases above, will require the participation, effort, and resources of many players. Tool creators should build in accordance with the standard to enable interoperability for users and partners, put control of annotation data in the hands of its creators through APIs, and avoid creating new proprietary silos. Through this, publishers can avoid a vendor lock-in and be free to migrate from one system to another. Annotation enthusiasts should focus on the new standard and insist on FAIR annotations in keeping with the increasing focus on openness and transparency across scholarship.
Interest in annotation continues to grow. Recently celebrating its sixth year, the I Annotate meeting gathers those interested in open annotation to explore use cases, assess industry developments, and demo new integrations. Videos from the event are available on YouTube. More panels at industry events are focusing on the possibilities around annotation in researcher or publisher workflow. It’s one of the featured topics at the upcoming Altmetric 5:AM Conference.
Annotation, whether it is public or private, can serve as a valuable metric for engagement with documents and parts of documents. If you’d like to join in the discussion about open interoperable annotation, we welcome your feedback.