Many information professionals are concerned about the loss of serendipitous discovery in research pursuits (see this 2015 Kitchen post by Roger Schonfeld.) Depending upon what an individual user knows about a topic when framing their search, our sophisticated systems may either direct the person’s thinking into too narrow a groove — precluding discovery of more loosely relevant items — or inundate the user with too many content possibilities. Online information resources are tightly engineered and so dependent upon well-structured metadata. What’s needed may be a different approach — one that allows the user more latitude in thinking out the scope of the question without becoming too precise.
Yewno is a semantic-analysis engine that was formally launched at ALA this year, although its creators offered some low-key presentations earlier in 2016 at meetings held by SSP and AAUP. The Yewno technology is run across full-text content, with the system creating a matrix of semantic entities found in each document. Yewno uses a mix of computational semantics, graph theory, and machine learning to retrieve relevant documents without reliance on restrictive conventions imposed by external technology or data format requirements. According to Michael Keller of Stanford, this means that Yewno enables searching of ideas rather than specific expressions, such as keywords. The technology is currently in beta-testing and/or trials at eight institutions: Harvard, Stanford, MIT, the University of Michigan, University of California–Berkeley, Stonehill College, Oxford University, and the Bavarian State Library.
There are two panels in the highly visual interface. On the left is the graph or concept map, while the panel on the right is referred to as the context bar. Run a search and the system presents the graph showing orange and blue nodes. The orange nodes displayed represent the central concepts with which the user is concerned (perhaps the proper name of an individual or an umbrella phrase representing a school of thought or a body of knowledge.) That node will be centrally placed within the concept map. Surrounding it will be blue nodes—circles representing concepts that are related (via lines) to the concept contained in the orange node. Double clicking on the blue nodes/concepts adds the specific ideas to the user’s map and enables discovery of further correlations.
Clicking on any concept node brings up a description of the concept in the right-hand context bar. The user’s clicks move through an iterative process that allows the original point of inquiry to be either broadly expanded or narrowly refined, and ultimately the click process yields relevant content.
Sitting in the Yewno booth at ALA, I input one of my standard test queries, first developed when Google and Microsoft began introducing search tools specifically for academic literature, having to do with Jane Austen’s depiction of the clergy in the Georgian Church of England. For the record, the test bed of content that was used for demonstrations at the conference included content from Wikipedia, Springer Nature, and Taylor & Francis, a respectable disciplinary mix. (It’s worth noting that the benefits to publishers of allowing their content to be crawled in this fashion includes not only enhanced discoverability of their content, but also access to useful referral data and metrics.)
Yewno recognized two concepts from my query—“Jane Austen” and “Church,” but did not directly connect the two as I might have expected (Austen was the daughter of an Anglican clergyman). However, Connor Shepherd, Product Lead for Yewno, explained that this might have been due to one of two factors — the specificity of my initial six-word query and the scope of the content collection over which the Yewno was running. The content might not have been sufficient to allow Yewno to be able to capture the relationship, weight its relevance, and build the related connection.
This is not to suggest that Yewno was unable to retrieve appropriate content for me, however. Clicking about a bit brought up more information in that side context bar, specifically a scholarly review of Juliette Wells’s Everybody’s Jane: Austen in the Popular Imagination (Bloomsbury Academic, 2012). The system showed me a “snippet” of that book review with a highlighted passage. Readers might shrug and note that at this point any content platform worth its salt can do that, but there’s a nuance here. The displayed highlight represents what the Yewno algorithms have identified and extracted as an important thesis statement from within the document. The output is drawn from the system’s semantic analysis, and one’s original concept phrases need not appear in the highlighted portion of the snippet. See another demonstration here:
The user can scroll down the right-hand context bar to discover the scope of material that may be immediately available in the local library. (Like all discovery systems, Yewno is not itself a host platform, but ingests content and constructs its matrices based on what the particular institution’s licensed access allows.)
The approach here is to de-emphasize search based on pre-assigned metadata and thereby minimize system influence over the direction adopted by the researcher; instead Ruggaro Gramatica and Ruth Pickering, the founders behind this nascent start-up, hope that their technology will assist students in visualizing the shape and scope of a research topic and aid in the subsequent work of refining the focus of the investigation. Content retrieval might almost be seen as a merely secondary aspect of this discovery tool — it’s a needed outcome, but that outcome is kept in proper proportion.
The two librarians talking about Yewno during the ALA launch session noted that the product is particularly valued as a mechanism for supporting the development of critical-thinking skills. Cheryl McGrath, Director of the MacPháidín Library at Stonehill College in Easton, MA, and Jason Price, Director of Licensing Operations, Southern California Electronic Library Consortium (SCELC), both indicated that in early trials the system is being enthusiastically embraced by undergraduate and graduate students and some faculty advisors.
From my perspective, the central idea behind Yewno is a good one. The intent to enable more fluid forms of discovery for undergraduates need not eliminate professional-grade indexing services or discovery tools, such as those provided by the Modern Language Association, CAS (Chemical Abstracts Service), OCLC, or any commercial player that one might name. Yewno represents an alternate approach that will be suitable for some — but by no means, all — user needs. I recommend that followers of the Scholarly Kitchen take the opportunity to explore the possibilities of the system when a publicly accessible demonstration site opens. According to the discussion at ALA, such a site is due before summer’s end.