At ALA this year, I had the happy experience of sitting down in the wrong session but then serendipitously hearing a worthwhile presentation by a search industry expert, Daniel M. Russell. Russell holds the title of Senior Research Scientist for Search Quality at Google and for roughly 13 years, he has actively engaged in educating users in appropriate use of various Google search tools. (One of his recent articles appeared in Scientific American.) His appearance at ALA was to generate interest in his forthcoming book, The Joy of Search (MIT Press, September 2019). I was subsequently lucky enough to receive a set of uncorrected page proofs of that title from Amy Harris of MIT Press.
Russell’s book should be viewed as an educational resource for those about to embark on information literacy instruction. Online search is complex and while explaining GIGO (Garbage in, Garbage Out) seems a rudimentary place to start, it’s still a foundational concept. If you don’t train users about the thought required in framing a research question, best practices in crafting a search query or the value of documenting results derived from that query, they will not do well.
Despite the old warning that only librarians like to search, The Joy of Search does offer a lively means of helping users to develop the thinking skills needed in strategically approaching available tools for solving an information problem. The book consists of 20 chapters, each of which offers examples of complex information-seeking tasks and laying out the various processes used in satisfying the inquiry. These are not just fact-finding examples. Each one required a variety of iterative searches and reliance on different Google tools. (Note: Google Street View and YouTube are used with surprising creativity and frequency.) Chapter headings suggest the variety and off-beat searches that pop into enquiring minds.
- Chapter 11: Can You Die from Apoplexy or Rose Catarrh?
- Chapter 14: What’s the Connection between “The Star Spangled Banner” and the General Who Burned the White House?
- Chapter 16: Is Abyssinia the Same as Eritrea?
Consider what the user might do in answering the question of Abyssinia and Eritrea. Google just those two terms as one might do on impulse and the top results are from Wikipedia. There is no Knowledge Graph card that answers the question definitively. Because user intent is unclear, Google offers alternative questions that the system may have previously answered; the system, focused on saving the user time, offers these questions as perhaps being what the user truly needs to know. (The available suggestions include “What was Eritrea before?” or “How old is Eritrea?” but no immediately digestible answer to Chapter 16’s question.) I was amused by one caution included in the book although it doesn’t appear until the closing pages:
…search engines don’t signal that they lack the knowledge to supply an answer, yet they don’t want to look bad so they give a Web-search set of results instead. That’s a great fallback position, but it’s also an important difference between an answer and a set of search results.
Serious researchers already know that. An individual may need to formulate a new search strategy or even a series of strategies in order to satisfy the particular need.
Reinforcing that lesson is even more important when one considers the rapid expansion (just in terms of volume) of content. In his talk at ALA, Russell noted that during every minute of the day, more than 400 hours of video content gets uploaded to YouTube. There are over a billion viewings of learning-related videos daily. He noted as well the diversity found in content forms, referencing the photography found in Google Street View. He touches briefly on searching inside Google Books (but interestingly does not refer the reader to Google Scholar, a burgeoning source of reliable content.)
While Russell’s accounts of his process are always fun, the bulk of his guidance should qualify as common sense. Scrutinize the content you find. Assess the source’s reliability, consistency and credibility.
Other recommendations are specifically targeted to the younger or inexperienced user, explaining the need to factor in changes in language across time. In pursuing the death rate of soldiers during the Civil War due to dysentery, he notes that a preferred term of the time was “flux” and suggests exploration of archival resources (Library of Congress, Hathi Trust) in order to pick up such shifts in terminology.
Another useful chapter walks the reader through Google Dataset Search. This is a beta phase tool which contains public data sets. Search for African American Population and you get a wealth of economic data from the St Louis Federal Reserve (see example). Searching for something as specific as “Time spent on reading the Bible in the U.S. from 2013 and 2017” yields reliable results from the paywalled site, Statista. On the downside, while Russell notes in later chapters the need for searchers to be aware of coverage and limitations of the resources searched, the FAQ for Google Dataset Search is remarkably vague on both counts. What data has been uploaded there? Only by experimenting with queries would one be able to tell. Questions of scope and comprehensiveness have been a long-standing point of friction between Google representatives and the information profession.
To complete information tasks, it is important that users learn how to engage with the systems that surround them. Russell’s acknowledgements in the book express appreciation to anthropologist Mimi Ito for “reminding me that most people think of online research as a pedestrian skill that shouldn’t need any teaching…She pointed out that this book needs to be intrinsically interesting.” That’s an important insight and Russell clearly took it to heart. His book is both lively and informative. One hopes that some set of college syllabi will refer students to this text as an advisable addition to the learning experience.