I was surprised recently when a former co-worker mentioned that he’d found a hobbyist blog I’d abandoned years ago, still available and discoverable. I can no longer delete it, because the email address I used when establishing it is defunct, and I don’t recall the password. Without a major effort, one I’m not willing to exert, I am stuck with it.
The recent court ruling in Europe establishing a “right to be forgotten” brings up interesting issues for scholarly and scientific publishers, who have spent the better part of the last decade bringing vast archives of old research reports online.
Last week, in the New Yorker, Jeffrey Toobin’s excellent article detailed some of the issues involved in the ruling and its aftermath. He starts with a story of a girl who was decapitated in a car crash, pictures of which leaked out of the coroner’s office and onto the Internet. These devastating pictures have proven nearly impossible to remove and difficult to conceal. The parents have had to deal with this nightmare on top of the nightmare of their young daughter’s death.
Remember the days when editors and others would scoff at the Internet as something that was too unstable and fleeting to compete with paper, the more durable and enduring medium? Toobin’s article touches on the nice way paper tends to vanish or become very obscure around the time it suits humans, while the Internet remains stubbornly persistent and timeless:
“Back in the day, criminal records kind of faded away over time,” [Sharon] Dietrich[, director of Community Legal Services in Philadelphia,] said. “They existed, but you couldn’t find them. Nothing fades away anymore. I have a client who says he has a harder time finding a job now than he did when he got out of jail, thirty years ago.”
Viktor Mayer-Schönberger, author of, “Delete: The Virtue of Forgetting in the Digital Age,” is quoted in Toobin’s story, as well:
. . . digitization and cheap online storage make it easier to remember than to forget, shifting our “behavioral default,” Mayer-Schönberger explained. Storage in the Cloud has made information even more durable and retrievable. . . . “We do not know what the future holds in store for us, and whether future governments will honor the trust we put in them to protect information privacy rights.”
While some portray this as a “freedom of the press” issue, I don’t see it. Are we supposed to take the blinds off our windows to serve the press? The press having the freedom to say what it wants and to operate without answering to a governmental authority is far different from the press having access to all our information forever. The press can still contact us, ask questions, verify sources, and report like they always have. We have no obligation to serve it all up on a silver platter.
For scientific and scholarly publishers, there is a corollary issue, and it relates to our archives. They are, at best, mixed bags, a fact that has become more apparent as assumptions about journals as utilitarian resources for broad, non-expert audiences — rather than historical records for specialist research communities — have emerged. Some of the papers are classics and have aged well. Some are forgettable but harmless. Still others are misleading, wrong, and should be taken out of practical circulation, as they only matter for historical purposes. But we mostly manage our archives as if they are uniformly relevant and useful — or purely historical — as if there aren’t differences in what they contain.
Some journals and publishers manage the issues involved by segregating the archive off, usually at a clear cutoff point, say 1990 or 1995. Others add this cutoff point to their internal search engine — but this only solves native search, not general search, issues. Some mark their archive articles accordingly. In each case, these approaches acknowledge the issue but use a blunt and convenient approach to addressing it.
When journals do dive into their archives with intellectual intent, the picture often becomes more nuanced. Celebrating anniversaries or milestones is often a reason to do this work, where editors find the archive to be much different than they’d imagined. There are funnier, stranger, more interesting, and more astounding things in there than they’d thought.
In fast-moving fields, the archive can be downright misleading unless accessed by an expert who knows the field. In medical fields, in particular, historical information can include procedures that are no longer practices, diagnoses that are no longer used, and tests that have been proven irrelevant and misguided. A disease once treated only surgically can now be treated medically or with radiation therapy, or some combination.
And in an era with more access given to less qualified people (laypeople and an increasingly unqualified blogging corps presenting themselves as experts or journalists), not to mention to text-miners and others scouring the literature for connections, the obligation to better manage these materials seems to be growing. We can no longer depend on the scarcity of print or the difficulties of distance or barriers of professional expertise to narrow access down to experts with a true need. More and more, a simple search can unearth materials of questionable relevance, presented without condition or qualification.
An interesting issue involved in archival articles is what “peer review” was in the period in question. If your journal stretches back to before 1900, chances are the peer review of those preceding eras was much less rigorous than it was after, say, 1950. In fact, journals were scrambling for articles, many emanated from groups of colleagues who published one another almost exclusively, and many were regional and therefore limited to a small group of labs or hospitals. Peer review under these conditions was highly variable and far less diversified than it is today. Should your archive provide some insights into what publication practices were like in 1895 vs. 1995?
Taking the time to sort through these vast archives and make designations of what is put in historical shadow is a daunting task. Toobin writes about how Google and Bing and other search engines are scaling up efforts to assess requests to be forgotten. Our job is less daunting — we have static archives of finite size, making it a one-time effort. But it is a major editorial effort, nonetheless.
Such work is fraught with uncomfortable and unfamiliar editorial decisions. Leeches provide an interesting example. They are no longer mainline medical therapy, but they are of great historical interest and still inform some drug development approaches. Are papers about the application of leeches to be put in the “history” bin? Or should they remain in the main retail outlet? Surely, within the category “leeches,” there are some papers of purely historical interest while others remain somewhat relevant to current research pursuits. Making these distinctions would give our users not a right to be forgotten, but the right to forget.
The Internet has proven to be almost uncomfortably persistent. Our archives are, to some unknown degree, part of the problem, adding to search results articles of variable applicability and relevance. We are already facing recurring problems with a widened funnel of journals publishing papers of dubious quality. Over the last decade, we’ve also widened the funnel at the base, by adding huge archives. Mismanagement at either end is a sort of filter failure. Do we need to do some more work here, to provide our audiences with an implicit “right to an ahistorical literature,” one that is based more purely on pragmatic relevance than on historical habits?