The mantra of the Digital Age — “Move fast and break things. Unless you are breaking stuff, you are not moving fast enough” — was coined by Mark Zuckerberg in the early days of Facebook. The quote’s underlying message had already consumed Silicon Valley and others via its shorthand equivalent, “disruption.” Disruption was deemed to be a good thing, shaking the cobwebs off the status quo and putting necessary revolutionary pressure on slow-moving enterprises and weak ideas.
What transpired from there has been akin to other disruptions of society caused by new technologies, whether the printing press or the Industrial Revolution. As with those and other convulsions, there is a beginning, a middle, and a resolution. We may be approaching the end of the beginning of this one.
After years of disruption, we have a polarized society with less scientific literacy and less cohesion, international alliances under strain, greater income inequality, tense race and sex relations, and a Doomsday Clock that has advanced a minute closer to its symbolic midnight in just over a year. We have depressed and stressed children and teens showing how “social media” is deeply antisocial, heaps of misinformation (including Facebook enabling yet again misinformation about vaccines), and a raft of other maladies that all seem somehow related to the fracturing and exploitation — the disruption — of our information environment.
Clearly, the disruption has not delivered a panacea as expected. Now, the tide seems to be turning, as more and more experts are saying the time has come for us to fix what has been broken.
Andrew Keen in his new book How to Fix the Future writes about this while also indicating how we can get our bearings again:
The future isn’t working. There’s a hole in it. . . . Ourselves. We are forgetting about our place, the human place, in this twenty-first century networked world. That’s where the hole is. And the future, our future, won’t be fixed until we fill it.
Yet, the momentum created by moving fast may cause some things to be broken still. One new example may be the unanticipated harm that could be created by moving fast into the realm of open citations.
Say what you will about the Impact Factor and its limitations, one thing it doesn’t do is foster mob or swarm mentality. As constructed and executed currently — as a lagging indicator, with opacity given the sheer volume of citations occurring across a large number of sources, and given its significant lag in numbers being announced, along with a lack of specificity at the article level — the metric does not encourage any particular article’s citations to “go viral.” If you look at the citation curve for nearly any journal, the Impact Factor consists of a small percentage of highly cited articles and a larger number of moderately or minimally cited articles. By not providing immediate data or tools users can access while responding to known citation patterns, the act of citation is nearly organic, with other factors — brand, distribution, reputation — certainly contributing to some extent. Cheating does occur, but mechanisms are built in to discourage and detect it.
The approach taken with the Impact Factor can seem anachronistic when nearly all modern media is reliant on business models built on immediacy, network effects, and the particular psychological games these approaches have become known for — exploiting insecurity, groupthink, clickbait, and swarm behavior.
Recent calls to make citation data “open” could move citations into this dubious modern age, and there is a good amount of enthusiasm for the innovation. But are there potential downsides? Could open citations inadvertently foment herd mentality and swarm behavior around citations? Could it feed the dominance of top journals by reinforcing their position with fast feedback loops? Could it increasingly feed the surveillance economy that’s been built around platforms and free content? Could it entice authors and editors to find new ways to cheat their way up the ladder?
Revealing citations in more-or-less realtime may change how articles are cited, and not in a legitimate or informative way. When metrics are so visible, accessible, and responsive, they can create feedback loops that promulgate swarm behavior. Popularity becomes a self-fulfilling prophecy. Seeing that something is cited a lot might make you more likely to cite it. Algorithms and discovery services may surface such articles more often in data interfaces. It’s the availability error on a networked scale. Quality works may remain hidden under these waves of clicks and self-reinforcing awareness.
Digital Science’s latest initiative, Dimensions, injects disruption into the citation space. You get a hint of this in Roger Schonfeld’s recent post analyzing Dimension’s launch:
. . . [Dimensions] collapses the product categories of citation database and analytics suite into a single new product category. . . . Dimensions is inclusive in terms of content coverage, rather than curated as is the case for Scopus and Web of Science. Of course, what reads to some as more inclusive can be seen by others as less rigorous selection, given the ways that citation databases have been policed to minimize exploitation of bibliometrics.
Instead of creating a bulwark against groupthink, Dimensions’ approach to citations may be more susceptible to it. The badging they are leveraging, an extension of the Altmetric badge, which has limitations it carries forward, as outlined in a post by Phil Davis from 2013. Now, an elaboration on the flower motif — more of a 3D hexagon — is being used to encourage display of citation data as sheer numbers based fundamentally on a competition paradigm — popularity, relative ratio, etc. — all at the article level:
— Altmetric (@altmetric) January 22, 2018
These badges and their integration into web sites have the potential to be exploited. Imagine editing a journal when there is a paradigm that causes people to chase citations at the article level. I’m going to prioritize papers that cite recent articles in my journal — an author cites an article, her paper moves to the front of the queue. We know some journals already unethically demand authors add more citations to the same journal to their paper. Once this is working at scale, we may see the incentives coalesce, and citation hacking will take off like wildfire.
At a more prosaic level, this disruption also has the potential to make highly cited articles even more highly cited, while academics may be reluctant to cite less-cited articles because they may deem them as too obscure or somehow lacking in community acceptance.
What Schonfeld described as “the exploitation of bibliometrics” cuts both ways, which seems a hallmark of the “move fast and break things” philosophy — small divisions and differences become exaggerated by the sheer speed and scale inherent in these platforms.
Metrics that claim to move authority to the crowd may actually decrease individual autonomy by concealing alternatives or preying on psychological biases in real-time, while metrics that place curatorial authority over data may actually increase the autonomy of individuals in the system by buffering or thwarting the swarm effect, leaving users with more control.
. . . they [Scopus] are now willing to invest in the most costly resource in building authority – people. In a world of abundant and cheap data, there is a real and growing demand for authority.
We’ve seen the negative effects of letting crowds manage information — it’s hard to detect abuse; divisiveness and abuse seem predictable outcomes when people are given tools that add up sentiment quickly and obviously; and, the economics that make it work incentivize turning users into the products, which flips the script on the relationship most of us want with information products.
Predatory publishers are already a major, poorly managed problem in an industry that has embraced perhaps more disruption than it should. A recent story in the Guardian outlined how sketchy journals can support climate change denialism, lending it a false legitimacy. This is nothing new, but the fact that it continues and seems unstoppable is worrying. Could open citations expose citation data to some sort of predation? You can probably already imagine the algorithm designed to scrape open citation data to identify highly cited papers to add to a manuscript so it’s more likely to be accepted.
Algorithms are often loopholes in disguise, which clever people can easily exploit, in ways that are difficult to detect for technical (sneaky) and psychological (arrogance) reasons.
The model has been abused before, and this abuse continues to this day, although it has morphed. Google’s PageRank, which was built on the impact factor model and brought to scale, has become unrecognizable, as clever people with something at stake found ways to exploit the algorithm, game the citation paths, and tilt the playing field their way. Google has spent the better part of the last decade modifying their approach, first with what seemed like integrity and more lately with what seems like commercial opportunism. A recent search of the term “hematology journals” delivered results dominated by paid search results and predatory publishers, with OMICS leading the pack. Once people saw how the system worked, they could make it work for them.
Yielding authority to the crowd, or to some algorithm that purports to be able to moderate the crowd, can work pretty well for a time, especially in the early days when user diversity is low and a large share of the crazy or nefarious in the world has not yet interacted with the algorithms. However, if the system reaches sufficient scale, the algorithm’s limitations show, exploits become apparent, or the algorithm simply goes rogue under the stress. Google discovered this with its numerous controversies about auto-fill search terms. Facebook and Twitter learned this with Russian poaching of their systems via fake accounts. We may learn it via shady citations generated by exploits we may not yet be able to imagine. The cookbook is already posted.
Different approaches have emerged in scholarly publishing broadly over the past decade, with some journals exhibiting tinges of the “move fast and break things” philosophy with editorial standards that shed some of the perceived barriers to publication (novelty, significance, relevance) while encouraging critiques of proprietary publishing as elitist or misguided. Preprint servers are another “move fast and break things” approach that will bear further scrutiny from those who may be more in the “slow down and fix things” camp. I’ve speculated before that their presence could allow journals to resume their roles more clearly without having to drift into supporting the speedy aspects of publication.
By its very nature, the culture of scientific and scholarly publishing imposes aspects of “slow down and fix things” — retractions, corrections, and other integrity measures don’t let publishers cut the cord or move unburdened on to the next thing, the pivot, and reinvention. The closest we’ve seen a publisher come to fully abdicating responsibility is F1000 Research, which has a peer review and approval practice that still defies explanation for deposit into PubMed and potential retraction in cases of later disagreements or problems. F1000 Research is a case of a publisher wishing to behave as a disruptor, shedding certain obligations by enabling the crowd. Hailed by some as innovative, we now might sense it is more fraught.
As an industry, we’ve specialized in “slow down and fix things” practices — curating content, selecting the best, selecting relevant content for developed audiences, and slowing things down just enough to help smart people separate the wheat from the chaff. Publishers, editors, reviewers, and librarians all developed their professional habits and reputations via this model. Over the past 10-15 years, we have tilted more and more toward the “move fast and break things” model, which makes editors, reviewers, publishers, and librarians understandably uncomfortable, moving their roles into uncertain positions relative to the information stream. It purports benefits using attractive vocabulary (open, free, democratic), but often what breaks only feels free or open or democratic for a brief period, until the shackles of uncertainty and unreliability bring an Orwellian twist to the words.
Perhaps a contra move is in order — instead of more information loosely managed with responsibility deferred so we can move fast and break things, it might be wiser to embrace high editorial standards and more active curation in order to get ahead of the fact that platform providers are going to undergo a decade or more of fixing what they’ve broken.
I believe we are seeing an essential tension between disruption and repair in the realm of information purveyance. How organizations strike the proper balance will inform a lot of what we talk about in the coming years. In the case of open citations injected into interfaces in ways that might skew behavior, we might want to pause a beat and think again about the benefits of separating action from reaction, adding smart, informed human circuit breakers, and letting some important measures lag. The care and adjudication of the data underlying the Impact Factor, despite its flaws, may hold clues about some essential firebreaks.
(Tomorrow, Part Two explores slowing down and fixing the business model elements in ways that bear on issues of integrity, governance, and revenues.)