“Hey scientists, how much of your publication success is due to dumb luck?”
In scientific journalism, this headline was about as close to click-bait as you can get. And yes, I clicked.
There is a scathing narrative that gets pedaled by critics of scholarly publishing that many of the selection and review processes underlying the selection of scientific papers are elaborately disguised shams. Editors, they argue, are no better at identifying good science from bad science than flipping a coin, rolling dice, or buying a lottery ticket. “Dumb luck” fits into this narrative.
The paper that prompted the click-bait headline tells a slightly more nuanced story. The article, “Quantifying the evolution of individual scientific impact,” by Roberta Sinatra and others (Science, 4 Nov 2016, DOI: 10.1126/science.aaf5239) tracks the changes in productivity and impact of scientists throughout their career. From a methodological standpoint, this was not easy to do. (Remember that ORCID IDs are still very new and not universally adopted). Disambiguating and matching tens of thousands of authors to their papers over a long period of time is a feat in itself.
Creating a career trajectory for each scientist provides us with a view that we don’t often see in bibliometrics. While some have meticulously documented and analyzed the publication history of Nobel laureate and other notable figures, the career trajectories of 99.9%+ of scientists are largely ignored.
Contrary to the popular notion that scientists publish their most influential work when they are young, Sinatra discovered that researchers’ highest-impact paper can be published anytime during their career. She calls this finding the random-impact rule:
Impact is randomly distributed within a scientist’s body of work, regardless of publication time or order in the sequence of publications. We call this the random-impact rule because it indicates that the highest-impact work can be, with the same probability, anywhere in the sequence of N papers published by a scientist. We find that the random-impact rule holds for scientists in different disciplines, with different career lengths, working in different decades, and publishing solo or with teams and whether credit is assigned uniformly or unevenly among collaborators.
I need to highlight that the use of random in this context means that there is no particular time in a scientist’s career when her highest-impact work is published: It could happen early. It could happen late. It could happen mid-career. A lack of impact order is not the same as stating that a paper is randomly picked to be highly-cited or that scientific success is of any given paper is a matter of “dumb luck.” I cannot be more emphatic about these differences in meaning.
In many quantitative fields, random can describe a relationship that has no predictive power, in this case, being unable to predict when a researcher publishes her most cited paper. It can also be used to describe the residual error that cannot be explained by a statistical model. Here, randomness simply means that a researcher is not capturing the variables that are known to predict a certain outcome. It does not mean that the process behind the observations is random.
In the process of writing a manuscript, an author does not spin a wheel, roll dice, or flip a coin to select references. While there are many reasons why an author may cite one paper over another (paper A is already highly cited, written by a well-respected scientists, or published in a prestigious journal), randomness is not one of them. The inability to predict which papers will become highly cited is not part of the structure of science but a failure of the predictive model. Success is not up to dumb luck.
The main contribution of this paper is that it attempts to disentangle effects of productivity, individual ability, and luck on the success of a scientist.
In their analysis, Sinatra assigns a single value (Q) to every researcher, which remains constant over their entire career. Intuitively, a stable Q-value makes sense. Some scientists build a reputation for publishing important work in prestigious journals and this is reflected by a higher average citation impact per publication. A scientist with a low Q-value may publish only periodically and receive few citations for their contributions.
The weakness of the individual Q-value approach is that it takes a decade or more of publication history to calculate one’s Q-value, so it is largely useless for evaluating early-career researchers for tenure and promotion. And like the h-index, the Q-value reduces the sum contribution of a researcher into a single metric. It also views research and publication as a mechanistic process, like lottery balls jostling around in a wire cage or gas molecules reacting to each other in a chamber. The scientific community is somewhat more complicated.
Scientists are a highly-stratified group, with the most productive and impactful scientists residing at a handful of prestigious research universities. These scientists are able to build teams of bright, hard-working graduate students, post-docs, and technicians, attract major research funds that allow these teams to work, and produce the majority of publishable papers. In instances where research requires very specialized and rare equipment (like a radio telescope or a synchrotron), these teams are few and highly-connected. This aggregation pattern even has a name–the social stratification of science–and has been well-documented since the early 1970s.
It is neither novel nor surprising that a researcher who begins her career at a top research institution has a relatively stable Q-value.
In their discussion section, Sinatra hand-waves at exogenous factors (variables that were not included in their model) that help predict career impact — education, institution, size and dynamics of subfields, gender, and publication habits — however, as these variables were left out of their model, their relationship with Q remains unknown.
In an interview for the Chronicle of Higher Education, the senior author of the paper, Albert-László Barabási, openly dismisses questions about explaining researchers’ Q-value:
“I’m not a policy maker, I’m a researcher,” he said, “and I don’t want to go there, because I don’t understand where this quality factor comes from.”
Discussion
4 Thoughts on "Is Publication Success a Matter of Dumb Luck?"
The one highly variable component to research is reality itself – sometimes it’s maddeningly complicated and you can only publish impenetrable long form papers, and (very occasionally) you arrive at a clear insight that generates a lot of attention.
Indeed, there is a lot of serendipity, which is often a fancy word for luck. Big discoveries are often accidental. Several of mine certainly were. This may be a reason why the occurrence of high impact papers is unpredictable. I think of it as being like prospecting. The gold has to be there before it can be found. This is why I prefer the term importance to quality. Calling important work high quality suggests that the researcher controls the outcome, but that is not so. Nature determines the outcome.