Editor’s Note: this post is based on a talk the author gave at the 2018 Society for Scholarly Publishing Annual Meeting.
The reason our governments and charities fund research, or why anybody bothers to devote their lives to it, is to make the world a better place. We benefit from the generation of new knowledge, as it leads to things like better healthcare, powerful new technologies, and a stronger understanding of who we are as a species and a society. A better world is the end goal of research, so it’s not surprising to see funders and governments increasingly demanding proof of “real world” or “societal” impact. But while the idea of measuring real world impact makes sense, objectively measuring it is not a simple or straightforward process, and it raises some red flags about falling into the same sorts of traps we’re already in from the ways we already do research assessment.
Funders and research institutions set goals for researchers – here’s how we’re going to evaluate your performance, and while these goals are meant to provide qualitative measures, they are inevitably reduced down to quantitative metrics. In an ideal world, every grant officer, every tenure committee, every person on a hiring committee is going to read every single one of every applicant’s papers and develop a rich understanding of the importance and meaning of their work. But we know that’s not going to happen, and in practice, even six years after DORA, they instead rely heavily on numerical metrics, largely for two main reasons: there are too many things to evaluate, and no one has all the necessary subject knowledge.
We have an enormous number of people doing an enormous number of research projects. Grant offerings and job openings routinely get hundreds, if not thousands of qualified applications. In 2014, the UK’s REF review assessed 191,150 research outputs and 6,975 impact case studies from 52,061 academics. The NIH alone awards around 50,000 grants to more than 300,000 researchers, which represents a tiny fraction of the applications they receive.
Even if you had the time to read everything, many decisions have to be made by non-experts. University administrators, grant officers and others can’t be expected to analyze high level research. If I don’t know a thing about Keynesian economics, how am I going to judge the importance of your work on the subject?
Real world impact cannot be effectively reduced to a numerical ranking for a number of reasons:
- It’s often subtle and non-obvious, requiring an explanation, rather than a yes/no tick box.
- It’s slow, in many fields better measured in decades rather than months.
- It’s widely variable, and there’s no one standard measure that makes sense for a particle physicist and an economist.
- Trying to metric-ize real world impact leads to short term thinking rather than doing what’s best for research and for society over the long term.
Probably one of the most impactful set of experiments of the last century was done in the early 1950s, when Salvador Luria and Guiseppe Bertani were looking at how bacteria protect themselves from viruses. They realized that there were differences in how well the virus could take hold in different strains of bacteria. Maybe kind of interesting, but where’s the societal impact?
Cut to the next decade. In the 1960’s, Werner Arber and Matt Meselson’s labs showed that the reason for the restricted growth in some strains was because there was an enzyme chopping up the viral DNA, something they called a “restriction enzyme”. Again, it’s bacteria, so where’s the “real world” impact?
A decade later, in the early 1970s, Hamilton Smith and Daniel Nathans, among others, isolated these enzymes and showed that you could use them to cut DNA in specific place and map it. This quickly led to the notion of “recombinant DNA,” and genetic engineering. Herb Boyer and Stanley Cohen filed a patent and started a company called “Genentech”, and started producing synthetic human insulin. Until this time, the only insulin available to diabetics was purified animal insulin, which was hard to get and expensive. This made an incredible difference in the lives of the people suffering from this condition. Arber, Nathans and Smith were awarded the Nobel Prize for this work in 1978. The patent, by the way, wasn’t officially awarded until 1980.
That seems a pretty significant piece of real world impact. But going back to Luria and Bertani, how would we have measured the value of what they did? What metric would have capture that impact?
Real world impact can be subtle and slow. It took decades for the real payoff from the initial experiments. RO1 grants are offered for 1 to 5 years. How does that time scale gibe with the course of these experiments? In today’s funding environment, would Luria and Bertani have kept their grants, their jobs?
Altmetrics are often mentioned as a way to measure real world impact. The first thing we must absolutely be clear on though, is that attention is not the same thing as impact. Just because something is popular or eye-catching, doesn’t mean it’s important or of value. Cassidy Sugimoto, summed up the state of Altmetrics and societal impact in a recent tweet:
I have not seen strong evidence to suggest that #altmetrics – as presently operationalized – are valid measures of social/societal impact. Rather, they seem to measure (largely unidirectional and concentrated) attention to articles generated by publishers, researchers, and bots.
The key to understanding why Altmetrics, or any metrics are problematic comes down to a concept that’s so important it has been codified under three separate names (Campbell’s Law, Goodhart’s Law, the Lucas Critique) — essentially, if you make a measurement into a goal, it ceases to be an effective measurement. As currently constructed, Altmetrics measure as much the level of publicity efforts from the publisher and author as they do the actual attention paid to the work.
Some caveats should be considered when thinking of Altmetrics though, as there are a couple of interesting and potentially relevant things they do capture. One important indicator is when a paper is referred to in a law or policy document. This shows the research is directly affecting society’s approach to a subject. However, this is not a universal metric for all fields – there’s not a lot of policy being written about string theory or medieval poetry studies. Going back to recombinant DNA, in the mid-70s, a number of communities, notably Cambridge MA, passed laws banning or strictly regulating recombinant DNA research. That’s certainly a sign of societal impact, but I’m not sure how much a funder or institution would reward a researcher for inspiring laws that essentially ban their research.
Another interesting metric with some potential is tracking connections between research papers and patents. In theory, a patent shows a practical implementation of something discovered during the research process. Turning an experiment into a product, a methodology, or a treatment would certainly qualify as real world impact.
But again, this is not universal. Not all research lends itself to patentable intellectual property. Campbell’s Law comes into play again – if you get credit for patents, then researchers are going to start patenting everything they can think of, regardless of its actual utility. Do we really want to lock up more research behind more paywalls? If everything that’s discovered gets patented, then does that stop further research – I can’t take the next steps in knowledge gathering because I’d be violating your patent if I try to replicate or expand upon your experiments. That goes against the progress we’re making toward open science.
Time scales come into play here as well. It takes about 12 years to go from the laboratory bench to an FDA approved drug. Most funders aren’t going to wait 12 years for you to prove societal impact to get your grant renewed. Remember also that only about 1 in every 5,000 drugs that enters preclinical testing will reach final approval. If your drug gets tested and fails, is that real world impact?
Exerting pressure to productize research, and to strive for small, short term gains over creating long term value is bad for research.
This gets us to one of the other roots of society’s ills, short term thinking. In the US, schools are willing to trade-off having a generation of citizens that haven’t been properly educated for a quick infusion of local tax dollars, and so focus on the metric of standardized test scores. Kent Anderson wrote a post about this for The Scholarly Kitchen last month, where he called it a “race to the bottom”, with people keen to pay lower prices now even if they know it will hurt them in the long run.
We already have some major issues getting funding for things like basic science. The technology and health gains we’re putting together today are based on the basic science of the last few decades. Luria and Bertani weren’t trying to cure diabetes, they were investigating defense mechanisms of bacteria. You probably couldn’t get that project funded today, as researchers are increasingly asked to show practical results of their work – as noted sage Sarah Palin put it,
You’ve heard about some of these pet projects, they really don’t make a whole lot of sense and sometimes these dollars go to projects that have little or nothing to do with the public good. Things like fruit fly research in Paris, France. I kid you not.
The idea of codifying “real world impact” as the key to career advancement and funding puts us in even more dangerous straits. Researchers will optimize for these goals, and we’ll see less and less basic science and more and more attempts at quick wins.
I’ll end this post with one crazy idea for measuring real world impact – hear me out, put down those torches and pitchforks! We know that science is slow and incremental, “standing on the shoulders of giants” and all that. How do we know that Luria and Bertani’s work led to synthetic insulin? From the citation record, as each step of the way, new research cited the previous research that led to it.
We know that citation is an imperfect metric, it lags, and we know that the Impact Factor distorts it and remains deeply problematic. But there is valuable information in the citation record that can help us track the path of research. We shouldn’t be so quick to dismiss that value just because we don’t like the way academia has chosen to use the Impact Factor.