This Spring has been a season of altmetrics, at least for me, as I have spent much of the last two months either moderating panel discussions on alternative and new metrics, or giving requested talks on the subject. I’ve spoken with librarians, representatives from funding agencies, executives from altmetrics service providers and academics involved in hiring and career advancement decisions. The impressive level of interest in the subject of altmetrics is telling. There is great discontent with the current system for understanding and evaluating scholarly research, and in the era of “big data”, an understandable desire to put some of that data to good use.
The question still remains though, what exactly should that “use” be?
When we look at altmetrics, or really any kind of metrics, we have to carefully examine the stories they tell us to understand whether they hold any significance.
Our brains like patterns. Steven Jay Gould, in his in his famous 1988 essay on Joe DiMaggio’s record hitting streak, talks about how evolutionary adaptations to seeing patterns continue to drive the way we see the world:
“We must have comforting answers. We see pattern, for pattern surely exists, even in a purely random world… Our error lies not in the perception of pattern but in automatically imbuing pattern with meaning, especially with meaning that can bring us comfort, or dispel confusion…we must impart meaning to a pattern—and we like meanings that tell stories about heroism, valor, and excellence.”
There’s a natural tendency in humans to want to create stories. To take available data and make sense of it by using it to show patterns that tell a story. The danger though, comes from our tendency to do this even when the meaning behind the story isn’t really there.
Much of the data collected and presented in the altmetrics world revolves around measurements of attention. For many, “altmetrics” and “attention metrics” are synonymous. The very nature of online publishing, social media and the interlinked internet itself make these sorts of data readily available. But measuring attention is not the same thing as measuring quality or value, and for important decisions like career advancement and funding, a sense of quality (this is an important result, this person does excellent research) is what we are hoping to learn.
A company called Chartbeat, which provides real-time data to major web publications did a recent study looking at 2 billion visits across the web over the course of a month. A few key findings were that most usage statistics like pageviews and clicks were meaningless—in 55% of cases, a reader spends less than 15 seconds on a given webpage.
Second, there is no correlation between someone sharing an article via social media and the attention paid to that article:
We looked at 10,000 socially-shared articles and found that there is no relationship whatsoever between the amount a piece of content is shared and the amount of attention an average reader will give that content…
Bottom line, measuring social sharing is great for understanding social sharing, but if you’re using that to understand which content is capturing more of someone’s attention, you’re going beyond the data. Social is not the silver bullet of the Attention Web.
This should give us pause when considering the value of attention metrics. Does an html view of a paper really mean that someone read it or that the title was intriguing enough to click on when seen in Google search results? As David Colquhoun and others have pointed out, tweets very rarely seem to show any sort of understanding of the content of the articles being shared.
Even if perfect measuring tools were available, we would still need to decide where measurements of attention should matter. There are clear use cases: knowing what your colleagues are reading can help you filter the literature; knowing which research projects have piqued the public’s interest can help an institution’s development officer solicit donations. Serious questions remain though, on whether these types of measurements should play any role whatsoever in the assessment of a researcher’s work or the value of a research publication.
If you set up a metric to be used in performance assessment, researchers will change their behaviors to maximize their performance of that metric.
Right now we ask researchers to do a certain amount of self-promotion, but this is a means to an end. By promoting their work through publishing, giving talks at meetings, doing interviews, things like that, it helps raise awareness of the work and hopefully increases its impact. And impactful research is the goal. But by rewarding researchers for doing the things attention metrics measure, we instead make attention itself the goal, rather than the impact we were hoping to gain through attention. By focusing on the means, rather than the end, we may end up favoring the drawing attention to oneself over the production of meaningful research results.
Metrics must line up with goals and offer incentives for meeting those goals. How do we really want researchers spending their time? How much of the job is about research and how much should be about other things? The PeerJ blog suggests that funding agencies should ask researchers to prioritize things like “inspiring citizen scientists” as much as generating results, but I’m not so sure. If publicity is the goal, why not hire an expert? Wouldn’t it be more cost effective for a funding agency to hire a professional publicity firm, rather than offering a research grant and expecting something other than research?
(As an aside, that PeerJ blog post, particularly the now struck-through retracted text, is the perfect example of why peer review should be done anonymously, a clear case of how doing a review could lead to retaliation against a researcher).
The skeptical part of my brain always lights up when I hear proposals to reward researchers for doing things other than research. Is what’s being discussed an important part of a researcher’s job or an attempt to change the game to better fit someone else’s skillset? Maybe if you’re not so great at doing groundbreaking research, but you’re really good at communicating, forming community and you enjoy arguing online all day, could this be a way to shift the criteria for career success to something that favors you? I worry that altmetrics reward things like effort and participation and talking about science rather than actually doing science.
Measurements of quality of work and fuzzy concepts like “impact” remain at the heart of our needs for decision making in areas like funding and career advancement. Attention metrics offer us, at best, correlative information, and those correlations are unlikely to hold up if attention becomes an official metric of gauging success.
If attention is going to factor in to how we judge the work of researchers, this will change the way that researchers plan their experiments and write their papers. If we set up a system that rewards popularity and sensationalism, researchers will understandably start to chase those goals, planning experiments like those at the top of altmetric scoring lists like this and this. We’ll see a rise in the flashy and sensationalistic, research on fad diets, Facebook and Sudoku, rather than meaningful but more mundane work that advances human health and society.
Don’t get me wrong–there’s great value in measuring attention, the interest in science, and the communication of science. These are fascinating subjects, but they are not a replacement for measurements of quality and importance. We know the Impact Factor is flawed, but we must be careful not to replace it with something even more flawed.
The good news is that from speaking to all of these meeting panels, there seems to be little traction for serious use of attention metrics in researcher assessment. Funding agencies stated that they were very interested in altmetrics, but are not using them (or any metric) in funding decisions. Attention metrics were intriguing to researchers but very far off of their radar as far as having any real impact on their careers. Even those from metric companies suggested that attention metrics were just a small part of the picture.
It may be that the overemphasis on attention metrics is slowing the growth and acceptance of altmetrics. The very term “altmetrics” may be so intertwined with attention measures that it is no longer of use for the very important need to find better ways to measure the quality of researcher performance. To many, “altmetrics” means “how many Facebook likes did the article get?” The toolset of altmetrics needs great refinement and attention metrics may be best left off to one side and only used where strictly appropriate.
As one of the better metaphors I heard suggests, measuring attention tells us how well the movie did at the box office, when what we really want to know is whether it is any good.