Yesterday’s release of Altmetric’s 2017 Top 100 Articles list continues the year-end ritual, its coverage of numerous sources concatenated to rank scholarly articles by online attention scores. Given the excitement the release of this list causes every year, it feels akin to a wrapped gift for journal editors, staff, and readers, many of whom open it seeking validation or dreading disappointment.
But what does the list actually tell us? What does it represent?
Altmetric is a scoring system, with weighted scores based on the type of online source mentioning an article. To tally up an Altmetric score, you just add the weighted scores based on event monitoring. Altmetric has added a distinctive design by arranging the sources into a flower with the synthetic metric at the middle.
However, how these weightings are determined is puzzling and opaque. For example, an article appearing in a news outlet receives a weighted score of 8 in the Altmetric system. Why 8? Why not 3 or 10 or 15?
Two things are worth checking with metrics — validity and completeness.
First, is the approach validated? That is, is a Twitter posting (score of 1) really worth 4x a Facebook post (score of 0.25)? Is a Google+ post (score of 1) worth the same as a Twitter post and 4x more than a Facebook post? Why are blogs in general given a score of 5, yet Sina Weibo, a Chinese blogging platform, only given a score of 1? Why is Wikipedia worth 37.5% of News, 12x more than Facebook, and 3x more than Twitter?
There doesn’t seem to be any validation supporting these weighting assignments, and there’s no clear indication that they are validated on a rolling basis. For example, as Facebook has gained prominence and Google+ has faded, the Altmetric weightings don’t appear to have shifted to follow.
Then we get into the question of completeness. There have been numerous changes in media outlets, especially social media outlets, in the past few years. Where are WeChat, Instagram, Snapchat, Qzone, and WhatsApp? It turns out that data from social media platforms, especially those in China, aren’t available to Altmetric, according to Euan Adie, founder of Altmetric. Beyond social media, there are other sources that surveil the literature for professionals in the field and might be worthy of including in Altmetric measures — for instance, publications like Journal Watch and Retraction Watch.
Altmetric finishes gathering data for the Top 100 list in mid-November. This schedule means articles from the prior year carry over as candidates for the next year. In the 2017 list, there are 5 articles from December 2016.
The average age of articles on the Top 100 list is 197 days from publication, with the oldest being 352 days old, the newest a mere 7 days old. The 7-day-old article ranks #50 for 2017 despite accumulating data for only a week before time was up — as we’ll see, most of the effect documented by Altmetric might occur early in most cases, meaning the Altmetric flower may denote a splash rather than a long tail.
Downloading the dataset for the 2017 Top 100 articles, there is a slightly different parsing of the data than the table here would suggest:
- A category titled, “Number of Mendeley Readers” is now included in the raw data behind the Top 100, but according to Adie, these numbers are not factored into the scoring as they are not auditable. They are in the dataset “for extra information.”
- Publons and PubPeer data are apparently captured in a column titled, “Number of peer reviews” (answer? 10 across all 100 articles, with one article accounting for half that number alone).
- F1000 posts are included separately (they hardly contribute at all to the outcomes).
- The number of LinkedIn posts is uniformly zero, which Adie said is due to the fact that LinkedIn data are not available to them. ResearchGate data are another source Altmetric would like to include, but cannot.
Calculate out the weightings and raw data, and you find that 96% of the results are attributable to news stories (54%) and Twitter (42%), with blogs contributing 3%, and the everything else adding up to less than 1%.
Given that Twitter is imprinted heavily with news links, it seems Altmetric is essentially measuring news coverage from its huge list of news outlets, with little activity coming from the other sources. Of the 17 total sources, 13 of them account for just 0.8% of the total for the Top 100.
Remove the weightings, and Twitter accounts for 82% of the total effect, with news accounting for just 13%, and Facebook for 3%. Everything else accounts for 1% or less.
As a raw data measure, Altmetric is mostly measuring Twitter.
Altmetric’s calculations exist in the context of “complex socio-technical systems”
Given the high prevalence of Twitter in either case, it’s worth reflecting that Altmetric’s calculations exist in the context of other algorithms, or what Clifford Lynch mentioned in his recent paper about the stewardship of algorithms as “complex socio-technical systems.” Twitter is a combination of social and technical interactions, with recommendations surfaced based on followers, followers of followers, likes, replies, time of day, and other factors. What you see is not a direct list but a list curated by algorithms that, as Lynch writes:
. . . cannot stand alone: they operate in a very complex and extensive (and often proprietary, unrecordable or even un-reproducible and unknowable) context.
Here are a few socio-technical systems in play with the news and social media, especially Twitter in this case, that may have a bearing on the Altmetric Top 100:
- Journal brands, which are related to prestige, awareness, social media connections, and media profile, as well as number of followers, tweets and retweets, likes, and more
- Media outreach capabilities, which include cultivation of outlets, press releases, coordination with author institutions and their media outreach, and so forth
- Social media marketing techniques, which include dedicated staff, automated placement and measurement tools, expertise, and experience
- Author prominence, which is often related to high profile publication events, since more experienced authors often have larger labs, more resources overall, and greater experience getting grant funding
- Institutional prominence, which includes the size of the organization, its alumni, its location (major metropolitan area with strong local news coverage, or not), and more
All of these factors and more contribute to the social context in which the news and Twitter operate. Getting media coverage for the Lancet or JAMA or NEJM or Nature or Science is much easier than it is for the Journal of Psoriasis. Given the outsized effect of news and Twitter on the Altmetric Top 100 list, journals with savvy media operations, big brands, good social media practices, and prominent authors from large institutions in major metropolitan areas all factor into the scores.
Altmetric is curating its list more carefully this year, eliminating opinion pieces from the list, another change that led to some consternation expressed via Twitter (of course). For example, President Obama’s opinion piece published in Science in January has a higher Altmetric score than anything on the Top 100 list, and would have led to another year of Obama leading the list. It wasn’t the only article to suffer this fate with the editorial change.
In short, given the data behind the Altmetric Top 100 for 2017, it appears Altmetric is basically a measure of news coverage and news coverage amplification via Twitter. The absence of LinkedIn data (all zeroes) and the lack of impact from Facebook does make me wonder about the quality of the data queries of these popular sources for information sharing and news mentions. There is a data availability problem plaguing Altmetric scoring — we don’t know what we don’t know.
Overall, the Top 100 list remains interesting, and perhaps data availability and other elements will improve over time. But the strength of the variables that lead to news coverage and amplification suggests to me that the Top 100 list could well remain a predictable mix of high profile articles from major journals from major researchers from major institutions in major cities with one major platform a major factor. Those seem to be the socio-technical elements driving this particular set of rankings.
13 Thoughts on "All the News That Fits — What’s Really Driving Altmetric’s Top 100 Articles List?"
I suppose one thing that struck me about the list is something that you don’t mention here — the almost complete dominance of biomedical and life-science papers, and the virtual absence of physical science (except for two planetary studies and some Earth science, much of it tied to global warming).
That merely reflects, I guess, the well-known obsession of the public with medical news, and the consequent biases of news desks. But it does make one wonder about the utility of a measure in which (for example) the papers on the detection of a neutron-star merger by both gravitational-wave and astronomical observatories — which were widely covered and which, in the view of some, mark the birth of a new kind of astronomy — don’t even register.
That is certainly a secondary effect of Twitter and news coverage driving most of the results. In addition to journalists preferring to cover health and medical stories, I’m sure those get more play on Twitter, as well. They have health, policy, personal, and financial implications, making them great fodder for social media. Neutron stars don’t hit those notes.
The gravitational wave detection paper did make it in last year, although last year the results were also heavily biased towards medical stories. As Kent says, this is mostly driven by the non-specialist news media. Gravitational wave detection was a big breakthrough last year, but now they’ve lost interest.
Very nice piece, Kent. I would add a related point. It is one thing to measure ALL impacts, as Altmetric attempts to do (and more power to them for that), quite another to measure the impact on researchers. For researchers the meaningful metric is citations (not JIF, but citations), as citations identify when one person’s work builds on another If Twitter, etc. (and I love Twitter) picks up a reference, it may increase awareness. For a researcher, that’s great, but the crucial test is when the researcher then reads the article and decides to cite it. No citation, limited value. Citations, in other words, are a special test, which Altmetrics does not fully acknowledge. Nothing wrong with the other influences an article might have, but when the question is the value for researchers, the citation reigns. Altmetrics should continue to do what it does, but its success does not make traditional metrics any less valuable for the purpose for which they were designed.
I like your way of separating the questions of importance for the researcher and importance for other people, these are two very different thing. I think that Altmetric is of interest to inform yourself of what kind of research has been most discussed during the year, i.e. Biomedicine and very little else. From the point of view of a researcher it is rather pointless and as a tool for evaluating said researcher it offers absolutely nothing. It’s just a sort of answer to the question “What’s the buzz?”. It is of course fun to be included in an answer to that question (well… usually anyway, did they say if PubPeer discussions add a positive or negative numbers to the score?), which I guess is why people care.
It could of course be that a high score might give you something to put in the ‘societal relevance’ section of your next application (I would certainly use a high score in that way), but it isn’t really a very sensible measure of that either since coverage is mostly a function of how hard a small number of people arbitrarily decide to ram your story into the media machinery.
To your point about the growth / impact of Facebook, it’s also worth noting that Altmetric is only able to track posts on *public* Facebook pages (rather than on people’s personal profiles). Our data at Kudos suggests people are much more likely to post on personal Facebook pages than on public ones; Facebook is the most common sharing channel that researchers are using Kudos to track, yet articles showing substantial Facebook activity in Kudos (author posts, reader click throughs) still have a low or non-existent Facebook contribution to their Altmetric score. I love altmetrics / Altmetric and more power to them, as Joe says, but they can basically only track what is publicly available to them. Kudos data shows that authors are most likely to share (and most likely to drive readership / citations of their work) via email, Facebook, LinkedIn or scholarly collaboration networks – none of which activity can be tracked by Altmetric – so as ever the important thing is to be clear what question you’re trying to answer, and use the appropriate data / metrics / services.
This is something I was thinking about as well. If Twitter has only 300+ million users, and Facebook has more than 2 billion users, why would a Facebook mention be worth less than a Twitter mention? Doesn’t the latter have the potential to reach a vastly wider audience?
Another thing missing here is activity on Scholarly Collaboration Networks, ResearchGate, Academia.edu and the like. These for-profit companies are notoriously guarded about letting others behind their walls as the user data they collect is one of the few assets they own. This seems to me another hole in the picture.
And the third question that comes to mind is around the exclusion of “non-research” material. Doesn’t this open up all of the same criticisms that the Impact Factor sees when it makes subjective decisions about what counts as a “citable item” (https://scholarlykitchen.sspnet.org/2016/02/10/citable-items-the-contested-impact-factor-denominator/)?
One thing we found with our Current Biology paper (it just missed the 2013 list) is that the coverage in Nature had almost double the altmetric score compared to the original paper (https://www.altmetric.com/details/1992940, the original paper was about 450 before Elsevier switched to Plum). It makes sense to take out editorial pieces, but perhaps tweets etc for the secondary coverage should somehow filter through to the original – this would be akin to Carl Bergstrom et al’s network approach to citation analysis (http://octavia.zoology.washington.edu/publications/WesleySmithEtAl16.pdf)
Hello! Great to see this kind of discussion. 🙂 We do debate this stuff a lot internally. We’ve given our data to a lot of different bibliometricians at this point so there are quite a lot of relevant independent studies in the literature too.
I’m not sure there’s a correct answer re: weightings. With any complex indicator the math gives way to heuristics pretty quickly. In our case the complexity comes from trying to derive a single indicator of attention in a bunch of heterogeneous contexts. Our weightings are unquestionably subjective.
What kind of attention is “worth more” – something that gets people to read the actual paper, or that gets the synopsis out more widely? Something from people engaging with the content? If a university comms office answers this differently to a journal press officer then how should that be taken into account?
Pragmatically, beyond basic modelling we rely on user feedback to figure out what feels right. We support a lot of authors and do a lot of training for our publisher clients so end up with a pretty good idea of what they think is important. We spend a lot of time investigating new sources: we’d love to add ResearchGate (if anybody has an in let me know :)) and LinkedIn, though sometimes there are barriers outside of our control.
Great point about the curating & the other factors at play in the Top 100: I think that’s absolutely right. We started it as a simple, catchy way of showing that people really did talk about research articles online, and it’s been great to see how interest in the list has grown. We’ve received a lot of great feedback and suggestions this year in particular, and will be discussing if and how we might evolve things next year. Definitely interested in any further feedback, feel free to tweet me (@stew) or drop the support team a note with any comments or questions.
I think the curation is a really important aspect. Over the years, the top 100 list has gone from “here’s a bunch of weird and wacky stuff” (https://scholarlykitchen.sspnet.org/2014/12/17/altmetrics-top-100-what-does-it-all-mean/) to a more serious look at serious research that people paid attention to. This is a positive development.
But it does speak to the Altmetrics score as being an editorially curated metric. It is not an objective look at the numbers, as each number is weighted based on the opinion of how much it should “count”. That’s one reason why the data behind the score is often much more interesting and informative than the score itself.
Describing scholarly impact (citation based) has historically a focus for most quality metrics, however disseminative impact (i.e. Altmetrics, etc.) is probably as important. In medicine for instance, there are very little correlation between views and citations of papers. Some papers have disseminative value (important to change clinical practice) more than scholarly value (important to advance science), and vice versa. I guess the main problem with any disseminative metric is that it is challenging to capture the data required to render it representative on the same level as scholarly metrics. I suspect that if any send one will find that HIC papers tend to be more scholarly and LMIC papers will be more disseminative, reflecting the development stage of the setting. It may also be a way of representing the different impact of journals, which may help authors decide where to publish (based on the perceived impact – scholarly or disseminative – of papers). Lots more work to do. Great post.
Here’s another question: how well do Altmetrics capture what you call “disseminative impact”? As Kent points out in the article here, these metrics mostly measure what’s easy to measure, rather than what is most meaningful. A recent study found, “that about 2% of the overall population of scholars in the Web of Science is active on Twitter.” (https://arxiv.org/abs/1712.05667). If this is indeed the case, then how useful is a measure of Twitter activity in understanding dissemination?