Post-Publication Peer Review: What Value Do Usage-Based Metrics Offer?

A PLoS ONE article recently went viral, hitting the front page of Reddit and garnering an amazing amount of reader interest. This was great news for the journal and the paper’s authors, but raises questions for the notion of post-publication peer review.

As Kent Anderson recently discussed, the idea of post-publication peer review is nothing new — it’s called “science”. Publication of the paper is an end of one process but the beginning of another. The paper is read and discussed, in private conversations, in lab meetings, at journal clubs, at the bench, in the bar and beyond. It’s how we analyze and understand what has been accomplished and plan the next set of experiments to follow.

fruit fly — *Drosophila melanogaster*, image via Michael.

The proposed revolution then, is not in the concept, but in the tools available, ways to open that conversation worldwide and to track the life of that paper after it has been published, to better measure its true impact. Despite initial momentum, movement toward implementation of these new technologies seems to have hit a stalling point.

Article commenting is increasingly seen as a futile pursuit. Nearly every publisher has tried to drive commenting in one way or another to little success. There are fairly obvious reasons for this failure, particularly the question of why someone would spend time commenting on another researcher’s article when they could be doing their own research instead.

Doing away with pre-publication peer review and replacing it entirely seems to have garnered little support in the research community. F1000 Research will be the biggest test of whether this has any viability. Their approach seems more a strategy meant to increase publisher revenue, rather than to benefit researchers. By collecting fees from authors with incomplete or unpublishable results, F1000 Research does away with the costs of rejection. All authors will be expected to pay for publication, whether their articles ever eventually pass a peer review process (which conveniently happens after payment is made).

It’s a clever strategy, but one that seems to benefit private investors more than readers or authors. Most researchers I speak to want less to read, not more. The idea of slogging through an enormous slush pile of bits and pieces of some strangers’ lab notebooks does not hold great appeal. Further, if peer review is to be practiced on this near-unlimited degree of salami-slicing, won’t that overload the system? How much time will researchers need to devote to performing peer review on a massive influx of leftovers and half-baked ideas?

That leaves the search for new metrics (“altmetrics“) as perhaps the greatest hope for near-term improvement in our post-publication understanding of a paper’s value. The Impact Factor is a reasonable, if flawed measurement of a journal, but a terrible method for measuring the quality of work in individual papers or from individual researchers. A move away from journal-level metrics to article-level metrics is certainly welcome. A researcher’s work should be judged on its own merits, not necessarily the company it keeps within a journal’s pages.

The ideal rating system would employ a deep and nuanced understanding of a researcher’s work. But we don’t live in an ideal world, and people seem intent on having quantitative ranking systems for decision making. As a species, we seem to like ordered lists. If we want to replace the Impact Factor, then we need to offer something that does a better job of measuring quality. Unfortunately, many of the new proposed metrics measure something different altogether. They seem chosen because they’re easy to determine, rather than because they’re important.

Metrics based on social media coverage of an article tell us more about the author’s ability to network than about their actual experiments. Metrics based on article usage are even harder to interpret as they offer information on reader interest and subject popularity, rather than quality of the article itself.

Looking at PLoS’ treasure trove of article metrics, these questions become evident. Two of the most read articles in the history of all PLoS journals are about antidepressant medications. Does this mean that these two articles are significant, important studies about depression or does it instead indicate a tremendous level of reader interest in the subject of depression? These are both valuable pieces of information, but they hold different meanings and offer different value to different parties.

The presence of quirky, oddball articles in the most-read list makes a strong case against usage as an indicator of quality or impact. A paper on sexual activity among fruit bats has long been one of the top 5 most-read papers from PLoS. In March, a group of Japanese researchers published a study on a line of fruit flies that had been maintained in constant dark conditions for 57 years. The study hit the front page of Reddit and within a week it became one of the top 5 most-read articles in the publisher’s history.

The fruit fly study is interesting, to be sure–the authors maintained a colony for flies in total darkness for more than half a century, how cool is that? But the results are not terribly significant scientifically. The take home message from the study is that evolution takes a really long time. 57 years meant 1,400 generations of flies, the equivalent of 30,000 years for humans. No, the flies didn’t lose their eyes like blind cave fish. All that was found were some inconclusive changes in gene sequence and an observation that they seemed to breed a bit better in the dark than in the light.

Still, the study drew over 180,000 readers in a very short period. But would this level of interest mean much to you if you were making funding or career advancement decisions? Really, all we learn here is that people are interested in weird things. The usage metrics don’t seem to immediately correlate with scientific impact. And that’s a problem if you’re looking to replace the impact factor as the key metric in these sorts of decisions.

That said, there are places where usage metrics are likely very useful. Librarian purchasing decisions are largely based on usage numbers. This seems a perfectly reasonable way of allocating subscription dollars, putting them toward the publications that one’s institution wants to read.

And there are types of publications and fields of research where our standard method for measuring impact, citation, doesn’t really work all that well. Clinical practice journals offer tremendous value to the medical community. There’s much to be learned from the experiences of others. But the value offered translates into treatment, not further research that yields citations.

Engineering journals can suffer the same fate. Publications in some of these fields are often more about solving specific problems than hypothesis-driven basic research. If the problem is solved, then there may not be many future projects based in the same area, and hence few citations.

For both of these areas, usage may offer a better measure of impact than citation. Do we then need to think about different classes of journals and apply different sets of criteria and metrics to the value of the research published?

For the mainstream of science journals, usage based metrics don’t seem to offer the much-desired replacement for the Impact Factor. There is value in understanding the interest drawn by research, but that value is not the same as measuring the quality of that research.

So far we’re mining all the easy and obvious metrics we can find. But they don’t offer us the information we really need. Until better metrics that truly deliver meaningful data on impact are offered, the altmetrics approach is in danger of stalling out. This points to a major crossroads for the field.

Like so many new technologies, there’s an initial rush of enthusiasm as we think about how it could fit with scholarly publishing. But then we hit a point where the easy and obvious approaches are exhausted without much return. Now the hard work begins.

David Crotty

@davidacrotty

David Crotty is a Senior Consultant at Clarke & Esposito, a boutique management consulting firm focused on strategic issues related to professional and academic publishing and information services. Previously, David was the Editorial Director, Journals Policy for Oxford University Press. He oversaw journal policy across OUP’s journals program, drove technological innovation, and served as an information officer. David acquired and managed a suite of research society-owned journals with OUP, and before that was the Executive Editor for Cold Spring Harbor Laboratory Press, where he created and edited new science books and journals, along with serving as a journal Editor-in-Chief. He has served on the Board of Directors for the STM Association, the Society for Scholarly Publishing and CHOR, Inc., as well as The AAP-PSP Executive Council. David received his PhD in Genetics from Columbia University and did developmental neuroscience research at Caltech before moving from the bench to publishing.

Discussion

32 Thoughts on "Post-Publication Peer Review: What Value Do Usage-Based Metrics Offer?"

Excellent piece, setting the whole record straight, pointing out the many seductions and traps to be avoided with usage data and also hinting at solutions. Couldn’t what you wrote be extended to Google and Page Rank? (i.e quantitative measures becoming indicators of quality)?

By aalamwassef
Apr 19, 2012, 6:08 AM

Good summary. In our brief experience with running a portal based solely on post publication peer review, I can say that there is a need to separate data suggesting popularity (could be good or adverse) from that indicating quality.
I’m afraid quality will have to depend on the reviewer/commentator’s score. There is no other way to objectively assess it. Popularity (good or bad) on ther other hand is easily assessed.
Citation tracking, in my view, is also basically a measure of article’s popularity and hence fundamentally flawed. its days, as the sole metric, of an article’s (or journal’s or author’s) impact are hence numbered.

Kamal Mahawar
CEO, WebmedCentral

By Kamal Mahawar
Apr 19, 2012, 6:30 AM

Great insights (as always). One comment on this:

“Still, the study drew over 180,000 readers in a very short period.”

I’m not sure I would call them “readers” — “visitors”, maybe, or even (perhaps better) “page viewers”. Looking at the stats by medium for the article on the PLoS ONE site:

Full-text HTML (the Reddit folk): 188,274
Full-text PDF: 1,139
XML: 36

One suspects that the PDF figure might be a better reflection of the actual “reader” interest in the paper.

It would be interesting to have added to these metrics some of the measures of user engagement, such as average time on page and bounce rate, that are increasingly important to site managers and editors in this fragmented, highly networked environment. Often a piece of content (especially, I suspect, a scholarly article) that looks huge based on page views might fare considerably worse than average on these kinds of measurements. Anyway, it could be one useful corrective to the focus on “usage”.

By Stewart Wills
Apr 19, 2012, 7:29 AM

I heard this from Pete Binfield personally at one point, so unfortunately I can’t cite actual data here, but PLoS (and, to the best of my knowledge, PLoS alone) actually tends to have significantly higher HTML views than PDF across the board. Attributing these numbers to “the Reddit folk” is partially correct insofar as the HTML full text is PLoS’ equivalent of the article abstract “splash page” that you normally need to access before downloading a PDF from non-OA journals, but still worth keeping in mind.

By axfelix
Apr 19, 2012, 12:36 PM

Sorry; I didn’t mean to suggest that HTML views aren’t important. I was only suggesting that in this specific case, given the huge number of HTML views and the much smaller number for PDFs, it seemed plausible that the majority of the users logged by that usage number might have been people coming from that specific viral channel, and that one can’t assume that they are actually reading the study. Apologies if calling these visitors “the Reddit folk” seemed unduly flip . . .

By Stewart Wills
Apr 19, 2012, 1:52 PM

Nope, no offense taken — just wanted to clarify that in PLoS’ case, there are probably a more-than-average number of “serious” readers who never actually access the PDF.

By axfelix
Apr 19, 2012, 2:25 PM

The fastest way to ensure the success of post-publication peer review is to set up a system whereby authors of new articles gain credits by serving as reviewers of others’ articles such that their own articles are tagged to be given higher priority for review when submitted. This system should include not only a quantitative component (how many articles one has reviewed) but also a qualitative one (how useful the reviewers’ comments have been), though of course the latter would be more controversial and difficult to implement. If one’s own speed to publication depended on one’s level of contribution to the post-publication review system, you can be sure there would be plenty of incentive for scholars to devote the time and effort to participating. (Some waiver would have to be extended to very junior faculty, of course, to make this kind of system operate fairly.)

By Sandy Thatcher
Apr 19, 2012, 9:42 AM

Sandy, this sounds very much like PubCred, a proposal to fix peer review, by Jeremy Fox and Owen Petchey. http://scholarlykitchen.sspnet.org/2010/09/16/privatizing-peer-review/

By Phil Davis
Apr 19, 2012, 10:01 AM

I’m hesitant to endorse programs like this because they don’t necessarily pair expertise with review. One of the key points of our current peer review system, and an essential job for a journal editor, is to find the right reviewers for a given paper. My opinion on a paper well outside of my field is less valuable than one that’s directly related to my own research. The deliberate pairing of this expertise with submitted papers seems a much better system than relying on serendipity, hoping someone with a clue will happen to volunteer on a given paper.

As such, it can’t serve as a replacement for our current system. You have no guarantee that any paper submitted would receive any reviews at all. What if all researchers volunteered for the same 10 papers and the rest went un-reviewed?

A system like this puts an additional burden on already heavily-burdened researchers. If I ask someone to peer review a paper and they decline, I don’t hold it against them. I respect their busy schedules and know that they likely will take on any papers that they can. Here you’re forcing them to review a specific number which may be more or less than they would normally do under the current system. So you’ll lose efforts by some of our best reviewers once they find out they only have to do X number of reviews and you’ll gain a huge quantity of reviews from people who are unqualified.

And the qualitative measurements are sure to be a mess. If you think people hate the Impact Factor, just wait till you try something like this.

By David Crotty
Apr 19, 2012, 10:07 AM

Good points, David, but remember that I enphasized a qualittative as well as quantitative factor so that those who offer useless comments will not get as much credit as those who offer useful ones. (This of course adds the complicating factor of reviewing the reviews.) Also, the current system relies on a very uneven distribution of the burden for peer review. A few scholars review a great many articles, while some review none or very few. Isn’t equity a factor to be used in judging the utility of a system also?

By Sandy Thatcher
Apr 19, 2012, 10:18 AM

I don’t know that I would want to put a huge amount of emphasis on equity. Science (and research in general) is a meritocracy, not a democracy. We don’t distribute funds and jobs to everyone equally, nor should we consider everyone’s level of expertise to be equal.

By David Crotty
Apr 19, 2012, 10:21 AM

Let me attempt to address a few points raised here.

Find the right reviewers: I think in an ideal world, you could leave that to the authors. Authors know best who the contemporaries are. Of course, it could be abused if authors only selectively invite people. But what benefit would one get by publishing anything straightaway on a “no pre publication peer review” journal, if you don’t even tell your peers/competitors about it. To further counteract this practice, one could reveal for every review, if the reviewer was invited by author. This should enable any casual reader to decide what is happening beneath the surface. Of course opinion of somebody outside his/her field of interest matters less. That is why reviewers should openly declare their experience in the area of the paper.

Also, there is no evidence that the same “deliberate pairing” could not be done post publication. It will need effort but should be possible. It should be possible to have reviewers who review every article post publication just as you would get every article seen by reviewers in a pre publication system. The whole system of post publication peer review needs to be taken seriously and organised carefully. It should not be left to “volunteers” if we have to give it a real try.
Last issue is that of rewarding and evaluating the reviewers. There are multiple ways it can be done and people providing more quantity will not necessarily easily win over the ones providing better quality.

I agree science is “meritocracy” but why should “meritocracy” be hostage to what your peers know and think. Biases of “pre publication peer review” system are fairly well known. It only survives because we haven’t been able to provide a better alternative. I certainly feel “post publication peer review” only method could answer some of the questions our current systems pose. I would finally concede however that any qualitative impact factor based on these metrics will not be easy to design and will pose its unique challenges.

Kamal Mahawar
CEO WebmedCentral

By Kamal Mahawar
Apr 19, 2012, 1:13 PM

Also, there is no evidence that the same “deliberate pairing” could not be done post publication. It will need effort but should be possible. It should be possible to have reviewers who review every article post publication just as you would get every article seen by reviewers in a pre publication system.

This is what F1000 Research is planning to do. What do you think are the advantages of this over pre-publication peer review?

Every single (100%) researcher I’ve spoken with in the medical/life sciences arena has told me they absolutely will not read any journal article that has not been peer reviewed. If we assume no one is going to look at these posted but unreviewed papers until they’ve been reviewed, then what has been gained by posting them?

Furthermore, there’s an enormous amount of winnowing done by editorial rejection (http://scholarlykitchen.sspnet.org/2012/04/17/editorial-rejection-increasingly-important-yet-often-overlooked-or-dismissed/). If you’re getting rid of this pre-publication review step and posting everything that comes in, then you’re massively increasing the peer review burden on the community.

And for an author, what happens if your published paper fails to pass peer review? You now have a publicly available failure with your name on it for all to see. And if it’s in an author-pays journal, you’ve paid for that privilege.

By David Crotty
Apr 19, 2012, 2:07 PM

I find you arguments here to be quite contorted. As I understand it, you are confusing media hype with post-publication peer-review (PPPR) and trying to condemn altmetrics by association. No-one in the altmetrics community is is saying PPPR should replace peer review, only that it can add substantial scholarly perspective to published work and can also act as a source of data for article level scientometrics.

Moreover, as someone who has left a substantive PPPR comment on the paper that leads your piece (prior to your blog post), I would say that the viral media + post-publication peer review is working perfectly as it should in this case. A paper was published that garnered substantial attention in the scientific and lay community, it came across my twitter feed multiple times, I read the paper, noticed scholarly flaws, and made an informed scientific comment about the work. What else can be asked for in terms of PPPR?

I suggest before you make any further general attacks on altmetrics (which is clearly the future of scientometrics), it would be worth clarifying and quantitatively substantiating your views on this topic instead of making meandering posts with unfounded statments such as: “usage metrics don’t seem to immediately correlate with scientific impact”. Last I checked “seemed” wouldn’t cut it as evidence in any scholarly context, and weak arguments such as those presented here unfortunately undermine your authority to make any claims about the role of altmetrics and PPPR in the future of scholarly publishing.

Regards,
CMB

By Casey Bergman (@caseybergman)
Apr 19, 2012, 3:32 PM

Wow, I’m sorry if that’s the message you took from this article. I’ve deliberately tried to separate out altmetrics from other PPPR efforts, and called it our greatest hope for improvement in our understanding of a paper’s impact. If you took that as an attack then I’m not quite sure what to say.

PPPR means many different things to many different people. Some think of it as article commenting, some as a star-rating system on articles, some as social media-based discussion of an article, others as incorporating new metrics to follow the life of an article after it is released. As noted above, I’m dubious about most of these but think there’s much to be gained from new, better metrics. But I don’t think basing things solely on usage is particularly meaningful for measuring quality or importance.

If PPPR and commenting is working “perfectly” in this case, what about the thousands and thousands of other cases where it’s not working at all? How many other papers this month in PLoS ONE received a substantive comment? As has been written elsewhere, PPPR tends to lead to a small number of papers receiving a lot of attention and the majority being ignored. That’s particularly evident for a paper that goes viral like this, or the arsenic life paper as another example.

As this is an opinion piece and not a work of scholarly research, you’ll have to settle for my opinion on the subject. If you have evidence that usage metrics directly correlate with scholarly impact, I’d love to see it. Considering that you haven’t refuted or even attempted to refute anything I actually claimed in the piece, we would seem to be in general agreement, and I’m a bit confused by your hostile tone.

By David Crotty
Apr 19, 2012, 4:23 PM

While the irony is not lost on me (or, I trust, your other readers) that I am provinding scholarly context to your “publication” and therefore improving it via PPPR, I am happy to point you to several studies that measure a positive correlation between various altmetrics and citation rates:

http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0019917
http://arxiv.org/abs/1202.2461
http://arxiv.org/html/1203.4745v1
http://www.jmir.org/2011/4/e123/

This is a new and active field (i.e more papers will emerge soon) and I have no doubt this type of research will lead to a radical transition in how we scientometrically evaluate “impact” in the coming years.

By Casey Bergman (@caseybergman)
Apr 21, 2012, 10:51 AM

Not sure I’d describe it as “ironic”, just that behavior on an informal business blog is different from that seen in the formal scholarly literature. But it’s good to see we’re still in agreement. Altmetrics do indeed offer great potential for redefining how we understand a paper’s impact, though this is going to require further development. Simple metrics alone, like usage statistics, are not of immediate value without this further ability to transform them into something meaningful. From a cursory reading, none of your cited papers makes a claim for downloads as directly correlating with citation and one even goes so far as to call usage numbers “orthogonal” to citation. Thanks for providing further evidence for my main point.

By David Crotty
Apr 21, 2012, 11:10 AM

If PPPR is not supposed to replace prepub review then it adds to the existing burden of review, which is already millions of hours a year. Why would we want to add millions more? This seems completely unrealistic.

By David Wojick
Apr 19, 2012, 6:26 PM

There has always been a role for post-publication peer review, in the form of letters to the editors, retractions, etc. In some cases, it will make sense for authors to add a note about further work that they have done, or are doing.

Having said that, there is a blurring of timing between peer review and publication happening anyways. The best peer review starts long before publication – indeed, at the very beginning of the research process. Ideally, this should start with figuring out the research questions.

By hgmorrison
Apr 19, 2012, 7:27 PM

Great post David. I wonder whether the current crop of altmetrics are too focused on the needs of authors- they want to see that their work is important and relevant, and maybe get feedback on a much shorter timescale than the 18 month citation cycle. Maybe we should ask ‘what would readers actually want to be able to do?’

For example, many researchers I know cover the pdfs of the papers they read with notes and comments,. These actually are the individual grains of post publication peer review. Is there a way to free these onto the wider web so that others reading the same pdf can choose to see them as well?

Of course, these raw comments are very hard to boil down into ‘like’ and ‘dislike’ or any other overall metric, so maybe 20-30 different tags with within pdf coordinates could be available. These tags would say things like ‘poorly supported result’, ‘statistical error’ or ‘brilliant idea’, and would also contain a box for a more detailed comment.

Some algorithm could then distil all the tags into two different scores for e.g. reliability and importance. Alternatively, it could produce a shape indicating the overall reliability and importance of the paper- I’m thinking here of the visual representation of soccer players’ abilities that Worldwide Soccer Manager uses (see here).

By Tim Vines
Apr 19, 2012, 4:25 PM

Tim,

You say
“For example, many researchers I know cover the pdfs of the papers they read with notes and comments,. These actually are the individual grains of post publication peer review. Is there a way to free these onto the wider web so that others reading the same pdf can choose to see them as well?”

Interestingly, there is. It is called Utopia Documents. It is a scientific PDF viewer that offers exactly the functionality you mention. If you comment in a PDF opened with Utopia Documents, your comments can be seen by anyone else opening the same PDF. Extremely easy and convenient. Utopia Documents is completely free. Download it from http://utopiadocs.com (Windows, Mac and Linux versions)

By Jan Velterop
Apr 30, 2012, 4:39 AM

Great post. I am glad that you point out that “The ideal rating system would employ a deep and nuanced understanding of a researcher’s work. But we don’t live in an ideal world, and people seem intent on having quantitative ranking systems for decision making”. I think that we need to challenge this focus on quantitative metrics. Any metrics-based system has the potential to distort scholarship, even without considering gaming or obviously nonrelevant metrics. Impact factor entrenches a scholarly publishing system that is still largely based on print and contains significant market dysfunction, such as a few players enjoying an inelastic market while other essential players such as society publishers and university presses are struggling to survive.

Some of the approaches to challenging metrics include:
– critical analysis (I have done just a bit of this in my book chapter, The implications of usage statistics as an economic factor in scholarly communication https://circle.ubc.ca/handle/2429/954 (In brief, usage stats are likely to have significant negative impacts, from discouraging use to discouraging important but not necessarily popular entire fields of research); and the second chapter of my draft thesis http://pages.cmns.sfu.ca/heather-morrison/chapter-two-scholarly-communication-in-crisis/ (search for irrational rationalization).

– research – both quantitative (of the type you describe above, David) and qualitative could be very helpful to illustrate how problematic quantitative metrics are. One example, from an interview study of scholarly monograph publishers I did recently, is the impact of pushing scholars to publish more books to obtain tenure. This pressure is not consistent with the time it takes to write books that are really worth publishing and reading; so in this instance, we have a quantitative metric intended to improve quality and productivity (of our academic staff) which appears to be lowering quality (more mediocre books, more book production when the problem for all of us is not enough reading material, but rather not enough time to read).

By hgmorrison
Apr 19, 2012, 4:50 PM

Nice article, David, as usual. I like your assessment that we need to start delivering the goods for altmetrics–not just fancy claims and wild ideas, but hard evidence showing this is a data source that meaningfully adds to our knowledge.

I’m happy to report that a number of us have been working on such projects, and getting interesting stuff back. An article we recently submitted to PLoS shows results of clustering, factor analysis, and loads of correlations for many altmetrics on ~25k PLoS articles. Mendeley really shines as a data source here; around 80% of articles are in Mendeley libraries, and total Mendeley bookmarks correlate around .5 with citation. We found strikingly similar results in another study just submitted to the STI conference, examining a corpus of scientometrics papers across multiple journals. A similar examination of Nature and Science papers [1] came out looking about the same, too.

I’ll spare you a list of other findings on Wikipedia, blogs and Twitter. Suffice it to say that there’s more hard data being rolled out. We’re not anywhere close, though, to really understanding the potential of these approaches. We need to be looking at things like the interaction effects between say, Facebook Likes and Mendeley bookmarks, or PDF and HTML downloads as suggested above. We also need to be looking at *who* is doing the usage…this is hard for downloads, but pretty easy for Tweets, for example. In principle, we could train algorithms to recognize which tweeps are really good at identifying good (highly-cited later) articles.

As you say, all this takes work. But I’m optimistic, given the results that we’re getting, and the growing interest in the potential of this approach. With recent $25k and $125k grants to the total-impact altmetrics-gathering tool, a second altmetrics workshop scheduled for June (CFP coming…:), a PLoS altmetrics collection soon to launch, and loads of interest from publishers and working scholars, my own sense is that altmetrics is gaining momentum, not losing it.

But I totally agree that it’ll be a challenge to keep that momentum as we push to reach the potential of mining scholarly conversations on the social Web. I think that efforts is going to take a decade or more…but there’s certainly plenty to work on right now!

[1] Li, X., Thelwall, M., & Giustini, D. (2011). Validating online reference managers for scholarly impact measurement. Scientometrics. doi:10.1007/s11192-011-0580-x

By jasonpriem
Apr 19, 2012, 9:12 PM

Thanks Jason, and in particular, thanks for the links for those interested in digging further. I think for most of us, we encounter altmetrics at a much shallower level, in particular the things that various journals are trying to sell to us as useful features. Most of these though, are low-hanging fruit, things that are being measured because they’re easy to measure (rather than because they provide valuable analytic tools). Translating the really meaningful metrics, which as you note are often very complex, to an easily understandable form for the end user will likely provide another challenge.

But I think the level of tracking and understanding that’s potentially offered here is of great value.

By David Crotty
Apr 19, 2012, 9:29 PM

One quick question (well, five really). If the goal is to find an alternate metric that merely correlates with citation, why not just use citation? Is the benefit here one of faster analysis? If so, what sort of timesavings are we talking? How long does it take a paper to achieve a level of Mendeley usage that is meaningful? If the new metric merely recreates citation, doesn’t it then have the same flaws?

By David Crotty
Apr 19, 2012, 10:27 PM

Good question(s). I generally argue that there are three big potential advantages of altmetrics:
1. we can track more diverse kinds (or “flavors”) of impact, making many of the invisible college’s mechanisms visible,
2. we can track impact on more diverse audiences, including clinicians, practitioners, and the general public, and
3. we can track impact of more diverse products, including datasets or software–things that citation practices aren’t serving well right now.

Note that none of these require perfect or even particularly good correlation with citation. As you say, perfect correlation might be useful for prediction (since altmetrics are indeed way faster), but this misses altmetrics’ biggest potential value–a chance to really open the hood and peer into the guts of scholarly conversations. As you’ve pointed out elsewhere, right now only a small percentage of scholarly conversation happens online, where we can track it. But it’s an exciting bit, because we couldn’t see it before, and it’s likely to grow over time as scholarly networks move online.

All that said, partial correlation with citation is pretty cool, because it suggests that while we’re tracking a different kind of impact, on a different population, it’s not altogether unrelated to what we’re familiar with. A lot of folks weaned on citation impact measures will find that comforting, and rightly so. It also offers the ability to maybe finally offer customized article recommendations in something approaching real time (it’s been tried using the citation graph, but by the time you’ve got enough data, it’s old news).

This has pretty far-reaching implications…if we can really deliver good, fast recommendations, the curation function of the journal becomes a lot less compelling. As you point out, scientists mostly want less to read, not more…adaptive, personalized recommendations informed by activity in scholars’ individual networks could in theory be pretty good at this.

By jasonpriem
Apr 19, 2012, 11:01 PM

So the correlation with citation is not an end goal, but instead a signifier along the way, a sign that the analysis has some validity and is worth further investigation. Makes sense.

One thought that also occured was a worry about relying on a privately-held, for-profit company like Mendeley as the basis for any evaluation system. If it eventually gets widely adopted, then you’re in the same boat as you are with Thomson-Reuters and the Impact Factor, where there’s one commercial gatekeeper for the information that academia needs.

I’d also hesitate to separate out the curation by journal title from the social behavior around an article, as it is likely having an influence on how readers approach each paper. Take that out of the picture and social activity would change.

By David Crotty
Apr 20, 2012, 7:49 AM

Thank you for the insightful piece! On a personal level, it’s very timely because my lab published a new paper in PLoS ONE last week. As I’ve logged my paper’s daily page view totals, I’ve been grappling with the daunting question of how to interpret article-level metrics. To wit, I had 2,009 total page views spanning the first 7 days post-publication, which breaks down as follows: 1,939 HTML + 64 PDF downloads + 6XML downloads. Half of the total page views were generated within the first 48 hours, which I’m told is “blog-typical,” which I assume is a compliment. (As an aside: so is blog-atypical no early exponential growth?)

As a “control,” I compared my paper’s metrics to 7 other “Pharmacology” papers that were published on the same day. My total page view count exceeded theirs by approximately an order of magnitude (10-fold) each day of the first-week observation period. I attribute the initial spike to the 200-word press release I got written up for my paper by my University press office. This press release spawned a EurekAlert and attendant syndication. Three formal 700-800 word research summaries were published earlier this week, and I assume they are driving the present 100+ page views per day rate, though presumably this amplification will fade quickly.

As a second control, I tried to assess the significance of page views from a longitudinal perspective. I scanned the search returns for all “Pharmacology” papers going back from the present day looking to see when papers on average registered total view counts comparable to mine. There were definitely >10X outliers along the way, but I had to go back 3-4 months to find articles with as many aggregate views as mine — on average. So I think that’s another reflection of amplification, which again I attribute to the inclusion of my press release in the weekly PLoS ONE press pack.

However, it may come as little surprise that my paper has only 4 comments. The first comment was auto-generated by PLoS ONE on account of the aforementioned press release — hey, you have to break the ice somehow, I guess. The second comment was from my Dad, who’s a very smart non-scientist. He asked several big-picture questions from the perspective of an interested layperson with an above-average scientific literacy. The third comment was an honest to God peer review from an academic colleague with whom I’ve had sporadic email correspondence over the last few years. He is an expert in my field, and I value his opinion. The fourth comment was generated by me today: a selection of tweets linking to my paper from real living and breathing people, some of whom I know, most whom I don’t.

Obviously, the comment most germane to your article is #3. I have to say that the commenter was initially hesitant to post anything at all. My plan going forward to is amass as many thoughtful, well reasoned comments about my paper as possible. Now, I’m no dope. I realize most comment threads do not pass the smell test for learned, civil colloquy. So my approach thus far has been to revive past email correspondences with academic peers from my general and specific areas of biology, and then invite each person one at a time to relocate our email conversation to the online comment setting. I’m actually cautiously optimistic.

If you or anyone else out there has feedback, advice or any other ideas, I’m all ears.

-Ethan

By eperlste
Apr 26, 2012, 3:21 PM

The Scholarly Kitchen

Post-Publication Peer Review: What Value Do Usage-Based Metrics Offer?

SSP Originals Auction is Back!

SSP Announces Release of Individual Results for the Insights Benchmarking Compensation & Benefits Study

David Crotty

Related Articles:

Next Article: