(Editor’s Note: Recent weeks have seen considerable movement in the Altmetrics world. We’ve just seen the acquisition of Plum Analytics by EBSCO and a lengthy argument from David Colquhoun suggesting we ignore altmetrics and
“other bibliometric nightmares”. Today NISO is hosting the third meeting in their “Alternative Assessment Metrics (Altmetrics) Project” and a live stream is available. With this in mind, I wanted to revisit a 2012 post from Todd Carpenter that discusses the value of altmetrics beyond serving as a mere replacement for the Impact Factor.)
There are other important value metrics beyond the strength of a journal. This might come as a shock to some STEM publishers, who have flourished or floundered based on the performance of impact factor rankings published each June. While the value of the journal as a container is an important value metric and one that needs to continue, the rapidly evolving alternative metrics (altmetrics) movement is concerned with more than replacing traditional journal assessment metrics.
Like much of life these days, a key focus of our community has been on those qualities that are measured to ensure one “passes the test.” The coin of the realm in scholarly communications has been citations, in particular journal citations, and that is the primary metric against which scholarly output has been measured. Another “coin” for scholarly monographs has been the imprimatur of the publisher of one’s book. Impact factor, which is based on citations, and overall publisher reputation provide the reading community signals of quality and vetting. They also provide a benchmark against which journals have been compared and, by extension, the content that comprise those titles. There are other metrics, but for the past 40+ years, these indicators have driven much of the discussions and many of the advancement decisions during that time. Given the overwhelming amount of potential content, journal and publisher brands provide useful quick filters surrounding quality, focus, interests, or bias. There is no doubt that these trust metrics will continue. But they are increasingly being questioned, especially in a world of expanding content forms, blogs, and social media.
At the Charleston Conference last week, I had the opportunity to talk with fellow Scholarly Kitchen Chef, Joe Esposito about altmetrics and its potential role in our community. Joe made the point, with which I agree, that part of the motivation of some who are driving new forms of measurement is an interest in displacing the traditional metrics, i.e., the impact factor. Some elements of our industry are trying to break the monopoly that the impact factor has held on metrics in our community so that newer publications might more easily flourish. Perhaps the choice of the term “altmetrics” to describe the movement implies the question, “Alternative to what?” — which leads one back to the impact factor and its shortcomings. The impact factor is an imperfect measure, a point even Eugene Garfield acknowledges. We needn’t discuss its flaws here; that is well-worn territory.
The inherent problem with that focus is that it misses a key point of assessment about the actual impact of a particular piece of research (or ultimately its contributor) that is represented by one or more individual articles that may have been published in multiple journals. Our current metrics in scholarly publishing have been averages or proxies of impact across a collection (the journal), not the item itself, or the impact across the work of a particular scholar or particular research project. The container might be highly regarded, and the bar of entry might have been surpassed, but that doesn’t mean that any particular paper in a prestigious journal is significantly more valuable within its own context than another paper published in a less prestigious (i.e., lower impact factor) title. The fact that there are a growing number of papers that get rescinded is a signal of this even within the most highly regarded titles.
Assessment is increasingly important to the communities directly related to, but not part of, the publishing community. Yes, libraries have been applying usage-based assessment and impact factor for acquisitions decisions for some time, but there is more they could/should do to contribute to scholarly assessment. But beyond that, the academe and the administration of research institutions rely on publication metrics for researcher assessment, promotion and tenure decisions. Grant funding organizations have used the publication system as a proxy for assessing grant applicants. However, these are only proxy measures of a researchers impact, not directly tied to the output of the individual researcher.
Scholarly communications is also expanding in its breadth of accepted communications forms. Scholars are producing content that gets published in repositories and archives, blogs, and social media — separately from or in addition to journals. Some researchers are publishing and contributing their data to repositories such as ChemBank and GenBank. Others, such as in the creative arts, are capturing performances or music in digital video and audio files that can be shared just like journal articles. Traditional citation measures are not well-suited to assessing the impact of these non-traditional content forms. If we want to have a full view of a scholar’s impact, we need to find a way to measure the usage and impact of these newer forms of content distribution. Addressing these issues is the broader goal of the altmetrics community and it is perhaps clouded by the focus on replacing the impact factor.
In order to make alternative metrics a reality, there must be a culture and an infrastructure for sharing the requisite data. Just as there was opposition by segments in the publishing community to the creation of standards for gathering and sharing of online usage data in the late 1990s, there exists opposition to providing data at the more granular level needed for article-level metrics. It is instructive to reflect on the experience of our community with COUNTER and journal usage data. When Project COUNTER began, usage data reporting was all over the map. In addition to wide inconsistencies in how data was reported between publishers, there were many technical issues — such as repeated re-loading of files by users, double-counting the download of a PDF from an HTML page, and downloads by indexing crawlers — that skewed the statistics, making them unreliable.
Fortunately, tools are quickly falling into place to provide metrics on the things that the larger scholarly community really needs: individual items and individual scholars. The broader scholarly community has been aware of these needs for some time, and we are making considerable progress on implementation. There has been a great deal of progress in the past decade in increasing the granularity of assessment metrics, such as the h-index and the Eigenfactor. These metrics are gaining traction because they focus on the impact of a particular researcher or on particular articles, but they are still limited by dependence solely on citations. Newer usage-based article-level metrics are being explored such as the UsageFactor, Publisher and Institutional Repository Usage Statistics (PIRUS2), both led by COUNTER, as well as other applications of usage data such as the PageRank and the Y-Factor. Additionally, infrastructure elements are becoming available that will aid in new methods of assessment, like an individual researcher ID through the ORCID system that officially launched in October. Other projects described in Judy Luther’s posting on altmetrics earlier this year are examples of infrastructure elements necessary to push forward the development of new measures.
Our community needs to move these pilots and discussions of alternative metrics into the stage of common definitions and standards. We need to come to consensus on what should be included and excluded from calculations. For example, we need clear definitions of what constitutes a “use” of a dataset or software code and how to quantify the applications of data from within a larger collection. Some of these determinations might be domain specific, but many of these issues can be generalized to broader communities. Because these resources can and often do reside in multiple repositories or resources, thought needs to be given to how metrics can be consistently reported so that they can be universally aggregated. Additionally, we will need commitment to be a level of openness about the automated sharing of usage data so that network-wide measures can be calculated. Efficient methods of real-time (or nearly real) data collection should also be considered an important element of the infrastructure that we will come to expect. While a central repository of data for anonymization purposes or for more robust analytics might be valuable, it probably isn’t a necessity, if we can reach agreement on data supply streams and open data-sharing tools and policy guidance.
By all means, it is early days in developing alternative metrics and the technical and cultural support structures needed. Some have argued that it might be too early to even begin establishing best practices. However, if alternative assessment is to really gain traction, agreeing at the outset on the needed components of those metrics will solve many downstream issues and lead to more rapid adoption by everyone who needs better assessment of the value of scholarly communications.
42 Thoughts on "Stick to Your Ribs: Altmetrics — Replacing the Impact Factor Is Not the Only Point"
Yes I think altmetrics open new ways to understand how science works. Phil Davis’s analysis of article usage half-lives based on COUNTER data is a fine example. Moreover, the success of the original impact factor shows that even flawed methods can be very useful. This goes for altmetrics as well.
Fortunately, for monographs, there has always been a vital metric that doesn’t exist for journal articles, viz., the book review. This is not to say that every monograph gets equal treatment in book reviews; some monographs get many more reviews than others. But at least a book review is an independent and reasonably reliable measure of a monograph’s worth to a community of scholars.
It is hard to see a review as a metric in the quantitative sense, unless you generate a score from it. Reviews typically say both good and bad things about the book, do they not? Thus there could be a sliding scale but it would be a subjective judgement. Some of the people analyzing tweets and blog posts have the same problem, sorting the stuff into positive and negative for example. Some are experimenting with semantic techniques for this. Unfortunately a single little word can reverse the meaning, as in “this is good” versus “this is no good.”
Here is a recent example of positive versus negative tweet research. The results suggest that tweets do not provide that information.
“Tweeting Links to Academic Articles”
Mike Thelwall, Andrew Tsou, Scott Weingart, Kim Holmberg, Stefanie Haustein
Cybermetrics, 17(2013), Issue 1, Paper 1
“Academic articles are now frequently tweeted and so Twitter seems to be a useful tool for scholars to use to help keep up with publications and discussions in their fields. Perhaps as a result of this, tweet counts are increasingly used by digital libraries and journal websites as indicators of an article’s interest or impact. Nevertheless, it is not known whether tweets are typically positive, neutral or critical, or how articles are normally tweeted. These are problems for those wishing to tweet articles effectively and for those wishing to know whether tweet counts in digital libraries should be taken seriously. In response, a pilot study content analysis was conducted of 270 tweets linking to articles in four journals, four digital libraries and two DOI URLs, collected over a period of eight months in 2012. The vast majority of the tweets echoed an article title (42%) or a brief summary (41%). One reason for summarising an article seemed to be to translate it for a general audience. Few tweets explicitly praised an article and none were critical. Most tweets did not directly refer to the article author, but some did and others were clearly self-citations. In summary, tweets containing links to scholarly articles generally provide little more than publicity, and so whilst tweet counts may provide evidence of the popularity of an article, the contents of the tweets themselves are unlikely to give deep insights into scientists’ reactions to publications, except perhaps in special cases.”
This is not a new finding – I wrote about this already in 2011: “the vast majority of tweets simply contained variants of the article title or the key conclusion, and rarely contained explicit positive sentiments (such as “Great article!”) or—even less common—negative sentiments (such as “questionable methods”—I have not seen any examples of the latter). This may be because the mere act of (re)tweeting an article is often an implicit endorsement or recommendation with which readers express their interest in and enthusiasm about a specific topic, support the research question and/or conclusion, or simply want to bring the article to the attention of their followers. Additional comments are not necessarily required to express this implicit endorsement. Also, with most tweets occurring on the day of publication, few readers will actually have had time to carefully read and appraise the entire paper beyond the title and perhaps abstract. While we originally thought of doing an automated sentiment analysis, the sparse nature of comments did not make this approach seem promising to elicit more specific data, “
And some words have very different implications in context. I remember a non-English mother tongue colleague being confused by a book review that called some artwork ‘pretty ugly’.
While I recognize that people long to get out from under the “Tyranny of the Impact Factor”, what is the likelihood of this happening in practice? Are any of the altmetrics currently available gaining any traction in the granting agencies, job search committees, and tenure review committees (thinking specifically of the STEM fields)? Because, if all the metrics (including JIF, altmetrics, and article-level metrics) are gameable and have equal levels of downsides, I doubt that there will be a big move to embrace the new metrics. Why not just deal with the devil you know and avoid the switching costs?
I’m afraid it is very obvious that this article and comments are not written by practising scientists. Once one considers individual papers one sees that altmetrics is even sillier than impact factors. The examples in my post, to which you kindly link, make that very obvious: http://www.dcscience.net/?p=6369
Some bits of what you write I don’t recognise at all. You say ” to break the monopoly that the impact factor has held on metrics”. In any decent lab or university, impact factors have never been taken seriously. The monopoly of which you speak is restricted to bibliometricians, and others who are incapable of reading the papers and judging their content.
On the contrary, David, I am a practicing cognitive scientist, focusing for the last ten years on the science of scientific communication. Your claim that universities do not take impact factors seriously is an empirical one, which I doubt is correct. Your blog article actually complains about universities that do take the IF seriously. Perhaps you are confusing what you want with what is.
Nor are altmetrics confined to the social media you list, although that is a common confusion because that is what some firms are selling. We still do not know how to measure the importance of research and it is a worthwhile goal to do so. But in the near term what these metrics may provide is ways to improve scientific communication. In any case ignoring new data is not a scientific approach.
I said any decent lab or university doesn’t use impact factors to evaluate individuals. No doubt some second rate institutions do.
You say “We still do not know how to measure the importance of research “. That’s quite untrue. You have somebody who know’s the subject well read the papers. Of course that is fallible too, because the real importance of a piece of work will not be apparent until 10 or 20 years after the work is published. But at least you can check whether the methods are sound and whether the conclusions are justified by the data. We looked at some examples of papers with high altmetric attention in http://www.dcscience.net/?p=6369 and in http://www.dcscience.net/?p=156#follow , In those cases the papers where shown to be very dodgy when you looked at their contents.
Any attempt to measure the value of work without looking at the contents is doomed to failure. In fact it’s worse that that. It corrupts science by encouraging gaming and sensationalism (ie dishonesty). You should think very carefully about the unintended consequences of bibliometry.
Your use of “decent” suggests that you are making a value judgement, probably a self fulfilling one. But if you have data to back it up let’s see it. Beyond that I said importance, not quality. I find the term “quality” to be probably meaningless in this context. Importance means the paper causes things to happen. This covers a lot of different phenomena, some immediate and some distant in time. Thus there are a lot of different things to measure, possibly to predict as well. But while expert opinion may be valuable in assessing or predicting importance, I consider it less useful than scientific analysis.
For example science, like all human systems, is prone to fads, in which case work is judged to be important that turns out not to be. Conversely, as you point out, work that is ignored sometimes turns out to be very important. In short importance is a set of phenomena, not a matter of expert judgement, so judgement is no substitute for science.
A few points related to your blog post, which I can appreciate in some respects.
I do not for a moment suggest that any metric should be used in place of actually reading and considering a particular content object. No algorithm or analysis can replace the nuanced reading of a paper by a specialist in the field. That is not the point of bibliometrics, or of assessment generally. The point is to quickly and relatively compare related objects. It would be foolhardy to presume the value of any work, publication or paper could be summed up in a single number, whatever its source. However, metrics are valuable tools in a variety of contexts.
And contrary to your protestations, people do use metrics of all different kinds. They might not rely on the impact factor as gospel to determine what they read, but it is one component of a suite of information about a publication (say that includes, the editor, the editorial board, the publisher, the author, the opinions of other) that people take into account when considering a publication or its value. Some of these metrics are qualitative and some are quantitative. One isn’t inherently better or worse than the other; again context is important.
In an era when the volume of scholarship is far exceeding any single person’s ability to read and absorb all that is published, we all rely on filtering mechanisms to help identify and prioritize our content consumption. Some of these are historical, some qualitative, while others are based on personal relationships or experience. I firmly believe there is also great value in a quantitative approach to sorting and prioritizing–indeed assessing–content based on various metrics. That is not to say that any quantitative assessment is the only metric, but it is one of many that play into decisions about what to read or the relative value of what was read.
The vast majority of researchers are already relying heavily on various metrics, even if they are unaware of it. Whomever uses Google is using an “altmetric” based on bibliometric analysis of webpages and a significantly large percentage of researchers use Google as their first stop when seeking information on the web. Even those who turn to discovery tools specific to their own field, have internal metrics to highlight specific articles over others. Library acquisition decisions at most institutions include some form of assessment, with many metrics playing a role in that, not just IF, but usage, cost per use, etc. As the amount of available information (publications, data, etc) continues to grow, people will rely ever more on these tools to bring information to their attention.
This notion that scholarly assessment based on something other than citation studies must therefore be “Facebook Likes” or counting “tweets” is to misunderstand the potential value in the secondary data being generated by online systems. Yesterday at the NISO meeting on altmetrics , Mike Buschman described five types of metrics that can be captured for assessment purposes: usage, captures, mentions, social media reference, and citations. User behavior is a great example of a new metric that can highlight interesting connections or correlations between content forms. As a real world example, how many have bought something Amazon recommended as “people who bought this also bought X, Y or Z”. Similarly, there is value in presenting to users “people who read this, also read X, Y or Z”.
Furthermore, scholarly communication is taking place in a variety of forms and formats. As a perfect example of this, your paper was published not in a traditional journal (as you note in your introduction about not publishing it in eLife), but on your blog. Simply because it is posted there does not mean that it is somehow outside of the scope of literature that “counts” and that it is value-less. Scholars are creating a variety of content (data, software, theater productions, etc) that are not captured in the traditional scholarly publications ecosystem. Part of the value of these new metrics is that they can more fully represent the output of researcher’s work, that are not captured in traditional assessment.
Before you respond saying that any scholarship worth reading should be part of the scholarly record, you might consider both the time you invest in writing on your blog, the time the community spends reading this blog (and others). Those contributions to the scholarly conversation needn’t only take place in journals or in a traditional article format to be worthy of recognition.
All I can say is that we must talk to very different people. I never came across a scientist who used any sort of metric to decide what they should read. It simply isn’t how science works. That’s just a fancy of bibliometricians, a way to justify their existence,
I can imagine that a librarian might want to use citations to decide what journals to buy. If, that is, they are lucky enough to have the choice, rather than being forced to pay millions to predators like Elsevier for a bundle of journals many of which are junk -that’s not an exaggeration -you’ll find some numbers at http://www.dcscience.net/?p=4873
They may also be interesting to journal editors who are trying to think of ways of gaming the system, but that is not an activity to be encouraged.
I can’t comment on Mike Buschman’s talk because I wasn’t there, but your description makes it sound like more bibliometric games that are unrelated to how research is really done. Well, apart from research in bibliometrics which is rapidly become a self-referential discipline with little relationship to the real world of research.
I’m a single ion channel biophysicist. My blog is a post-retirement hobby. It gives me the chance to say things that younger scientists dare not say because they are terrified of repercussions. If you want to see my serious science, check my Google scholar profile. We posted that piece on my blog because I’m at an age when I don’t buy green bananas and I couldn’t be bothered to spend another year haggling with reviewers. Post-publication peer review is the way we are heading anyway (please feel free to comment on the blog -one problem is that post-publication peer review gets split across many sites).
In fact some of my posts on the blog seem to have attracted readers (about 3.5 million page views since I started). That, oddly enough, doesn’t seem to get counted by any bibliometricians and I can’t get DOIs for any of them (much too expensive).
The role of metrics in reading decisions is itself an interesting and potentially important scientific question. But “I don’t know any” is not a lucid analysis. The flow of knowledge in science is a very complex diffusion process which is poorly understood, so your “that is not how science works” is also facile.
One could start with a reading decision model which might be useful in its own right. For example Phil Davis’s finding of long half-lives suggests that most reading is need driven not curiosity driven or keeping up with the field. Then too the influence of metrics may well be indirect. For example it is frequently said on these pages that scientists use IFs to help decide which journals to submit to.
It is amusing that you cite the altmetric of page views as evidence for the importance of your writing that altmetrics are useless.
My point was that altmetrics don’t include page views of my blog, as far as I know. Even if they did, it would say very little about my real science which doesn’t appear there. As I said, the blog is a hobby, in which I try to explain ideas about evidence to a wider audience. Here is some real science http://www.onemol.org.uk/Colquhoun%20&%20Hawkes-1982-ocr.pdf
If I were to blog about that I’d have very few readers.
Blogs are more than a hobby on some fields. In law, for example, some of the most influential publications in the field now are coming from senior law professors who blog and prefer this method of communication to writing articles for law journals–which, of course, are unique among scholarly journals in NOT being peer reviewed!
Indeed Sandy, and my new blog on horse cognition (http://horsecognition.blogspot.com) is far from a hobby. I am introducing a new theory of animal behavior, which may be my next research field so it is commercial in nature. Then too many blogs are policy advocacy sources, including David Colquhoun’s.
Actually page views of your blog is an example of altmetrics, which are not confined to scholarly articles..See http://en.wikipedia.org/wiki/Altmetrics where you are cited by the way, another altmetric.
Yes I’d noticed that I was cited in Wikipedia. It’s a good example of failing to distinguish between positive and negative citattions!
Aha horse cognition. That’s rather different from my sort of science!
When I read your piece, I generally agreed with the core statement–altmetrics have yet to (and may never) demonstrate an effective measurement for the quality of the work done by a researcher. I’ve referred to it elsewhere as the “hole in the altmetrics donut.”
But that assumes the only value that comes from measurements of scholarly publication is in assessing quality for things like career or funding decisions. While those are important, there are other uses for papers, and other stakeholders with other needs. Those stakeholders may not have the depth of subject area knowledge to judge the work fully on their own just from reading it.
If I’m a Development Officer at a research institution, my job is to secure donations from patrons to support research efforts. It may be helpful for me to know which research projects at my institution are capturing public interest through social and traditional media channels. I can then use these as a way to pique donor interest. If I’m a beginning graduate student struggling to keep up with the literature, seeing what my colleagues are bookmarking on a site like Mendeley can help me filter the literature and prioritize what I should be reading first.
So yes, I’m fully with you in that I suspect that attempts to reduce what is essentially a qualitative judgement to a quantitative algorithm will never by completely satisfactory, there are other areas where metrics may be useful.
On a semi-related matter, there’s a strange correlation in the minds of many between altmetrics and open access. Your article continues this confusion, with its many asides about OA and such. Realistically, any new system for measuring researcher work will apply across all types of publication, regardless of the business model of the outlet. It would perhaps reduce confusion to separate out unrelated (though interesting) issues.
Indeed David, I saw that confusion in the NISO discussions yesterday. The overlap may be social media.
I can see that measures of public attention could be useful for raising funds (as long as the donors don’t spot the fact that public attention is often focussed on the sensational stuff that is most likely to be wrong).
But if a beginning graduate student has to rely on Mendeley to decide what to read, I’d recommend them to find a better supervisor.
It’s true that I feel strongly about open access (e.g http://www.dcscience.net/?p=4873 ), but the main reason it gets mentioned in our altmetrics post (http://www.dcscience.net/?p=6369 ) is because so many of the papers that get publicity, at least on twitter, were hidden behind paywalls which is one reason that many tweeters hadn’t read them. It’s one of my huge disappointments that Jeffrey Beale, who did such a service in listing predatory journals. is actually against open access.
The fact is that most scientists are far too busy to be on twitter, so intellectual standards are not always very high. It’s fun for politics and news but it’s been very rare for anything useful about my day job to come up.
Another example might be a publications planner from a pharma company–knowing the usage levels of the articles they’ve put together might be a valuable way to judge their performance.
But if a beginning graduate student has to rely on Mendeley to decide what to read, I’d recommend them to find a better supervisor.
I can’t say that my PhD advisor recommended any more than a small percentage of the articles I read as a graduate student. More often I was bringing articles to her attention. I did my PhD in a wet bench lab in a good-sized department which was part of a larger research institution. There was a constant conversation going on about the world of research in which we worked, discussions of papers, journal clubs, etc. I would also scan “Current Contents” for certain keywords, and the table of contents of new journals as they hit the conference room table.
That’s pretty anachronistic these days, and given the enormous growth in the literature, maybe not the best way to keep up with things. Perhaps more importantly, the inherently social nature of the laboratory and the type of work I was doing is not automatically in place for every type of research. A computer scientist may do all his/her work in isolation on a laptop. A comparative literature researcher similarly has no “lab” to hang out in. So altmetrics can potentially offer a proxy of those social cues for researchers who are isolated or disenfranchised.
“Another example might be a publications planner from a pharma company”
The main aim of publications planners for pharna companies seems to be to publish positive research as often as possible, and to suppress any results that might harm sales, even if they harm patients.
I already commented on how I find things to read at http://www.dcscience.net/?p=6369#comment-12258
Usually I find more than I can read properly from lab discussions alone. Occasionally citations are useful for the reason the SSI was originally introduced. Looking for people who have cited a good paper is a way of looking forward in time. Of course this has nothing whatsoever to do with counting citations of a paper as a proxy for its quality.
In the world of clinical trials there has been huge condemnation of the use of surrogate outcomes, because they so often mislead one about how well treatments work. Metrics are 100 percent about surrogate outcomes. That is one reason that they are a corrupting influence.
Usually I find more than I can read properly from lab discussions alone.
As stated above, there are many types of research and many environments in which research takes place. Not all offer laboratories or social groups of peers for such discussions.
I can speak only of science, All science depends on experimental data, even theoreticians. If they don’t talk to other people in the field, they aren’t doing their job properly.
A lot of science is based on observation not experiment, and both are based on theory. And research indicates that more of what researchers read is found via discovery than social contact. It is not useful to make broad unsubstantiated claims about how science works. How science works is something we are still trying to understand.
Regarding this supposed donut hole and the failure of altmetrics to measure quality, note that the “I” in IF refers to impact not quality. No one I know of is trying to measure quality, probably because the concept is too vague to operationalize. Altmetrics have arisen because the network of human thinking is now visible in many new ways, so we can now explore impact in new ways.
It’s a good start if we can agree that altmetrics have nothing to do with quality.
A publication can have “impact” because it is very good, or because it is terrible. The fact that no distinction is made between these cases is one reason why altmetrics is a corrupting influence.
As far as twitter etc go, and even more common case seems to be that they have impact because they have buzzwords in the title. Anything about diet gets lots of retweets from people who haven’t even bothered to read the paper. So it acts as a medium to propagate the spin and hype in the title or abstract. This may be a service to the authors, the journals and university PR people, but it is a disservice to science.
You have missed my point which is that the concept of quality is uselessly vague. If you give me an operational definition of quality that is significantly different from impact then I will tell you how altmetrics relate to that. Beyond that some altmetrics do distinguish positive from negative impact while others do not. For that matter citation based measures usually do not even though a citation may well be negative.
It is true that many more titles are read than abstracts, many more abstracts are read than articles are skimmed and many more articles are skimmed than read fully. This is important when it comes to modeling scientific communication but it is not an argument against altmetrics per SE, quite the contrary. These important differences can now be measured to a significant new degree.
Note that I am using a broad concept of altmetrics while you seem to be using something like the narrowest possible. Every specific metric has it’s limitations. That is not an argument against it’s utility, just a limit to it which needs to be recognized. And it is certainly not an argument against seeking metrics, given our vast new data resources. Every science is based on measurement and the science of science is no exception. Do you object to science itself being studied?
I can give you plenty if examples of individual papers that have, in retrospect, turned out to be of very high quality, Sometimes that was obvious immediately. In other cases it took decades. The mistake made by all bibliometricians is to neglect to look at individual papers.
It should be obvious from the paper I cited ( http://www.onemol.org.uk/Colquhoun%20&%20Hawkes-1982-ocr.pdf ) that I’m totally in favour of measuring things. But only if they are measurable. One of my interests is the estimatability of parameters, particularly in over-parameterised problems. That has the consequence that I’m not in favour of attempts to measure the immeasurable. proposed by the semi-numerate.
I do not see your point. I am well aware of cases of delayed impact, as many are famous. So what? I can easily measure that delay. But if you are claiming that quality is immeasurable then I agree, but it is because the concept is hopelessly vague as I have said, hence scientifically useless. That is why we are trying to measure impact.
But you still have not said how quality differs from impact. Can a paper that is never read be high quality? If so then why do we care about quality?
Regarding the time profile of impact you might look at my team’s work here: http://www.osti.gov/home/innovation/research/diffusion/. We used a disease model to explore the potential effects of improved communication. But the relevant point is that impact is not a matter of reading an individual paper, rather it is what that paper causes to happen downstream over time. That can be measured by counting a great many papers, not by reading them, although it may involve analyzing the texts semantically.
The discussion above suffers from a specific confusion, which is that of the vague concept. In this case the concept is altmetrics and the confusion is quite natural since this is an emerging concept, hence poorly defined. In particular sometimes the term is Being used to refer to something like a specific set of approaches, perhaps even specific products, especially those that focus on social media. Other times it is used to refer to a broad emerging field where many approaches are being explored, some of which have nothing to do with social media. I imagine that careful analysis would reveal several intermediate uses of the word as well. Under these circumstances the phenomenon of talking past one another is almost inevitable and so we see it here. This sort of talking past is characteristic of emerging fields, which altmetrics certainly is. Confusion is the price of progress.
“I am well aware of cases of delayed impact, as many are famous. So what? I can easily measure that delay.”
If you did that, you’d say nothing about altmetrics for another 10 or 20 years. I see no sign on any such restraint!
“Can a paper that is never read be high quality?”
It’s certainly the case that very mathematical papers are less read, and often less cited, than easy ones. How many people have read Higgs (1964)? In my own case, the fundamental results that were needed to implement maximum likelihood estimation of rate constants are mathematically difficult (eg http://www.onemol.org.uk/?page_id=175#hjc90 ) and they have been cited much less than the papers that use them.