Sun Kissed Orcids
Sun Kissed Orcids (Photo credit: Naiyaru)

“Many of the things you can count, don’t count. Many of the things you can’t count really count.” — Albert Einstein

Measuring the value of science is difficult. A recent report in the Chronicle of Higher Education outlined the difficulty policy-makers and scientists are having calculating the return on investment (ROI) for science funding — and the potential pitfalls in making the measurements, including the cost and labor involved, the huge opportunities for inaccuracies, and the classic problem of creating incentives for the wrong things. Another recent essay on Inside Higher Ed critiqued CourseSmart’s analytics, which purport to estimate student preparedness by measuring their engagement with an e-textbook. The essayist, an experienced teacher, knows he has a better way of evaluating preparedness — grades for work completed.

CourseSmart’s motivation is a classic marketing attitude, as their SVP of marketing says:

The big buzz in higher ed is analytics. Based on what we had and what issues there are with institutions around improving the return they’re getting on their investment in course materials, we realized we had a valuable data set that we could package up.

John Warner, the author of the essay, has this response:

CourseSmart is peddling a product for which there is “buzz,” but no actual need. This strikes me as a particularly 21st Century mindset.

Still, science has a strong quantitative bent, and we’ll always be trying to measure whatever we can. But are the important things available for reliable measurement? And can we accept that they may not be?

Thinking about the goals of altmetrics — identifying content that’s more relevant, more interesting, novel, or important, and doing so as quickly as possible after publication or, better yet, helping authors to find the best match for their works — made me wonder if we’re missing some obvious alternatives to metrics, ones based on words rather than numbers. Not alternative metrics (or altmetrics), but alternatives to metrics (which I’ll call alt2metrics).

In my experience with authors and editors, metrics are rarely definitive, but can add a little bit of information to an already information-rich landscape. Most authors know where they want to publish, and these preferences can be based on all sorts of factors, from the journal’s reputation to the editor’s reputation to the perceived audience the journal reaches to publication policies (speed, review process, perceived rigor). More often than not, academics and researchers are dismissive of metrics, as they’ve seen how, once you poke at them, they end up being relatively blunt measures with little nuance or depth.

I recently came across a blog post from 2011 along these lines, which has more of an organizational behavior perspective, but contains a lot of truth:

Metrics are one dimensional, human beings are not. . . . Prioritizing things that can be measured over these kinds of things has been very, very costly to business. We have, in the name of metrics, hollowed out our organizations, our organizational cultures and the employee-employer relationship. Assuming that we have got to be able to measure something to acknowledge its existence seems reckless to me and leads us down a very inauthentic and unproductive road.  It is a false constraint.  It is a false constraint supported by antiquated archetypes of the organization, of management, and the value creation process . . . and those that sell us metrics.  We need not to struggle for measurement of things that cannot be measured, but help our organizations better understand the intangibles that are so valuable today, and that we can still pay attention to and even prioritize things that cannot be directly measured

It’s worth unpacking some parts of this quote.

First, there is a fetishization going on with metrics, and one that’s not necessarily helpful. The belief in metrics fosters a feeling that if you can’t quantify something, it’s illegitimate or lesser in some way. However, if you can express a dimension as a whole number, or better yet, with decimals, it’s somehow more real or considered. We may own powerful brands, but if our PageRank is only a 6, well, it’s nights crying in the pillow. At the same time, the criticism of the impact factor is that it’s too simple, too widely embraced, and an illegitimate substitute for a qualitative evaluation of a scholar’s or researcher’s work, which drives the festishization to another level of dubious — the next number will be better, we just know it!

Second, those who are pushing metrics are usually those wanting to sell us metrics, either directly or indirectly — that is, they’re salespeople at some level. Whether they’re in it for the money, the academic novelty, or a bit of both, metricians have invested interests in making everyone uneasy about not having metrics for things, even those things for which metrics are essentially misleading or inadequate, and likely always will be. Putting metrics into perspective — as useful adjuncts in specific circumstances — helps you to not overinvest in systems and tools that can look amazing, but usually end up with an inadequate return on investment.

Third, intangibles matter, and may be more valuable than things we can measure. Even those who embrace altmetrics know this, and operate accordingly. After all, they have no metrics about how effective their altmetrics might be — they are basing their enthusiasm on intangibles, like hope, ambition, novelty, and rebelliousness. These are all important intangibles we should applaud, but they are not themselves measurable motivators. Authors and readers in scientific publishing possess some of the same motivations, to which we can add fear, resentment, bitterness, hubris, pride, vanity, ego, passion, curiosity, commitment, professionalism, civic duty, and many other unmeasurable attributes to what propels science forward.

Perhaps key to all this is the fact that metrics take time to assemble — they are delayed, and secondary to activity. Non-measurement-dependent signals are more important, anticipatory, and upstream from metrics. And they are what scientists rely on every day to guide them and their searches for information. It would be a shame to spend all our time on secondary, derivative measurements while primary, original signals of value are ignored or downplayed inappropriately.

What are some of the primary, root signals of value that scientists pay attention to?

Brand: A journal’s brand is a major signal of various qualities and aspects — the focus, the reputation, the likelihood of the information inside being important, the editorial process, the longevity, the culture, the position in the overall market. To measure all these things numerically would be impossible. Yet, they have incredible value as guides to both authors and readers. Brands have predictive value.

Authorship. Authors also have predictive value for fellow professionals, and most scientists read within their spheres. This is something no metric in the world can change — for instance, does it matter to an engineer that a cancer review journal has a high Eigenfactor? Relevance and interest are trump cards for readers, and when the right author publishes an article on an interesting topic, it will grab more attention. With the implementation of ORCID, author names will become more like data, but not quantitative data. Rather, ORCID turns author names into structured data. The value is increased by reliability and disambiguation, but no numbers are needed.

Results. Studies also provide strong signals that aren’t quantifiable. Usually, in research communities, there are large-scale trials underway that people know about — the funding is significant, the labs involved are numerous, and the results are highly anticipated. Once they’re published, word spreads like wildfire, and downloads spike. Studies with catchy names can do even better, and this non-metric trend of coming up with acronyms has gripped science for almost two decades now.

Sponsorship. Society, funding, and academic affiliations also send signals readers and researchers can use to assess relevance and quality. The orbit into which a study falls is an important differentiator. Is it a mechanism study? A therapeutic study? Preliminary? Clinical? Who funded it? NIH? Or Pfizer? Did it come from a major physics center? Or a community college?

Altmetrics may be all the rage, and it will be interesting to see if its advocates are able to come up with a few measures that are meaningful, robust, and durable. Currently, there are plenty of non-quantitative signals at work, ones that useful, enduring, rich, addressable, human-readable, and well-understood — they just aren’t measurable.

Publishers and editors know how to use these signals — build stronger brands; get better authors; grab the best studies; put them into the most appropriate outlets; get them out as soon as feasible; and ensure discoverability. Follow this non-quantitative formula, and the metrics can measure the particulates in the air while you leave them in the dust.

Enhanced by Zemanta

Kent Anderson

Kent Anderson

Kent Anderson is the CEO of RedLink and RedLink Network, a past-President of SSP, and the founder of the Scholarly Kitchen. He has worked as Publisher at AAAS/Science, CEO/Publisher of JBJS, Inc., a publishing executive at the Massachusetts Medical Society, Publishing Director of the New England Journal of Medicine, and Director of Medical Journals at the American Academy of Pediatrics. Opinions on social media or blogs are his own.


12 Thoughts on "Metrics and Meaning — Can We Find Relevance and Quality Without Measurements?"

I’m not sure metrics are avoidable. Simply put, time is money, and metrics provide necessary shortcuts, particularly for those without the academic backgrounds necessary to fully comprehend and judge the work in question. When a job posting gets 500 seemingly qualified applicants, or a funding agency takes in thousands of grant requests, there needs to be a way of narrowing down the pool of applicants rapidly to allow for in depth analysis of the top tier. What’s really interesting about the study of new metrics is that we’ve reached a technological point where we can track the life of an article, unlike the black box that the print era offered. The point now is to sift through the haystacks of data that can be measured to see if any needles exist. And like all data, a small percentage is useful, most is meaningless and some can be misleading.

Your point about journal brand is important though. When an article has been published in a high quality journal, it has been through a rigorous peer review process, with an experienced, knowledgable editor driving that process, finding absolute top experts to perform the review to a set of high standards. Because that process happens behind the scenes, it seems to be ignored by much of the altmetrics movement–it’s harder to quantitate, and favoring anything related to publishers is currently unfashionable. Instead there’s focus on post-publication comments or rankings, which are far less worthy. Which is more meaningful, a thorough review by a set of carefully chosen experts, or a stochastically driven comment left by some random reader? The Impact Factor currently stands in as the measurement of journal brand, and it’s imperfect, but not that bad a metric for doing so. But brand alone (and Impact Factor alone) don’t give the full story. I think the movement to article level metrics is important here. Something as simple as an article’s citation count tells us much more about the individual work than the Impact Factor alone.

The state of things reminds me quite a bit of where we were with science blogs a few years back. Those financially backing blogging companies and those hoping to make their reputations and careers by blogging gave us a nonstop stream of hype: the future of science is blogging, all scientists will blog and read blogs and comment on blogs constantly. Over time it quickly became evident that this is a niche activity, but not one without value. Science blogs provide community for the disenfranchised, and have often offered up useful analysis and criticisms that have failed to materialize through other channels like article comments. They’re useful for driving social agendas in science, but have not become the substitute for journals or the repositories for open science data that they were suggested to be. On the down side, they’ve also become a source of free labor for old media looking to cut costs and get rid of professional science writers. Why pay someone to write an article when so many less adequate writers are willing to do so for free?

We need to follow a similar path here, and let things play out, let the hype be tested and prove itself worthy or unworthy. Something will emerge from altmetrics, I have no doubt, but what that something is, well, we’ll have to wait and see.

It seems to me Kent that you have hit the nail on the head! I found that the number of hits on the web site did not result in increased sales or any other measure of value to my company or to the journals.

I wondered as I received these reports that showed number of hits, time of hits, frequency of hits, what was hit, etc just how I would use the information. Should I go to my journal editor and say: In your next call for papers or talks with colleagues who are penning a paper be sure to mention that at 1 a.m. some 3,000 people looked at the web site and 100 at your Journal’s TOC. Personally, I have looked at a document on line found it to be worthless yet my hit is counted in some metric as if it has value.

I am reminded of the oft said statement: The web moves in dog years! An interesting altmetric at best.

This discussion of metrics, like most of what appears in TSK, focuses on journals. If you think quantitative metrics are not very good at measuring quality for journals, or journal articles, consider how little measurement of any kind exists for scholarly books. Sales are a very crude measure of quality because we all know that some bad but controversial books, like The Bell Curve, can sell extremely well. Book prizes are a better measure, especially those awarded by scholarly associations that use panels of experts to select the winners, but they are very unevenly distributed across disciplines. E.g., the field of political science has a great many prizes, offered not only through the American Political Science Association but through other affiliated organizations too, whereas the field of philosophy has hardly any prizes; the American Philosophical Association sponsors only one prize, the Matchette Prize, and it is awarded only to philosophers who are not over 40 years old. Book reviews are helpful, but many appear long after a book’s publication and not every book gets reviewed in the top journals. Building a brand as a scholarly book publisher can be a real challenge!

Good article! No editor can afford to be unaware of the Impact Factor and downloads. But, that’s not the whole story! I also judge my journal’s success by the comments I get attending our conferences. You have to give your readers what they want and need.

I think the issue is that we are forcing metrics to measure usage and subsequently value. In the first instance, the issue is that “usage” as one of the commenters above has noted may simply be someone coming to a web page to view an abstract. No metric really shows what mental evaluation of the abstract may be tied to that view. It’s just that — a viewing. But I, as the researcher, may still have “used” the content. I evaluated whether it was relevant to my work. If I don’t choose to download or print the associated paper, does that negate the value of the paper or does it simply mean that the paper wasn’t useful to my particular purpose. The metrics offered up (whether traditional, altmetrics, alt2metrics, etc.) will never capture that activity. Even citation data is doubtful as a metric because sometimes citations refer to work that is being dismissed as erroneous by later researchers. The flaw in our system is the constant desire to force quantitative metrics on to what is essentially a creative endeavor.

Anyone who works with these various metrics for any length of time eventually recognizes that no metric is useful on its own. We need multiple metrics (both quantitative and qualitative) to make worthwhile judgments. The metrics Kent references are generally judgement calls in and of themselves. A brand is only useful insofar as it delivers a short-hand message about what that brand encompasses, similarly the other ones named deliver their own meaning. But if I don’t trust the sponsor or the author or the brand (journal), then those are insufficient. The issue (in my own view) is insisting on use of metrics to make assumptions regarding use and value that must necessarily differ according to who is evaluating and *using* the content.

I think that eventually there will be a crossing of the the two paths – qualitative and qualntitative measures. After all, in hard economic times like the present, the public will demand an accounting of publically-funded research. But also, industry wil demand an accounting, and higher education institutions will demand it too. Gone are the days when you can spend a lifetime pursuing your research passions without anything to show for them. We (the public, industry, your parent institution) will not let you. Sorry.

But I believe the discipline of bibliometrics itself will provide solutions. You talk about brand, authorship, results and sponsorship. But bibliometrics is currently working out ways to quantify all of those things. Just take results. Altmetrics is now tracking impact beyond citations. And there are stirrings of developing research quality indices, citation sentiment indices, and even prediction indices. One day soon, metrics will be able quantify the qualitative.

It’s like saying you can’t quantify love. But then you count on all the words and actions you expect from someone who claims they love you. It’s almost unscientific to make a blanket statament that something cannot be measured. Someone might prove you wrong.

The quantitative and qualitative are always mixing, but the proper mix isn’t certain, and it likely isn’t completely quantitative. Even scientific principles aren’t completely quantitative (Koch’s postulate comes to mind). One aspect of some current and reliable qualitative measures is that they are anticipatory, or provide very rapid signals about quality and relevance. Metrics (alt or not) take time to accumulate, sort, analyze, and present. It may get faster, but there’s also a Worm Ouroboros thing going on here — let’s see if we can make an alternative to the impact factor that we’ll know we’ve done when it’s as good as the impact factor; let’s measure brand and we’ll know we’ve done it when it’s as useful as branding; let’s measure author effects which we’ll test against qualitative author effects. What have you gained? The standard hasn’t changed, just your way of arriving at it, and the new way of achieving the same effect may take more time and be less addressable in reality.

Five centuries ago there was a great philisophical debate about the quantity of motion, which led to the discovery of dynamics. Then concerns about the quantity of heat led to thermodynamics. The quantity of work issue spawned scientific management a century ago. Today it is the quantity of thought, especially the quantity of importance. There is yet another new science in here somewhere.

Metrics is not about one dimension , Metrics is not about selling metrics (this was an exaggeration of an already biased opinion).

Metrics is required for repeatability , better understanding of Cause & Effect ,

Metrics is required for Training & improving all so that Intangibles (so called) do not remain with few , or good intangibles do not remain restricted . so that society as a whole can improve

Metrics help us to set a goal , achieve & exceed.

Request not to be Myopic while discussing Metrics , Please look at the human , social element of Metrics , You simply can’t avoid & should not avoid to have a Metrics , You should there fore keep on doing research to find out more appropriate one.

I agree to David that Intagibles are only those which are not yet measured , but they will be measured eventuallyby yet another science . This helps in understanding intangibles better , recreate them , distribute them to all if they are good , protect ourselves from them if they are bad

Comments are closed.