Super Bowl Sunday Crystal Ball
Image by circulating via Flickr

You probably remember reading about it — the experiment allegedly demonstrating human precognition, something akin to telepathy. It was published in a reputable journal, the results were statistically significant, and the media coverage was plentiful, if a little sporadic. There was something a little hesitant about it, as if even the reporters were having trouble accepting the basic premise, but the New York Times, New Scientist, the Guardian, the Telegraph, and New York magazine all covered it.

But it was a positive result, it was statistically significant, so it was published.

The study used cards, some of which had erotic pictures on the back, to test whether subjects could look into the future and discern which card they would soon discover had a risque image on its reverse side. (Some wondered if the test was for telepathy or pornotelepathy, but that’s beside the point.)

Ben Goldacre, a prominent voice of skepticism in science, was, like most of us, nonplussed by the results, wondering inherently about the paper’s extraordinary claims, which needed to be backed up by extraordinary evidence.

And what are extraordinary claims? It seems some psychologists enjoy double-speak, judging from a recent article in Psychology Today about this same precognition paper. The author, Sathoshi Kanazawa (himself an evolutionary psychologist with a PhD), quibbles with Carl Sagan’s quote that “Extraordinary claims require extraordinary evidence,” parrying the Sagan quote with Elaine Boosler’s “Popcorn is magic if you don’t know how it happens.”

Aside from the inanity of attempting to repudiate Sagan with Boosler, Kanazawa is being disingenuous — we can experience popcorn together, watch it pop together, buy it readily, and eat it endlessly. Popcorn is nothing extraordinary — not making it, not buying it, not eating it. How it happens isn’t a mystery — water in the kernel boils when heated, and the expanding steam causes the kernel to explode. No strange mechanism of action, nothing extraordinary about the results, nothing hard to observe or experience. And pretty tasty.

Claiming to be able to predict the future is not an ordinary claim. Seeing into the future breaks the physical laws of the universe as we understand them, and is currently considered impossible. It would be absolutely amazing if someone could do it. Therefore, it requires a lot of evidence to show that someone can actually predict the future. Not only that, but you’d better have a theory you’re pursuing, a framework in which it might be conceivable.

Back to Ben Goldacre. Aside from likely having a deep problem with the underlying study, his problem in his recent post is with the aftermath of the media coverage around this paper, namely that when other scientists replicated the experiment and found no effect, those negative results didn’t receive nearly the amount of media attention the initial findings received.

Not that it didn’t receive any attention — the blogosphere tackled the paper immediately, with one psychologist who blogs noting that “I don’t believe a word of it because a) let’s face it, it’s about precognition and b) there’s simply no effort to propose a mechanism that might support such an outrageous claim.”

Goldacre’s narrow complaint is about the mainstream media and how it continues to miss opportunities to provide more than merely superficial coverage. With niche coverage emerging all around, this flaw is more and more apparent. He ends his piece with the following observation:

What’s interesting is that the information architectures of medicine, academia and popular culture are all broken in the exact same way.

The information architecture of popular culture has been broken for a while now, but may be healing thanks to blogs, search engines, smartphones, and the like — things that take control out of the hands of the few and put them in the hands of the many. We’ll see if the accumulated wisdom of niche experts and endless monitoring from users can compensate for superficial coverage from marginal journalists.

But the mainstream media problem may be the least of the concerns this study raises. While an idealist might think it was a good thing to have the precognition study published because this allowed for attempts of replication and ultimately refutation, there is perhaps a greater cost in science around misinformation — the costly distraction of doing needless experiments to disprove claims based on statistics sans theory. How empty of theory was this paper? Let the authors explain:

The term psi denotes anomalous processes of information or energy transfer that are currently unexplained in terms of known physical or biological mechanisms. The term is purely descriptive; it neither implies that such phenomena are paranormal nor connotes anything about their underlying mechanisms.

Psi . . .

Theory is important here — it’s the “so what?” part of the experiment. So what if there is a slightly less-than-random chance of someone guessing there’s a dirty picture on the back of a card? What’s the theory? What does it portend? That people’s minds can travel forward in time? What’s the proposed mechanism for this? Can we control it? Are the brains of those who performed better observably different?

If extraordinary claims are being made, it’s a waste of everyone’s time if there’s no theory supporting them and if the evidence supporting them isn’t extraordinary.

Many of the studies refuting the initial experiment will be published, and the instigating study will likely be debunked. The problem isn’t just getting psychologists to know the study didn’t hold up. The problem isn’t only that a distracted public spent a few minutes rolling its eyes over a convoluted precognition study in a scientific literature it already thinks is pretty unreliable. The more pragmatic problem is the time wasted by scientists having to replicate and refute the wild assertion that was statistically significant but not theoretically meaningful.

Things like this make me reflexively feel we should publish less, be more selective, and stop polluting the literature if something just doesn’t make sense or isn’t likely to make a whit of difference in anyone’s life.

Niche publications in the sciences break their bond with their audience, but not in the same way the mainstream media does. The mainstream media breaks it by abandoning the audience after titillating coverage. The journals world does it by publishing too much baseless research and thereby wasting scientists’ time.

Negative results will become known in the sciences, but the effort taken to produce them in the face of strange outcomes from pointless experiments is much more costly than a throwaway newspaper story.

Enhanced by Zemanta
Kent Anderson

Kent Anderson

Kent Anderson is the CEO of RedLink and RedLink Network, a past-President of SSP, and the founder of the Scholarly Kitchen. He has worked as Publisher at AAAS/Science, CEO/Publisher of JBJS, Inc., a publishing executive at the Massachusetts Medical Society, Publishing Director of the New England Journal of Medicine, and Director of Medical Journals at the American Academy of Pediatrics. Opinions on social media or blogs are his own.


25 Thoughts on "Positively Predictable — The Multiple Costs of Mindless Studies"

I think you are on thin ice here. Didn’t that Newton guy claim the existence of gravity without any theory of how it worked?

No, it was quite theoretical: “every point mass in the universe attracts every other point mass with a force that is directly proportional to the product of their masses and inversely proportional to the square of the distance between them.” It could be expressed mathematically. There was disagreement about whether it was centripetal or centrifugal force. While Newton never explained “action at a distance,” once his theory was in place, we could do new things like calculate the escape trajectory necessary to go to the moon and return.

There was empirical observation that led to the theory, but Newton didn’t publish his evidence until he had the theory as clearly articulated as he could. It was also a widely held theoretical position at the time, with Hooke and others contributing.

The authors of the precognition paper make no attempt at a theoretical mechanism, from what I can see. They just report that an apple fell from a tree in a slightly non-random way.

Perhaps you are aware that Newton’s critics were correct and gravity does not exist. There was no mechanism. We just had the geometry wrong. Let’s let the scientists fight it out. That is where progress lies.

This isn’t a debate about Newton, so I’m going to stop this line of quibbling. The theory of gravity persists, even if some people like making bold statements that it doesn’t. Gravity is a common, everyday, consistent, describable experience that has been reliably mastered mathematically to the point that we can send probes away from our planet and into deep space through mathematical calculations. To say “it doesn’t exist” is mere hyperbole. String theorists and others looking to make science news with radical new theories have a major uphill battle. But at least they’re being theoretical first. I’m all for letting the scientists fight it out, but have a theory at least. And if the data don’t bear out your theory, move on.

To publish data without a theory, especially on a pseudoscience topic like telepathy, seems designed to shed heat, not light, and the less of that we have in the literature, the better the scientific literature will be.

I’m not sure I agree about the importance of theory. At times observations precede a good framework for interpretation. Take the spectral lines that baffled physicists at the end of the 19th century. They were observed time and again and needed explanation – publishing the results meant that a global effort could be made to explain them.

Theory becomes important when selling your hypothesis and when explaining your results. Pure empiricism can do without it and still warrant publication.

Of course, data without an explanatory framework needs to be well-established to be published. Replication is key here, and the editors allowing the psi study failed hard here (but prevented conspiracy thought by adopting this loose standard)

Spectral lines were uncovered after a long line of solid experiments revealed a surprising hole in the working theoretical framework, requiring a re-examination of the theoretical framework. The comparison is weak. The precognition study isn’t the surprising result of thousands of scientists working collaboratively on effective physics experiments and, through hard work and keen observation, noticing an interesting anomaly. There was no “claim” that led to their discovery, but true empiricism. The precognition study was an attempt, I think, to take a radical claim that people can see into the future (telepathy and its ilk) without a foundational reason why you might think this, and throw spaghetti at the wall until some stuck. That’s completely different.

VT: Replication (or not) by others normally follows publication. It is not a condition of publication in the first place.

That’s my point, to some extent — if more junk science is published and promoted, more junk replications have to occur, and we spin our collective wheels. I’m all for scientific breakthroughs, but couldn’t we have predicted (without resorting to precognition) that a study proposing telepathy would be proven false?

In many areas of science, raw statistics are very important. The people who collect them (empiricists) are very different from the people who explain them (theorists). Cosmology and particle physics come to mind.

Global warming is perhaps the most prominent case. Here we have statistically significant warming, with an entire community trying to untangle the mechanisms (called “attribution” studies).

The point is that if you want to find some way to keep some class of studies from being published (I don’t) you need to sharply narrow your focus. I am particularly amused by your blanket statement that “Claiming to be able to predict the future is not an ordinary claim.” One of the first things I ordinarily do each day is check the weather forecast (of the future).

Censorship normally fails because the precision it requires is impossible.

In global warming, there was a starting theory — that CO2 was creating a greenhouse effect. Now, the science is trying to see if that’s the cause, or if there are multiple things going on, and just how much of what’s going on is within our control. Raw statistics are important, data is crucial, but without a theoretical framework, even a start, you’re just putting in noise. With theory, noise can start to edge toward signal, and others can expand on what you were trying to show. How can you expand on the precognition study? What’s the theory to test in another setting? How does it lend itself to anything other than replication? How is it extended? What mechanism are we testing?

Observation typically precedes theory and is never “noise.” You seem to be calling for a rule that says no observation shall be published without accompanying speculation. I see no reason for such a rule.

Climate science is full of statistical patterns that still need explanation. 11 year cycles, 60 year, 1500 year, they are everywhere we look, and they get published. It is an entire genre, as it should be. Even the ice ages still have no specific mechanism connecting the purported Milankovitch cycles to ice sheet development.

In the card case remember that statistically significant correlations often occur by chance. They are still worth reporting, because science never knows where it is going.

You’re overgeneralizing. What I’m saying is that this study shows an extreme claim (telepathy) but middling evidence and no theoretical framework, so it’s pretty weak. Yet, it garnered a lot of attention, leading to three bad trends — superficial media coverage, increased public skepticism of what the heck scientists are doing with their funding, and scientists wasting time replicating something that was, on the face of it, flawed.

Theory exists for a reason. If something falls outside of it, you’d better have damn good data. If, in this case, something attempts to exist without it, you’d better have data that blows everyone away. This study didn’t have that, so there was no excuse for atheoretical science.

You are the one making wild general claims, which is actually my point. What makes data “damn good” other than agreeing with your theories? Why don’t you just say you are against parapsychology (to which I am indifferent) and let it go at that. Leave science alone.

Things like this make me reflexively feel we should publish less, be more selective, and stop polluting the literature if something just doesn’t make sense or isn’t likely to make a whit of difference in anyone’s life.

Unfortunately, this would explain a lot of scientific fields. String theory comes to mind as an iconic example.

Your fear in publishing these kinds of studies seems more rooted in how these studies are used to debase (even ridicule) the scientific establishment in the eyes of the public, and I share your concern.

From a scientific standpoint however, the studies you cite have little influence on professional scientists and the direction of scientific research. Tightening standards for acceptability will only serve to make the system even more conservative and slow scientific progress.

Indeed. I have been doing staff work for the US Inter-agency Working Group for Digital Data (IWGDD). We are trying to figure out new rules that will promote data sharing, not restrict it. Reducing public stupidity is a noble goal, but restricting science is not the best approach.

The Popper falsificationists might even argue that the best data is that which does not fit any existing theory.

I think there is a potentially dangerous assumption of status quo in that idealistic view, which is that public sentiment doesn’t impact funding availability. If we aren’t careful about managing the balance of what we publish and promote, we could slowly erode the willingness to fund exciting science. If that happens, scientific progress could be slowed even more than any extra filtering ever would.

I see where you’re coming from here Kent, regarding the idea of publishing raw data, and it’s an argument that often comes up as people talk about “open science” or data repositories, or publishing datasets as part of the literature.

As I’ve argued elsewhere, collecting data makes one a technician. Understanding data makes one a scientist. I’m not against the idea of making data available, I just think a chunk of raw data is a much lesser achievement than a completed study that fully explains that data. It’s also much less interesting for the reader. Data is a starting point for understanding, not an end unto itself.

Maybe a good analogy could be drawn with the release of the human genome. Sequencing the genome was not an actual “experiment”, it had no hypothesis other than a self-reflexive “it is possible to sequence the genome,” and no theory other than “this may be useful for later analysis.”

The data has indeed proven tremendously useful for further analysis, but in and of itself, it’s a meaningless string of letters. It raises an interesting question of making editorial judgment calls on when it is reasonable to publish this type of material. The genome was clearly a huge technological achievement and offered great potential for further research. I’m not sure the same can be said about this card guessing paper.

I don’t think the great experimentalists would share your view that observation is not science. Nor would the Popperians, given that their goal of science is to falsify theories (not that I agree). Are you seriously suggesting that the genomes are not worth publishing, because they have no theoretical content? Perhaps we can do away with observation and experiment, given that they are not science, as it would save a lot of money.

No one is arguing that this card experiment is important. What I am arguing is that it is not known that it is not important, so it is worth publishing. Or rather that its kind is worth publishing.

I believe every sound result is worth publishing, precisely because we cannot know what is and is not important. At OSTI we publish everything DOE funds: with no questions asked. On the other hand, NSF publishes almost nothing it funds, leaving it up to the journals, which I regard as a colossal mistake. NSF gets thousands of detailed final research reports every year, which it refuses to publish. It is a big issue, because a 6 page journal article is no substitute for a 60 page project report.

That’s not what I’m arguing at all. Collecting data is a vital part of the process. But it’s a means to an end, not an end unto itself. You’ll note that science papers end with a “Conclusions” or “Discussion” section, not the “Results” section.

I never suggest that genomes are not worth publishing, in fact I suggest just the opposite (“I’m not against the idea of making data available”). But the fact of the matter is that those genomes are just strings of letters until they’ve been analyzed. Raw data has value, but it’s of less interest to the reader and less value to a field than data understood and explained. Each journal’s editor must make value decisions about the level of what they choose to publish.

For example, if Darwin had merely published a list of beak lengths among finches he encountered in the Galapagos Islands, rather than “Origin of Species” would he be as revered a figure as he is today? Rosalind Franklin took the X-ray photograph that enabled Watson and Crick to make the intellectual leap explaining the structure of DNA (a leap she failed to make herself). If you were the editor of Nature in the early 1950’s, would you have published her photograph, by itself, with no accompanying explanation of its significance or meaning?

If the research is any good, it will be reproducible.

The weak point in ALL science is the interpretation. Setting up and running (and reproducing) any experiment is the easy part in that human interaction is limited. Taking data and drawing any conclusion is the weak link because it involves human intellect (or lack thereof!).

That’s what we have here: a pile of data and people drawing a conclusion from it. No one is arguing with the data, only the conclusion.

And all conclusions, even if they are correct, are not equal. Empiricism is the weakest, while a full blown, qualitative theory is the ideal, but that usually takes a while to acheive.

On the contrary, as David C. points out, interpretation is one of the most important aspects of science. Explanation is the goal, for without it science is pointless. But data is also essential. It is the combination that makes the glory.

It’s what fascinates me about the “open science” movement that calls for scientists to publicly post their data as they collect it. You could become the greatest scientist of your generation without doing a single experiment, just sitting around with an internet connection poaching other people’s data and figuring it out before they are able to. You’d be fair and cite them as having collected the data, but you’d get credit for the discovery and they’d be a footnote.

Think of it, no grants, no overhead other than a computer and an internet connection.

Which is why, if I were still at the bench, there’s no way I’d put my data out for public consumption until I was sure I’d thoroughly understood it and exploited it.

On the contrary, we agree. I don’t see how you can interpret my ideas any other way, but ironically, that supports my point, doesn’t it?

Kent, you’ve explained why you wonder about the value of empirical data with no theory to provide a framework for interpreting them. How do you feel about the reverse, i.e., a theory with no empirical data to support it, as I understand is the case with superstring theory?

Wasn’t that also true of Einstein’s theories Special and General of Relativity?

It seems to me that if a scientist has something of value, be it a theory without data, or data without a theory, then science is, in general, benefited by publication.

Comments are closed.