Over the past few months, a major scandal has been unfolding in economics. It involves a paper by two Harvard economists who claimed that when the debt load of a country reaches 90% of that country’s gross domestic product (GDP), growth slows significantly.
The paper became a cudgel in the hands of political forces urging governments to adopt austerity measures in response to the global financial meltdown of 2008, leading to major cuts to outlays across Europe, where the ideas found their most centralized and immediate application, and to the budgetary stinginess of the US Republican party, led by their champion, Paul Ryan.
The problem is that the paper is very likely wrong, its conclusions based on selective data and mathematical errors. In fact, it may be exactly wrong — that is, when the math is corrected and all the data included, debt load doesn’t seem to slow economic growth. In fact, it may actually help economies grow while reducing debt.
Carmen Reinhart and Kenneth Rogoff (who is also a chess grandmaster) published their controversial paper in 2010 in the American Economic Review. It was not peer-reviewed.
While the paper had been used in numerous venues to drive austerity policy, it wasn’t until April 2013 when Thomas Herndon, a graduate student at the University of Massachusetts Amherst released a paper refuting the findings of Reinhart and Rogoff, stating that:
. . . coding errors, selective exclusion of available data, and unconventional weighting of summary statistics lead to serious errors that inaccurately represent the relationship between public debt and GDP growth among 20 advanced economies in the post-war period.
There are so many interesting angles to this story it’s hard to know where to begin. So, let’s try listing them.
- The variability and role of peer-review. While it has been rightly pointed out that peer-review may not have caught mathematical errors or selective use of data, it may have caught the fact that all four references in the paper are to prior works from Reinhart and Rogoff. But there is a more subtle problem, which comes down to the fundamental branding of a journal as a “journal.” That is, a “journal” is widely interpreted as containing peer-reviewed content. American Economic Review is a journal (“a general-interest economics journal”). However, the paper in question was published in a section called “Papers and Proceedings,” which is not peer-reviewed. It is misleading to have variable levels of peer-review within a single journal, not to mention non-peer-reviewed content in an information form predicated on peer-review. We are now in a publishing environment where how a paper has been peer-reviewed is becoming increasingly important, as variable and experimental forms of peer-review are emerging all around us. We can no longer assume that publication in a journal equates to peer-review, or when peer-review occurs that the term means what we think it means. This is a problem.
- The post-publication review process. As David Crotty stated in a wise comment yesterday, the stochastic process of post-publication review is no substitute for structured pre-publication peer-review. This is a case in point. Not only did the post-publication peer-review consist mainly of mindlessly accepting the paper and trying to score political points with it, but the problems were not revealed for years — years during which unemployment in Europe skyrocketed, US public policy became tied in knots over austerity politics, and fiscal policy in many other nations remained frozen in fear of debt.
- The power of incentives. Scientists are busy. Social scientists are busy. If there is no incentive to do work, it’s unlikely the work will get done. In this case of this paper, post-publication peer-review only occurred when it was assigned to a graduate student. That is, it was assigned review. There was an incentive to get it done. There was no assigned pre-publication review, and organic post-publication review raised none of the problems later identified. It was assigned, incentivized post-publication review that caught the problems. If we continue to believe that post-publication review will occur without incentives, that’s our mistake.
- The damage one paper can do. The entire vaccines and autism debate was sparked by a single fraudulent paper in the UK. We had a decade of fear-mongering and social upheaval (and, very likely, a number of unnecessary deaths from preventable diseases like measles and mumps when parents refused to get their children vaccinated). Now we have high unemployment and increasing debt in Europe because a paper seemed to indicate this was the direction to move. We have long-term unemployment and crumbling infrastructure in the US for the same reason. As we publish more papers, maybe the “single paper” event will diminish. Or, perhaps there will be more entry points for flawed papers that might misdirect us and cause significant damage. It’s hard to know, but the risk is real either way. One bad paper can do a lot of damage, yet we’re emphasizing convenience for authors, reduced levels of peer-review, and rapid publication.
- The lack of skepticism and accountability in an era of polemics. There is a core and perhaps false dichotomy of opinion in economics, between Keynesian intervention and growth vs. Hayek’s free market and austerity approaches. Readers of their actual writings actually believe the two disagreed less often than is commonly portrayed, but in our age of polemics, every proponent needs an opponent, so Keynes and Hayek have become mutual foils, deservedly or not. This means there are now two ideologies in a battle, a state of affairs that is all too common. The paper by Rienhart and Rogoff tapped into the austerity proponents’ point of view, and was embraced. However, the polemical attitude didn’t allow for a response probing the facts, only affording a more emotional response. This happens far too often. Once a position is staked out, it is defended to the point of rhetorical exhaustion. Apparently, this also makes it less likely for the facts to be the focus of intellectual activity.
It is this last problem that seems to be the most important and pernicious. When criticized for their mathematical and data-inclusion errors, Reinhart and Rogoff have repeatedly responded emotionally, the first time on Reinhart’s blog, as if the only issue is that they are being attacked:
. . . it has been with deep disappointment that we have experienced your spectacularly uncivil behavior the past few weeks. You have attacked us in very personal terms, virtually non-stop, in your New York Times column and blog posts. Now you have doubled down in the New York Review of Books, adding the accusation we didn’t share our data. Your characterization of our work and of our policy impact is selective and shallow. It is deeply misleading about where we stand on the issues. And we would respectfully submit, your logic and evidence on the policy substance is not nearly as compelling as you imply.
This is standard defensive behavior, but it seems to me that it’s unacceptable for scientists. If you’re right, you’re right; if you’re wrong, you’re wrong. Making math errors and cherry-picking data should be impossible for a scientist to defend.
I’ve seen this sort of socially pervasive attitude many times recently, in situations when a person’s arguments or facts (or both) are pointed out to be simply and purely wrong, yet they persist — not by defending their arguments or facts, but by claiming to now be attacked, misportrayed, or misunderstood.
We see this behavior far too often in the age of polemicism and ideology, which has seeped into the sciences in many ways — the attempt to fight off facts with emotion and fend off critics with cries of form over statements of substance. It brings to mind what Daniel Patrick Moynihan famously said:*
Everyone is entitled to his own opinion, but not to his own facts.
How we establish facts, validate them, communicate them, and interpret them remain fundamental challenges.If we succumb to pride, hubris, and vanity rather than to the humility of scientific proofs and processes, we will make troubling and multi-year mistakes again and again.
* Hat tip to JI for reminding me of this appropriate quotation.
17 Thoughts on "Austerity Research — When Ideology and Polemicism Overwhelm Facts and Logic"
Predating Daniel Moynihan, CP Scott of the (then Manchester) Guardian said it more pithily:
“Comment is free, facts are sacred” (1921).
The origin of the Guardian’s current ‘Comment is Free…’ blog title.
And you can see the polemical approach being propagated in the comments section there (and in other newspaper comment sites). It’s not evidence-based opinion but opinon-based evidence that so often takes centre stage.
The problem, Martin, is that the weight of evidence is a personal matter. This is reasoning in action. Some people find disagreement distasteful but it is central in science, as in life. Reasonable people of good will looking at the same evidence can come to contradictory conclusions. Some people cannot accept this fact, so they accuse their opponents of irrationality, as you seem to be doing. I call that the Lockean fallacy.
The weight of evidence might be a personal matter. But of course if the weight of evidence points the other way from the opinion, and it is just ignored, or the providers of the contra-opinion evidence are attacked rather than the evidence challenged, then we don’t get very far.
Good reasoning might also occur when both parties use the same weight of evidence but come to differing vies. They might then agree on what further information (or experiments in science) is needed to verify one or the other (or an yet unseen) veiewpoint.
I’m not sure why you say I’m accusing anyone of irrationality. Of not using the fully available evidence to back an opinion yes. But I didn’t say that was irrational.
Can we call that the Wojickean fallacy. 🙂
Your tone strikes me as polemical. But in any case as one who studies the logic of complex issues I tend to disagree with your point #5, which you say is most important. When a paper achieves iconic status in a public debate it typically gets very close scrutiny, just as this one did. This happens within peer review as well as without it, but it takes time for such status to be achieved. If you are suggesting that every paper should be so scrutinized prior to publication that is impossible. Peer review is not an arbiter of truth. (That it is is a common confusion, but perhaps that is not your point.)
Nor is the distinction between opinions and facts that simple. All we have are our beliefs and the tests thereof. If the authors do not accept the purported refutation then that refutation remains controversial. This situation is quite common and not new, as science is an ongoing struggle not a piling up of facts. You have taken a side but your argument that your opponent is dishonest, if that is your argument, is not a good one.
In particular I am troubled by the emerging popularity of the term “cherry picking” as that term is scientifically meaningless. It is thus highly polemical. No matter what data one uses there is always more. Science is often about finding patterns in selected data. That is a strength not a flaw. Pointing out that other data gives other results is important but it is not a condemnation. At the frontier science is a debate.
I’m intolerant of intolerance, and polemical about polemics.
I realize how one path to scrutiny is public acceptance first, scrutiny second. My concern is the length of time it took, the fact that other economists were apparently accepting of the premise and paper, and that vast swaths of public policy — entire national systems — fell prey to a single paper’s premise. Also, how a paper gets into a journal without peer-review and only drinking its authors’ own bathwater is questionable. That’s leveraging the power of “journal” to fool people into believing the paper is more than it deserves to be.
Opinions can be useful when they are based more closely on facts than otherwise. If authors don’t accept a refutation (not a disputation, mind you, but a factual refutation), then those authors are just being stubborn. That’s wrong in science. Imagine the lack of controversy if the authors had simply said, “Wow, you’re right. We forgot to include entire economies in our model, made formula errors we failed to catch, and our conclusions would have been very different had we not made these errors. We apologize.” Apologies have a place in science, as in all human affairs. We make mistakes. This paper contained mistakes, and some doozies.
“Cherry-picking” is an interesting and useful term. While you have a point about selecting data to pay attention to vs. data to ignore, the connotation of cherry-picking is well-understood and is not merely selectivity in the pure scientific sense, but suppression of contradictory data that would challenge a foregone conclusion. I have no problem with how it is being used. It is a logical term related to confirmation bias. It’s just common parlance now for those ideas. Synonyms don’t bother me.
I doubt the widespread existence of confirmation bias, which is itself a controversial concept in psychology. Like cherry picking the term is commonly used to denigrate one’s opponents.
I also doubt that the national budget cutting is due to this paper.
I agree on the first point. I don’t believe confirmation bias is widespread, but it is widely understood, and when it seems to occur, “cherry-picking” is nice vernacular for it.
This paper wasn’t the only factor leading to austerity measures, but it provided a backboard for a lot of decisions over the past few years. This paper has been a centerpiece of Paul Ryan’s budget approach, which has been the fulcrum of the austerity politics of the Republican party.
There is one other issue that is worth mentioning. This journal publishes the data with each article. I expect that was the key that allowed Thomas Herndon to expose the glaring errors in the paper.
Actually, they did not publish the data with this article. Herndon had to request it from them. Perhaps that’s another side-effect of no peer-review?
And/or perhaps why the journal should publish the data…
Agree completely. I was going to go into that, but skipped it for the sake of brevity. The main issues seem to be around skepticism and incentives in any event. Without those two, we could have the “tree falls in the woods” phenomenon around published data. If bad data exists, and nobody examines it, does it still have flaws?
Another advantage of traditional peer review is it takes place pretty much in private, where mistakes can be corrected before publication with a minimum of embarrassment and an editor is there to oversee any debate. This takes some of the edge off a confrontation. I wonder if Reinhart and Rogoff would have responded as they did had Thomas Herndon been assigned reviewer.
This is a great point. Most of us learn at a relatively young age that private and constructive criticisms delivered with sincerity work far better than public reprobations. People get defensive, and their pride is wounded more easily, if they are publicly shamed. Perhaps the era of public comment is feeding the age of polemicism and pride for exactly that reason.
I wonder if we’d have less of this if people could file private comments to writers of blog posts, etc.? God, that’s a really interesting idea. Hey, WordPress, get those developers on this!
Outstanding post and comments, particularly on the eve of the Society for Scholarly Publishing Annual Meeting. . . . 😉
Another piece very close to my heart. I’m finishing up my Masters of Publishing and my thesis deals with peer review, how lack of review can have long reaching negative impact, and how not all peer review is “created equal.” This is a perfect example of why this is important.
Nice post Kent. I have to admit I was tripped up by the variable application of peer review that you have correctly identified.
By some strange coincidence I also blogged on this topic today. However I focused on another angle: that the peer review process needs to encompass not just data but also the software algorithms that are being used to process the data.
My post is at http://www.semantico.com/2013/06/should-peer-review-include-software/ should you wish to read it.