Reprinted from http://www.realclimate.org/index.php/archives/2005/01/peer-review-a-necessary-but-not-sufficient-condition/
Peer Review: A Necessary But Not Sufficient Condition
by Michael Mann and Gavin Schmidt
On this site we emphasize conclusions that are supported by “peer-reviewed” climate research. That is, research that has been published by one or more scientists in a scholarly scientific journal after review by one or more experts in the scientists’ same field (‘peers’) for accuracy and validity. What is so important about “Peer Review”? As Chris Mooney has lucidly put it:
[Peer Review] is an undisputed cornerstone of modern science. Central to the competitive clash of ideas that moves knowledge forward, peer review enjoys so much renown in the scientific community that studies lacking its imprimatur meet with automatic skepticism. Academic reputations hinge on an ability to get work through peer review and into leading journals; university presses employ peer review to decide which books they’re willing to publish; and federal agencies like the National Institutes of Health use peer review to weigh the merits of applications for federal research grants.
Put simply, peer review is supposed to weed out poor science. However, it is not foolproof — a deeply flawed paper can end up being published under a number of different potential circumstances: (i) the work is submitted to a journal outside the relevant field (e.g. a paper on paleoclimate submitted to a social science journal) where the reviewers are likely to be chosen from a pool of individuals lacking the expertise to properly review the paper, (ii) too few or too unqualified a set of reviewers are chosen by the editor, (iii) the reviewers or editor (or both) have agendas, and overlook flaws that invalidate the paper’s conclusions, and (iv) the journal may process and publish so many papers that individual manuscripts occasionally do not get the editorial attention they deserve.
Thus, while un-peer-reviewed claims should not be given much credence, just because a particular paper has passed through peer review does not absolutely insure that the conclusions are correct or scientifically valid. The “leaks” in the system outlined above unfortunately allow some less-than-ideal work to be published in peer-reviewed journals. This should therefore be a concern when the results of any one particular study are promoted over the conclusions of a larger body of past published work (especially if it is a new study that has not been fully absorbed or assessed by the community). Indeed, this is why scientific assessments such as the Arctic Climate Impact Assessment (ACIA), or the Intergovernmental Panel on Climate Change (IPCC) reports, and the independent reports by the National Academy of Sciences, are so important in giving a balanced overview of the state of knowledge in the scientific research community.
There have been several recent cases of putatively peer-reviewed studies in the scientific literature that produced unjustified or invalid conclusions. Curiously, many of these publications have been accompanied by heavy publicity campaigns, often declaring that this one paper completely refutes the scientific consensus. An excellent account of some of these examples is provided here by Dr. Stephen Schneider (Stanford University).
Perhaps the most publicized recent example was the publication of a study by astronomer Willie Soon of the Harvard University-affiliated Harvard-Smithsonian Center for Astrophysics and co-authors, claiming to demonstrate that 20th century global warmth was not unusual in comparison with conditions during Medieval times. Indeed, this study serves as a prime example of one of the "myths" that we have debunked elsewhere on this site. The study was summarily discredited in articles by teams of climate scientists (including several of the scientists here at RealClimate), in the American Geophysical Union (AGU) journal Eos and in Science. However, it took some time the rebuttals to work their way through the slow process of the scientific peer review. In the meantime the study was quickly seized upon by those seeking to sow doubt in the validity behind the scientific consensus concerning the evidence for human-induced climate change (see news articles in the New York Times, and Wall Street Journal). The publication of the study had wider reverberations throughout the academic and scientific institutions connected with it. The association of the study with the “Harvard” name caused some notable unease among members of the Harvard University community (see here and here) and the reputation of the journal publishing the study was seriously tarnished in the process. The editor at Climate Research that handled the Soon et al paper, Dr. Chris de Frietas, has a controversial record of past editorial practices (see this 'sidebar' to an article in Scientific American by science journalist David Appell). In an unprecedented (to our knowledge) act of protest, chief editor Hans Von Storch and 3 additional editors subsequently resigned from Climate Research in response to the fundamental documented failures of the editorial process at the journal. A detailed account of these events are provided by Chris Mooney in the Skeptical Inquirer and The American Prospect, by David Appell in Scientific American, and in a news brief in Nature. The journal’s publisher himself (Otto Kline) eventually stated that “[the conclusions drawn] cannot be concluded convincingly from the evidence provided in the paper”.
Another journal which (quite oddly) also published the Soon et al study, “Energy and Environment”, is not actually a scientific journal at all but a social science journal. The editor, Sonia Boehmer-Christensen, in defending the publication of the Soon et al study, was quoted by science journalist Richard Monastersky in the Chronicle of Higher Education somewhat remarkably confessing “I’m following my political agenda — a bit, anyway. But isn’t that the right of the editor?”.
Shaviv and Veizer (2003) published a paper in the journal GSA Today, where the authors claimed to establish a correlation between cosmic ray flux (CRF) and temperature evolution over hundreds of millions of years, concluding that climate sensitivity to carbon dioxide was much smaller than currently accepted. The paper was accompanied by a press release entitled "Global Warming not a Man-made Phenomenon", in which Shaviv was quoted as stating, “The operative significance of our research is that a significant reduction of the release of greenhouse gases will not significantly lower the global temperature, since only about a third of the warming over the past century should be attributed to man”. However, in the paper the authors actually stated that “our conclusion about the dominance of the CRF over climate variability is valid only on multimillion-year time scales”. Unsurprisingly, there was a public relations offensive using the seriously flawed conclusions expressed in the press release to once again try to cast doubt on the scientific consensus that humans are influencing climate. These claims were subsequently disputed in an article in Eos (Rahmstorf et al, 2004) by an international team of scientists and geologists (including some of us here at RealClimate), who suggested that Shaviv and Veizer’s analyses were based on unreliable and poorly replicated estimates, selective adjustments of the data (shifting the data, in one case by 40 million years) and drew untenable conclusions, particularly with regard to the influence of anthropogenic greenhouse gas concentrations on recent warming (see for example the exchange between the two sets of authors). However, by the time this came out the misleading conclusions had already been publicized widely.
Next, we discuss the first of three so-called “bombshell” papers that supposedly "knock the stuffing out of" the findings of the IPCC. Patrick Michaels and associates billed his own paper (McKitrick and Michaels, 2004) (co-authored by Ross McKitrick), this way:
After four years of one of the most rigorous peer reviews ever, Canadian Ross McKitrick and another of us (Michaels) published a paper searching for “economic” signals in the temperature record. … The research showed that somewhere around one-half of the warming in the U.N. surface record was explained by economic factors, which can be changes in land use, quality of instrumentation, or upkeep of records.
It strikes us as odd, to say the least, that, after one of the “most rigorous peer reviews ever”, nobody involved (neither editor, nor reviewers, nor authors) seems to have caught the egregious basic error that the authors mistakenly used degrees rather then the required radians in calculating the cosine functions used to spatially weight their estimates**. This mistake rendered every calculation in the paper incorrect, and the conclusions invalid — to our knowledge, however, the paper has not yet been retracted. Remarkably, there were still other independent and equally fundamental errors in the paper that would have rendered it entirely invalid anyway. To the journals credit, they published a criticism of the paper by Benestad (2004) to this effect. It may come as no surprise that McKitrick and Michaels (2004) was published in Climate Research and was handled by none other than Chris de Frietas.
The other two “bombshell” papers were published in the AGU journal Geophysical Research Letters (GRL) which publishes over 1500 papers per year. It can be conservatively estimated that they publish no more than 70% of the papers received, and thus probably process over 2000 papers per year. That gives each of the typically 8 or so editors of the journal almost a paper per day to evaluate. While GRL publishes many excellent papers and provides an important forum to the research community for rapid publication of important results, occasionally, poor papers slip through the net. These two papers were authored by Douglass and collaborators (Douglass et al, 2004a;2004b) the first with Fred Singer as a co-author and the second with both Singer and Michaels. Both papers*** argue that recent atmospheric temperatures have been cooling, rather than warming, based on the analysis of data over a selective (1979-1996) time interval that eliminates periods of significant warming both before and after, and using a controversial satellite-derived temperature record whose robustness has been called into question by other teams analysing the data. An excellent discussion of both papers is provided by Tim Lambert.
Another relevant GRL paper was the article by Legates and Davis (1997) which criticized the use of “centered correlations” common to numerous "Detection and Attribution" studies supporting the detection of human influence on recent climate change. They argued that correlations could increase while observed and simulated global means diverge. However, as pointed out in the chapter on Detection and Attribution in IPCC (2001)*, centered correlations were introduced for precisely this reason: to provide an indicator that was statistically independent of global mean temperature changes. As noted by the IPCC, “if both global mean changes and centered pattern correlations point towards the same explanation of observed temperature changes, it provides more compelling evidence than either of these indicators in isolation”. Again, a basic logical flaw in the authors’ criticism of past work was not caught in peer review.
Next, we consider the paper by Soon et al (2004) published in GRL which criticized the way temperature data series had been smoothed in the IPCC report and elsewhere. True to form, contrarians immediately sold the results as ‘invalidating’ the conclusions of the IPCC, with the lead author Willie Soon himself writing an opinion piece to this effect. Once again, a few short months later, a followup article was published by one of us (Mann, 2004) that invalidated the Soon et al (2004) conclusions, demonstrating (with links to supporting Matlab source codes and data) how (Α) the authors had, in an undisclosed manner, inappropriately compared trends calculated over differing time intervals and (Β) had not used standard, objective statistical criteria to determine how data series should be treated near the beginning and end of the data. It is unfortunate that a followup paper even had to be published, as the flaws in the original study were so severe as to have rendered the study of essentially no scientific value.
There are other examples of studies that have even been published in high quality venues that were heavily publicized at the time, but in retrospect were flawed (though not as egregiously as the examples above). For instance, Fan et al (1998), on the size of the carbon sink in the continental US, rebutted by Schimel et al. (2000). Or the solar-cycle length/climate correlation described by Friis-Christensen and Lassen (1991) whose seeemingly impressive correlation for the latter half of the 20th Century disappears if you don’t change the averaging scheme half way along (Laut, 2003; Damon and Laut, 2004).
The current thinking of scientists on climate change is based on thousands of studies (Google Scholar gives 19,000 scientific articles for the full search phrase “global climate change”). Any new study will be one small grain of evidence that adds to this big pile, and it will shift the thinking of scientists slightly. Science proceeds like this in a slow, incremental way. It is extremely unlikely that any new study will immediately overthrow all the past knowledge. So even if the conclusions of the Shaviv and Veizer (2003) study discussed earlier, for instance, had been correct, this would be one small piece of evidence pitted against hundreds of others which contradict it. Scientists would find the apparent contradiction interesting and worthy of further investigation, and would devote further study to isolating the source of the contradiction. They would not suddenly throw out all previous results. Yet, one often gets the impression that scientific progress consists of a series of revolutions where scientists discard all their past thinking each time a new result gets published. This is often because only a small handful of high-profile studies in a given field are known by the wider public and media, and thus unrealistic weight is attached to those studies. New results are often over-emphasised (sometimes by the authors, sometimes by lobby groups) to make them sound important enough to have news value. Thus “bombshells” usually end up being duds.
However, as demonstrated above, even when it initially breaks down, the process of peer-review does usually work in the end. But sometimes it can take a while. Observers would thus be well advised to be extremely skeptical of any claims in the media or elsewhere of some new “bombshell” or “revolution” that has not yet been fully vetted by the scientific community.
*Note added 1/21/05: It has come to our attention that Legates and Davis (1997) were similarly rebutted in a separate publication by Wigley et al (2000).
**Note added 1/21/05: McKitrick and Michaels have published an errata correcting the degrees/radians error in CR 27, 265-268 which now shows that latitude correlates much better with temperature trends than any economic statisitic.
***Note added 1.25.05: Chip Knappenberger correctly points out that the the second Douglass et al paper doesn’t actually make the claim that the atmosphere is cooling. We therefore withdraw that specific comment, but note that the comment concerning the selective use of data series and time periods stands.