Friday, May 30, 2014

Review of Chapter 3 (part 1), The Hockey Stick Illusion by A. W. Montford


In an attempt to see both sides of the debate in the question of human caused climate change, I am starting to read this book: The Hockey Stick Illusion: Climategate and the Corruption of Science (Independent Minds) by Andrew Montford

 As I read it, I will give a chapter by chapter review comprising quotes and commentary. So, if this is an area of interest, keep a watch.

All material quoted from the book is in italics. My comments are in plain text in brackets.


There had been a great deal of excitement on the forum in recent months. A new study by two Harvard astrophysicists, Willie Soon and Sallie Baliunas, had just been published and its appearance had caused a huge furore in the world of paleoclimate [Soon W, Baliunas S. Proxy climatic and environmental changes of the past 1000 years. Climate Research. 2003; 23: 89–110]. Soon and Baliunas had reviewed a large dataset of paleoclimate proxies to see how many showed the Medieval Warm Period, the Little Ice Age and the modern warming. They had concluded that the Medieval Warm Period was in fact a real, significant feature of climate history. The paper had been extremely controversial, contradicting the mainstream consensus that the Medieval Warm Period was probably only a regional phenomenon. Climatologists from around the world had fallen over themselves to attack the Soon and Baliunas paper, mainly on the grounds that many of the proxies used in the study were precipitation proxies rather than temperature proxies. So great was the uproar, in fact, that several scientists resigned from the editorial board of Climate Research, the journal which had published the paper in the first place. In the face of all this opposition, the paper had gained little traction in terms of changing mainstream scientific opinion on the existence of the Medieval Warm Period. It had been a huge disappointment for the sceptic community.

[There are reasons the Soon-Baliunas paper was controversial, but it was not because of the Medieval Warm Period (MWP), which all climate researchers accept as having occurred, but rather the poor quality of their analysis and the claim that the MWP saw higher mean temperatures than the recent warming trend . Mann characterizes the problems well in The Hockey Stick and the Climate Wars: Dispatches from the Front Lines:

“The Soon and Baliunas study claimed to contradict previous work—including our own—that suggested that the average warmth of the Northern Hemisphere in recent decades was unprecedented over a time frame of at least the past millennium. It claimed to do so not by performing any quantitative analysis itself, but through what the authors referred to as a “meta-analysis”—that is to say, a review and characterization of other past published work.

“A fundamental problem with the paper was that its authors’ definition of a climatic event was so loose as to be meaningless. As Richard Monastersky summarized it in his article, “under their method, warmth in China in A.D. 850, drought in Africa in A.D. 1000, and wet conditions in England in A.D. 1200 all would qualify as part of the Medieval Warm Period, even though they happened centuries apart.” In other words, their characterization didn’t take into account whether climate trends in different regions were synchronous. The authors therefore hadn’t accounted for likely off setting fluctuations—the typical sort of seesaw patterns one often encounters with the climate, where certain regions warm while others cool.

“An additional problem with the study is readily evident from Monastersky’s characterization above. Rather than assessing whether there was overall evidence for widespread warmth, the authors were asking a completely different, practically tautological question: Was there evidence that a given region was either unusually warm, or wet, or dry? The addition of these two latter criteria undermined the credibility of the authors’ claim of assessing the relative unusualness of warmth during the medieval period. These two criteria—were there regions that were either wet or dry—could just as easily be satisfied during a global cool period!

“A third problem is that the authors used an inadequate definition of modern conditions. It is only for the past couple of decades that the hockey stick and other reconstructions showed warmth to be clearly anomalous. Many of the records included in the Soon and Baliunas meta-analysis either end in the mid-twentieth century or had such poor temporal resolution that they could not capture the trends over the key interval of the past few decades, and hence cannot, at least nominally, be used to compare past and present.

“There was yet a fourth serious problem with the Soon and Baliunas study. The authors in many cases had mischaracterized or misrepresented the past studies they claimed to be assessing in their meta-analysis, according to Monastersky. Paleoclimatologist Peter de Menocal of Columbia University/LDEO, for example, who had developed a proxy record of ocean surface temperature from sediments off the coast of Africa, indicated that “Mr. Soon and his colleagues could not justify their conclusions that the African record showed the 20th century as being unexceptional,” and told Monastersky, “My record has no business being used to address that question.” To cite another instance, David Black of the University of Akron, a paleoclimatologist who had developed a proxy record of wind strength from sediments off the coast of Venezuela, indicated that “Mr. Soon’s group did not use his data properly”; he told Monastersky pointedly: “I think they stretched the data to fit what they wanted to see.””

and from the same source:

“John Holdren, the Heinz professor of environmental policy who went on to become president of the American Association for the Advancement of Science (AAAS) and presidential science adviser in the Obama administration, voiced the opinion27 that “The critics are right. It’s unfortunate that so much attention is paid to a flawed analysis, but that’s what happens when something happens to support the political climate in Washington.””

Another critique of the controversy is in Wikipedia: http://en.wikipedia.org/wiki/Soon_and_Baliunas_controversy]

McIntyre was one of the mainstays of the Climate Skeptics site, posting comments on a wide array of subjects. In recent weeks he’d spent a great deal of time discussing radiative physics, trying to understand how the IPCC came up with an expected temperature rise of 2.5°C every time atmospheric carbon dioxide doubled. He’d not really got anywhere with it so far (and in fact it remains a mystery to this day), but he was far from giving up hope. If there was an explanation to be had, he fully expected to find it.

[Is it possible that retired geologist (McIntyre) and Montford just don’t accept the explanations commonly found in the literature. For example, this, from Global Temperature Change, Hansen, et al. (2006):

“In assessing the level of global warming that constitutes DAI [dangerous anthropogenic interference], we must bear in mind that estimated climate sensitivity of 3 ± 1°C for doubled CO2, based mainly on paleoclimate data but consistent with models, refers to a case in which sea ice, snow, water vapor, and clouds are included as feedbacks, but ice sheet area, vegetation cover, and non-H2O GHGs are treated as forcings or fixed boundary conditions. On long time scales, and as the present global warming increases, these latter quantities can change and thus they need to be included as feedbacks. Indeed, climate becomes very sensitive on the ice-age time scale, as feedbacks, specifically ice sheet area and GHGs, account for practically the entire global temperature change.”

The above quote is by no means the only place where such explanations are to be found, and also notice that it is not accurate to set a specific value, such as 2.5°C without also incorporating the degree of uncertainty. According to Hansen, et al. (2006) the range is from 2-4°C, and if you read the paper in more detail the authors also admit that, although much less likely, it could be lower or higher than this range. Part of how Montford mischaracterizes the debate is to always lowball the numbers, seeming unwilling to see much, if any change in temperature due to increasing CO2. If he would at least engage in an honest debate it would be better.]

While McIntyre’s readings in climatology broadened, he also began discussing the IPCC’s claims of unprecedented warmth with friends and acquaintances. His contacts in the mining industry were particularly interesting on the subject. Familiar as they were with the long-term history of the Earth, many of the geologists McIntyre spoke to had strong opinions on claims that recent temperatures were unprecedented and most were highly sceptical of the idea. When it came to the Hockey Stick itself, mining people –geologists, lawyers and accountants –were openly contemptuous. Hockey sticks were a well known phenomenon in the business world, and McIntyre’s contacts had seen far too many dubious mining promotions and dotcom revenue projections to take such a thing seriously. The contrasting reactions to the Hockey Stick of politicians and business people –on the one hand doom-laded predictions of catastrophe and on the other open ridicule –acted as a spur to McIntyre, who flung himself headlong into the world of climatology.

[Two things here: First, geologists and geochemists, as a group tend to automatically be skeptical of climate science because they are not typically trained in the disciplines needed to fully understand climate modeling, and they are typically employed by an industry that sees signs of environmental problems as a threat to their profession. This is especially true of those geologists employed by the fossil fuel industry, which stands to lose a lot of revenue if politicians decide we must respond to climate change by reducing carbon emissions. Secondly, to question the hockey stick just because it resembles something you have seen in a business presentation in a boardroom seems like a poor reason to question the validity of the data supporting its construction. Of course, this is just what led him to approach Mann’s work with skepticism, not his stated reason for rejecting Mann’s work later, but it does suggest an interesting, non-scientifically based bias.]

Within a matter of days of his announcement, McIntyre was posting findings to the Climate Skeptics forum. He had now worked through Mann’s explanation of his methodology and he had soldiered his way through the matrix algebra. It was still very strange. The use of PC analysis was new in the realm of paleoclimate and Mann had made no attempt to prove the validity of the technique in the field, instead relying on a bold assertion that it was better than the alternatives. In view of this and given the surprising results–with no Medieval Warm Period or Little Ice Age visible in the reconstruction–one might have expected that experts in the field would have questioned whether Mann’s novel procedures might have been a factor in his anomalous results. But despite a thorough search of the literature, there was no sign that anyone else had seen fit to probe the issue further. Nor had any other researchers adopted Mann’s methodology in the five years since his paper had been published. Given how often the Hockey Stick had been cited in the scientific literature, these were very surprising observations, which seemed to suggest that paleoclimatologists liked Mann’s results rather more than they liked his methodology.

[Is Montford just being dishonest, or did he not bother to even check the MBH99 paper (Mann, et al., Northern hemisphere temperatures during the past millennium: Inferences, uncertainties, and limitations). All one has to do is read that paper to realize two things: First, Mann only estimated temperatures back to 1400, which is just after the MWP, so of course he left it out of the hockey stick figure. He even explains in the paper why temperatures before 1400 were not presented, i.e. because the data were inadequate. He even mentions the MWP, but because he suggests that the peak temperatures during that period only “approached” mean 20th century levels, Montford (and McIntyre) interpret that to mean that Mann has done away with the MWP. He has not done away with it, he is just saying that its peak temperatures were not as high as Montford would like them to be, i.e. higher than any time in the recent past or the present.

Secondly, Mann does include the Little Ice Age (LIA) in the hockey stick figure and even mentions it, just not by name, which apparently is interpreted by Montford as being absent. If you don’t agree with my two statements here, read the paper for yourself, and here is what they say in their conclusions:

“Although NH reconstructions prior to about AD 1400 exhibit expanded uncertainties, several important conclusions are still possible. While warmth early in the millennium approaches mean 20th century levels, the late 20th century still appears anomalous: the 1990s are likely the warmest decade, and 1998 the warmest year, in at least a millennium. More widespread high-resolution data which can resolve millennial-scale variability are needed before more confident conclusions can be reached with regard to the spatial and temporal details of climate change in the past millennium and beyond.”

To top it off, note the tentative nature of his statements in the conclusion, in contrast to Montfort who is just plain certain that Mann is wrong.

Montford also claims that no other researchers were using the methods of Mann in the five years following MBH99. Apparently Montford either did not check the literature very thoroughly, or his definition of researchers having not “adopted Mann’s methodology” so narrow that unless they did exactly what he did, Montford would assume they had not “adopted Mann’s methodology.” The odd thing about this is that the techniques used by Mann, for the most part, were standard, accepted methods used by most climate researchers. In just a brief search of the literature from 1999-2004 I found several papers that appear to have used the general approach used by Mann. Besides, there are also studies using alternative methods that found largely the same results as Mann during this same time period.]

Another issue was also attracting McIntyre’s attention. During his calibration exercise, Mann had assessed how well the temperature data matched up against the proxies by calculating various statistical measures–in other words, numbers that acted as a score of how good the match was. The main way he did this was using a measure that he called the beta (β), which he described as being ‘a quite rigorous measure of the similarity between two variables’.

This was a somewhat surprising choice since the beta statistic was virtually unheard of outside climatology circles. (It also goes by the names of the ‘resolved variance statistic’ or the ‘reduction of error (RE) statistic’–the latter being the term we will use to refer to it henceforward.) With his experience in statistics, McIntyre was aware that there was great danger in using novel measures like these, whose mathematical behaviour hadn’t been thoroughly researched and documented by statisticians. The statistical literature was littered with examples where particular statistical measures gave results which misled in certain circumstances. Mann had left no clue as to why he had preferred the RE rather than the more normal measures of correlation, such as the correlation (r), the correlation squared (R2) or the CE statistic. The behaviour of all of these measures under a wide range of scenarios was well documented, so McIntyre was surprised not to see an explanation.

[There is a mixture of truth and deception in these two paragraphs. It is true that RE is used within climatology circles and has in fact been used clear back into the 1950s, and although I cannot confirm Montford’s point that RE is ONLY used in climatology circles and not elsewhere, I am not sure why this would make any difference anyway. This statistical method was developed by Lorenz (1956, Empirical orthogonal functions and statistical weather prediction) and is referred to in standard climatology textbooks and scholarly books, for example, Methods of Dendrochronology: Applications in the Environmental Sciences by E.R. Cook and L.A. Kairiukstis (1990) says this:

4.3.4. Reduction of error


“The reduction of error (RE) statistic provides a highly sensitive measure of reliability. It has useful diagnostic capabilities (Gordon, 1980) and is similar, but not equivalent, to the explained variance statistic obtained with the calibration of the dependent data (Lorenz, 1956; 1977). Therefore, RE should assume a central role in the verification procedure. The equation used to calculate the RE can be expressed in terms of the ŷi, estimates and the yi predictions that are expressed as departures from the dependent period mean value:
(4.39)
  “The term on the right of (4.39) is the ratio of the total squared error obtained with the regression estimates and the total squared error obtained using the dependent period mean as the only estimate (Lorenz, 1958, 1977; Kutzbach and Guetter, 1980). This average estimation becomes a standard against which the regression estimation is compared. If the reconstruction does a better job at estimating the independent data than the average of the dependent period, then the total error of the regression estimates would be less, the ratio would be less than one, and the RE statistic would be positive.”


“Two verification statistics are presented here that were common to all of the recon­structions: the product-moment correlation coefficient and the reduction of error statistic. Each statistic is commonly used in dendroclimatic reconstructions. The product-moment correlation coefficient (r) is a parametric measure of association between two samples. Its use in testing for hypothesized relationships between variables is described in virtually all basic statistics texts and in Fritts (1976). The reduction of error (RE) statistic is less well known. It was developed in meteorology by Lorenz (1956) for the purpose of assessing the predictive skill of meteorological forecasts. The RE has no formal significance test. but an RE>0 is an indicator of forecast skill that exceeds that of climatology (i.e. extrapolating the climatic mean as the forecast or prediction). See Fritts (1976) Gordon and LeDuc (1981) and Fritts and Guiot (1990) for full descriptions of this statistic. its small sample properties, and other verification tests as well.”

So, why is Montford surprised that Mann would use a statistic that is widely used by climate scientists? Additionally, Montford faults him for using RE and not r or R2, which are much more widely used. In most papers by Mann, and follow-ups to MBH99, he includes not just RE, but also one or both of the latter. When both RE and r2 are reported side by side as they are in Zhang, Mann and Cook (2003, Alternative methods of proxy-based climate field reconstruction: application to summer drought over the conterminous United States back to AD 1700 from tree-ring data), RE is the more conservative statistic, causing the rejection of more data than r2. This runs counter to the apparent concerns of Montford (although his exact concerns are not clear) that Mann is including bad data in his analyses. Since RE is more conservative, Mann is more likely to have left out some potentially good data, and most certainly to have excluded any bad data that would also have been excluded by r2.]

Mann indicated in the paper that the r and R2 had also been calculated, which might have provided some reassurance to McIntyre but for the fact that the results of these calculations were not presented for the calibration step anywhere in the paper or in the online supplementary information. However, by now McIntyre had got hold of the data for the second Hockey Stick paper, MBH99–the extension back to the year 1000–so he was able to start to make some significant progress in answering some of these questions. Because the number of proxies used in MBH99 was so small (there being very few proxies that extended so far into the past) it was a relatively straightforward task for McIntyre to recreate Mann’s calibration and to calculate some of the correlation statistics for himself. The results were eye-opening, to say the least. As he reported to the climate sceptics:

The R2 . . . ranges from –0.006 to 0.454; on this basis, only 2 of 13 proxies have R2 adjusted over 0.25, and 7 of 13 have values under 0.1 . . .

To put this in perspective, R2 will normally vary between 0 and 1. A score of 0 indicates that there is no correlation at all, and 1 indicates perfect correlation. So what McIntyre was seeing was that the proxies and the temperature PCs didn’t really match up very well, according to a standard measure of correlation. The best among them were not even halfway good, and some simply showed no correlation at all. Could this explain why Mann was so enthusiastic about the RE statistic, the climatologists’ own measure of correlation?

[This would be pretty damning stuff and pretty surprising to be found in data used in a peer-reviewed scientific paper. How could this happen? Well, what is not done in this book is any update to McIntyre’s work in light of later findings. First, why should we trust that McIntyre, who does not do these kinds of statistics routinely like most climate scientists do, to produce more accurate results than Mann? If there is a discrepancy between Mann’s and McIntyre’s results, shouldn’t we suspect that McIntyre is the one making the mistakes.

In reference to the above claims, Mann, in his book The Hockey Stick and the Climate Wars. says this in reference to the above criticisms leveled by McIntyre on Montfort’s climate blog:

“To be specific, they claimed that the hockey stick was an artifact of four supposed “categories of errors”: “collation errors,” “unjustified truncation and extrapolation,” “obsolete data,” and “calculation mistakes.” As we noted in a reply to a McIntyre and McKitrick comment on MBH98 that had been submitted to and rejected by Nature (because their comment was rejected anyway, our reply would not appear there either), those claims were false, resulting from their misunderstanding of the format of a spreadsheet version of the dataset they had specifically requested from my associate, Scott Rutherford. None of the problems they cited were present in the raw, publicly available version of our dataset, which was available at that time at ftp://holocene.evsc.virginia.edu/pub/MBH98/.”

What also is overlooked by Montford, is that if you do actually take Mann’s correct data, and the RE and r2 values found in his papers that contain both statistics, that for the vast amount of his data, both values agree on what constitute good and bad data. Where data are rejected by one measure and not the other, the rejections are due to the RE statistic. Considering that this is a revised edition of the book, republished in 2011, it is dishonest of Montford to have not corrected these problems, or at least addressed them, rather than perpetuating false and incorrect critiques of MBH98 and MBH99.]

[Further Examples of Grammatically Incorrect Use of the Word “Data:”]

He could see that Mann had used a network of 112 proxy series, and in fact behind the scenes there was even more data than this.

The data that Mann used was the CRU’s best stab at what the actual temperatures had been for the previous 150-odd years, and as we’ve noted, CRU’s data was reckoned to be the best.

He also tried regressing nineteenth century proxy data against twentieth century temperatures and found no great difference in the R2 score to those achieved when the correct proxy data was used.

On an even simpler level, there was a great deal about the data used in the MBH99 reconstruction that was peculiar.

[It looks as if Montford’s grammatically incorrect use of the word data is consistent, given that four more cases occur in the first half of chapter 3. He needs more than just a better editor, he needs someone versed in scientific writing to help him.]

No comments:

Post a Comment