Review of Chapter 3 (part 1), The Hockey Stick Illusion by A. W. Montford
In an attempt to see both sides of the debate in the question of human caused climate change, I am starting to read this book: The Hockey Stick Illusion: Climategate and the Corruption of Science (Independent Minds) by Andrew Montford
As I read it, I will give a chapter by chapter review comprising quotes and commentary. So, if this is an area of interest, keep a watch.
All material quoted from the book is in italics. My comments are in plain text in brackets.
There had been a great deal of
excitement on the forum in recent months. A new study by two Harvard
astrophysicists, Willie Soon and Sallie Baliunas, had just been published and
its appearance had caused a huge furore in the world of paleoclimate [Soon W,
Baliunas S. Proxy climatic and environmental changes of the past 1000 years.
Climate Research. 2003; 23: 89–110]. Soon and Baliunas had reviewed a large
dataset of paleoclimate proxies to see how many showed the Medieval Warm
Period, the Little Ice Age and the modern warming. They had concluded that the
Medieval Warm Period was in fact a real, significant feature of climate
history. The paper had been extremely controversial, contradicting the
mainstream consensus that the Medieval Warm Period was probably only a regional
phenomenon. Climatologists from around the world had fallen over themselves to
attack the Soon and Baliunas paper, mainly on the grounds that many of the
proxies used in the study were precipitation proxies rather than temperature
proxies. So great was the uproar, in fact, that several scientists resigned
from the editorial board of Climate Research, the journal which had published the
paper in the first place. In the face of all this opposition, the paper had
gained little traction in terms of changing mainstream scientific opinion on
the existence of the Medieval Warm Period. It had been a huge disappointment
for the sceptic community.
[There
are reasons the Soon-Baliunas paper was controversial, but it was not because
of the Medieval Warm Period (MWP), which all climate researchers accept as
having occurred, but rather the poor quality of their analysis and the claim
that the MWP saw higher mean temperatures than the recent warming trend . Mann
characterizes the problems well in The
Hockey Stick and the Climate Wars: Dispatches from the Front Lines:
“The
Soon and Baliunas study claimed to contradict previous work—including our
own—that suggested that the average warmth of the Northern Hemisphere in recent
decades was unprecedented over a time frame of at least the past millennium. It
claimed to do so not by performing any quantitative analysis itself, but through
what the authors referred to as a “meta-analysis”—that is to say, a review and
characterization of other past published work.
“A
fundamental problem with the paper was that its authors’ definition of a
climatic event was so loose as to be meaningless. As Richard Monastersky
summarized it in his article, “under their method, warmth in China in A.D. 850,
drought in Africa in A.D. 1000, and wet conditions in England in A.D. 1200 all
would qualify as part of the Medieval Warm Period, even though they happened centuries
apart.” In other words, their characterization didn’t take into account whether
climate trends in different regions were synchronous. The authors therefore
hadn’t accounted for likely off setting fluctuations—the typical sort of seesaw
patterns one often encounters with the climate, where certain regions warm
while others cool.
“An
additional problem with the study is readily evident from Monastersky’s
characterization above. Rather than assessing whether there was overall
evidence for widespread warmth, the authors were asking a completely different,
practically tautological question: Was there evidence that a given region was
either unusually warm, or wet, or dry? The addition of these two latter
criteria undermined the credibility of the authors’ claim of assessing the
relative unusualness of warmth during the medieval period. These two
criteria—were there regions that were either wet or dry—could just as easily be
satisfied during a global cool period!
“A
third problem is that the authors used an inadequate definition of modern
conditions. It is only for the past couple of decades that the hockey stick and
other reconstructions showed warmth to be clearly anomalous. Many of the
records included in the Soon and Baliunas meta-analysis either end in the
mid-twentieth century or had such poor temporal resolution that they could not
capture the trends over the key interval of the past few decades, and hence
cannot, at least nominally, be used to compare past and present.
“There
was yet a fourth serious problem with the Soon and Baliunas study. The authors
in many cases had mischaracterized or misrepresented the past studies they
claimed to be assessing in their meta-analysis, according to Monastersky.
Paleoclimatologist Peter de Menocal of Columbia University/LDEO, for example,
who had developed a proxy record of ocean surface temperature from sediments
off the coast of Africa, indicated that “Mr. Soon and his colleagues could not
justify their conclusions that the African record showed the 20th century as being
unexceptional,” and told Monastersky, “My record has no business being used to
address that question.” To cite another instance, David Black of the University
of Akron, a paleoclimatologist who had developed a proxy record of wind
strength from sediments off the coast of Venezuela, indicated that “Mr. Soon’s
group did not use his data properly”; he told Monastersky pointedly: “I think
they stretched the data to fit what they wanted to see.””
and
from the same source:
“John
Holdren, the Heinz professor of environmental policy who went on to become
president of the American Association for the Advancement of Science (AAAS) and
presidential science adviser in the Obama administration, voiced the opinion27
that “The critics are right. It’s unfortunate that so much attention is paid to
a flawed analysis, but that’s what happens when something happens to support
the political climate in Washington.””
Another
critique of the controversy is in Wikipedia: http://en.wikipedia.org/wiki/Soon_and_Baliunas_controversy]
McIntyre was one of the mainstays
of the Climate Skeptics site, posting comments on a wide array of subjects. In
recent weeks he’d spent a great deal of time discussing radiative physics,
trying to understand how the IPCC came up with an expected temperature rise of
2.5°C every time atmospheric carbon dioxide doubled. He’d not really got
anywhere with it so far (and in fact it remains a mystery to this day), but he
was far from giving up hope. If there was an explanation to be had, he fully
expected to find it.
[Is
it possible that retired geologist (McIntyre) and Montford just don’t accept
the explanations commonly found in the literature. For example, this, from Global Temperature Change,
Hansen, et al. (2006):
“In
assessing the level of global warming that constitutes DAI [dangerous
anthropogenic interference], we must bear in mind that estimated climate
sensitivity of 3 ± 1°C for doubled CO2, based mainly on paleoclimate
data but consistent with models, refers to a case in which sea ice, snow, water
vapor, and clouds are included as feedbacks, but ice sheet area, vegetation cover,
and non-H2O GHGs are treated as forcings or fixed boundary conditions. On long
time scales, and as the present global warming increases, these latter
quantities can change and thus they need to be included as feedbacks. Indeed,
climate becomes very sensitive on the ice-age time scale, as feedbacks,
specifically ice sheet area and GHGs, account for practically the entire global
temperature change.”
The
above quote is by no means the only place where such explanations are to be
found, and also notice that it is not accurate to set a specific value, such as
2.5°C without also incorporating the degree of uncertainty. According to
Hansen, et al. (2006) the range is from 2-4°C, and if you read the paper in
more detail the authors also admit that, although much less likely, it could be
lower or higher than this range. Part of how Montford mischaracterizes the
debate is to always lowball the numbers, seeming unwilling to see much, if any
change in temperature due to increasing CO2. If he would at least
engage in an honest debate it would be better.]
While McIntyre’s readings in
climatology broadened, he also began discussing the IPCC’s claims of
unprecedented warmth with friends and acquaintances. His contacts in the mining
industry were particularly interesting on the subject. Familiar as they were
with the long-term history of the Earth, many of the geologists McIntyre spoke
to had strong opinions on claims that recent temperatures were unprecedented
and most were highly sceptical of the idea. When it came to the Hockey Stick
itself, mining people –geologists, lawyers and accountants –were openly
contemptuous. Hockey sticks were a well known phenomenon in the business world,
and McIntyre’s contacts had seen far too many dubious mining promotions and
dotcom revenue projections to take such a thing seriously. The contrasting
reactions to the Hockey Stick of politicians and business people –on the one
hand doom-laded predictions of catastrophe and on the other open ridicule –acted
as a spur to McIntyre, who flung himself headlong into the world of
climatology.
[Two
things here: First, geologists and geochemists, as a group tend to
automatically be skeptical of climate science because they are not typically
trained in the disciplines needed to fully understand climate modeling, and
they are typically employed by an industry that sees signs of environmental
problems as a threat to their profession. This is especially true of those
geologists employed by the fossil fuel industry, which stands to lose a lot of
revenue if politicians decide we must respond to climate change by reducing
carbon emissions. Secondly, to question the hockey stick just because it
resembles something you have seen in a business presentation in a boardroom
seems like a poor reason to question the validity of the data supporting its
construction. Of course, this is just what led him to approach Mann’s work with
skepticism, not his stated reason for rejecting Mann’s work later, but it does
suggest an interesting, non-scientifically based bias.]
Within a matter of days of his
announcement, McIntyre was posting findings to the Climate Skeptics forum. He
had now worked through Mann’s explanation of his methodology and he had
soldiered his way through the matrix algebra. It was still very strange. The
use of PC analysis was new in the realm of paleoclimate and Mann had made no
attempt to prove the validity of the technique in the field, instead relying on
a bold assertion that it was better than the alternatives. In view of this and
given the surprising results–with no Medieval Warm Period or Little Ice Age
visible in the reconstruction–one might have expected that experts in the field
would have questioned whether Mann’s novel procedures might have been a factor
in his anomalous results. But despite a thorough search of the literature,
there was no sign that anyone else had seen fit to probe the issue further. Nor
had any other researchers adopted Mann’s methodology in the five years since
his paper had been published. Given how often the Hockey Stick had been cited
in the scientific literature, these were very surprising observations, which
seemed to suggest that paleoclimatologists liked Mann’s results rather more
than they liked his methodology.
[Is
Montford just being dishonest, or did he not bother to even check the MBH99
paper (Mann,
et al., Northern hemisphere temperatures during the past millennium:
Inferences, uncertainties, and limitations). All one has to do is read that
paper to realize two things: First, Mann only estimated temperatures back to
1400, which is just after the MWP, so of course he left it out of the hockey
stick figure. He even explains in the paper why temperatures before 1400 were
not presented, i.e. because the data were inadequate. He even mentions the MWP,
but because he suggests that the peak temperatures during that period only “approached”
mean 20th century levels, Montford (and McIntyre) interpret that to
mean that Mann has done away with the MWP. He has not done away with it, he is
just saying that its peak temperatures were not as high as Montford would like
them to be, i.e. higher than any time in the recent past or the present.
Secondly,
Mann does include the Little Ice Age (LIA) in the hockey stick figure and even
mentions it, just not by name, which apparently is interpreted by Montford as
being absent. If you don’t agree with my two statements here, read the paper
for yourself, and here is what they say in their conclusions:
“Although
NH reconstructions prior to about AD 1400 exhibit expanded uncertainties,
several important conclusions are still possible. While warmth early in the
millennium approaches mean 20th century levels, the late 20th century still
appears anomalous: the 1990s are likely the warmest decade, and 1998 the
warmest year, in at least a millennium. More widespread high-resolution data
which can resolve millennial-scale variability are needed before more confident
conclusions can be reached with regard to the spatial and temporal details of
climate change in the past millennium and beyond.”
To
top it off, note the tentative nature of his statements in the conclusion, in
contrast to Montfort who is just plain certain that Mann is wrong.
Montford
also claims that no other researchers were using the methods of Mann in the
five years following MBH99. Apparently Montford either did not check the
literature very thoroughly, or his definition of researchers having not “adopted
Mann’s methodology” so narrow that unless they did exactly what he did,
Montford would assume they had not “adopted Mann’s methodology.” The odd thing
about this is that the techniques used by Mann, for the most part, were
standard, accepted methods used by most climate researchers. In just a brief
search of the literature from 1999-2004 I found several papers that appear to
have used the general approach used by Mann. Besides, there are also studies
using alternative methods that found largely the same results as Mann during
this same time period.]
Another issue was also attracting
McIntyre’s attention. During his calibration exercise, Mann had assessed how
well the temperature data matched up against the proxies by calculating various
statistical measures–in other words, numbers that acted as a score of how good
the match was. The main way he did this was using a measure that he called the
beta (β), which he described as being ‘a quite rigorous measure of the similarity
between two variables’.
This was a somewhat surprising
choice since the beta statistic was virtually unheard of outside climatology
circles. (It also goes by the names of the ‘resolved variance statistic’ or the
‘reduction of error (RE) statistic’–the latter being the term we will use to
refer to it henceforward.) With his experience in statistics, McIntyre was
aware that there was great danger in using novel measures like these, whose
mathematical behaviour hadn’t been thoroughly researched and documented by
statisticians. The statistical literature was littered with examples where
particular statistical measures gave results which misled in certain
circumstances. Mann had left no clue as to why he had preferred the RE rather
than the more normal measures of correlation, such as the correlation (r), the
correlation squared (R2) or the CE statistic. The behaviour of all of these
measures under a wide range of scenarios was well documented, so McIntyre was
surprised not to see an explanation.
[There
is a mixture of truth and deception in these two paragraphs. It is true that RE
is used within climatology circles and has in fact been used clear back into
the 1950s, and although I cannot confirm Montford’s point that RE is ONLY used
in climatology circles and not elsewhere, I am not sure why this would make any
difference anyway. This statistical method was developed by Lorenz (1956, Empirical orthogonal
functions and statistical weather prediction) and is referred to in
standard climatology textbooks and scholarly books, for example, Methods of Dendrochronology:
Applications in the Environmental Sciences by E.R. Cook and L.A. Kairiukstis
(1990) says this:
4.3.4.
Reduction of error
“The
reduction of error (RE) statistic provides a highly sensitive measure of
reliability. It has useful diagnostic capabilities (Gordon, 1980) and is
similar, but not equivalent, to the explained variance statistic obtained with
the calibration of the dependent data (Lorenz, 1956; 1977). Therefore, RE
should assume a central role in the verification procedure. The equation used
to calculate the RE can be expressed in terms of the ŷi, estimates
and the yi predictions that are expressed as departures from the
dependent period mean value:
(4.39) |
“The
term on the right of (4.39) is the ratio of the total squared error obtained
with the regression estimates and the total squared error obtained using the
dependent period mean as the only estimate (Lorenz, 1958, 1977; Kutzbach and
Guetter, 1980). This average estimation becomes a standard against which the
regression estimation is compared. If the reconstruction does a better job at
estimating the independent data than the average of the dependent period, then
the total error of the regression estimates would be less, the ratio would be
less than one, and the RE statistic would be positive.”
Similarly,
Climate Since A.D. 1500
edited by Raymond S. Bradley and Philip D. Jones has this to say:
“Two verification statistics are presented here that
were common to all of the reconstructions: the product-moment correlation
coefficient and the reduction of error statistic. Each statistic is commonly
used in dendroclimatic reconstructions. The product-moment correlation
coefficient (r) is a parametric measure of association between two samples. Its
use in testing for hypothesized relationships between variables is described in
virtually all basic statistics texts and in Fritts (1976). The reduction of
error (RE) statistic is less well known. It was developed in meteorology by
Lorenz (1956) for the purpose of assessing the predictive skill of
meteorological forecasts. The RE has no formal significance test. but an
RE>0 is an indicator of forecast skill that exceeds that of climatology
(i.e. extrapolating the climatic mean as the forecast or prediction). See
Fritts (1976) Gordon and LeDuc (1981) and Fritts and Guiot (1990) for full
descriptions of this statistic. its small sample properties, and other
verification tests as well.”
So, why is Montford surprised that Mann would use a
statistic that is widely used by climate scientists? Additionally, Montford
faults him for using RE and not r or R2, which are much more widely
used. In most papers by Mann, and follow-ups to MBH99, he includes not just RE,
but also one or both of the latter. When both RE and r2 are reported
side by side as they are in Zhang, Mann and Cook (2003, Alternative
methods of proxy-based climate field reconstruction: application to summer
drought over the conterminous United States back to AD 1700 from tree-ring data),
RE is the more conservative statistic, causing the rejection of more data than
r2. This runs counter to the apparent concerns of Montford (although
his exact concerns are not clear) that Mann is including bad data in his
analyses. Since RE is more conservative, Mann is more likely to have left out
some potentially good data, and most certainly to have excluded any bad data
that would also have been excluded by r2.]
Mann indicated in the paper that
the r and R2 had also been calculated, which might have provided
some reassurance to McIntyre but for the fact that the results of these
calculations were not presented for the calibration step anywhere in the paper
or in the online supplementary information. However, by now McIntyre had got
hold of the data for the second Hockey Stick paper, MBH99–the extension back to
the year 1000–so he was able to start to make some significant progress in
answering some of these questions. Because the number of proxies used in MBH99
was so small (there being very few proxies that extended so far into the past)
it was a relatively straightforward task for McIntyre to recreate Mann’s
calibration and to calculate some of the correlation statistics for himself.
The results were eye-opening, to say the least. As he reported to the climate
sceptics:
The R2 . . . ranges from –0.006
to 0.454; on this basis, only 2 of 13 proxies have R2 adjusted over 0.25, and 7
of 13 have values under 0.1 . . .
To put this in perspective, R2
will normally vary between 0 and 1. A score of 0 indicates that there is no
correlation at all, and 1 indicates perfect correlation. So what McIntyre was
seeing was that the proxies and the temperature PCs didn’t really match up very
well, according to a standard measure of correlation. The best among them were
not even halfway good, and some simply showed no correlation at all. Could this
explain why Mann was so enthusiastic about the RE statistic, the climatologists’
own measure of correlation?
[This
would be pretty damning stuff and pretty surprising to be found in data used in
a peer-reviewed scientific paper. How could this happen? Well, what is not done
in this book is any update to McIntyre’s work in light of later findings.
First, why should we trust that McIntyre, who does not do these kinds of
statistics routinely like most climate scientists do, to produce more accurate
results than Mann? If there is a discrepancy between Mann’s and McIntyre’s
results, shouldn’t we suspect that McIntyre is the one making the mistakes.
In
reference to the above claims, Mann, in his book The
Hockey Stick and the Climate Wars. says this in reference to the above
criticisms leveled by McIntyre on Montfort’s climate blog:
“To
be specific, they claimed that the hockey stick was an artifact of four
supposed “categories of errors”: “collation errors,” “unjustified truncation
and extrapolation,” “obsolete data,” and “calculation mistakes.” As we noted in
a reply to a McIntyre and McKitrick comment on MBH98 that had been submitted to
and rejected by Nature (because their comment was rejected anyway, our reply
would not appear there either), those claims were false, resulting from their
misunderstanding of the format of a spreadsheet version of the dataset they had
specifically requested from my associate, Scott Rutherford. None of the
problems they cited were present in the raw, publicly available version of our
dataset, which was available at that time at ftp://holocene.evsc.virginia.edu/pub/MBH98/.”
What
also is overlooked by Montford, is that if you do actually take Mann’s correct
data, and the RE and r2 values found in his papers that contain both
statistics, that for the vast amount of his data, both values agree on what
constitute good and bad data. Where data are rejected by one measure and not
the other, the rejections are due to the RE statistic. Considering that this is
a revised edition of the book, republished in 2011, it is dishonest of Montford
to have not corrected these problems, or at least addressed them, rather than
perpetuating false and incorrect critiques of MBH98 and MBH99.]
[Further
Examples of Grammatically Incorrect Use of the Word “Data:”]
He could see that Mann had used a
network of 112 proxy series, and in fact behind the scenes there was even more
data than this.
The data that Mann used was the
CRU’s best stab at what the actual temperatures had been for the previous
150-odd years, and as we’ve noted, CRU’s data was reckoned to be the best.
He also tried regressing
nineteenth century proxy data against twentieth century temperatures and found
no great difference in the R2 score to those achieved when the correct proxy
data was used.
On an even simpler level, there
was a great deal about the data used in the MBH99 reconstruction that was
peculiar.
[It
looks as if Montford’s grammatically incorrect use of the word data is
consistent, given that four more cases occur in the first half of chapter 3. He
needs more than just a better editor, he needs someone versed in scientific
writing to help him.]