## Keenan's response to the BEST paper

*Doug Keenan has posted up his correspondence with the Economist and Richard Muller about the BEST paper. I reproduce it here with permission.*

*The Economist* asked me to comment on four research papers from the Berkeley Earth Surface Temperature (BEST) project. The four papers were as follows.

- Decadal variations in the global atmospheric land temperatures
- Influence of urban heating on the global temperature land average using rural sites identified from MODIS classifications
- Berkeley Earth temperature averaging process
- Earth atmospheric land surface temperature and station quality

Below is some of the correspondence that we had. (Note: my comments were written under time pressure, and are unpolished.)

From: D.J. Keenan

To: Richard Muller [BEST Scientific Director]; Charlotte Wickham [BEST Statistical Scientist]

Cc: James Astill; Elizabeth Muller

Sent: 17 October 2011, 17:16

Subject: BEST papers

Attach: Roe_FeedbacksRev_08.pdf; Cowpertwait & Metcalfe, 2009, sect 2-6-3.pdf; EmailtoDKeenan12Aug2011.pdf

Charlotte and Richard,

James Astill, Energy & Environment Editor of *The Economist*, asked Liz Muller if it would be okay to show me your BEST papers, and Liz agreed. Thus far, I have looked at two of the papers.

- Decadal Variations in the Global Atmospheric Land Temperatures
- Influence of Urban Heating on the Global Temperature Land Average Using Rural Sites Identified from MODIS Classifications

Following are some comments on those.

In the first paper, various series are compared and analyzed. The series, however, have sometimes been smoothed via a moving average. Smoothed time series cannot be used in most statistical analyses. For some comments on this, which require only a little statistical background, see these blog posts by Matt Briggs (who is a statistician):

Do not smooth times series, you hockey puck!

Do NOT smooth time series before computing forecast skill

Here is a quote from those (formatting in original).

Unless the data is measured with error,

you never, ever, for no reason, under no threat, SMOOTH the series!And if for some bizarre reason you do smooth it,you absolutely on pain of death do NOT use the smoothed series as input for other analyses! If the data is measured with error, you might attempt to model it (which means smooth it) in an attempt to estimate the measurement error, but even in these rare cases you have to have anoutside(the learned word is “exogenous”) estimate of that error, that is, one not based on your current data.If, in a moment of insanity, you

dosmooth time series data and youdouse it as input to other analyses,you dramatically increase the probability of fooling yourself! This is because smoothing induces spurious signals—signals that look real to other analytical methods.

This problem seems to invalidate much of the statistical analysis in your paper.

There is another, larger, problem with your papers. In statistical analyses, an inference is not drawn directly from data. Rather, a statistical model is fit to the data, and inferences are drawn from the model. We sometimes see statements such as “the data are significantly increasing”, but this is loose phrasing. Strictly, data cannot be significantly increasing, only the trend in a statistical model can be.

A statistical model should be plausible on both statistical and scientific grounds. Statistical grounds typically involve comparing the model with other plausible models or comparing the observed values with the corresponding values that are predicted from the model. Discussion of scientific grounds is largely omitted from texts in statistics (because the texts are instructing in statistics), but it is nonetheless crucial that a model be scientifically plausible. If statistical and scientific grounds for a model are not given in an analysis and are not clear from the context, then inferences drawn from the model should be regarded as unfounded.

The statistical model adopted in most analyses of climatic time series is a straight line (usually trending upward) with noise (i.e. residuals) that are AR(1). AR(1) is short for “first-order autoregressive”, which means, roughly, that this year (only) has a direct effect on next year; for example, if this year is extremely cold, then next year will have a tendency to be cooler than average.

That model—a straight line with AR(1) noise—is the model adopted by the IPCC (see AR4: §I.3.A). It is also the model that was adopted by the U.S. Climate Change Science Program (which reports to Congress) in its analysis of “Statistical Issues Regarding Trends”. Etc. An AR(1)-based model has additionally been adopted for several climatic time series other than global surface temperatures. For instance, it has been adopted for the Pacific Decadal Oscillation, studied in your work: see the review paper by Roe [2008], attached.

Although an AR(1)-based model has been widely adopted, it nonetheless has serious problems. The problems are actually so basic that they are discussed in some recent introductory (undergraduate) texts on time series—for example, in *Time Series Analysis and Its Applications* (third edition, 2011) by R.H. Shumway & D.S. Stoffer (see Example 2.5; set exercises 3.33 and 5.3 elaborate).

In Australia, the government commissioned the Garnaut Review to report on climate change. The Garnaut Review asked specialists in the analysis of time series to analyze the global temperature series. The report from those specialists considered and, like Shumway & Stoffer, effectively rejected the AR(1)-based statistical model. Statistical analysis shows that the model is too simplistic to cope with the complexity in the series of global temperatures.

Additionally, some leading climatologists have strongly argued on scientific grounds that the AR(1)-based model is unrealistic and too simplistic [Foster et al., *GRL*, 2008].

To summarize, most research on global warming relies on a statistical model that should not be used. This invalidates much of the analysis done on global warming. I published an op-ed piece in the *Wall Street Journal* to explain these issues, in plain English, this year.

The largest center for global-warming research in the UK is the Hadley Centre. The Hadley Centre employs a statistician, Doug McNeall. After my op-ed piece appeared, Doug McNeall and I had an e-mail discussion about it. A copy of one of his messages is attached. In the message, he states that the statistical model—a straight line with AR(1) noise—is “simply inadequate”. (He still believes that the world is warming, primarily due to computer simulations of the global climate system.)

Although the AR(1)-based model is known to be inadequate, no one knows what statistical model should be used. There have been various papers in the peer-reviewed literature that suggest possible resolutions, but so far no alternative model has found much acceptance.

When I heard about the Berkeley Earth Surface Temperature project, I got the impression that it was going to address the statistical issues. So I was extremely curious to see what statistical model would be adopted. I assumed that strong statistical expertise would be brought to the project, and I was trusting that, at a minimum, there would be a big improvement on the AR(1)-based model. Indeed, I said this in an interview with *The Register* last June.

BEST did not adopt the AR(1)-based model; nor, however, did it adopt a model that deals with some of the complexity that AR(1) fails to capture. Instead, BEST chose a model a model that is much more simplistic than even AR(1), a model which allows essentially no structure in the time series. In particular, the model that BEST adopted assumes that this year has no effect on next year. That assumption is clearly invalid on climatological grounds. It is also easily seen to be invalid on statistical grounds. Hence the conclusions of the statistical analysis done by BEST are unfounded.

All this occurred even though understanding the crucial question—what statistical model should be used?—requires only an introductory level of understanding in time series. The question is so basic that it is discussed by the introductory text of Shumway & Stoffer, cited above. Another text that does similarly is *Introductory Time Series with R* by P.S.P. Cowpertwait & A.V. Metcalfe (2009); a section from that text is attached. (The section argues that, from a statistical perspective, a pure AR(4) model is appropriate for global temperatures.) Neither Shumway & Stoffer nor Cowpertwait & Metcalfe have an agenda on global warming, to my knowledge. Rather, they are just writing introductory texts on time series and giving students practical examples; each text includes the series of global temperatures as one of those examples.

There are also textbooks that are devoted to the statistical analysis of climatic data and that discuss time-series modeling in detail. My bookshelf includes the following.

*Climate Time Series Analysis* (Mudelsee, 2010)

*Statistical Analysis in Climate Research* (von Storch & Zwiers, 2003)

*Statistical Methods in the Atmospheric Sciences* (Wilks, 2005)

*Univariate Time Series in Geosciences* (Gilgen, 2006)

Considering the second paper, on Urban Heat Islands, the conclusion there is that there has been some urban *cooling*. That conclusion contradicts over a century of research as well as common experience. It is almost certainly incorrect. And if such an unexpected conclusion is correct, then every feasible effort should be made to show the reader that it must be correct.

I suggest an alternative explanation. First note that the stations that your analysis describes as “very rural” are in fact simply “places that are not dominated by the built environment”. In other words, there might well be, and probably is, substantial urbanization at those stations. Second, note that Roy Spencer has presented evidence that the effects of urbanization on temperature grow logarithmically with population size.

The Global Average Urban Heat Island Effect in 2000 Estimated from Station Temperatures and Population Density Data

Putting those two notes together, we might expect that the UHI effect will be larger at the sites classified as “very rural” than at the sites classified as urban. And that is indeed what your analysis shows. Of course, if this alternative explanation is correct, then we cannot draw any inferences about the size of UHI effects on the average temperature measurements, using the approach taken in your paper.

There are other, smaller, problems with your paper. In particular, the Discussion section states the following.

We observe the opposite of an urban heating effect over the period 1950 to 2010, with a slope of -0.19 ± 0.19 °C/100yr. This is not statistically consistent with prior estimates, but it does verify that the effect is very small....

If the two estimates are not consistent, then they contradict each other. In other words, at least one of them must be wrong. Hence one estimate cannot be used “verify” an inference drawn from the other. This has nothing to do with statistics. It is logic.

Sincerely, Doug

* * * * * * * * * * * *

Douglas J. Keenan

http://www.informath.org

From: Richard Muller

To: James Astill

Cc: Elizabeth Muller

Sent: 17 October 2011, 23:33

Subject: Re: BEST papers

Dear James,

You've received a copy of an email that DJ Keenan wrote to me and Charlotte. He raises lots of issues that require addressing, some that reflect misunderstanding, and some of which just reflect disagreements among experts in the field of statistics. Since these issues are bound to arise again and again, we are preparing an FAQ that we will put on our web site.

Keenan states that he had not yet read our long paper on statistical methods. I think if he reads this he is more likely to appreciate the sophistication and care that we took in the analysis. David Brillinger, our chief advisor on statistics, warned us that by avoiding the jargon of statistics, we would mislead statisticians to think we had a naive approach. But we decided to write in a more casual style, specifically to be able to reach the wider world of geophysicists and climate scientists who don't understand the jargon. Again, if Keenan reads the methods paper, he will have a deeper appreciation of what we have done.

It is also important to recognize that we are not creating a new field of science, but are adding to one that has a long history. In the past I've discovered that if you avoid using the methods of the past, the key scientists in the field don't understand what you have done. As my favorite example, I cite a paper I wrote in which I did data were unevenly spaced in time, so I did a Lomb periodogram; the paper was rejected by referees who argued that I was using an "obscure" approach and should have simply done the traditional interpolation followed by Blackman-Tukey analysis. In the future I did it their way, always being careful however to also do a Lomb analysis to make sure there were no differences.

His initial comment is on the smoothing of data. There are certainly statisticians who vigorously oppose this approach, but there have been top statisticians who support it. Included in that list are David Brillinger, and his mentor, the great John Tukey. Tukey revolutionize the field of data analysis for science and his methods dominate many fields of physical science.

Tukey argued that smoothing was a version of "pre-whitening", a valuable way to remove from the data behavior that was real but not of primary interest. Another of his methods was sequential analysis, in which the low frequency variations were identified, fit using a maximum likelihood method, and then subtracted from the data using a filter prior to the analysis of the frequencies of interest. He showed that this pre-whitening would lead to a more robust result. This is effectively what we did in the Decadal variations paper. The long time scale changes were not the focus of our study, so we did a maximum-likelihood fit, removed them, and examined the residuals.

Keenan quotes: "If, in a moment of insanity, you *do* smooth time series data and you *do* use it as input to other analyses, **you dramatically increase the probability of fooling yourself**! This is because smoothing induces spurious signals—signals that look real to other analytical methods." Then he draws a conclusion that does not follow from this quote; he says: "This problem seems to invalidate much of the statistical analysis in your paper."

He is, of course, being illogical. Just because smoothing can increase the probability of our fooling ourselves doesn't mean that we did. There is real value to smoothing data, and yes, you have to beware of the traps, but if you are then there is a real advantage to doing that. I wrote about this in detail in my technical book on the subject, "Ice Ages and Astronomical Causes." Much of this book is devoted to pointing out the traps and pitfalls that others in the field fell into.

Keenan goes on to say, "In statistical analyses, an inference is not drawn directly from data. Rather, a statistical model is fit to the data, and inferences are drawn from the model." I agree wholeheartedly! He may be confused because we adopted the language of physics and geophysics rather than that of statistics. He goes on to say that "This invalidates much of the analysis done on global warming." If we are to move ahead, it does no good simply to denigrate most of the previous work. So we do our work with more care, using valid statistical methods, but write our papers in such a way that the prior workers in the field will understand what we say. Our hope, in part, is to advance the methods of the field.

Unfortunately, Keenan's conclusion is that there has been virtually no valid work in the climate field, that what is needed is a better model, and he does not know what that model should be. He says, "To summarize, most research on global warming relies on a statistical model that should not be used. This invalidates much of the analysis done on global warming. I published an op-ed piece in the *Wall Street Journal* to explain these issues, in plain English, this year."

Here is his quote basically concluding that no analysis of global warming is valid under his statistical standards: "Although the AR(1)-based model is known to be inadequate, no one knows what statistical model should be used. There have been various papers in the peer-reviewed literature that suggest possible resolutions, but so far no alternative model has found much acceptance."

What he is saying is that statistical methods are unable to be used to show that there is global warming or cooling or anything else. That is a very strong conclusion, and it reflects, in my mind, his exaggerated pedantry for statistical methods. He can and will criticize every paper published in the past and the future on the same grounds. We might as well give up in our attempts to evaluate global warming until we find a "model" that Keenan will approve -- but he offers no help in doing that.

In fact, a quick survey of his website shows that his list of publications consists almost exclusively of analysis that shows other papers are wrong. I strongly suspect that Keenan would have rejected any model we had used.

He gives some specific complaints. He quotes our paper, where we say, "We observe the opposite of an urban heating effect over the period 1950 to 2010, with a slope of -0.19 ± 0.19 °C/100yr. This is not statistically consistent with prior estimates, but it does verify that the effect is very small...."

He then complains,

If the two estimates are not consistent, then they contradict each other. In other words, at least one of them must be wrong. Hence one estimate cannot be used “verify” an inference drawn from the other. This has nothing to do with statistics. It is logic.

He is misinterpreting our statement. Our conclusion is based on our analysis. We believe it is correct. The fact that it is inconsistent with prior estimates does imply that one is wrong. Of course, we think it is the prior estimates. We do not believe that the prior estimates were more than back-of-the-envelope "guestimates", and so there is no "statistical" contradiction.

He complains,

Considering the second paper, on Urban Heat Islands, the conclusion there is that there has been some urban

cooling. That conclusion contradicts over a century of research as well as common experience. It is almost certainly incorrect. And if such an unexpected conclusion is correct, then every feasible effort should be made to show the reader that it must be correct.

He is drawing a strong a conclusion for an effect that is only significant to one standard deviation! He never would have let us claim that -0.19 ± 0.19 °C/100yr indicates urban cooling. I am surprised that a statistician would argue that such a statistically insignificant effect indicates cooling.

Please be careful whom you share this email with. We are truly interested in winning over the other analysts in the field, and I worry that if they were to read portions of this email out of context that they might interpret it the wrong way.

Rich

From: D.J. Keenan

To: James Astill

Sent: 18 October, 2011 17:53

Subject: Re: BEST papers

James,

On the most crucial point, it seems that Rich and I are in agreement. Here is a quote from his reply.

Keenan goes on to say, "In statistical analyses, an inference is not drawn directly from data. Rather, a statistical model is fit to the data, and inferences are drawn from the model." I agree wholeheartedly!

And so the question is this: was the statistical model that was adopted for their analysis a reasonable choice? If not, then--since their conclusions are based upon that model--their conclusions must be unfounded.

In fact, the statistical model that they adopted has been rejected by essentially everyone. In particular, it has been rejected by both the IPCC and the CCSP, as cited in my previous message. I know of no work that presents argumentation to support their choice of model: they have just adopted the model without any attempt at justification, which is clearly wrong.

(It has been known for decades that the statistical model that they adopted should not be used. Although the statistical problems with the model were clear, for a long time, no one knew the physical reason. Then in 1976, Klaus Hasselmann published a paper that explained the reason. The paper is famous and has since been cited more than 1000 times.)

We could have a discussion about what statistical model should be adopted. It is certain, though, that the model BEST adopted should be rejected. Ergo, their conclusions are unfounded.

Regarding smoothing, the situation here requires only little statistics to understand. Consider the example given by Matt Briggs at

Do NOT smooth time series before computing forecast skill

We take two series, each entirely random. We compute the correlation of the two series: that will tend to be around 0. Then we smooth each series, and we compute the correlation of the two smoothed series: that will tend to be greater than before. The more we smooth the two series, the greater the correlation. Yet we started out with purely random series. This is not a matter of opinion; it is factual. Yet the BEST work computes the correlation of smoothed series.

The reply uses rhetorical techniques to avoid that, stating "Just because smoothing can increase the probability of our fooling ourselves doesn't mean that we did". The statement is true, but it does not rebut the above point.

Considering the UHI paper, my message included the following.

There are other, smaller, problems with your paper. In particular, the Discussion section states the following.

We observe the opposite of an urban heating effect over the period 1950 to 2010, with a slope of -0.19 ± 0.19 °C/100yr. This is not statistically consistent with prior estimates, but it does verify that the effect is very small....

If the two estimates are not consistent, then they contradict each other. In other words, at least one of them must be wrong. Hence one estimate cannot be used “verify” an inference drawn from the other. This has nothing to do with statistics. It is logic.

The reply claims "The fact that [their paper's conclusion] is inconsistent with prior estimates does imply that one is wrong". The claim is obviously absurd.

The reply also criticizes me for "drawing a strong a conclusion for an effect that is only significant to one standard deviation". I did not draw that conclusion, their paper suggested it: saying that the effect was "opposite in sign to that expected if the urban heat island effect was adding anomalous warming" and that "natural explanations might require some recent form of “urban cooling”", and then describing possible causes, such as "For example, if an asphalt surface is replaced by concrete, we might expect the solar absorption to decrease, leading to a net cooling effect".

Note that the reply does not address the alternative explanation that my message proposed for their UHI results. That explanation, which is based on the analysis of Roy Spencer (cited in my message), implies that we cannot draw any inferences about the size of UHI effects on the average temperature measurements, using the approach taken in their paper.

I has a quick look at their Methods paper. It affects none of my criticisms.

Rich also cites his book on the causes of the ice ages. Kindly read my op-ed piece in the *Wall Street Journal*, and especially consider the discussion of Figures 6 and 7. His book claims to analyze the data in Figure 6: the book's purpose is to propose a mechanism to explain why the similarity of the two lines is so weak. In fact, to understand the mechanism, it is only necessary to do a simple subtraction--as my piece explains. In short, the analysis is his book is extraordinarily incompetent--and it takes only an understanding of subtraction to see this.

This person who did the data analysis in that book is the person in charge of data analysis at BEST. The data analysis in the BEST papers would not pass in a third-year undergraduate course in statistical time series.

Lastly, a general comment on the surface temperature records might be appropriate. We have satellite records for the last few decades, and they closely agree with the surface records. We also have good evidence that the world was cooler 100-150 years ago than it is today. Primarily for those reasons, I think that the surface temperature records--from NASA, NOAA, Hadley/CRU, and now BEST--are probably roughly right.

Cheers, Doug

From: James Astill

To: D.J. Keenan

Sent: 18 October 2011, 17:57

Subject: Re: BEST papers

Dear Doug

Many thanks. Are you saying that, though you mistrust the BEST methodology to a great degree, you agree with their most important conclusion, re the surface temperature record?

best

James

James Astill

Energy & Environment Editor

From: D.J. Keenan

To: James Astill

Sent: 18 October 2011, 18:41

Subject: Re: BEST papers

James,

Yes, I agree that the BEST surface temperature record is very probably roughly right, at least over the last 120 years or so. This is for the general shape of their curve, not their estimates of uncertainties.

Cheers, Doug

From: D.J. Keenan

To: James Astill

Sent: 20 October, 2011 13:11

Subject: Re: BEST papers

James,

Someone just sent me the BEST press release, and asked for my comments on it. The press release begins with the following statement.

Global warming is real, according to a major study released today. Despite issues raised by climate change skeptics, the Berkeley Earth Surface Temperature study finds reliable evidence of a rise in the average world land temperature of approximately 1°C since the mid-1950s.

The second sentence may be true. The first sentence, however, is not implied by the second sentence, nor does it follow from the analyses in the research papers.

Demonstrating that "global warming is real" requires much more than demonstrating that average world land temperature rose by 1°C since the mid-1950s. As an illustration, the temperature in 2010 was higher than the temperature in 2009, but that on its own does not provide evidence for global warming: the increase in temperatures could obviously be due to random fluctuations. Similarly, the increase in temperatures since the mid 1950s could be due to random fluctuations.

In order to demonstrate that the increase in temperatures since the mid 1950s is not due to random fluctuations, it is necessary to do valid statistical analysis of the temperatures. The BEST team has not done such.

I want to emphasize something. Suppose someone says "2+2=5". Then it is not merely my opinion that what they have said is wrong; rather, what they have said is wrong. Similarly, it is not merely my opinion that the BEST statistical analysis is seriously invalid; rather, the BEST statistical analysis is seriously invalid.

Cheers, Doug

From: James Astill

To: D.J. Keenan

Sent: 20 October 2011, 13:19

Subject: Re: BEST papers

Dear Doug

Many thanks for all your thoughts on this. It'll be interesting to see how the BEST papers fare in the review process. Please keep in touch.

best

jamesJames Astill

Energy & Environment Editor

A story about BEST was published in the October 22nd edition of

*The Economist*. The story, authored by James Astill, makes no mention of the above points. It is subheaded “A new analysis of the temperature record leaves little room for the doubters. The world is warming”. Its opening sentence is “For those who question whether global warming is really happening, it is necessary to believe that the instrumental temperature record is wrong”.

## Reader Comments (79)

It was posted on Doug's website. I asked Doug for permission to reproduce it.

From soup to nuts, the BEST project has been an exercise in PR rather than science. SOP: tell the press. what you are going to discover, then go off and discover it, then tell the press again what you have discovered.

Surely the point that Keenan is trying to make is that the linear models are expecting TSI to have a linear trend so the satellite data is adjusted to fit that trend.

In reality the Sun is cyclical so it's effects must be cyclical, which differs from what the models are expecting because they require a linear trend.

The data stipulates cyclical the models stiputate linear so what happens, the data is adjusted to fit the model requirements instead of the the models are adjusted to fit the cyclical data.

A quick search gives some detail:

The present state of a thermodynamic system, such as the climate, is evidently a

function of the present and past values of the forcings, not of the future ones.

Thus, it is evident that Lockwood and Frolich are ʺanticipatingʺ what eventually

might be happening in the future.

I fnd the UHI conclusions somewhat puzzling.

I am highly suspicious of heavy weight statistical processing of lousy data.

We do not know how well the majority of urban sites reflect the actual temperature around that site. I.e the integral of the temperature over a radius of say 0.5 Km/area.

Watts has done a partial survey of temperature stations, but the sampling is undoubtedly biased. Nevertheless, some of these stations are definitely mis-sited. There are no, as far as I know, reliable estimates of bias determined by actual measurement.

Normally, one would construct a hypothesis, design an experiment to test that hypothesis, gather data designed to test the hypothesis and then analyse it.

All has been done is statistical analysis of data that has unknown errors.

Foxgoose

Anthony appears to agree that he was duped-

The rush to judgment they fomented before science had a chance to speak is worse than anything I’ve ever seen, and from my early dealings with them, I can say that I had no idea they would do this, otherwise I would not have embraced them so openly. A lie of omission is still a lie, and I feel that I was not given the true intentions of the BEST group when I met with them.http://wattsupwiththat.com/2011/10/21/best-what-i-agree-with-and-what-i-disagree-with-plus-a-call-for-additional-transparency-to-preven-pal-review/

I read Anthony's comment and he may be right; perhaps I was just in a charitable mood when I wrote my last comment.

But the only other explanation is that this was a set-up from the start and intended to undermine the credibility of someone who has been a thorn in the side by enlisting the aid of rank amateurs to go out and pick holes in the reliability of the US weather reporting system.

It's possible but I'd be a bit reluctant to go where that road leads.

I'm not given to conspiracy theories by nature - and a couple of years ago I would have agreed with you.

Now, though, there have been just too many pieces of evidence like this, all pointing in the same direction, and I fear we're all being taken down that road - reluctant or not

rc saumarez: "Normally, one would construct a hypothesis, design an experiment to test that hypothesis, gather data designed to test the hypothesis and then analyse it.:

Might such an experiment be the erection of 20 or 30 properly constructed and sited thermometer stations near existing questionable stations, and then the collection and comparison of data from both sets?

On the face of it, one might think that such a comparison could be done with results sufficient for validation in a year or two. in the meantime, so much of this seems like arguing the number of teeth in a horse where the horse's mouth might be surveyed but isn't.

Oct 21, 2011 at 1:59 PM | rc saumarez writes "I do not agree with the comment about applying statistics to smoothed data"

The consequence of importance to me is that the calculations of error or uncertainty change (often to look better) as more averaging is used. In theory you can smooth harder and harder until you get a straight line with almost no error envelope. Surely the uncertainty in a time series should be calculated on the smallest available time increment, unfiltered, unsmoothed. I suspect that when you do this, you find that the last 150 years of global climate change is all within such error bounds, but I would like to be shown wrong.

BTW, I have a cleaned-up data set for Australia (Tmax, Tmin, daily) which has added factors such as population at census, distance of weatther station from ocean, distance of station from nearest suburb, plus a number of selected "pristine" sites, many of which I am familiar with through having visited. This was made available to the BEST project. In time I shall do some reanalysis and see how they compare. If anyone wishes to rework the data, several hundred stations, please ask me for it. Note that Australia is unusual for the large number of "pristine sites" that have been that way since establishment. Thus, a correlation of trends at these sites with factors such as ocean proximity (which keeps cropping up), might provide a window of different opacity to the much-worked USA data. I'm also finding that negative to essentially zero temperature trends are fairly common in the overall data set.

I'm with BBD, the UHI effect is a non-issue. It is indisputable that there has been warming, whether this warming has been exaggerated by having weather stations go from rural to urban over a period of time won't make a jot of difference to the warmist position. Even if prived they'd still go to their models, make the required adjustments and lo! more catastrophe on the way.

What we have is a religious movement and people will believe what they want to believe regardless of the facts in front of them. And before anyone tells me that scepticism is a religion it lacks the first two necessary requirements of a religion. 1. Humans are basically evil; 2. Something rather nasty will happen to them if they don't change their ways to those proscribed by the religious leaders.

BBD. O/T I know but I've been struggling to understand the feedback theories of climate science. many years ago I studied control systems and two things were clear (the only two I remember), one is any system prone to positive feedback is unstable, and as described in the IPCC documents the earth is extremely sensitive to slight changes in temperature (although given that we have had most of the 1C rise predicted to bring on the catastrophes it doesn't appear to be that sensitive) if it is that sensitive then the catastrophes would have happened before. Question, how did we recover from previous catastrophes, do you know of any non-paywalled literature. Q2. As you're aware positive feedback will result in the total failure of the system without some limiting factors. Are you aware what factors will limit the temperatures at their maximum stated level in AR4?

Foxgoose

You may well be right!

Over at WUWT, legatus argues that the BEST project (and what Watts calls a "rush to judgment") might be to do with Muller's business interests.

http://www.mullerandassociates.com/index.php certainly does not look as if "pure science" is high on the agenda!

Go here for the comment: http://tinyurl.com/65accu4. Comment is at 8.31pm Oct 21.

@ Geoff Sherrington.

What does smothing mean? If means that you have applied a filter to the data. I would comment that I have never understood why people use an averaging "filter", because it has got a rotten frequency response. If you want to "smooth" data, you are making assumptions about which frequencies in the data are important and which are not. Faced with this, I would design a filter to act to emphasize the components that are interest and reject those that are not.

Continuing the filter argument, you know which components have been attenuated, and hence you can predict the change in variance. You also influence the number of degrees of freedom in the same way. For example, if you take n points in a sequence and they are normally distributed, you can calculate the number of degrees of freedom from the correlation structure of the data so that if you want, say, to compare means, you use the correct number of DoFs. Filtering, provided you know the frequency response, will allow you to calculate the modified DoFs and also the variance of data as this is directly related to the power spectrum.

There are tons of rather turgid mathematics devoted to these types of problems in statistical signal processing theory. The key is to form a hypothesis about the data, based on the question you are asking, and apply the correct signal processing procedure to the data. If the noise is Gaussian, the statistical calculations are relatively simple as the power spectrum is chi-squared, with 2DoF at each harmonic.

What I get from the paper is that UHI is a small if any part of the temperature trend, that the temperature trend correlates closely to natural cycles and that any correlation between increasing atmospheric CO2 and increasing temperature may well be entirely coincidental. Where I do not find any correlation is between the headline reporting of the paper and its conclusions. Notwithstanding DK's objections to methodology in analysis, the paper reads as a strong endorsement of the sceptical stance: anthropomorphic effects on climate so far are insignificant and may well remain so.

The side story, that climate scientists for 30 years failed to understand that ice melts more when the sun shines more, is so absurd that I simply refuse to accept that such incompetence is possible.

Someone please tell me that Doug is grossly misrepresenting this.

Perfect - so the "reformed sceptic" who finally "proved global warming" (according to the world's media) was actually plugging his GreenGov consultancy!

I hope this gets the coverage it deserves.

Meanwhile, I think the final comment on the BEST farrago should be left to Mr Wagner, the late unlamented editor of Remote Sensing who said, in his resignation suicide note:-

With this step I would also like to personally protest against how the authors and like-minded climate sceptics have much exaggerated the paper’s conclusions in public statementsWhat's good for the "warmist" goose is obviously not at all acceptable for the "sceptic" gander.

Bishop Hill,

Thank you for your reply:

> It was posted on Doug's website. I asked Doug for permission to reproduce it.

Do you know if Doug Keenan asked for permission to post this correspondence on his website?

Nothing is indicated in that regard on his site. Nor is there a way to contact him.

Could it be possible to contact him and have an answer to that question?

Many thanks!

bbd

science is not about "buying into" and "believing" etc

as long as there are no good arguments and proofs against UHI except pal reviewed articles, it's 100% open for discussion and analysis, and not a geography headmaster neglecting his pupils that is gonna change that.

willard

go and have a look in the bin outside in the shed if the papers and permissions are not there

they are at the bottom I believe.

please close the lid while you're looking.

my

some people really live in another age

tutu

beware...Willard is really sharp...all the CAGW team think he is the stoat's nuts. Do not mess with someone like him.

Willard sounds like a lawyer.

No comment is the correct answer.

Willard sounds curious. What's up with that? ;-)

willard's probably thinking a source has some obligations toward his journalist.

Have just spent a couple hours looking over the four beastly BEST papers, I think that Matt Briggs is 110% correct. You do not smooth data and then plug it into the linear regression model, which is what BEST did. The data are fictional.

In simple English, statistics is based on looking at the error that surrounds your data -- called the noise, sampling error or a dozen other things -- and comparing it to the difference between two sets of data. One set is almost always experimental data such as the measured height of 1000 people. The other can be either experimental data -- for example the weight of 1000 people, or it can be calculated from a mathematical model -- for example the function y= f(x) +c where (x) is the age of the person.

You then compare the two sets of data mathematically, estimate the noise around it and see if the noise is more than enough to explain any differences (or lack of difference depending on your test). If the effect is less than the noise, then you have no effect -- it is noise.

When you "smooth" the data, you are merely artificially reducing error or noise surrounding your data and spuriously increase the "probability" because the mathematics used assumes that you are using real error and not fictional error. The whole idea is to compare your supposed data to the real error.

From what I see just looking at the raw data of BEST, it is noise with a slight cyclic trend in it. It certainly does not look to have a linear trend in it at all. Matt does an excellent job in explaining all this in his two papers cited above

This is such a basic mistake it is enough to make one cry. However, we will have "scientists" and wannabe statisticians doing this sort of crap forever, just as they were for the past 100 or so years.

Hockey puck is too polite a term to use in my view, but I suspect that the Bishop would merely snip what my true feelings are if I were to express them.

rc saumarezCORRECT! You clearly understand what you are doing. You also clearly understand the difference between smoothing and filtering.

What they did in BEST is brain dead smoothing -- that is running average. And I agree with your earlier remark:

Just try to explain

Nyquist–Shannon sampling theoremto them.@Don Pablo

I have tried tried. See my recent post on Judith Curry's blog: "Does the aliasing beat feed the undertainty Monster?" I didn't go into statistics much except for a little simulation.

I am coming round to the point of view that the statistical methods in Best may be wrong in DSP terms. It needs a lot of thought though.

rc saumarezConsider this -- you are dealing with a continuous waveform that pretty much runs "forever"; it is sinusoidal (certainly as far as diurnal, seasonal, and longer period cyclic effects are concerned). Plotted, the temperature curve over time is a very noisy signal but you can make out all the effects listed above. So why would DSP be wrong?

I caught your post on Judith Curry's blog a few days ago and thought that it was to the point but it clearly went COMPLETELY over the head of all the self-appointed experts, including Judith, who haven't a clue, and that is the reason why I made my comment above about trying to explain the

Nyquist–Shannon sampling theorem.Reading the interchange between D.J. Keenan and others made me wonder where the f### they learned statistics? Clearly they do not understand the fundamental basics. All they know how to do is how to plug numbers into a computer program.

Post modern statistics at its finest.

As I have commented many times before, all they have to do is run the data through a polynomial regression -- which you can do on Excel if you don't have the money to buy a proper program -- to see how easily it would be fitted. Meaningless to be sure, but a damn fine fit.

Judith is a (old) woman..what more can I say ?

the problem with combinatorial maths, probability and stats is that it is in most university/polytech curriculums squeezed in a 30h course or something.

this , where this is one of the most fundamental canons in the sciences.

Human thinking ( i no, I no , this o/t in a climate debate) best modelling is with probabilistic graph theory and critical phenomena , Paul Erdoes the most prolific mathematician of the 2nd half of 20th century spend a lot of time on that..

shanon nyquist if you are tired with that theorem, you can improve it by going 4 sampling on the go : when you go over rough terrain you sample faster zack-zack-zack-zack when you go over smooth terrain you sample smooth zack....zack this way you save on the zack's. You just need a little diary with you and jot down when you sampled at which rate. This way you can outdo shanon, but don't tell it further I am patenting this and hitting BIG with this, soon.

Bishop Hill just sent me an email telling me he forwarded my question to Doug Keenan.

Many thanks!

This was just what I needed! reading your post actually motivated me to reorganize my notes and stuff, so thanks a lot!