Buy

Books
Click images for more details

Twitter
Support

 

Recent posts
Recent comments
Currently discussing
Links

A few sites I've stumbled across recently....

Powered by Squarespace
« A bit sensitive - Josh 216 | Main | Moriarty on peer review »
Monday
Apr222013

SkS quietly withdraws allegation

Last week I ribbed Dana Nuccitelli and Gavin Schmidt over the former's comparing the mean of the Aldrin paper to the mode of Lewis's. Here's the quote:

One significant issue in Lewis' paper (in his abstract, in fact) is that in trying to show that his result is not an outlier, he claims that Aldrin et al. (2012) arrived at the same most likely [i.e. the mode] climate sensitivity estimate of 1.6°C, calling his result "identical to those from Aldrin et al. (2012)."  However, this is simply a misrepresentation of their paper.

The authors of Aldrin et al. report a climate sensitivity value of 2.0°C [per the paper, the mean] under certain assumptions that they caution are not directly comparable to climate model-based estimates. When Aldrin et al. include a term for the influences of indirect aerosols and clouds, which they consider to be a more appropriate comparison to estimates such as the IPCC's model-based estimate of ~3°C, they report a sensitivity that increases up to 3.3°C. Their reported value is thus in good agreement with the full body of evidence as detailed in the IPCC report.

I was somewhat taken aback when Nuccitelli subsequently denied having done this:

Me: @dana1981 And you can't really duck the fact that you compared mean to mode. @ClimateOfGavin @wattsupwiththat

Nuccitelli: @aDissentient You have a strange definition of the word "fact", but that's not news.

Me: @dana1981 You are denying comparing mean to mode?

Nuccitelli: @aDissentient Sure. While we're at it, I'm also denying that the moon is made of cheese.

In the comments, Tom Curtis is remonstrated about Nuccitelli accusing Lewis of misrepresenting the match between his PDF and Aldrin's,

Dana correctly describes Lewis as claiming that the mode (most likely climate sensitivity) of his result is identical to the mode of Aldrin et al, but then incorrectly calls that claim a simple misrepresentation.  It is not a misrepresentation.  The modes of the two studies are identical to the first decimal point. 

Now it has all changed. Look at the Skeptical Science page again (bold emphasis added):

One significant issue in Lewis' paper (in his abstract, in fact) is that in trying to show that his result is not an outlier, he claims that Aldrin et al. (2012) arrived at the same most likely climate sensitivity estimate of 1.6°C, calling his result "identical to those from Aldrin et al. (2012)."  However, this is not an accurate of their paper.

The authors of Aldrin et al. report a mean climate sensitivity value of 2.0°C under certain assumptions that they caution are not directly comparable to climate model-based estimates. When Aldrin et al. include a term for the influences of indirect aerosols and clouds, which they consider to be a more appropriate comparison to estimates such as the IPCC's model-based estimate of ~3°C, they report a sensitivity that increases up to 3.3°C. Their reported value is thus in good agreement with the full body of evidence as detailed in the IPCC report.

This seems to be a result for Tom Curtis. However, he then goes on to make a very strange point:

[Lewis's claim] is...misleading in that it is an apples and oranges comparison.  Given that other studies report the mean, in comparing with other studies the mean should be reported, or it should be made absolutely clear that not only are you reporting the mode, but that the authors you are reporting on reported the mean.

The idea that comparing mode to mode is "apples to oranges" is pretty strange. To say it is "misleading" is again absolutely extraordinary when one notes that the IPCC doesn't consider means either - it reports medians and modes. This is only natural to do so when considering skewed distributions since the mean is strongly influenced by outliers.

The other reason for using the mode is that it is largely unaffected by choice of prior, so by using it one can better understand what the Lewis paper means, namely that the Lewis and Aldrin approaches give the same best estimate of climate sensitivity, but the adoption of the objective Bayesian approach gives a more constrained estimate.

PrintView Printer Friendly Version

Reader Comments (53)

When in a hole stop digging ;)

Will they ever learn, I hope not !!!!

Apr 22, 2013 at 10:30 AM | Registered CommenterBreath of Fresh Air

Well done for plugging away at these matters. Sensitivity has a central role in the IPCC framework and argument. Although use of mode rather than mean may seem a small detail it isn't. As we focus in on such things it's getting harder to paint sceptics as ignorant bigots - largely because of Nic's excellent work.

Apr 22, 2013 at 10:38 AM | Registered CommenterRichard Drake

Speaking of SkS, John Cook in the Conversation has today provided a graph of a hockey stick extraordinaire. Whilst lamenting the increasing public apathy in Climate Science, he gives us proof of the astonishing increase in the number of Climate Science papers.

I commented, probably incorrectly, that I wondered if it had anything to do with the increase in funding.

Apr 22, 2013 at 11:08 AM | Registered CommenterGrantB

GrantB,

I guess his last point is kind of correct, of course all the evidence is going to pile up to support their point of view...but only because anything and everything that contradicts their religious beliefs is suppressed.

Regards

Mailman

Apr 22, 2013 at 11:32 AM | Unregistered CommenterMailman

It is amazing to witness the intellectual heights of a climate science discussion. As weather, and its statistical construct over a period of time: climate, are driven by (partly) known physical processes of which C02 only constitues a tiny amount and hence influence, we better put our energy into understanding such processes than throwing sensitivity values to impress each other.

Apr 22, 2013 at 11:52 AM | Unregistered Commenteroebele bruinsma

Anyone ask Dr. Nuccitelli how that cheese tastes?

This conversation remiinded me of Anscombe's Quartet which reminds you to always look graphically at the data before assigning value to analysis.

Apr 22, 2013 at 11:56 AM | Unregistered CommenterBob MacInnes

John Cook's claim of '97% consensus' is fraudulent.

Apr 22, 2013 at 12:05 PM | Registered Commentershub

I've examined what Tom Curtis says at other junctures. Just as here, he makes careless, unsubstantiated statements. These are mixed in with more well-founded ones.

This is primarily due to belief along the lines of 'Well, Dana1891 may be wrong in this one point, but he is right in the larger scheme'.

Apr 22, 2013 at 12:22 PM | Registered Commentershub

Richard Drake +1

Apr 22, 2013 at 12:26 PM | Unregistered Commenternot banned yet

The warmista rewriting history - never!

Apr 22, 2013 at 12:31 PM | Unregistered CommenterConfusedPhoton

Being Green means never having to say you're sorry, and never admitting you are wrong.

Apr 22, 2013 at 12:34 PM | Unregistered CommenterRick Bradford

The Truro lass appears to have dyslexia. Or is it an incredibly brilliant ploy?

Apr 22, 2013 at 12:39 PM | Registered CommenterGrantB

Indeed comparing mean values makes no sense in this context. If you don't like the mode (like me, actually), then compare medians. The median is both robust and meaningful.

Apr 22, 2013 at 1:22 PM | Unregistered CommenterCees de Valk

Cees de Valk -
While I would agree that the median is generally a better guide to central tendency than mode, in this case I would differ. The mode is far less sensitive to the choice of prior than median. The mean, as you say, is not an appropriate metric in this context.

Apr 22, 2013 at 1:56 PM | Registered CommenterHaroldW

eeny meany miny mode

Dana can't man up and admit his mistake ...just like Mann.

Will he sport a Van Dyke next?

Apr 22, 2013 at 2:07 PM | Unregistered CommenterAnthony Watts

Can someone explain the terms "median" and "mode" for me, and say why they're useful. (I understand "mean", the arithmetic average). For example, what is the mode and what the median of the following numbers whose mean is 27?

18, 24, 24, 25, 29, 33, 36.

Apr 22, 2013 at 2:13 PM | Unregistered Commentersimon abingdon

Simon,

Your example is comprised of only 7 discrete values and is not a good comparison with the continuous PDF which is the subject of the discussion. Notwithstanding this the median of your numbers is 25 (because it is the middle value) and the mode would be 24 (the most frequently occuring).

Note also that for example making the first value much smaller or the last value much larger would change the mean but would not effect either the mode or the median.

Apr 22, 2013 at 2:20 PM | Registered Commenterthinkingscientist

since skeptical science mentions another [second],[Norwegian] study several times matching Nic Lewis's study would this not make Nic Lewis's study an example of second study syndrome rather than single study syndrome?
" Taken from Climate Sensitivity Single Study Syndrome, Nic Lewis Edition"

Apr 22, 2013 at 2:46 PM | Unregistered Commenterangech

angech, no it wouldn't. This is clearly the one and only "Climate Sensitivity Single Study Syndrome, Nic Lewis Edition"

Apr 22, 2013 at 2:52 PM | Registered Commentersteve ta

A perfect example of intellectual dishonesty. See Currys blog on the topic.

Apr 22, 2013 at 2:54 PM | Unregistered CommenterJPC Lindstrom

@thinking scientist

Thanks. I was rather hoping for an explanation in words of what sort of an average the terms "mode" and "median" suggest. If my journey time to work takes 30 minutes typically, but sometimes as little as 25 minutes and occasionally as much as 60 I can see that I might not base my day to day expectation on the mean. But would I choose the mode or the median? Maybe in my journey time example they might give identical results if measured in minutes but if I timed my journeys to the second I'd expect few identical readings and the idea of the most frequently occurring value would be inappropriate, making the concept of the mode misleading.

So what is it that makes the mode or the median relevant in a particular application? What are the essential characteristics of what is being studied that guide the statistician?

Apr 22, 2013 at 3:15 PM | Unregistered Commentersimon abingdon

Simon Abingdon,

The median is the journey time you would expect on a "typical" day, and works extremely well with things like journey times timed to the second. If you make a large number of journeys then half of them will be shorter than the median time, and half will be longer. (For small numbers of journeys this gets messed up slightly by journeys that take exactly the median time, but for large numbers of journey times recorded to silly precision this doesn't matter, and even for small numbers of journeys the median is perfectly well defined and still has the "typical" interpretation).

The mode is "the most probable" journey time (the time which is more probable than any other) and only really makes sense if either you have a large number of journey times which you have binned (say to the nearest minute or nearest five minutes) or if you have replaced the discrete data by some smoothly varying model.

Modes are messy in general, and really only make sense if the distribution is well behaved (smooth, single peaked, etc.); personally I prefer medians for most purposes.

Apr 22, 2013 at 3:32 PM | Registered CommenterJonathan Jones

"However, this is not an accurate of their paper"

An accurate what? I assume that's not the Bish's typo...

Apr 22, 2013 at 3:56 PM | Unregistered CommenterJames P

Nuccitelli has clearly taken lessons from the master of deception.
Michael E. Mann.

Apr 22, 2013 at 4:10 PM | Unregistered CommenterDon Keiller

James P, a single click of the SkS link provided could have confirmed that this isn't a Bish error. Just too lazy?

Apr 22, 2013 at 4:10 PM | Registered Commentersteve ta

For shub: John Cook is fraudulant!

Apr 22, 2013 at 4:12 PM | Unregistered Commenterstan stendera

@James P

"However, this is not an accurate of their paper"

Presumably they meant to alter "this is a misrepresentation ..." to "this is not an accurate representation ...", but their proofreading skills appear to be the equal of their science skills.

Apr 22, 2013 at 4:25 PM | Unregistered CommenterTurning Tide

Most people find themselves needing to say "I made a mistake" from time to time. But it does not seem to be in the vocabulary of not these guys.

___________________________________________________________________________________________

Basil Fawlty: "What am I going to say to them?"

Mrs Fawlty: "Just say 'I am sorry, I made a mistake' ."

~~~~~~~~~

Basil Fawlty: "I am sorry, my wife made a mistake."

Apr 22, 2013 at 4:28 PM | Registered CommenterMartin A

Stan, I can't decide if John Cook is more scary or creepy.

Apr 22, 2013 at 4:29 PM | Registered Commentersteve ta

Simon,

I share your confusion. With discrete values, the mean, mode and median are clear. With a continuous distribution, the mean and median are still clear, but the mode becomes murky in my mind. I think in this context, the meaning of the mode is merely the peak of the pdf.

However, using the phrase "most likely" strikes me as very odd since any specific value of a continuous distribution occurs with probability zero, which does not strike me as very likely at all! Again, I think it is all just a way of identifying the peak of the density.

James

Apr 22, 2013 at 4:36 PM | Unregistered CommenterJames

Jonathan, thank you for your explanation.

I'm now pondering "If you make a large number of journeys then half of them will be shorter than the median time, and half will be longer". Maybe this is just the definition of median, but on the other hand the journey time I expect on a "typical" day can be slightly lessened if circumstances turn out to be particularly favourable (not very often) or considerably worsened (which in comparison seems quite often) if things go wrong. Perhaps it's just that my expected "typical" journey time is too optimistic from the outset.

But I do find it interesting, having started with the definition "The median is the journey time you would expect on a 'typical' day" that you then draw that 50/50 inference which I am pondering. With luck it will soon dawn on me why this must be so.

Apr 22, 2013 at 4:39 PM | Unregistered Commentersimon abingdon

Simon, what's the average wage of all the people in the bar of your local pub? Probably close to the national average wage, whether mean, median, or mode.

And what happens to that average if Bill Gates drops in for a drink?

The median and mode remain about the same - the mean goes off the scale.

Apr 22, 2013 at 4:48 PM | Registered Commentersteve ta

Simon abingdon: I suggest you make a note of all your journey times over a year, put them into say 2 minute bins and then enter the number in each bin and plot them in excel (you aren't Phil Jones coming here on the sly for a bit of knowledge are you?). If you mark the mode, mean and median, I'm sure it will then all become clear to you and you will understand the worth of each metric. It will be useful because you will know alll sorts of useful facts, such as how long you will need to allow so that you have a 50% or 95% or 99% probability of arriving at work on time (assuming you have a large enough sample and circumstances don't change).

Apr 22, 2013 at 5:08 PM | Registered CommenterPhillip Bratby

"But I do find it interesting, having started with the definition "The median is the journey time you would expect on a 'typical' day" that you then draw that 50/50 inference which I am pondering. With luck it will soon dawn on me why this must be so."

The Median literally is the number in the middle of your list of values. It is a very good estimate for your 'typical day'. I would say the mode is the best value for your typical day, but as pointed out you have to define the list of values properly before it makes any sense. If you recorded your journey time to the microsecond then every single time would be unique and you wouldn't have a mode.

Apr 22, 2013 at 5:09 PM | Unregistered CommenterRob Burton

Simon Abingdon,

By a "typical day" I just mean something like take all the days you had over a year, order then from best to worst, and then a typical day is the one in the middle. Typical here simply means that half the days are better and half the days are worse: it's not a good day or a bad day but a typical day. Up to a few technical details this is precisely the definition of the median.

You might, however, choose a different definition, which I will call an "ordinary" day, which is the sort of day that happens more often than any other sort of day. That's the mode.

Of course in general a typical day need not be an ordinary day: you seem to be suggesting that on an ordinary day nothing special happens to you, but when something special happens it is more likely to go wrong than go right. In such a case one might expect the typical day (median) to be worse than the ordinary day (mode).

As you also point out, when something goes wrong it can go very badly wrong, while there are limits on how well things can go (your journey time home is probably lower bounded by speed limits, or at least the maximum speed of your car, or at very least the speed of light, but the upper bound is essentially indefinite if you get stuck in a snowdrift for example). This is a very common situation, and in such cases the average day (mean) is worse than the typical day (median).

Which definition is most useful depends very much on what you are trying to achieve by summarising your distribution of travel experiences. No one number can capture the whole range of life. On the whole I prefer the typical day (median) or the average day (mean), but it's very much horses for courses. One popular approach is to quote the median and mean, with the ratio between these providing a handy summary of how "skewed" your life is.

Apr 22, 2013 at 5:16 PM | Registered CommenterJonathan Jones

Jonathan "only really makes sense if either you have a large number of journey times which you have binned".

Ah, a technical term. At first I thought you were talking about the wpb.

Apr 22, 2013 at 5:24 PM | Unregistered Commentersimon abingdon

Speaking as a geologist who analyzes grain sizes. We sometimes run into situations where we have two grain sizes that show up more than others. For example: a grain size analysis shows that we have a large amount of samples that occur at 2mm and also at 0.5mm, we would refer to that as a sample with a bi-modal distribution. Thinkingscientist is right. The mode is the value that shows up the most times in a given population of values, be they commute times, grain sizes, whatever. Lets say in this case the Modal value is 32 min. The median is simply the value of the sample at the mid-point of your sample population. Let's say you kept track of your commute times for 31 days. The median is value of the commute time on the 16th day. Lets say: 35 min. Now, if during that 31 day period, your fastest commute time was 25 min and your longest was 60 min. The mean value would 42.5 min. To answer your question about commuting times I would use the modal value of 32 min because that time occurs the most often in your commute. The reason median and mode are used is that they are unaffected by extreme values of the population, as steveta points out.

Apr 22, 2013 at 5:42 PM | Unregistered CommenterGilbert K. Arnold

Jonathan, please disregard my frivolous 5:24 which I sent before I'd read your follow-up 5:16 which I now much appreciate.

Your answer to my journey time question, making the difference between median and mode equivalent to the subtle difference between a typical day and an ordinary day is I think a very persuasive and informative analogy. Thank you again .

Apr 22, 2013 at 5:42 PM | Unregistered Commentersimon abingdon

Simon and the others here: Check any elementary statistics text book. The median is the middle number of a set of ordered numbers (smallest to largest). See here: http://gwydir.demon.co.uk/jo/numbers/pictogram/box.htm

Apr 22, 2013 at 6:04 PM | Unregistered CommenterGilbert K. Arnold

I did a webcite query to see what I'd get. Here is the earliest version of that page on April 18, at 05:42: http://www.webcitation.org/query?id=1366321331471189&date=%400&fromform=1

That's interesting because I read that page AFTER that time and date and it didn't look like that (it had the mode and mean comparison). Even the "However, this is not an accurate of their paper" error is correct in that earliest version.

I don't know the ins and outs of webcite or Wayback or whatever but that seems a little strange.

Apr 22, 2013 at 6:25 PM | Unregistered CommenterMikeC

Gilbert
A quick clarification, re your commute times example. If we record commute times for a month and arrange the times in order, the median would be the mid-point of the series.

The median itself may occur on any given day of the month, including the mid-month day (the 16th) (!) ;)

Apr 22, 2013 at 6:30 PM | Registered Commentershub

Re steveta

And what happens to that average if Bill Gates drops in for a drink?

Hockey Sticks again.

Apr 22, 2013 at 7:00 PM | Unregistered CommenterAtomic Hairdryer

However, using the phrase "most likely" strikes me as very odd since any specific value of a continuous distribution occurs with probability zero, which does not strike me as very likely at all! Again, I think it is all just a way of identifying the peak of the density.

James
Apr 22, 2013 at 4:36 PM James

Well, although I've messed around with various aspects of probability theory, I've never had reason to use the mode, so take what I say with appropriate sanity warnings.

My understanding is that the mode, m, is the value at which the probability density function has its maximum value (assuming this is a unique value). So, as you say, the probability of getting precisely this value is zero (assuming that the probability density function contains no delta functions).

The random variable is more likely to fall in the range m ± D than any other range of extent 2D (where D is a small value).

So I can see that it has the attraction of answering the question "What value are we most likely to get?" with the answer "it's more likely to be close to m than close to any other value".

I can see that some listeners could then mistakenly think "Ah, they said it was going to be pretty close to m".

[corrections of any misconceptions on my part welcomed]

Apr 22, 2013 at 7:27 PM | Registered CommenterMartin A

@ shub: I stand corrected. I stated the problem wrong. I forgot that the series was an "ordered series" The median actually does not have a "particular" day of the month associated with it. It is simply the mid-point of the series; as you so correctly state.

Apr 22, 2013 at 7:53 PM | Unregistered CommenterGilbert K. Arnold

Suppose you have a set of voters with preferences arranged along a single axis, and two political parties occupying the two ends, such that anyone to the left of a certain point prefers party A and anyone to the right of that point prefers party B. Who wins the election?

If the split is to the right of the median voter, then more than half the voters favour party A, as does the median voter. If the split is to the left of the median voter, then more than half the voters favour party B, including the median voter. Therefore whichever party the median voter favours will win the election. That's why politicians fight over the middle ground, and why their policies are almost identical.

The median voter may be thought of as a sort of absolute dictator - they control every decision in a two-party system, and they determine the policies of both parties. The median-man-on-the-street (or woman) is a very important person... :-)

Apr 22, 2013 at 8:37 PM | Unregistered CommenterNullius in Verba

People interested in the concepts Nullius is outlining might want to take a look at Hotelling's law and the Median voter theorem.

Apr 22, 2013 at 10:18 PM | Registered CommenterJonathan Jones

NiV:
"The median voter may be thought of as a sort of absolute dictator - they control every decision in a two-party system, and they determine the policies of both parties. The median-man-on-the-street (or woman) is a very important person... :-)"

That reminds me of an Alan Ramsey (journo/columnist for the Sydney Morning Herald) article on 'swinging voters' from way back. I couldn't find the whole piece on the web but I could locate this bit:

Back in the 1970s, when qualitative voter research was just starting to get a stranglehold on the political process in this country, the Labor Party … circulated a confidential paper entitled “Some Respectful but Blunt Suggestions on TV Appearances by Labor Spokespersons”. It offered a series of political ground rules, all of them designed to influence what it saw as the 15 per cent of swinging voters “who show any willingness to change,” including that such voters were “basically ignorant and indifferent” to politics and voted “on instinct for superficial, ill-informed and generally selfish reasons”. The only communication pattern that offered “the slightest chance of success” in reaching these voters was “the minimum number of thoughts repeated the maximum number of times, and when you are sick of saying it, they’ll be just starting to notice it”. While no politician would openly acknowledge such rank cynicism, millions of dollars are now spent every election by all parties on the very premise that these are exactly the voters who determine electoral outcomes. It is also exactly the sort of disciplined if mindless attitude of all parliamentary behaviour these days.

Apr 22, 2013 at 10:47 PM | Unregistered CommentersHx

"the minimum number of thoughts repeated the maximum number of times"

How true. And don't mention uncertainties..

Apr 23, 2013 at 9:07 AM | Registered Commenterjamesp

Here's a real nail in the coffin of AGW and sea level rise
http://people.duke.edu/~ns2002/pdf/10.1007_s00382-013-1771-3.pdf
(H/T to Tallbloke)

Maybe a real climate (pun intended) psientist can explain why this paper is flawed/doesn't matter?

Apr 23, 2013 at 10:00 AM | Unregistered CommenterDon Keiller

"the minimum number of thoughts repeated the maximum number of times"
sounds like a varaition of Goebbels' notorous maxim:

“If you tell a lie big enough and keep repeating it, people will eventually come to believe it. The lie can be maintained only for such time as the State can shield the people from the political, economic and/or military consequences of the lie. It thus becomes vitally important for the State to use all of its powers to repress dissent, for the truth is the mortal enemy of the lie, and thus by extension, the truth is the greatest enemy of the State.”

For "state" substitute "advocates of climate change"

Apr 23, 2013 at 10:14 AM | Unregistered CommenterDon Keiller

PostPost a New Comment

Enter your information below to add a new comment.

My response is on my own website »
Author Email (optional):
Author URL (optional):
Post:
 
Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>