- Bishop Hill blog - Some oddities in HadSST

Friday

Mar252016

Bishop Hill

Some oddities in HadSST

Mar 25, 2016

Climate: Oceans

Climate: Surface

Reader John McLean emails with details of some surprising finds he has made in the Hadley Centre's sea-surface temperature record, HadSST. John is wondering whether others might like to take a look and confirm what he is seeing. Here's what he has found:

1 - Files HadSST3-nh.dat and HadSST3-sh.dat are the wrong way around.

About 35% down web page https://crudata.uea.ac.uk/cru/data/temperature/ there's a section for HadSST3. Click on the 'NH' label and you go to https://crudata.uea.ac.uk/cru/data/temperature/HadSST3-nh.dat, which has 'nh' in the file name. But based on the complete gridded dataset that data file is for the Southern Hemisphere, not the Northern. The two sets are swapped. The links to named files are correct but the content of those files is wrong, likely due to errors in the program that created these summary files from the SST3 gridded data.

2 - The ASCII file containing observation counts per grid cell has records in the wrong order.

On the above page click on the "HadSST3" link and go to the Hadley Centre page, then from there to the "download page" (http://hadobs.metoffice.com/hadsst3/data/download.html) and you'll see mention of an ASCII file of observation counts for each grid cell.

The data in that file is in the wrong sequence. The HadSST3 gridded data has records for each month in sequence from 90N to 90S but the gridded data runs from 90S to 90N.

(I found this when I discovered lots of SST data with no corresponding observation count and then lots of grid cells with observation counts but no SST data. I've created a crude map of cells that contained SST data in January 2000 and it displays with the NH at the top, so it's not the SST data file that's wrong; it's the counts. When I flipped the data in each month into 90N to 90S order the SST data always had corresponding observation counts and there were no cells with an observation count but no data.)

3 - The ASCII observation count file contains unreadable fields because they overflowed.

Since about 2002 it's not unusual to find cells for which the observation count is '*******', meaning that the count is greater than 9999.00. There's no way for the user to know what value should be in that field.

I suspect that problem comes about because the file was written by a Fortran program because that language fills a field with *'s when the data doesn't fit. To use a real number (i.e. with decimal places) makes no sense because one can't make half an observation or 0.19 of an observation. I don't know why the fields can't be a 7 digit integer, but they're not. (Could it be to cater for the language R ?)

(I'm a bit suspicious about the figures in excess of 9999.00 values because that's an average of over 14 observations per hour, or roughly one in less than 4.25 minutes! Probably 75% of cells with these observation counts are along the western or eastern US coast, but the other 25% aren't. Is it a cluster of Argo buoys??)

He adds:

I see from my notes that in 2002 the instances of instances of '*******' in a field were as follows:
2002: 2,
2003: 1,
2004: 5,
2005:14,
2006: 17,
2007: 103
2008: 143,
2009: 178,
2010: 177,
2011: 111,
2012: 127,
2013: 153,
2014: 147,
2015: 136
(or at least that's how the files downloaded after the January 2016 update had things).

This list might help anyone confirm the existence of these overflowed fields.

The HadSST3 observation count problems won't be used by many people, maybe I'm even the first if no-one else has hit the problems. I found them because I was investigating the temperature and coverage impact in each month of grid cells with few observations

I think a fair question is whether Hadley Centre publishes other flawed data on SST or anything else because it looks like there's no in-house verification that software does what it's supposed to do.

107 comments

View Printer Friendly Version

Reader Comments (107)

What's all this mean in English?

Mar 25, 2016 at 12:47 PM |

mailman

In English, they have mixed up the Northern and Southern hemispheres, or in Anglo Saxon they do not know their arse from their elbow. And its twice over as they also think up is down ;)

Mar 25, 2016 at 1:06 PM |

Breath of Fresh Air

There was never a drill hole that went from surface to sky in all my mineral exploration years.
What is wrong with the standard of "easy" science these days?
Geoff

Mar 25, 2016 at 1:06 PM |

Geoff Sherrington

Come back Harry. All is forgiven.

Mar 25, 2016 at 1:08 PM |

Martin A

In the software industry, mistakes happen frequently and sometimes a minor error can look horrendous to the user(customer), but it's an easy fix.
What matters , is not the error, it's how fast it is fixed and how honest is the explanation.

Mar 25, 2016 at 1:58 PM |

EternalOptimist

Did this happen because they were trying to fudge the results and hope no one noticed?

Mar 25, 2016 at 2:16 PM |

ivan

Has anyone thought of just asking John Kennedy about this? Maybe it is a mistake, but maybe not (I've had a quick look at the data files and it's not obvious to me what the problem is)

Mar 25, 2016 at 2:51 PM |

...and Then There's Physics

Upside down data? No problem! Ask Mikey Mann!

Mar 25, 2016 at 2:57 PM |

Jimmy Haigh

Nonsense! This data is all provided by peer-reviewed scientists - so it can't be wrong!

What are 'Reader John McLean's' qualifications? If he has no peer reviewed papers, or is not a practising approved climate scientist, who is he to say that 97% of real scientists are wrong? This is obviously a smear campaign being started by highly-paid Big Oil disinformation specialists... and, in any case, no matter what the data says, it doesn't affect our conclusions....

/sarc - in case anyone needs it... :)

Mar 25, 2016 at 3:02 PM |

Dodgy Geezer

I do hope that Anthony picks up on this. I'd love to read Willis's take on it. The thing is, had the error not been spotted how many government diktats would have been born on the back of it?

Mar 25, 2016 at 3:10 PM |

Harry Passfield

Make one kinda proud of a great British institution! Trust me, I'm a Climate Scientist!

This is the body that has repeatedly & consistenty told us through the UNIPCC that the Sun has no significant affect upon Earth's climate, yet recently had the gaul to announce that we could be getting colder NH winters as a result of significantly reduced Solar activity! I do wish they'd make up there minds which it is! No credibility!

Mar 25, 2016 at 3:20 PM |

Alan the Brit

Mailman. What's it mean in English?

CRU and co. still can't code.

Mar 25, 2016 at 3:32 PM |

Alan Kendall

C'mon, it's just through the looking glass. We already know where we are.
===================

Mar 25, 2016 at 3:52 PM |

kim

The https://crudata.uea.ac.uk/cru/data/temperature/HadSST3-nh.dat file looks fine to me. Either its already fixed or the author of the post is mistaken. Verified by checking the netcdf gridded fields; NH anomalies are warmer in recent years than SH anomalies, which is correct.

Mar 25, 2016 at 3:52 PM |

Zeke Hausfather

I think that EternalOptimist and ATTP have probably got it right, whether the available download data is the same as used for HadSST is not clear, first step clarify what the situation is. If it is an error then it won't be the first time something like this has escaped.

Mar 25, 2016 at 3:58 PM |

SandyS

DOES IT REALLY MATTER IN THE END ANYWAY when you average all the numbers together in the end to formulate the so called Index. It's not a real number anyway, it's all realative and all the parameters are at the discretion of the expert. For afterall, they have all the experience and everyone else is a heretic.

Mar 25, 2016 at 4:09 PM |

Pete J

Alan: CRU and co. still can't code.

Thinking how much it takes to be a good software engineer, I'm not slightly surprised by missing output quality checks. If no-one is using those files, it is no surprise they are broken.

Mar 25, 2016 at 4:25 PM |

wert

GIGO^2

Mar 25, 2016 at 4:34 PM |

NCC 1701E

Azimuth - from North or South Pole?
A possible source of such confusion is that sometimes Azimuth has been reckoned from the South Pole in astronomy and satellite observations, instead of from the North Pole as in navigation.
Stanford defines Azimuth:

Azimuth, in astronomical measurement, is the number of degrees clockwise from due south (usually) to the object's vertical circle (i.e. a great circle through the object and the zenith). For nonastronomical purposes, azimuth (or bearing) is generally measured clockwise from due north.

Altitude, Azimuth, and Line of Position Comprising Tables for Working Sight ...Table IV page 155
Azimuth Wikipedia

Azimuth (Az), that is the angle of the object around the horizon, usually measured from the north increasing towards the east. Exceptions are, for example, ESO's FITS convention where it is measured from the south increasing towards the west, or the FITS convention of the SDSS where it is measured from the south increasing towards the east.

NOAA has historically inverted longitude and time zone definitions:

Please note that this web page is the old version of the NOAA Solar Calculator. Back when this calculator was first created, we decided to use a non-standard definition of longitude and time zone, to make coordinate entry less awkward. So on this page, both longitude and time zone are defined as positive to the west, instead of the international standard of positive to the east of the Prime Meridian.

Mar 25, 2016 at 5:15 PM |

David L. Hagen

oh cmon !

97% of scientists have taken a deep look at this or they would never have come to any conclusions

Mar 25, 2016 at 5:56 PM |

venus

Actually, since most of the data analysis programs were probably written by CRU they know how the data is stored and just leave it in that format. It probably made the arithmetic easier to program. But anyway, they should at least have info in the header of the file describing the layout for future historians if nothing else and for current users out of courtesy. Most likely the usual bumbling. Good programmers are hard to find.

As far as the infilled fields, a programmer guru I remember said: never write a procedure that has to handle any outside data that doesn't check that everything coming in is in the proper format so it doesn't cause problems, all calculations and results that go outside the procedure have correctly formatted results, all counters, registers, flags, files, and codes are properly set, and any errors that might occur are flagged.

Those kinds of errors in programming have been all over the place and are why there have been so many, many hacker exploits via buffer overlows, range errors, improper program error recoveries, etc.

Mar 25, 2016 at 6:35 PM |

Phil Cartier

We know that quality procedures are alien to "climate science".

Mar 25, 2016 at 7:50 PM |

Phillip Bratby

I note Geoff Sherrington's comment that he never had an exploration drill hole go into the sky. Drill underground and there are plenty of drill holes going up as well as down. +ve dips are as significant as -ve dips.

Mar 25, 2016 at 8:31 PM |

nvw

adulterating data is a warmista's 2nd nature

Mar 25, 2016 at 8:43 PM |

venus

When the climate scientists say it looks OK to me....... Start to worry.

Mar 25, 2016 at 9:05 PM |

JustAnotherPoster

@John McClean

I just did a quick look at the number of '*******' in the had.SST3.1.1.0. number of observations zip.

They are there, no doubt. But I get different numbers to you.
These were produced by a vb script checking each of the 72 fields by year by month. any '*******' gave a +1

myYear SumOfMissingCount
1855 13
1858 8
1863 19
1869 5
1876 1
1877 13
1878 5
1884 19
1890 15
1891 9
1893 6
1897 10
1909 10
1913 11
1920 7
1925 14
1935 9
1938 8
1939 1
1941 3
1942 6
1945 2
1951 21
1956 18
1968 10
1969 1
1975 11
1980 9
1981 15
1982 1
1985 1
1990 12
1997 12
2002 1
2003 1
2004 15
2005 13
2006 10
2007 91
2008 109
2009 130
2010 148
2011 67
2012 88
2013 94
2014 135
2015 107
2016 20

Mar 25, 2016 at 9:59 PM |

EternalOptimist

I can't see any problem with NH and SH. HadSST3-nh.dat (and -sh) is just a file of monthly averages. The numbers in the file correspond to the familiar graphs shown. NH (-nh) temperatures are higher, as expected. Eg the 2015 average for NH was 0.737; for SH was 0.425. The files were last updated 8 March, so I don't think there is a recent change. It looks to me as if John Maclean may have been reading the netCDF gridded file wrongly.

The CRU NH data seems to agree entirely with the Met Office data here.

Mar 25, 2016 at 10:13 PM |

nick stokes

@ nvw

> ... Drill underground and there are plenty of drill holes going up as well as down

Yes, sure, all mining geologists know that, including Geoff. It's trivially obvious.

But Geoff's point, as I read it, is that such drillholes as you refer to tend to stop once they surface.

The only drillhole I've ever observed that did indeed continue into thin air for a few metres was a supposedly horizontal hole that control survey was lost on and it finished literally drilling into a wombat hole - much to the inhabitant's intense irritation

Mar 25, 2016 at 11:07 PM |

ianl8888

Nvw,
With respect you made a misleading comment about my drill hole example.
I did specify a collar at the surface, to exclude underground.
Surface is reasonably where land meets sky.
Why did you go to the effort of misrepresenting me?
Geoff

Mar 26, 2016 at 12:04 AM |

Geoff Sherrington

OK. But aside from the fact that the data is totally wrong...

Mar 26, 2016 at 1:17 AM |

Keith L

can't drill up where the surface meets the sky....hmmm

https://www.youtube.com/watch?v=qW28i9SkUlg

Mar 26, 2016 at 5:26 AM |

nvw

This article got me curious, so I went and grabbed the 3 ASCII files that were mentioned at the HadSST website ('...number_of_observations.zip', '...median.zip', and '...measurement_and_sampling_uncertainty.zip'). I did not bother (yet) with any of the NetCDF files.

After a very cursory overview of those files (I have no 'R' chops), I came to the following conclusions:

- the meta-data for these files appears to be very minimal (may be more available if I were to read ALL the support docs/papers, but ...)

- the 'median.zip' and 'measurement....zip' files didn't look obviously out of scope

- the 'number_of_observations.zip' file was indeed curious:

- - Yeah, it did look like the file format was legacy FORTRAN-sourced, and nobody has bothered to update it for integer-scale content (one wonders where this stuff is going to get used).

- - The positioning of the overflow markers (field content '*******') seemed to have some significance relative to the particular grid-cells that they represent (beyond their meaning of integer value > 9999.00). At least in the first several years that the overflow-markers occurred, they appeared against the same few grid-cells (I'm still mucking about in the data). Additionally, the neighboring grid-cell values for the same latitude also contained 'large' counts (upwards of 40%-80% of max value). The fact that these overflow values didn't start until late 2002 makes me wonder if the source data for these grid-cells was due either to Argo float transmitters that were improperly nattering as they phoned home, or if there was some particular problems in (what appears to be ICOADS if I followed the HadSST website documentation correctly) source data, such as certain data sets being entered multiple times. I have seen multiple-entry happen before in other data systems, so it wouldn't overly surprise me, but I'm open to other explanations. One of these days, I'm gonna learn enough R to build a grid-cell-count-histogram over a world map to see if there is some way to simplify chasing down these kinds of data problems.

- - Depending on how this file gets used in the HadSST computations (or is it just produced for 'historical reasons'?), I could see the overflow markers (and any other associated data issues) impacting any number of statistical inferences in the SST products. As Steve MacIntyre and Willis E have said on any number of occasions, intimate knowledge of one's detail data can be very useful.

Mar 26, 2016 at 5:34 AM |

OldUnixHead

library(raster)
library(ncdf4)
library(maps)
hadsst <- "https://crudata.uea.ac.uk/cru/data/temperature/HadSST.3.1.1.0.median.nc"
download.file(hadsst,destfile="hadsst.nc",mode="wb")

SST <-brick("hadsst.nc",varname="sst")
plot(raster(SST, layer =1000))
map("world",add=TRUE)

data is fine.

How about this.

Some guy says he read the data and it was upside down

I am skeptical of his claim. I want to see his code.

Above you can see how to get the netcdf which looks just fine.

netcdf is typically used in CDRs. Its a standard self documenting format.

Mar 26, 2016 at 6:39 AM |

Steven Mosher

@nvw, did you read the comments in the youtube video you linked to? Consensus is that it was a planned drill. Why else would there just happen to be a group of men standing around. What were they waiting for? Godot or the drill to come out of the ground?

Mar 26, 2016 at 8:33 AM |

SadButMadLad

Meanwhile at WUWT, Anthony Watts wrote his article (archived here), as if he was certain the data were wrong. Of course Anthony didn't bother checking for himself, he wouldn't know how. As well as his headline of "Friday Funny: more upside down data", he wrote:

I wonder what CRU will have to say about this one that has been discovered? It’s bigger than just a single point on Earth.Anthony wrote his article at least an hour after ATTP's comment, so he should have suspected that it was John McLean who was wrong, not CRU. He probably thought: why let a potential denier meme go to waste?

Source

There will now be a flurry of retracted accusations and associated apologies.

Not.

Mar 26, 2016 at 9:24 AM |

Phil Clarke

This 'mix-up' is not as serious as you think, after all they have the 'result they want ' before any data is consider .
And they have shown great skills in 'making the data ' fit their needs.

But frankly given the constantly poor personal standards and awful professional pratice , with a near total lack of good scientific pratice seen within climate 'science' such mistakes are not news at all

Mar 26, 2016 at 9:54 AM |

knr

Steven Mosher: You referenced <- "https://crudata.uea.ac.uk/cru/data/temperature/HadSST.3.1.1.0.median.nc"

But McLean referenced https://crudata.uea.ac.uk/cru/data/temperature/HadSST3-nh.dat

Are you saying they one and the same?

Mar 26, 2016 at 10:48 AM |

Harry Passfield

Phil C

You (quite rightly) expect to see a retraction when something is wrong.
Can you tell me when James Hansen will be making retractions of his hopelessly wrong predictions such as "the west side highway underwater" and his laughable ABC scenarios?
When will UNEP be making an official retraction of their ridiculous "50 million climate refugees by 2010" statement?
Will 'Climate Czar' John Holdren be retracting his 1985 prediction that there will be 1 billion climate-related deaths by the end of this decade?
I could quote loads more busted alarmist predictions, but quite frankly I can't be bothered as there are almost too many to mention.

Mind you, warmists such as Stephen Schneider seem to think it's okay to exaggerate when you're fighting for the cause:

we have to offer up scary scenarios, make simplified, dramatic statements, and make little mention of any doubts we might have.
Phil, would you agree with me that Schneider's statement is unethical?

Mar 26, 2016 at 12:23 PM |

david smith

Can you tell me when James Hansen will be making retractions of his hopelessly wrong predictions such as "the west side highway underwater"

Because he didn't make any such prediction. He was asked what would happen in 40 years if CO2 doubled. Good to see you promoting standard science denier memes though.

Mar 26, 2016 at 12:27 PM |

...and Then There's Physics

As far as Stephen Schneider's quote is concerned, you should read this. However, don't let me stop you from promoting these standard science denial memes. I'd hate to take them away from you; you might have nothing left.

Mar 26, 2016 at 12:35 PM |

...and Then There's Physics

Heh, the Mad, the Bad, and now, the Ugly.
==============

Mar 26, 2016 at 1:21 PM |

kim

I do not think John McCain owes anyone any apologies.

The number of observations file interests me, for two reasons. First, it is not adjusted, what you see is what you get. Second, it is possible to look at the grid on the map, to see where the observations were made.

John focussed on the ******* fields. The cell with most of these is 13,25 or 32.5 S, 117.5 W.

This cell is smack bang in the middle of the pacific close to Easter Island. it has had 137 *******
plus 1324 observations going back to 1850 cf the English channel which has had 376

Mar 26, 2016 at 1:42 PM |

EternalOptimist

Ken,
Jim and his CO2 levels. Yep, Jim was wrong - it's nearly 30 years later and CO2 levels are nowhere near to doubled. Also, current sea level rise is refusing to play nicely as it's still extremely unalarming despite rising CO2 levels. 3mm/year (at most) - pathetically small. I think that West Side Highway will be just fine.

I've already seen the page you link to about Stephen Schnieder and the full quote. As commenters at the bottom of that page say, the full quote doesn't change Schneider's unethical stance one bit.
Nice try Ken, keep playing.

Mar 26, 2016 at 1:45 PM |

David Smith

Ken, you are Wirthless.
=======

Mar 26, 2016 at 1:54 PM |

kim

clarification !!!

32.5 S, 117.5 W

This cell is smack bang in the middle of the pacific close to Easter Island. it has had 137 months containing *******
plus 1324 months containing observations going back to 1850 cf the English channel which has had 376 months containing observations

I need to double check this , because it seems to be barmy

Mar 26, 2016 at 2:22 PM |

EternalOptimist

EternalOptimist: Just for the hell of it, if NH and SH had been switched then 32.5N, 117.5W is just off the coast of San Diego. If the Westing was swapped to an Easting as well, it's in the heart of China. (According to Google Maps)

Mar 26, 2016 at 2:31 PM |

Harry Passfield

@Harry, I'll check out the old reverso later
in the meantime - how about this

row 36 - the Antarctic
1124 months with obs, 71,008 obs in total.

Mar 26, 2016 at 2:38 PM |

EternalOptimist

Ooh, yes, aTTP: a wonderfully balanced point of view:

Excellent story! Thanks for presenting it for us. It’s important to deconstruct the fossil-fueled lies out there. [Nanci]

While “climatesight” did correct her (him?) somewhat, he (she?) later writes:

The presence of skeptics is a complex issue that I’ll be writing a LOT more about soon – keep your eyes open for new posts. Internet searches make me depressed too sometimes, but keep in mind that the presence of skeptics is hugely over-represented online and even in the popular media. There really aren’t that many of them. We also have the government on our side :)

Does climatesight not like the concept of scepticism in science? If so, how do you consider her (him?) as a serious defender of science?

Charles presents the best comment:

If you’re honest you do tell the truth. Even if you don’t want to, you tell the truth.

A policy we should all try and adhere to.

I hope that means being both. [S. Schneider]

But which would he choose, if there is a conflict? The evidence suggests the “being effective” choice.

Mar 26, 2016 at 3:08 PM |

Radical Rodent

David,
Given that it is patently obvious that responding to a question about what would happen if CO2 doubled in 40 years is not the same as predicting that it will, I have no suitable response to your comment.

Mar 26, 2016 at 3:13 PM |

...and Then There's Physics

Is Gavin honest? Time tells, will tell, and has told. As for Ken Rice's moving hand, well!
=============

Mar 26, 2016 at 3:27 PM |

kim

Post a New Comment

Enter your information below to add a new comment.

My response is on my own website »

Author:

Author Email (optional):

Author URL (optional):

Post:

↓ | ↑

Some HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>