- Bishop Hill blog

Over the last week or so I've been spending a bit of time with a new paper from Gavin Schmidt and Steven Sherwood. Gavin needs no introduction of course, and Sherwood is also well known to BH readers, having come to prominence when he attempted a rebuttal of the Lewis and Crok report on climate sensitivity, apparently without actually having read it.

The paper is a preprint that will eventually appear in the European Journal of the Philosophy of Science and can be downloaded here. It is a contribution to an ongoing debate in philosophy of science circles as to how computer simulations fit into the normal blueprint of science, with some claiming that they are something other than a hypothesis or an experiment.

I'm not sure whether this is a particularly productive discussion as regards the climate debate. If a computer simulation is to be policy-relevant its output must be capable of being an approximation to the real world, and must be validated to show that this is the case. If climate modellers want to make the case that their virtual worlds are neither hypothesis nor experiment, or to use them to address otherwise intractable questions, as Schmidt and Sherwood note happens, then that's fine so long as climate models remain firmly under lock and key in the ivory tower.

Unfortunately, Schmidt and Sherwood seem overconfident in GCMs:

...climate models, while imperfect, work well in many respects (that is to say, they provide useful skill over and above simpler methods for making predictions).

Following on from this, the authors examine climate model development and testing, and both sections are interesting. For example, the section on tuning models includes this:

Once put together, a climate model typically has a handful of loosely-constrained parameters that can in practice be used to calibrate a few key emergent properties of the resulting simulations. In principle there may be a large number of such parameters that could potentially be tuned if one wanted to compare a very large ensemble of simulations (e.g. Stainforth et al 2005), but this cumbersome exercise is rarely done operationally. The tuning or calibration effort seeks to minimise errors in key properties which would usually include the top-of-the-atmosphere radiative balance, mean surface temperature, and/or mean zonal wind speeds in the main atmospheric jets (Schmidt et al 2014b; Mauritsen et al 2012). In our experience however tuning parameters provide remarkably little leverage in improving overall model skill once a reasonable part of parameter space has been identified. Improvements in one field are usually accompanied by degradation in others, and the final choice of parameter involves judgments about the relative importance of different aspects of the simulations...

This tallies with what Richard Betts has said in the past, namely that modellers are using the "known unknowns" to get the model into the right climatic ballpark, but not to wiggle-match. However, I'm not sure that users of climate models can place much reliance on them when there is this clear admission that the models are nudged or fudged so that they look "reasonable".

The section on model evaluation is also interesting:

The most important measure of model skill is of course its ability to predict previously unmeasured (or unnoticed) phenomena or connections in ways that are more accurate than some simpler heuristic. Many examples exist, from straightforward predictions (ahead of time) of the likely impact of the Pinatubo eruption (Hansen et al 1992), the skillful projection of the last three decades of warming (Hansen et al 1988; Hargreaves 2010) and correctly predicting the resolution of disagreements between different sources of observation data e.g., between ocean and land temperature reconstructions in the last glacial period (Rind and Peteet 1985), or the satellite and surface temperature records in the 1990s (Mears et al 2003; Thorne et al 2011). Against this must be balanced predictions that did not match subsequent observations—for instance the underestimate of the rate of Arctic sea ice loss in CMIP3 (Stroeve et al 2007).

I was rewatching Earth: Climate Wars the other day, and laughed at the section on the credibility of climate models, which essentially argued that because Hansen got the global response to Pinatubo correct we should believe what climate models tell us about the climate at the end of the next century. Of course, we'd shout it to the roottops if Hansen's model had got it wrong, but I think some recognition is due of what a small hurdle this was.

Similarly, how much confidence should climate modellers have in Hansen's 1988 prediction? As the Hargreaves paper cited notes, Hansen's GCM overpredicted warming by some 40% as assessed in its first 20 years. This was still better than a naive prediction of no warming, but was still a long way out. Moreover, it should now be possible to redo Hargreaves' assessment at the 25-year mark and it is more than likely that the naive prediction will now outperform the GCM.

And what about the Arctic sea ice predictions? You have to laugh at the authors' shamelessness in picking Arctic sea ice here. Look, it's worse than we thought! Nevertheless, Stroeve et al 2007 proves an interesting read, with computer model simulations presented alongside observational data going back to 1950. The early figures in this dataset were apparently based on a paper from the Met Office, a read of which reveals that they were based on interpolation from other data points. The paper also contains these words of caution:

Care must be taken when using HadISST1 for studies of observed climatic variability, particularly in some data sparse regions, because of the limitations of the interpolation techniques, although it has been done successfully...

Datasparse regions like the Arctic then?

I think I'm right in saying that there has been another paper published recently which reconstructed sea ice levels from old satellite photos and showed that the Met Office figures were too high, but I can't lay my hands on it at the moment.

So, Schmidt and Sherwood is an interesting read, but I'm not sure that the poor policymaker will draw much comfort from it.