Figure caption: These maps show the observed (left) and model-predicted (right) air temperature trend from 1970 to 1999. The climate model developed by the National Center for Atmospheric Research (NCAR) is used here as an example. More than 50 such simulations were analyzed in the published study. (Illustration: Koichi Sakaguchi)
I was alerted to a new paper that examines the predictive skill of the multi-decadal global climate predictions; h/t to Anthony Watts in his post
Actually, the article also informs us on their value for even longer time periods,. The article is
Sakaguchi, K., X. Zeng, and M. A. Brunke (2012), The hindcast skill of the CMIP ensembles for the surface air temperature trend, J. Geophys. Res., 117, D16113, doi:10.1029/2012JD017765.
[as a side comment, Xubin Zeng was one of my Ph.d. students (and an outstanding one!) who I have published with, and I have also published with Mike Brunke].
The abstract reads [highlight added]
Linear trends of the surface air temperature (SAT) simulated by selected models from the Coupled Model Intercomparison Project (CMIP3 and CMIP5) historical experiments are evaluated using observations to document (1) the expected range and characteristics of the errors in hindcasting the ‘change’ in SAT at different spatiotemporal scales, (2) if there are ‘threshold’ spatiotemporal scales across which the models show substantially improved performance, and (3) how they differ between CMIP3 and CMIP5. Root Mean Square Error, linear correlation, and Brier score show better agreement with the observations as spatiotemporal scale increases but the skill for the regional (5° × 5° – 20° × 20° grid) and decadal (10 – ∼30-year trends) scales is rather limited. Rapid improvements are seen across 30° × 30° grid to zonal average and around 30 years, although they depend on the performance statistics. Rather abrupt change in the performance from 30° × 30° grid to zonal average implies that averaging out longitudinal features, such as land-ocean contrast, might significantly improve the reliability of the simulated SAT trend. The mean bias and ensemble spread relative to the observed variability, which are crucial to the reliability of the ensemble distribution, are not necessarily improved with increasing scales and may impact probabilistic predictions more at longer temporal scales. No significant differences are found in the performance of CMIP3 and CMIP5 at the large spatiotemporal scales, but at smaller scales the CMIP5 ensemble often shows better correlation and Brier score, indicating improvements in the CMIP5 on the temporal dynamics of SAT at regional and decadal scales.
The conclusions contain the informative caution
The spatiotemporal scales with more reliable model skills as identified in this study are consistent with previous studies [Randall et al., 2007] and suggest caution in directly using the outputs of long-term simulations for regional and decadal studies.
This is reminensent of the statement by Kevin Trenberth who wrote for Nature entitled
“…..we do not have reliable or regional predictions of climate.”
Clearly, the CMIP5 model results do not have the skill needed by the impacts communities either directly from the global model or dynamically or statistically downscaled on any multi-decadal time scales, as we summarized in our article
Pielke Sr., R.A., and R.L. Wilby, 2012: Regional climate downscaling – what’s the point? Eos Forum, 93, No. 5, 52-53, doi:10.1029/2012EO050008.
If they do not have sufficient skill for time periods less than 30 years for surface temperature, and longer time periods are made up of 30 years periods, they certainly will not have added skill at any multi-decadal time period. Moreover, since other climate metrics (e.g. precipitation) are even more difficult to predict, the lack of value of the CMIP5 model runs for the impacts communities is actually well (although subtlely) documented in the Sakaguchi et al 2o12 paper.
There is a major oversight, however, in the Sakaguchi et al 2o12 paper. This paper neglected to include available peer reviewed papers that document a serious lack of skill in the CMIP5 model runs. I have summarized these in my posts
By neglecting the peer reviewed papers I listed in those posts [most of which available to the authors], the Sakaguchi et al 2o12 even with its critical assessment of the CMIP3 and CMIP5 model predictive skill, has still not completely assessed the actual skill of the CMIP5 and CMIP3 model capabilities.