Monday, 16 January 2017

Cranberry picking short-term temperature trends

Photo of cranberry fields

Monckton is a heavy user of this disingenuous "technique" and should thus know better: you cannot get any trend, but people like Monckton unfortunately do have much leeway to deceive the population. This post will show that political activists can nearly always pick a politically correct period to get a short-term trend that is smaller than the long-term trend. After this careful selection they can pretend to be shocked that scientists did not tell them about this slowdown in warming.

Traditionally this strategy to pick only the data you like is called "cherry picking". It is such a deplorable deceptive strategy that "cherry picking" sounds too nice to me. I would suggest calling it "cranberry picking". Under the assumption that people only eat cranberries when the burn peeing is worse. Another good new name could be "wishful picking."

In a previous post, I showed that the uncertainty of short-term trends is huge, probably much larger than you think, the uncertainty monster can only stomach a few short-term trends for breakfast. Because of this large uncertainty the influence of cranberry picking is probably also larger than you think. Even I was surprised by the calculations. I hope the uncertainty monster does not upset his stomach, he does not get the uncertainties he needs to thrive.

Uncertainty monster made of papers

Size of short-term temperature fluctuations

To get some realistic numbers we first need to know how large the fluctuations around the long-term trend are. Thus let's first have a look at the size of these fluctuations in two surface temperature and two tropospheric temperature datasets:
  • the surface temperature of Berkeley Earth (formerly known as BEST),
  • the surface temperature of NASA-GISS: GISTEMP,
  • the satellite Temperature of the Total Troposphere (TTT) of Remote Sensing Systems (RSS),
  • the satellite Temperature of the Lower Troposphere (TLT version 6 beta) of the University of Alabama in Huntsville (UAH).
The four graphs below have two panels. The top panel shows the yearly average temperature anomalies over time as red dots. The Berkeley Earth data series starts earlier, but I only use data starting in 1880 because earlier data is too sparse and may thus not show actual climatic changes in the global mean temperature. For both surface temperature datasets the second world war is removed because its values are not reliable. The long-term trend is estimated using a [[LOESS]] smoother and shown as a blue line.

The lower panel shows the deviations from the long-term trend as red dots. The standard deviation of these fluctuations over the full period is written in red. The graphs for the surface temperature also gives the standard deviation of the deviations over the shorter satellite period written in blue for comparison with the satellite data. The period does not make much difference.

Both tropospheric datasets have fluctuations with a typical size (standard deviation) of 0.14 °C. The standard deviation of the surface datasets varies a little depending on the dataset or period. For the rest of this post I will use 0.086 °C as a typical value for the surface temperature.

The tropospheric temperature clearly shows more short-term variability. This mainly comes from El Nino, which has a stronger influence on the temperature high up in the air than on the surface temperature. This larger noise level gives the impression that the trend in the tropospheric temperature is smaller, but the trend in the RSS dataset is actually about the same as the surface trend; see below.

The trend in the preliminary UAHv6 temperature is currently lower than all others. Please note that, the changes from the previous version of UAH to the recent one are large and that the previous version of UAH showed more (recent) warming* and about the same trend as the other datasets.

Uncertainty of short-term trends

Already without cranberry picking short-term trends are problematic because of the strong influence of short-term fluctuations. While a average value computed over 10 years of data is only 3 times as uncertain as a 100-year average, the uncertainty of a 10-year trend is 32 times as large as a 100-year trend.**

To study how accurate a trend is you can generate random numbers and compute their trend. On average this trend will be zero, but due to the short-term fluctuations any individual realization will have some trend. By repeating this procedure often you can study how much the trend varies due to the short-term fluctuations, how uncertain the trend is, or more positively formulated: what the confidence interval of the trend is. See my previous post for details. I have done this for the graph below; for the satellite temperatures the random numbers have a standard deviation of 0.14 °C, for the surface temperatures 0.086 °C.

The graph below shows the confidence interval of the trends, which is two times the standard deviation of 10,000 trends computed from 10,000 series of random numbers. A 10-year trend of the satellite temperatures, which may sound like a decent period, has a whooping uncertainty of 3 °C per century.*** This means that with no long-term trend the short-term trend will vary between -3°C and +3 °C per century for 95% of the cases and for the other 5% even more. That is the uncertainty from the fluctuations along, there are additional uncertainties due to changes in the orbit, the local time the satellite observes, calibration and so on.

Cherry picking the begin year

To look at the influence of cranberry picking, I generated series of 30 values, computed all possible trends between 10 and 30 years and selected the smallest trend. The confidence intervals of these cranberry picked satellite temperature trends are shown below in red. For comparison the intervals for trends without cranberry picking, like above, are shown in blue. To show both cases clearly in the same graph, I have shifted the both bars a little away from each others.

The situation is similar for the surface temperature trends. However, because the data is less noisy, the confidence intervals of the trends are smaller; see below.

While the short-term trends without cranberry picking have a huge uncertainty, on average they are zero. With cranberry picking the average trends are clearly negative, especially for shorter trends, showing the strong influence of selecting a specific period. Without cranberry picking half of the trends are below zero, with cranberry picking 88% of the trends are negative.

Cherry picking the period

For some the record temperatures the last two years are not a sign that they were wrong to see a "hiatus". Some claim that there was something like a "pause" or a "slowdown" since 1998, but that it recently stopped. This claim gives even more freedom for cranberry picking. Now also the end year is cranberry picked. To see how bad this is, I again generated noise and selected the period lasting at least 10 years with the lowest trend and ending this year, or one year earlier or two years earlier.

The graphs below compare the range of trends you can get with cranberry picking the begin and end year in green with "only" cranberry picking the begin year like before in red. With double cranberry picking 96% of the trends are negative and the trends are going down even more. (Mitigation skeptics often use this "technique" by showing an older plot, when the newer plot would not be as "effective".)

A negative trend in the above examples of random numbers without any trend would be comparable to a real dataset where a short-term trend is below the long-term trend. Thus by selecting the "right" period, political activists can nearly always claim that scientists talking about the long-term trend are exaggerating because they do not look at this highly interesting short period.

In the US political practice the cranberry picking will be worse. Activists will not only pick a period of their political liking, but also the dataset, variable, region, depth, season, or resolution that produces a graph that can be misinterpreted. The more degrees of freedom, the stronger the influence of cranberry picking.


There are a few things you can do to protect yourself against making spurious judgements.

1. Use large datasets. You can see in the plots above that the influence of cranberry picking is much smaller for the longer trends. For a 30-year period the difference between the blue confidence intervals for a typical 30-year period and the red confidence intervals for a cranberry picked 30-year period is small. Had I generated series of 50 random numbers rather than 30 numbers, this would likely have shown a larger effect of cranberry picking on 30-year trends, but still a lot smaller than on 10-year trends.

2. Only make statistical tests for relationships you expect to exist. This limits your freedom and the chance that one of the many possible statistical tests is spuriously significant. If you make 100 statistical tests of pure noise, 5 of them will on average be spuriously significant.

There was no physical reason for global warming to stop or slow down after 1998. No one computed the trend since 1998 because they had a reason to expect a change. They computed it because their eyes had seen something; that makes the trend test cranberry picking by definition. The absence of a reason should have made people very careful. The more so because there was a good reason to expect spurious results starting in a large El Nino year.

3. Study the reasons for the relationship you found. Even if I would wrongly have seen the statistical evidence for a trend decrease as credible, I would not have made a big point of it before I had understood the reason for this trend change. In the "hiatus" case the situation was even reversed: it was clear from the beginning that most of fluctuations that gave the appearance of a "hiatus" in the eyes of some was El Nino. Thus there was a perfectly fine physical reason not to claim that there was a change in the trend.

There is currently a strong decline in global sea ice extent. Before I cry wolf, accuse scientists of fraud and understating the seriousness of climate change, I would like to understand why this decline happened.

4. Use the right statistical test. People have compared the trend before 1998 and after 1998 and their uncertainties. These trend uncertainties are not valid for cherry picked periods. In this case, the right test would have been one for a trend change at an unknown position/year. There was no physical reason to expect a real trend change in 1998, thus the statistical test should take that the actual reason you make the test is because your eye sampled all possible years.

Against activists doing these kind of things we cannot do much, except trying to inform their readers how deceptive this strategy is. For example by linking to this post. Hint, hint.

Let me leave you with a classic Potholer54 video delicately mocking Monckton's cranberry picking to get politically convenient global cooling and melting ice trends.

Related reading

Richard Telford on the Monckton/McKitrick definition of a "hiatus", which nearly always gives you one: Recipe for a hiatus

Tamino: Cherry p

Statistically significant trends - Short-term temperature trend are more uncertain than you probably think

How can the pause be both ‘false’ and caused by something?

Atmospheric warming hiatus: The peculiar debate about the 2% of the 2%

Temperature trend over last 15 years is twice as large as previously thought because much warming was over Arctic where we have few measurements

Why raw temperatures show too little global warming

* The common baseline period of UAH5.6 and UAH6.0 is 1981-2010.

** These uncertainties are for Gaussian white noise.

*** I like the unit °C per century for trends even if the period of the trend it shorter. You get rounder numbers and it is easier to compare the trends to the warming we have seen in the last century and expert to see in the next one.

**** The code to compute the graphs of this post can be downloaded here.

***** Photo of cranberry field by mrbanjo1138 used under a Creative Commons Attribution-NonCommercial-NoDerivs 2.0 Generic (CC BY-NC-ND 2.0) license.

Sunday, 8 January 2017

Much ado about NOAAthing

I know NOAAthing.

This post is about nothing. Nearly nothing. But when I found this title I had to write it.

Once upon a time in America there were some political activists who claimed that global warming had stopped. These were the moderate voices, with many people in this movement saying that an ice age is just around the corner. Others said global warming paused, hiatused or slowed down. I feel that good statistics has always shown this idea to be complete rubbish (Foster and Abraham, 2015; Lewandowsky et al., 2016), but at least in 2017 it should be clear that it is nothing, nothing what so ever. It is interpreting noise. More kindly: interpreting variability, mostly El Nino variability.

Even if you disingenuously cherry-pick 1998 the hot El Nino year as the first year of your trend to get a smaller trend, the short-term trend is about the same size as the long-term trend now that 2016 is another hot El Nino year to balance out the first crime. Zeke Hausfather tweeted to the graph below: "You keep using that word, "pause". I do not think it means what you think it means." #CulturalReference

In 2013 Boyin Huang of NOAA and his colleagues created an improved sea surface dataset called ERSST.v4. No one cared about this new analysis. Normal good science.

Thomas Karl of NOAA and his colleagues showed what the update means for the global temperature (ocean and land). The interesting part is the lower panel. It shows that the adjustments make global warming smaller by about 0.2°C. Climate data scientists naturally knew this and I blogged about his before, but I think the Karl paper was the first time this was shown in the scientific literature. (The adjustments are normally shown for the individual land or ocean datasets.)

But this post is unfortunately about nearly nothing, about the minimal changes in the top panel of the graph below. I made the graph extra large, so that you can see the differences. The thick black line shows the new assessment (ERSST.v4) and the thin red line the previous estimated global temperature signal (ERSST.v3). Differences are mostly less than 0.05°C, both warmer and cooler. The "problem" is the minute change at the right end of the curves.

The new paper by Zeke Hausfather and colleagues now shows evidence that the updated dataset (ERSSTv4) is indeed better than the previous version (ERSSTv3b). It is a beautifully done study of high technical quality. They do so by comparing the ERSST dataset, which comes from a large number of data sources, with  data that comes only from only one source (buoys, satellites (CCl) or ARGO). These single-source datasets are shorter, but without trend uncertainties due to the combination of sources.

The recent trend of HadSST also seems to be too small and to a lesser amount also COBE-SST. This problem with HadSST was known, but not published yet. The warm bias of ships that measure SST at their engine room intake is getting smaller over the last decade. The reason for this is not yet clear. The main contender seems to be that the fleet has become more actively managed and (typically warm) bad measurements have been discontinued.

Also ERSST uses ship data, but it gives them a much smaller weight compared to the buoy data. That makes this problem less visible in ERSST. Prepare for a small warming update for recent temperatures once this problem is better understood and corrected for. And prepare for the predictable cries of the mitigation skeptical movement and their political puppets.

Karl and colleagues showed that as a consequence of the minimal changes in ERSST and if you start a trend in 1998 and compute a trend, this trend is statistically significant. In the graph below you can see in the left global panel that the old version of ERSST (circles) had a 90% confidence interval (vertical line) that includes zero (not statistically significantly different from zero), while the confidence interval of updated dataset did not (statistically significant).

Did I mention that such a cherry-picked begin year is a very bad idea? The right statistical test is one for a trend change at an unknown year. This test provides no evidence whatsoever for a recent trend change.

That the trend in Karl and colleagues was statistically significant should thus not have mattered: Nothing could be worse than define a "hiatus" period as one were the confidence interval of a trend includes zero. However, this is the definition public speaker Christopher Monckton uses for his blog posts at Watts Up With That, a large blog of the mitigation skeptical movement. Short-term trends are very uncertain, their uncertainty increases very fast the shorter the period is. Thus if your period is short enough, you will find a trend whose confidence interval includes zero.

You should not do this kind of statistical test in the first place because of the inevitable cherry picking of the period, but if you want to statistically test whether the long-term trend suddenly dropped, the test should have the long-term trend as null-hypothesis. This is the 21st century, we understand the physics of man-made global warming, we know it should be warming, it would be enormously surprising and without any explanation if "global warming had stopped". Thus continued warming is the thing that should be disproven, not a flat trend line. Good luck doing so for such short periods given how enormously uncertain short-term trends are.

The large uncertainty also means that cherry picking a specific period to get a low trend has a large impact. I will show this numerically in an upcoming post. The methods to compute a confidence interval are for a randomly selected period, not for a period that was selected to have a low trend.

Concluding, we have something that does not exist, but which was made into an major talking point of the mitigation skeptical movement. This movement put their credibility on fluctuations that produced a minor short-term trend change that was not statistically significant. The deviation was also so small that it put an unfounded confidence in the perfection of the data.

The inevitable happened and small corrections needed to be made to the data. After this even disingenuous cherry-picking and bad statistics were no longer enough to support the talking point. As a consequence Lamar Smith of TX21 abused his Washington power to punish politically inconvenient science. Science that was confirmed this week. This should all have been politically irrelevant because the statistics were wrong all along. This was politically irrelevant by now because the new El Nino produced record temperatures in 2016 and even cherry picking 1998 as begin year is no longer enough.

"Much Ado About Nothing is generally considered one of Shakespeare's best comedies because it combines elements of mistaken identities, love, robust hilarity with more serious meditations on honour, shame, and court politics."
Yes, I get my culture from Wikipedia)

To end on a positive note, if your are interested in sea surface temperature and its uncertainties, we just published a review paper in the Bulletin of the American Meteorological Society: "A call for new approaches to quantifying biases in observations of sea-surface temperature." This focuses on ideas for future research and how the SST community can make it easier for others to join the field and work on improving the data.

Another good review paper on the quality of SST observations is: "Effects of instrumentation changes on sea surface temperature measured in situ" and also the homepage of HadSST is quite informative. For more information on the three main sea surface temperature datasets follow these links: ERSSTv4, HadSST3 and COBE-SST. Thanks to John Kennedy for suggesting the links in this paragraph.

Do watch the clear video below where Zeke Hausfather explains the study and why he thinks recent ocean warming used to be underestimated.

Related reading

The op-ed by the authors Kevin Cowtan and Zeke Hausfather is probably the best article on the study: Political Investigation Is Not the Way to Scientific Truth. Independent replication is the key to verification; trolling through scientists' emails looking for out-of-context "gotcha" statements isn't.

Scott K. Johnson in Ars Technica (a reading recommendation for science geeks by itself): New analysis shows Lamar Smith’s accusations on climate data are wrong. It wasn't a political plot—temperatures really did get warmer.

Phil Plait (Bad Astronomy) naturally has a clear explanation of the study and the ensuing political harassment: New Study Confirms Sea Surface Temperatures Are Warming Faster Than Previously Thought

The take of the UK MetOffice, producers of HadSST, on the new study and the differences found for HadSST: The challenge of taking the temperature of the world’s oceans

Hotwhopper is your explainer if you like your stories with a little snark: The winner is NOAA - for global sea surface temperature

Hotwhopper follow-up: Dumb as: Anthony Watts complains Hausfather17 authors didn't use FUTURE data. With such a response to the study it is unreasonable to complain about snark in the response.

The Christian Science Monitor gives a good non-technical summary: Debunking the myth of climate change 'hiatus': Where did it come from?

I guess it is hard for a journalist to not write that the topic is not important. Chris Mooney at the Washington Post claims Karl and colleagues is important: NOAA challenged the global warming ‘pause.’ Now new research says the agency was right.

Climate Denial Crock of the Week with Peter Sinclair: New Study Shows (Again): Deniers Wrong, NOAA Scientists Right. Quotes from several articles and has good explainer videos.

Global Warming ‘Hiatus’ Wasn’t, Second Study Confirms

The guardian blog by John Abraham: New study confirms NOAA finding of faster global warming

Atmospheric warming hiatus: The peculiar debate about the 2% of the 2%

No! Ah! Part II. The return of the uncertainty monster

How can the pause be both ‘false’ and caused by something?


Grant Foster and John Abraham, 2015: Lack of evidence for a slowdown in global temperature. US CLIVAR Variations, Summer 2015, 13, No. 3.

Zeke Hausfather, Kevin Cowtan, David C. Clarke, Peter Jacobs, Mark Richardson, Robert Rohde, 2017: Assessing recent warming using instrumentally homogeneous sea surface temperature records. Science Advances, 04 Jan 2017.

Boyin Huang, Viva F. Banzon, Eric Freeman, Jay Lawrimore, Wei Liu, Thomas C. Peterson, Thomas M. Smith, Peter W. Thorne, Scott D. Woodruff, and Huai-Min Zhang, 2015: Extended Reconstructed Sea Surface Temperature Version 4 (ERSST.v4). Part I: Upgrades and Intercomparisons. Journal Climate, 28, pp. 911–930, doi: 10.1175/JCLI-D-14-00006.1.

Thomas R. Karl, Anthony Arguez, Boyin Huang, Jay H. Lawrimore, James R. McMahon, Matthew J. Menne, Thomas C. Peterson, Russell S. Vose, Huai-Min Zhang, 2015: Possible artifacts of data biases in the recent global surface warming hiatus. Science. doi: 10.1126/science.aaa5632.

Lewandowsky, S., J. Risbey, and N. Oreskes, 2016: The “Pause” in Global Warming: Turning a Routine Fluctuation into a Problem for Science. Bull. Amer. Meteor. Soc., 97, 723–733, doi: 10.1175/BAMS-D-14-00106.1.

Tuesday, 3 January 2017

Budapest Seminar on Homogenization and Quality Control

3 – 7 April 2017
Organized by the Hungarian Meteorological Service (OMSZ
Supported by WMO

Photo Budapest parliament

The first eight Seminars for Homogenization and Quality Control in Climatological Databases as well as the first three Conferences on Spatial Interpolation Techniques in Climatology and Meteorology were held in Budapest and hosted by the Hungarian Meteorological Service.

The 7th Seminar in 2011 was organized together with the final meeting of the COST Action ES0601: Advances, in Homogenization Methods of Climate Series: an integrated approach (HOME), while the first Conference on Spatial Interpolation was organized in 2004 in the frame of the COST Action 719: The Use of Geographic Information Systems in Climatology and Meteorology. Both series were supported by WMO.

In 2014 the 8th Homogenization Seminar and the 3rd Interpolation Conference were organized together considering certain theoretical and practical aspects. Theoretically there is a strong connection between these topics since the homogenization and quality control procedures need spatial statistics and interpolation techniques for spatial comparison of data. On the other hand the spatial interpolation procedures (e.g. gridding) require homogeneous, high quality data series to obtain good results.

The WMO CCl set up team to support quality control and homogenization activities at NMHSs. The main task of the Task Team on Homogenisation (TT HOM) to provide guidance to Members on methodologies, standards and software required for quality control and homogenization of long term climate time-series. The results of the homogenization sessions can improve the content of the guidance is under preparation.

Marx and Engels at the museum for communist area statues in Szobor park.

Communist area statues in Szobor park.

Thermal bath.

To go to the Hungarian weather service, you probably need to take a tram or metro to [[Széll Kálmán tér]] (Széll Kálmán Square).

The post office at Széll Kálmán Square.

"Szent Gellért tér" station of Budapest Metro.
Highlights and Call for Papers
The homogeneous data series with high quality and the spatial interpolation are indispensable for the climatological and meteorological examinations. The primary goal of this meeting is to promote the discussion about the methodological and theoretical aspects.

The main topics of homogenization and quality control are intended to be the following:
  • Methods for homogenization and quality control of monthly data series
  • Spatial comparison of series, inhomogeneity detection, correction of series
  • Methods for homogenization and quality control of daily data series, examination of parallel measurements
  • Relation of monthly and daily homogenization, mathematical formulation of homogenization for climate data series generally
  • Theoretical evaluation and benchmark for methods, validation statistics
  • Applications of different homogenization and quality control methods, experiences with different meteorological variables

The main topics of spatial interpolation are the following:
  • Temporal scales: from synoptic situations to climatological mean values
  • Interpolation formulas and loss functions depending on the spatial probability distribution of climate variables
  • Estimation and modelling of statistical parameters (e.g.: spatial trend, covariance or variogram) for interpolation formulas using spatiotemporal sample and auxiliary model variables (topography)
  • Use of auxiliary co-variables, background information (e.g.: dynamical model results, satellite, radar data) for spatial interpolation (data assimilation, reanalysis)
  • Applications of different interpolation methods for the meteorological and climatological fields
  • Gridded databases, digital climate atlases, results of the DanubeClim project

Organizational Details
Persons intending to participate on the meeting are required to pre-register by filling the form enclosed. To have a presentation please send us also a short abstract (max. 1 page). The pre-registration and abstract submission deadline is 20 February 2017. Publication of the papers in proceedings in the serial WMO/WCP/WCDMP is foreseen after the meeting. Paper format information will be provided in the second circular. The registration fee (including book of abstracts, coffee breaks, social event, proceedings) is 120 EUR. The second circular letter with accommodation information will be sent to the pre-registered people by 28 February 2017.

Location and Dates
The meeting will be held 3-7 April 2017 in Budapest, Hungary at the Headquarter of Hungarian Meteorological Service (1. Kitaibel P. Street, Budapest, 1024).

The official language of the meeting is English.

For further information, please contact:
Hungarian Meteorological Service
P.O.Box 38, Budapest, H-1525, Hungary

The pre-registration form can be downloaded here.

* Photo Budapest Parliament by Never House used under a Creative Commons Attribution-NonCommercial-NoDerivs 2.0 Generic (CC BY-NC-ND 2.0) license.
* Budapest-Szobor park by Simon used under a Creative Commons Attribution-NonCommercial-NoDerivs 2.0 Generic (CC BY-NC-ND 2.0) license.
* Budapest-Exterior thermal baths by Simon used under a Creative Commons Attribution-NonCommercial-NoDerivs 2.0 Generic (CC BY-NC-ND 2.0) license.

* Szobor park by bjornsphoto used under a Creative Commons Attribution-NonCommercial-NoDerivs 2.0 Generic (CC BY-NC-ND 2.0) license.
* Photo "Szent Gellért tér" station of Budapest Metro by Marco Fieber used under a Creative Commons Attribution-NonCommercial-NoDerivs 2.0 Generic (CC BY-NC-ND 2.0) license.

Saturday, 31 December 2016

Clickbait articles dividing an already divided country even more

Yougov and The Economist just published a poll on political conspiracy theories that was designed to produce outrage and clicks on stories how stupid the other tribe is and how we rational people are above that. Secular Talk, a popular high-quality YouTube pundit, made two stories out of it: "50% Of Dems Think 'Russia Tampered With Vote Tallies' To Elect Trump" and "Nearly Half Of Trump Voters Believe Hillary Is Pimping Kids."

In the poll Americans were asked: "Do you think the following statements are true or not true?". One of the conspiracies was: "54. Conspiracy Theories – Russia tampered with vote tallies in order to get Donald Trump elected President."

The problem was that people could only chose between: "Definitely true", "Probably true", "Probably not true", and "Definitely not true". There was no option: "I do not know". "I do not know" would have been the rational answer. "Definitely no evidence" would be another fine option that was not available.

Especially given the lack of audits of the votes and the undemocratic active resistance of some Republican politicians against audits "probably not true" is as wrong as "Probably true". As an aside, in a democracy there should be no doubt that votes are counted correctly, every citizen should be able to follow the trail from the filled in ballot the voter put into the ballot box to the final count and audits should actually count paper ballots by hand and naturally be free/automatic.

The strangest people are those that said "Definitely true" or "Definitely not true".

The article made the story more juicy by combining the 17% of Democrats answering "Definitely true" with the 35% answering "Probably true" to "50% of Dems".

Conspiracies do exist, if there was more than one person involved in making this poll, my personal conspiracy would be that the question was crafted to produce artificial outrage, clicks and revenue.

Another conspiracy question was: "52. Conspiracy Theories – Leaked email from some of Hillary Clinton’s campaign staffers contained code words for pedophilia, human trafficking and satanic ritual abuse - what some people refer to as ’Pizzagate’."

Most of the Pizzagate Republicans (40%) said "probably true" rather than "definitely true" (9%) and again had no way to say "I do not know". Most American likely should have answered "I do not know" because they do not follow the conspiracy media that closely.

Such pedophilia conspiracies naturally exist, but in case of Pizzagate there is no evidence for it. "Definitely no evidence" would be the right answer, but that could again not be answered.

The question is also badly phrased. The hacked emails did contained the word "pizza", which is claimed to be a code word used by pedophiles. Thus if you take the question too literally you could even answer that the question is true.

These answers are sad, but no way as bad as the headline "Nearly Half Of Trump Voters Believe Hillary Is Pimping Kids" suggests. The poll is the saddest part of this story.

1. If you see a poll, check the exact formulation of the question and the answers, especially when it is not a standard question that is regularly asked and the poll is thus more likely intended to generate clicks.
2. Even reputable sources, like The Economist and Secular Talk, can be wrong.
3. large parts of the media make money manufacturing outrage. To reduce this do your due diligence before you spread an emotional story. If that means spreading less stories: fine. News is no longer scarce, quality is.

This polls was a way to produce clicks and divide an already divided country even more.

Related reading

5 things the media does to manufacture outrage.

The BBC will continue fake debates on climate science on false balance ("due weight") and fake public debates.

Believe me, the GOP needs to open itself to rational debate.

Saturday, 24 December 2016

Can Trump fiddle with climate observations?

Some people worry about the Trump administration fiddling with climate data to get politically correct trends. There is a lot to worry about. This is not one of them.

Raw data

A Trump stooge could not fiddle with the raw data because there are many organisations that also have a copy. Old data can be found in the annual reports of the weather services. New data in their databases and in many archives that collect the observations that weather services share with each other, the so-called CLIMAT messages every month (for climate purposes) and GTS messages every day (for meteorology).

Nick Stokes checked how station data moves from the Australian Bureau of Meteorology (BOM) to NOAA's Global Historical Climate Network (GHCN). Spoiler: it fits. The marine observations by voluntary observing vessels are less open to the public due to piracy concerns, but this is just a small part of the marine data nowadays and regional data managers can check whether everything fits (Freeman et al., 2016).

Because climate data needs to be consistent, a lot of data would need to be changed. If only a few stations were changed these would be different from their neighbours and as such identified as faulty. Thus to find any fiddling of the raw data only a small number of stations needs to be sampled.

Who fiddles the fiddlers?

The raw data is processed to estimate the global (and regional) climatic changes from it. The temperature change in the raw data and the actual estimate of the temperature increase is shown in the graph below. The actual temperature increase was smaller than the one of the raw data. The main reason is that old sea surface temperature measurements were made with buckets and the water in the bucket would cool a little due to evaporation before reading the thermometer.

Theoretically a Trump stooge could mess with this data processing. However, the software is open. Thus everyone can check whether the code produces the claimed results when applied to the raw data. The changes would thus have to be made in the open and justified.

The Trump stooge could naturally openly make changes to the code and claim that this "improves" the data processing. Whether the new software is actually an improvement is, however, something we can check. For the land station data we have a validation dataset where we know the climate signal we put in and the measurement artefacts we put in and can thus see how well the software removes the artefacts. The current homogenization software of NOAA removes these measurement artefacts well. If the software is fiddled with for political reasons, it will perform worse.

If that happens I am sure someone will be willing to apply the better original code to the raw data and publish these results. That only requires modest software skills.

Signs of clear fiddling

Apart from such audits larger changes would also be obvious because data needs to be consistent with each other. Land surface temperature, sea surface temperature and upper air temperature, for example, need to fit together. Marine temperatures from ships, drifting buoys, moored coastal buoys and [[ARGO]] need to fit together. Pressure will need to fit to wind, the circulation to precipitation, precipitation to snow cover, snow cover to reflectance, reflectance to incoming radiation and absorption. The changes in the physical climate would need to fit to the changes observed by biologists and bird spotters in nature, to changes noticed by agricultural scientists, economists and farmers in yields, to changes seen by glaciologists in glaciers and ice caps, to changes measured by hydrologists in stream flows.

It is easier to go to the moon than to fake the moon landing in Hollywood. It is easier to fake the moon landing than to make significant changes to climate data without being caught.

Destruction of data

Thus with some vigilance the data we have will be okay. What is worrying is the possible destruction of datasets and the discontinuing of measurements. Trump's election has shown that catastrophes with less than 50% chance do happen. Climate data is part of our cultural and scientific heritage and important to protect communities. Thus we should not take any risks with them.

Destroying data would put American communities in more danger, but the Trump administration may not care. For instance, Florida’s Republican government banned state employees from discussing global warming. That hinders adaptation to climate change. Republican North Carolina legislators voted to ignore sea-level rise projections, putting citizens at a higher risk of drowning, endangering infrastructure and leading to higher adaptation costs later on. Several Republican politicians have wasted taxpayer money to harass climate scientists in return for campaign contributions.

Dumpster in Quebec with hundreds of carelessly discarded historic books and documents.
The conservative Harper government in Canada committed libricide and destroyed seven environmental libraries and threw the books on the trash heap.

Also what has not happened before can happen. The radicalised Congress has shown disregard of the American public by shutting down the government. In the election campaign Trump called for violence to quell protest and to lock up his opponent. An alt-nazi will be advisor in the White House. Never before have so many banks and oil companies had a seat at the tables of power. This is the first time that a foreign power was forced to move a celebration to the hotel of the president-elect. Presidents normally do not have hotels in Washington DC that all diplomats will use to gain favours. Trump will be the first president with a 300 million dollar loan from a foreign bank he is supposed to regulate. This list could be longer than this post. Do not be fooled that this is normal.

If a Trump stooge would order the deletion of a dataset also the backups would be deleted. Thus it is good that independent initiatives have sprung up to preserve digital archives. I hope and trust that all American scientists will make sure that there are copies of their data and code on private disks and in foreign countries.

Unfortunately not all data is digitised or digitisable, many documents still need to be scanned, proxy sources such as (ice) cores and tree rings contain information that has not been measured yet or needs future technologies to measure. Some of these ice cores come from glaciers that no longer exist.

Observations could be stopped. Even if they would be continued again after four years, the gap would limit our ability to see changes and thus to adapt to climate change and limit the damages. Looking at the proposed members of the Trump cabinet, I fear that such damages and costs for American citizens will not stop them. I hope that the blue states and Europe will be willing to pick up the tab until decency is restored and is prepared to move fast when needed. At a scientific conference in San Francisco Jerry Brown, Governor of California, promised earlier this month that "if Trump turns off the earth monitoring satellites California will launch its own damn satellites." A hopeful sign in the face of Washington fundamentalism.

Related reading

The Center for Science and Democracy at the Union of Concerned Scientists has established a hotline for National Oceanic and Atmospheric Administration (NOAA) employees to report political meddling

How Trump’s White House Could Mess With Government Data. 538 on how the Trump administration could fiddle with other (economic) datasets and especially affect how the information is communicated

A chat with Gavin Schmidt of NASA-GISS on why climate data is mostly save and the legal protections for federal scientists communicating science

Just the facts, homogenization adjustments reduce global warming

Statistical homogenisation for dummies

Benchmarking homogenisation algorithms for monthly data

Brady Dennis for The Washington Post: Scientists are frantically copying U.S. climate data, fearing it might vanish under Trump

Canadian CBC radio on Harper's carbon government attack on science: Science Under Siege

On the cuts to Canadian science and observational capabilities under Harper. Academic Matters: Harper’s attack on science: No science, no evidence, no truth, no democracy

On Harper's destruction of libraries: The Harper Government Has Trashed and Destroyed Environmental Books and Documents

In Florida, officials ban term 'climate change'

New Law in North Carolina Bans Latest Scientific Predictions of Sea-Level Rise


Freeman, E., Woodruff, S. D., Worley, S. J., Lubker, S. J., Kent, E. C., Angel, W. E., Berry, D. I., Brohan, P., Eastman, R., Gates, L., Gloeden, W., Ji, Z., Lawrimore, J., Rayner, N. A., Rosenhagen, G. and Smith, S. R., 2016: ICOADS Release 3.0: a major update to the historical marine climate record. Int. J. Climatol., doi: 10.1002/joc.4775

Tuesday, 6 December 2016

Scott Adams: The Non-Expert Problem and Climate Change Science

Scott Adams, the creator of Dilbert, wrote today about how difficult it is for a non-expert to judge science and especially climate science. He argues that it is normally a good idea for a non-expert to follow the majority of scientists. I agree. Even as a scientist I do this for topics where I am not an expert and do not have the time to go into detail. You cannot live without placing trust and you should place your trust wisely.

While it is clear to Scott Adams that a majority of scientists agree on the basics of climate change, he worries that they still could all be wrong. He lists the below six signals that this could be the case and sees them in climate science. If you get your framing from the mitigation sceptical movement and only read the replies to their nonsense you may easily get his impression. So I thought it would be good to reply. It would be better to first understand the scientific basis, before venturing into the wild.

The terms Global Warming and Climate Change are both used for decades

Scott Adams assertion: It seems to me that a majority of experts could be wrong whenever you have a pattern that looks like this:

1. A theory has been “adjusted” in the past to maintain the conclusion even though the data has changed. For example, “Global warming” evolved to “climate change” because the models didn’t show universal warming.

This is a meme spread by the mitigation sceptics that is not based on reality. From the beginning both terms were used. One hint is name of the Intergovernmental Panel on Climate Change, a global group of scientists who synthesise the state of climate research and was created in 1988.

The irony of this strange meme is that it were the PR gurus of the US Republicans who told their politicians to use the term "climate change" rather than "global warming", because "global warming" was more scary. The video below shows the historical use of both terms.

Global warming was called global warming because the global average temperature is increasing, especially in the beginning there were still many regions were warming was not yet observed, while it was clear that the global average temperature was increasing. I use the term "global warming" if I want to emphasis the temperature change and the term "climate change" when I want to include all the other changes in the water cycle and circulation. These colleagues do the same and provide more history.

Talking about "adjusted", mitigation sceptics like to claim that temperature observations have been adjusted to show more warming. Truth is that the adjustments reduce global warming.

Climate models are not essential for basic understanding

Scott Adams assertion: 2. Prediction models are complicated. When things are complicated you have more room for error. Climate science models are complicated.

Yes, climate models are complicated. They synthesise a large part of our understanding of the climate system and thus play a large role in the synthesis of the IPCC. They are also the weakest part of climate science and thus a focus of the propaganda of the mitigation sceptical movement.

However, when it comes to the basics, climate model are not important. We know about the greenhouse effect for well over a century, long before we had any numerical climate models. That increasing the carbon dioxide concentration of the atmosphere leads to warming is clear, that this warming is amplified because warm air can contain more water, which is also a greenhouse gas, is also clear without any complicated climate model. This is very simple physics already used by Svante Arrhenius in the 19th century.

The warming effect of carbon dioxide can also be observed in the deep past. There are many reasons why the climate changes, but without carbon dioxide we can, for example, not understand the temperature swings of the past ice ages or why the Earth was able to escape from being completely frozen (Snowball Earth) at a time the sun was much dimmer.

The main role of climate models is trying to find reasons why the climate may respond differently this time than in the past or whether there are mechanisms beyond the simply physics that are important. The average climate sensitivity from climate models is about the same as for all the other lines of evidence. Furthermore, climate models add regional detail, especially when in comes to precipitation, evaporation and storms. These are helpful to better plan adaptation and estimate the impacts and costs, but are not central for the main claim that there is a problem.

Model tuning not important for basic understanding

Scott Adams assertion: 3. The models require human judgement to decide how variables should be treated. This allows humans to “tune” the output to a desired end. This is the case with climate science models.

Yes, models are tuned. Mostly not for the climatic changes, but to get the state of the atmosphere right, the global maps of clouds and precipitation, for example. In the light of my answer to point 2, this is not important for the question whether climate change is real.

The consensus is a result of the evidence

Scott Adams assertion: 4. There is a severe social or economic penalty for having the “wrong” opinion in the field. As I already said, I agree with the consensus of climate scientists because saying otherwise in public would be social and career suicide for me even as a cartoonist. Imagine how much worse the pressure would be if science was my career.

It is clearly not career suicide for a cartoonist. If you claim that you only accept the evidence because of social pressure, you are saying you do not really accept the evidence.

Scott Adams sounds as if he would like scientists to first freely pick a position and then only to look for evidence. In science it should go the other way around.

This seems to be the main argument and shows that Scott Adams knows more about office workers than about the scientific community. If science was your career and you would peddle the typical nonsense that comes from the mitigation sceptical movement that would indeed be bad for your career. In science you have to back up your claims with evidence. Cherry picking and making rookie errors to get the result you would like to get are not helpful.

However, if you present credible evidence that something is different, that is wonderful, that is why you become a scientist. I have been very critical of the quality of climate data and our methods to remove data problems. Contrary to Adams' expectation this has helped my career. Thus I cannot complain how climatology treats real skeptics. On the contrary, a lot of people supported me.

Another climate scientist, Eric Steig, strongly criticized the IPCC. He wrote about his experience:
I was highly critical of IPCC AR4 Chapter 6, so much so that the [mitigation skeptical] Heartland Institute repeatedly quotes me as evidence that the IPCC is flawed. Indeed, I have been unable to find any other review as critical as mine. I know "because they told me" that my reviews annoyed many of my colleagues, including some of my [RealClimate] colleagues, but I have felt no pressure or backlash whatsoever from it. Indeed, one of the Chapter 6 lead authors said “Eric, your criticism was really harsh, but helpful "thank you!"
If you have the evidence, there is nothing better than challenging the consensus. It is also the reason to become a scientist. As a scientist wrote on Slashdot:
Look, I'm a scientist. I know scientists. I know scientists at NOAA, NCAR, NIST, the Labs, in academia, in industry, at biotechs, at agri-science companies, at space exploration companies, and at oil and gas companies. I know conservative scientists, liberal scientists, agnostic scientists, religious scientists, and hedonistic scientists.

You know what motivates scientists? Science. And to a lesser extent, their ego. If someone doesn't love science, there's no way they can cut it as a scientist. There are no political or monetary rewards available to scientists in the same way they're available to lawyers and lobbyists.

Scientists consider and weigh all the evidence

Scott Adams assertion: 5. There are so many variables that can be measured – and so many that can be ignored – that you can produce any result you want by choosing what to measure and what to ignore. Our measurement sensors do not cover all locations on earth, from the upper atmosphere to the bottom of the ocean, so we have the option to use the measurements that fit our predictions while discounting the rest.

No, a scientist cannot produce any result they "want" and an average scientist would want to do good science and not get a certain result. The scientific mainstream is based on all the evidence we have. The mitigation sceptical movement behaves in the way Scott Adams expects and likes to cherry pick and mistreat data to get the results they want.

Arguments from the other side only look credible

Scott Adams assertion: 6. The argument from the other side looks disturbingly credible.

I do not know which arguments Adams is talking about, but the typical nonsense on WUWT, Breitbart, Daily Mail & Co. is made to look credible on the surface. But put on your thinking cap and it crumbles. At least check the sources. That reveals most of the problems very quickly.

For a scientist it is generally clear which arguments are valid, but it is indeed a real problem that to the public even the most utter nonsense may look "disturbingly credible". To help the public assess the credibility of claims and sources several groups are active.

Most of the zombie myths are debunked on RealClimate or Skeptical Science. If it is a recent WUWT post and you do not mind some snark you can often find a rebuttal the next day on HotWhopper. Media articles are regularly reviewed by Climate Feedback, a group of climate scientists, including me. They can only review a small portion of the articles, but it should be enough to determine which of the "sides" is "credible". If you claim you are sceptical, do use these resources and look at all sides of the argument and put in a little work to go in depth. If you do not do your due diligence to decide where to place your trust, you will get conned.

While political nonsense can be made to look credible, the truth is often complicated and sometimes difficult to convey. There is a big difference between qualified critique and uninformed nonsense. Valuing the strength of the evidence is part of the scientific culture. My critique of the quality of climate data has credible evidence behind it. There are also real scientific problems in understanding changes of clouds, as well as the land and vegetation. These are important for how much the Earth will respond, although in the long run the largest source of uncertainty is how much we will do to stop the problem.

There are real scientific problems when it comes to assessing the impacts of climate change. That often requires local or regional information, which is a lot more difficult than the global average. Many impacts will come from changes in severe weather, which are by definition rare and thus hard to study. For many impacts we need to know several changes at the same time. For droughts precipitation, temperature, humidity of the air and of the soil and insolation are all important. Getting them all right is hard.

How humans and societies will respond to the challenges posed by climate change is an even more difficult problem and beyond the realm of natural science. Not only the benefits, but also the costs of reducing greenhouse gas emissions are hard to predict. That would require predicting future technological, economic and social development.

When it comes to how big climate change itself and its impacts will be I am sure we will see surprises. What I do not understand is why some are arguing that this uncertainty is a reason to wait and see. The surprises will not only be nice, they will also be bad and all over increase the risks of climate change and make the case for solving this solvable problem stronger.

Related reading

Older post by a Dutch colleague on Adams' main problem: Who to believe?

How climatology treats sceptics

What's in a Name? Global Warming vs. Climate Change

Fans of Judith Curry: the uncertainty monster is not your friend

Video medal lecture Richard B. Alley at AGU: The biggest control knob: Carbon Dioxide in Earth's climate history

Just the facts, homogenization adjustments reduce global warming

Climate model ensembles of opportunity and tuning

Journalist Potholer makes excellent videos on climate change and true scepticism: Climate change explained, and the myths debunked

* Photo Arctic Sea Ice by NASA Goddard Space Flight Center used under a Creative Commons Attribution 2.0 Generic (CC BY 2.0) license.
* Cloud photo by Bill Dickinson used under a Creative Commons Attribution-NonCommercial-NoDerivs 2.0 Generic (CC BY-NC-ND 2.0) license.

Wednesday, 30 November 2016

Statistically significant trends - Short-term temperature trend are more uncertain than you probably think

Yellowknife, Canada, where the annual mean temperature is zero degrees Celsius.

In times of publish or perish, it can be tempting to put "hiatus" in your title and publish an average article on climate variability in one of the prestigious Nature journals. But my impression is that this does not explain all of the enthusiasm for short-term trends. Humans are greedy pattern detectors: it is better to see a tiger, a conspiracy or trend change one time too much than one time too little. Thus maybe humans have a tendency to see significant trends where statistics keeps a cooler head.

Whatever the case, I expect that also many scientists will be surprised to see how large the difference in uncertainty is between long-term and short-term trends. However, I will start with the basics, hoping that everyone can understand the argument.

Statistically significant

That something is statistically significant means that it is unlikely to happen due to chance alone. When we call a trend statistically significant, it means that it is unlikely that there was no trend, but that the trend you see is due to chance. Thus to study whether a trend is statistically significant, we need to study how large a trend can be when we draw random numbers.

For each of the four plots below, I drew ten random numbers and then computed the trend. This could be 10 years of the yearly average temperature in [[Yellowknife]]*. Random numbers do not have a trend, but as you can see, a realisation of 10 random numbers appears to have one. These trends may be non-zero, but they are not significant.

If you draw 10 numbers and compute their trends many times, you can see the range of trends that are possible in the left panel below. On average these trends are zero, but a single realisation can easily have a trend of 0.2. Even higher values are possible with a very small probability. The statistical uncertainty is typically expressed as a confidence interval that contains 95% of all points. Thus even when there is no trend, there is a 5% chance that the data has a trend that is wrongly seen as significant.**

If you draw 20 numbers, 20 years of data, the right panel shows that those trends are already quite a lot more accurate, there is much less scatter.

To have a look at the trend errors for a range of different lengths of the series, the above procedure was repeated for lengths between 5 and 140 random numbers (or years) in steps of 5 years. The confidence interval of the trend for each of these lengths is plotted below. For short periods the uncertainty in the trend is enormous. It shoots up.

In fact, the confidence range for short periods shoots up so fast that it is hard to read the plot. Thus let's show the same data with different (double-logarithmic) axis in the graph below. Then the relationship look like a line. That shows that size of the confidence interval is a power law function of the number of years.

The exponent is -1.5. As an example that means that the confidence interval of a ten year trend is 32 (101.5) times as large as the one of a hundred year trend.

Some people looking at the global mean temperature increase plotted below claim to see a hiatus between the years 1998 and 2013. A few years ago I could imagine people thinking: that looks funny, let's make a statistical test whether there is a change in the trend. But when the answer then clearly is "No, no way", and the evidence shows it is "mostly just short-term fluctuations from El Nino", I find it hard to understand why people believe in this idea so strongly that they defend it against this evidence.

Especially now it is so clear, without any need for statistics, that there never was anything like an "hiatus". But still some people claim there was one, but it stopped. I have no words. Really, I am not faking this dear colleagues. I am at a loss.

Maybe people look at the graph below and think, well that "hiatus" is ten percent of the data and intuit that the uncertainty of the trend is only 10 times as large, not realising that it is 32 times.

Maybe people use their intuition from computing averages; the uncertainty of a ten year average is only 3 times as large that of a 100 year average. That is a completely different game.

The plots below for the uncertainty in the average are made in the same way as the above plots for the trend uncertainty. Also here more data is better, but the function is much less steep. Plots of power laws always look very similar, you need to compare the axis or the computed exponent, which in this case is only -0.5.

It is typical to use 30 year periods to study the climate. These so-called climate normals were introduced around 1900 in a time the climate was more or less stable and the climate needed to be described for agriculture, geography and the like. Sometimes it is argued that to compute climate trends you need at least 30 years of data, that is not a bad rule of thumb and would avoid a lot of nonsense, but the 30 year periods were not intended as a period on which to compute trends. Given how bad the intuition of people apparently is there seems to be no alternative to formally computing the confidence interval.

That short-term trends have such a large uncertainty also provides some insight into the importance of homogenisation. The typical time between two inhomogeneities is 15 to 20 years for temperature. The trend over the homogeneous subperiods between two inhomogeneities is thus very uncertain and not that important for the long-term trend. What counts is the trend of the averages of the homogeneous subperiods.

That insight makes you want to be sure you do a good job when homogenising your data rather than mindlessly assume everything will be alright and raw data good enough. Neville Nicholls wrote about how he started working on homogenisation:
When this work began 25 years or more ago, not even our scientist colleagues were very interested. At the first seminar I presented about our attempts to identify the biases in Australian weather data, one colleague told me I was wasting my time. He reckoned that the raw weather data were sufficiently accurate for any possible use people might make of them.

Related reading

How can the pause be both ‘false’ and caused by something?

Atmospheric warming hiatus: The peculiar debate about the 2% of the 2%

Sad that for Lamar Smith the "hiatus" has far-reaching policy implications

Temperature trend over last 15 years is twice as large as previously thought

Why raw temperatures show too little global warming


* In Yellowknife the annual mean temperature is about zero degrees Celsius. Locally the standard deviation of annual temperatures is about 1°C. Thus I could conveniently use the normal distribution with zero mean and standard deviation one. The global mean temperature has a much smaller standard deviation of its fluctuations around the long-term trend.
** Rather than calling something statistically significant and thus only communicating whether the probability was below 5% or not, it fortunately becomes more common to simply give the probability (p-value). In the past this was hard to compute and people compared their computation to the 5% levels given in statistical tables in books. With modern numerical software it is easy to compute the p-value itself.
*** Here is the cleaned R code to generated the plots of this post.

The photo of YellowKnife at the top is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported license.