Data Splices

Splicing data sets is a virtual necessity in climate research.  Let’s think about how I might get a 500,000 year temperature record.  For the first 499,000 years I probably would use a proxy such as ice core data to infer a temperature record.  From 150-1000 years ago I might switch to tree ring data as a proxy.  From 30-150 years ago I probably would use the surface temperature record.  And over the last 30 years I might switch to the satellite temperature measurement record.  That’s four data sets, with three splices.

But there is, obviously, a danger in splices.  It is sometimes hard to ensure that the zero values are calibrated between two records (typically we look at some overlap time period to do this).  One record may have a bias the other does not have.  One record may suppress or cap extreme measurements in some way (example – there is some biological limit to tree ring growth, no matter how warm or cold or wet or dry it is).  We may think one proxy record is linear when in fact it may not be linear, or may be linear over only a narrow range.

We have to be particularly careful at what conclusions we draw around the splices.  In particular, one would expect scientists to be very, very skeptical of inflections or radical changes in the slope or other characteristic of the data that occur right at a splice.  Occam’s Razor might suggest the more logical solution is that such changes are related to incompatibilities with the two data sets being spliced, rather than any particular change in the physical phenomena being measured.

Ah, but not so in climate.  A number of the more famous recent findings in climate have coincided with splices in data sets.  The most famous is in Michael Mann’s hockey stick, where the upward slope at the end of the hockey stick occurs exactly at the point where tree ring proxy data is spliced to instrument temperature measurements.  In fact, if looking only at the tree ring data brought to the present, no hockey stick occurs (in fact the opposite occurs in many data sets he uses).   The obvious conclusion would have been that the tree ring proxy data might be flawed, and that it was not directly comparable with instrumental temperature records.  Instead, Al Gore built a movie around it.  If you are interested, the splice issue with the Mann hockey stick is discussed in detail here.

Another example that I have not spent as much time with is the ocean heat content data, discussed at the end of this post.  Heat content data from the ARGO buoy network is spliced onto older data.  The ARGO network has shown flat to declining heat content every year of its operation, except for a jump in year one from the old data to the new data.  One might come to the conclusion that the two data sets did not have their zero’s matched well, such that the one year jump is a calibration issue in joining the data sets, and not the result of an actual huge increase in ocean heat content of a magnitude that has not been observed before or since.  Instead, headline read that the ARGO network has detected huge increases in ocean heat content!

So this brings us to today’s example, probably the most stark and obvious of the bunch, and we have our friend Michael Mann to thank for that.  Mr. Mann wanted to look at 1000 years of hurricanes, the way he did for temperatures.  He found some proxy for hurricanes in years 100-1000, basically looking at sediment layers.  He uses actual observations for the last 100 years or so as reported by a researcher named Landsea  (one has to adjust hurricane numbers for observation technology bias — we don’t miss any hurricanes nowadays, but hurricanes in 1900 may have gone completely unrecorded depending on their duration and track).  Lots of people argue about these adjustments, but we are not going to get into that today.

Here are his results, with the proxy data in blue and the Landsea adjusted observations in red.  Again you can see the splice of two very different measurement technologies.

mannlandseaunsmoothed

Now, you be the scientist.  To help you analyze the data, Roger Pielke via Anthony Watt has calculated to basic statistics for the blue and red lines:

The Mann et al. historical predictions [blue] range from a minimum of 9 to a maximum of 14 storms in any given year (rounding to nearest integer), with an average of 11.6 storms and a standard deviation of 1.0 storms. The Landsea observational record [red] has a minimum of 4 storms and a maximum of 28 with and average of 11. 7 and a standard deviation of 3.75.

The two series have almost dead-on the same mean but wildly different standard deviations.  So, junior climate scientists, what did you conclude?  Perhaps:

  • The hurricane frequency over the last 1000 years does not appear to have increased appreciably over the last 100, as shown by comparing the two means.  or…
  • We couldn’t conclude much from the data because there is something about our proxy that is suppressing the underlying volatility, making it difficult to draw conclusions

Well, if you came up with either of these, you lose your climate merit badge.  In fact, here is one sample headline:

Atlantic hurricanes have developed more frequently during the last decade than at any point in at least 1,000 years, a new analysis of historical storm activity suggests.

Who would have thought it?  A data set with a standard deviation of 3.75 produces higher maximum values than a data set with the same mean but with the standard deviation suppressed down to 1.0.  Unless, of course, you actually believe that the data volatility in the underlying natural process suddenly increase several times coincidental in the exact same year as the data splice.

As Pielke concluded:

Mann et al.’s bottom-line results say nothing about climate or hurricanes, but what happens when you connect two time series with dramatically different statistical properties. If Michael Mann did not exist, the skeptics would have to invent him.

Postscript #1: By the way, hurricane counts are a horrible way to measure hurricane activity (hurricane landfalls are even worse).  The size and strength and duration of hurricanes are also important.  Researchers attempt to factor these all together into a measure of accumulated cyclone energy.  This metric of world hurricanes and cyclones has actually be falling the last several years.

global_running_ace2

Postscript #2: Just as another note on Michael Mann, he is the guy who made the ridiculously overconfident statement that “there is a 95 to 99% certainty that 1998 was the hottest year in the last one thousand years.”   By the way, Mann now denies he ever made this claim, despite the fact that he was recorded on video doing so.  The movie Global Warming:  Doomsday Called Off has the clip.  It is about 20 seconds into the 2nd of the 5 YouTube videos at the link.

  • Tim Davis

    Hmm….you really have to wonder about the competence of these guys don’t you – the more elementary the errors they make the more attention they get. You would think that by now someone in the climate science community would have taken Mann aside and asked him quietly and politely to stop embarassing all of them – instead they just close ranks and refuse to admit any errors or problems whatsoever. Pathetic!

  • chico sajovic

    Very nice overview. I am also curious about the splices for SST. Recently there was a report that this years SST was the highest on record. I believe the record they are talking about is a splice of 4 different datasets: (1) bucket temperature measurements (2) Engine intake temperature measurements (3) XBT (4) Argo buoys.

  • clazy

    Clear discussion nicely illustrated by Mann’s example. If he didn’t exist, you would have invented him.

    By the way, you’ved tagged this plice rather than splice.

  • AnonyMoose

    Probably a typo in “Anthony Watt has calculated to basic statistics”.

  • gt

    One think that irks me a lot is the loose definition of “data”. According to mine (admittedly I can be wrong), the red line is comprised of real data. Every point on the line is based on actual observations. Is the blue line, which is obtained based on analysis of proxies, “data”? Maybe “estimation” is a more appropriate word?

  • Pete S

    I agree entirely with Tim Davis. How on earth can the AGW community let Mann open his mouth on any climate matter after so many embarrassing mistakes? One can only assume that they are so arrogant because they seem to have the world’s politicians on their side.

  • Alex Llewelyn

    What do you think of Tamino’s noncomputer model of temperature at open mind?

    http://tamino.wordpress.com/2009/08/17/not-computer-models/#more-1788

  • Fred from Canuckistan . . .

    “and a standard deviation of 3.75.”

    ’nuff said.

  • An unkempt man with his face covered in ashes and wearing a gunny sack robe walks toward us with a sign saying “Repent. The end is near.” When asked for the date of the end, he says next week. When seen again, three weeks later, his appearance and sign are the same. Again, when asked for the date of the end, he says next week. We conclude the man is most likely a psychotic and no longer pay attention to him.

    A person stands in front of us, clean shaven, and wearing a sharp looking three piece suit says essentially the same thing week, after week, after week. The politicians embrace his message, plan to enslave us, want to control every aspect of our lives, and expect us to stand by as they steal us blind. Now who is psychotic? The messenger of doom, the politicians, or we who voted the politicians into office?

    The King is naked. There are increasing numbers of clear voices saying “The King is naked.” Lets hope its enough soon enough otherwise things are going to get very ugly. Keep up the good work.

  • Alex Llewelyn: What do you think of Tamino’s noncomputer model of temperature at open mind?

    His intro to the NO GHG plot: “If we use all climate forcings except greenhouse gases we get this:”

    His presumption is that the climate forcings listed by the NASA GISS is exhaustive.

    The headings of the NASA GISS data table he used are W-M_GHGs, O3, StratH2O, Solar, LandUse, SnowAlb,S tratAer, BC, ReflAer, AIE. Apparently that is all NASA could think of but is it an exhaustive list? Is it even close to accurate to represent each with a single parameter? Are the appropriate factors taken into account to represent each so called forcing? Are they taking the interactions between the forcings into account? None of this is obvious or even addressed in the article. It is simply accepted by Argument of Authority to be adequate. Since the curves give the result desired by the author he does not question it.

    Consider: The sun has multiple impacts: radiance, magnetic flux, solar wind, other?. The radiance is directly related to the heat flux between the sun and the earth and is the the earth’s overwhelming majority source of heat. However, the magnetic flux and solar wind effects have been shown to markedly affect the nature and quantity of cloud cover. Cloud cover is NOT included as one of the forcings. This alone establishes the incompleteness and inadequacy of the so called simulation.

    So the argument essentially consists of Argument from Authority, Begging the Question, and Argument from Ignorance. Any one these logical fallacies is sufficient to discredit the argument. All the rest is simply a pseudo-scientific window dressing so that he can pretend to be doing science. He assumes what he is trying to prove and then acts surprised when he finds it.

    What do I think of the “noncomputer model”? It is scientific BS from start to finish.

  • Mitchel44

    “The last few decades have been marked by a special cultivation of the romance of the future. We seem to have made up our minds to misunderstand what has happened; and we turn, with a sort of relief, to stating what will happen—which is (apparently) much easier.”

    G.K. Chesterton, What’s Wrong With The World

    So, we don’t fully understand what caused the extreme climate changes of the past, but we are confident we can “model” the future, and that it’s all our fault.

    Jeez… I should feel more guilt I guess.

  • stan

    Mann’s “proxy” record for hurricanes is even more bizarre than his proxy record for temperature. He screwed up tree rings badly. His sediment indicator is an even bigger joke.

    I don’t think there’s much point any more in showing all the errors in Mann’s work. In basketball he would be called a “self-check” (i.e. someone incapable of scoring, even if left undefended). The focus should move onto those who publish him or collaborate with him. By choosing to associate with Mann and embracing his “results”, they have placed into question their own scientific judgment.

  • Dale

    As I read this and listen to the YouTube videos (I’m multi-tasking by trying to get some work done! ha!) I began to think about why Gore and Hansen are always in such a rush to declare the science as settled. Cynically, I thought it was because this was necessary to turn the money spicket on for their pockets. That may be true. It may also be true that these guys knew that there science was suspect and if they didn’t try to overwhelm the public with their “science”, that eventually respected scientists and the scientific community in general would start to blow large holes in their assumptions and models.

    A little off topic but, hey…

  • Billy

    I asked this question over at Air Vent too: what kind of research is put into verifying proxies are accurate in the first place before they are used as the basis for some paleoclimate reconstruction? It seems you’d want to spend a LOT of time doing this otherwise everything else is bunk, right?

    Instead, it almost seems like somebody just kind of says, hmmm, I bet sediment layers would be a good proxy for past storm activity let’s use that and off they go. I mean really. Tree rings, mussel shells and sediment layers? Bah! I’m not saying these records don’t contain interesting information or that they shouldn’t be studied. But the faith paleoclimatologists seem to have in the accuracy of the proxies always astonishes me.

    I mean look at that media quote again: “Atlantic hurricanes have developed more frequently during the last decade than at any point in at least 1,000 years”. I admit, I’m no scientist but common sense tells me that you cannot make that bold of a claim using sediment layers. You just can’t. I feel the same way about tree rings and temperature.

  • hunter

    Billy,
    Your point is very valid. But it is the torture and abuse of innocent proxies by the likes of Mann that is the real issue.
    The sediment proxies, until Mann’s creative genius was applied, clearly showed that storm frequency and strength during this era is not unusual, and is likely low.
    Mann, a creative genius like Michelangelo, looks a pile of data, and visuallizes the hockey stick hidden in it. Michelangelo did this to marble. He would remove marble to find the work of art hidden within it.
    Mann, on the other hand, carves away all data that is not a hockey stick, and publishes the result. Michelangelo created sublime art. Mann fabricates untruths.
    Mann abuses a process that makes for great art, but lousy science.

  • Billy,

    Each proxy is local, confounded, noisy, and approximate. At best, its an indicator of a mushy average loosely tied to time and not a specific, hard, pinned in exact time, value. This is true no matter what proxy is used for any parameter you wish to consider.

    The statement “more frequently during the last decade than at any point in at least 1,000 years” cannot validly be made without specific, hard, pinned in exact time data. We have, at best, high quality data for little more than 10% of that period. Most of the rest is hit or miss anecdotal and mushy proxy data.

    The most one can validly say is the mushy average from proxy appears to be or not be inside the variability of the current specific, hard, pinned in exact time data. Since the proxy result for storm frequency currently appears to be well inside the variability of current data, it would be exceedingly risky to act upon the presumption that it is not.

    The tell is “no data has been found that….” Was the data looked for? Did they look in the right place? Did they look in the right way? Did they pay attention when they were looking? Did they use all the data in full context? Will they give you the details of time, place, method, raw data, and intermediate results? In almost all cases for the AGW (aka climate change) argument the answer to ALL these questions is from a qualified maybe to an absolutely not. Add to that the logical fallacies of Argument by Consensus, Argument from Authority, Argument from Ignorance, Begging the Question, Argument Ad Hominem, and the argument collapses into a very ugly pile of garbage. It is irrelevant to the AGWer that his argument is right, wrong, valid, or invalid. What is important is that you believe it is both right and valid and will follow him into the abyss.

    My suggestion is to look harder, think independently, and make your own choices in the matter. You are fully capable of knowing what is best for you. It simply takes enough time, effort, thought, logic, and discipline. There are no shortcuts.

  • Mark McKinnon

    The infamous Mann et al Hockey Stick controversy has eerie parallels to “How to Publish a Scientific Comment in 1 2 3 Easy Steps”. Follow the link to an amazing, hilarious and somewhat disturbing article.

    http://www.scribd.com/doc/18773744/null

  • Another Pete S.

    Attention is paid to what makes headlines. One that reads, “Climate Scientists Find No Reason to Panic!” might work once, but continued non-news doesn’t sell. CRISIS!! is all that matters to the media, right, wrong, or manufactured.

    We don’t hear about the absolute fraud that Mann is. We have to dig for such information. The media and political class will do what is necessary to keep the dead body breathing.

    But the Internet is something they don’t yet control, and the truth is getting out there. Also, the climate isn’t moving in the direction that Mann, Hansen, and the Goracle have predicted.

    On the other hand, the magnetic activity of the sun seems to be a significant factor in climate. I’m anxious to see further results from Svensmark. The research by he and his colleagues could finally pull the plug on the corpse of Anthropogenic climate change.

  • Andrew

    That’s one hairy hockey stick. Needs a shave. Hm…You know, Gavin and Michael aren’t exactly the most clean shaven guys. Maybe they thought that hurricane Goatee would look cool? They let it get out of control though, and Gandalf and Santa will be jealous.

  • j ferguson

    The thing about splicing that perplexes me is that if the proxies are effective reporters for any period, why not all periods in which they can be found and measured? Do the proxies show hockey stick blades?

  • Andrew

    j ferguson-That’s a good question. And the answer is…Well, many proxies seem to just end before the most recent decades! And the ones that don’t? They show…”divergence”.

  • Junior Samples

    I find it unfortunate that both the climate skeptic blogs and the global warming blogs are full of comments from cheerleaders. Is there any real debate going on between these two camps?

  • Steve

    I know that Ice core sampling is one of the means of determining temperature estimates and I can understand how this is accomplished but I do have my doubts about determining the atmospheric composition, in particular the CO2 levels. My doubt revolves around the fact that CO2 is absorbed by water (increased acidity in the antarctic ocean surface) so, is it possible that some of the CO2 is absorbed into the snow and or Ice during the formation of these bubbles for which the atmospheric composition has been determined. If so is it estimated by the climatologists as to the absorption rate as this rate can be increased or decreased with the increase or decrease in surface temperature at the time the snow fell. I’m just saying?