Numbers Divorced from Reality

This article on Climate Audit really gets at an issue that bothers many skeptics about the state of climate science:  the profession seems to spend so much time manipulating numbers in models and computer systems that they start to forget that those numbers are supposed to have physical meaning.

I discussed the phenomenon once before.  Scientists are trying to reconstruct past climate variables like temperature and precipitation from proxies such as tree rings.  They begin with a relationship they believe exists based on an understanding of a particular system – ie, for tree rings, trees grow faster when its warm so tree rings are wider in warm years.  But as they manipulate the data over and over in their computers, they start to lose touch with this physical reality.

In this particular example, Steve McIntyre shows how, in one temperature reconstruction, scientists have changed the relationship opportunistically between the proxy and temperature, reversing their physical understanding of the process and how similar proxies are handled in the same study, all in order to get the result they want to get.

McIntyre’s discussion may be too arcane for some, so let me give you an example.  As a graduate student, I have been tasked with proving that people are getting taller over time and estimating by how much.  As it turns out, I don’t have access to good historic height data, but by a fluke I inherited a hundred years of sales records from about 10 different shoe companies.  After talking to some medical experts, I gain some confidence that shoe size is positively correlated to height.  I therefore start collating my 10 series of shoe sales data, pursuing the original theory that the average size of the shoe sold should correlate to the average height of the target population.

It turns out that for four of my data sets, I find a nice pattern of steadily rising shoe sizes over time, reflecting my intuition that people’s height and shoe size should be increasing over time.  In three of the data sets I find the results to be equivical — there is no long-term trend in the sizes of shoes sold and the average size jumps around a lot.  In the final three data sets, there is actually a fairly clear negative trend – shoe sizes are decreasing over time.

So what would you say if I did the following:

  • Kept the four positive data sets and used them as-is
  • Threw out the three equivocal data sets
  • Kept the three negative data sets, but inverted them
  • Built a model for historic human heights based on seven data sets – four with positive coefficients between shoe size and height and three with negative coefficients.

My correlation coefficients are going to be really good, in part because I have flipped some of the data sets and in part I have thrown out the ones that don’t fit initial bias as to what the answer should be.  Have I done good science?  Would you trust my output?  No?

Well what I describe is identical to how many of the historical temperature reconstruction studies have been executed  (well, not quite — I have left out a number of other mistakes like smoothing before coefficients are derived and using de-trended data).

Mann once wrote that multivariate regression methods don’t care about the orientation of the proxy. This is strictly true – the math does not care. But people who recognize that there is an underlying physical reality that makes a proxy a proxy do care.

It makes no sense to physically change the sign of the relationship of our final three shoe databases.  There is no anatomical theory that would predict declining shoe sizes with increasing heights.  But this seems to happen all the time in climate research.  Financial modellers who try this go bankrupt.  Climate modellers who try this to reinforce an alarmist conclusion get more funding.  Go figure.

  • Nice analogy,

    This represents exactly my own experience with Mann08. Looking at what happened to the data to support a conclusion it was amazing the lengths they went to.

    You could add the option of chopping the ends of the divergent shrinking shoe size data and pasting on the ends from the increasing shoe size sets with methods so confusing even the experts can’t figure out the meaning. Mann 08 did that to 90% of the data series.

  • I think your (and your co-bloggers’) messages are starting to hit home. I’m a liberal Democrat and read the major liberal blogs–and it’s eerie that for the past 5 days nobody is writing about climate change or global catastrophe. Believe me, that hasn’t happened for a while. If it’s a coincidence, it’s a strange one. More here:

  • George Tobin

    1) Is there a difference between men’s and women’s shoe size trends? — manufacturer’s probably adjusted size definitions for women over time because it is pleasing for women to think they have a more petite shoe size. At least the shoe companies did not go back and Hansenize historical sales records to make current average shoes seem larger than the past…

    2) I don’t know why anybody bothers to defend Mann. All the smoke and mirrors just to arbitrarily weight the ‘right’ data and wipe out the ‘wrong’ data and then declare a mathematically established trend. It’s like a butcher who uses a long stick instead of his thumb on the scales. What a waste. And as you already eloquently explained in an earlier post, if the earth’s climate were really that flat and staple for that that long, it kinda contradicts the absurdly highly sensitivity measure required to support AGW Doomsday scenarios so why defend the hockey stick at all?

  • Ed Fargler

    Slightly off topic, but perhaps someone here can help me out. My questions are in regards to data and modeling. From what I know about weather phenomenon, they are directly observable and yet not 100% predictable. Climate is described through data gathered daily and from geological record.

    So this leads to the questions: If weather is a non-linear chaotic phenomenon, does it stand that climate is non-linear? If so, aren’t the predictive capabilities of a climate model severly limited even if it’s output is range bound?

  • Jim

    Is Monckton’s chart of long wave radiation measurement vs. model mislabeled? Shouldn’t the model be the red line and the measurement black? (This is on page 18.) Gotta love the paper, BTW!!