Forecasting

One of the defenses often used by climate modelers against charges that climate is simple to complex to model accurately is that “they do it all the time in finance and economics.”  This comes today from Megan McArdle on economic forecasting:

I find this pretty underwhelming, since private forecasters also unanimously think they can make forecasts, a belief which turns out to be not very well supported.  More than one analysis of these sorts of forecasts has found them not much better than random chance, and especially prone to miss major structural changes in the economy.   Just because toggling a given variable in their model means that you produce a given outcome, does not mean you can assume that these results will be replicated in the real world.  The poor history of forecasting definitionally means that these models are missing a lot of information, and poorly understood feedback effects.

Sounds familiar, huh?  I echoed these sentiments in a comparison of economic and climate forecasting here.

  • Alex

    And, last post, Waldo, please answer my question, which you avoided answering at least twice now:

    If you won’t be able to get the exact list of stations participating in HadCRUT3, will you change your opinion and concede that your original statement that “it is all out there” is false?

    Yes or no.

  • Russ R.

    Alex:

    “Russ, you scored a point prematurely. Waldo did not provide a link to the exact list of stations participating in HadCRUT3.”

    Sorry, Alex. I failed to notice the discrepancies.

  • Alex

    No problem.

    The history of the list linked by Waldo is illustrative as well. If we were to follow publishing practices common to other branches of science, the exact list of stations (as well as their data) should have been made available December 19, 2005 (the date of the main publication). The first FOI request (not a simple request, mind you) was filed on September 28, 2006. After much hassle, CRU posted the incomplete and otherwise flawed version of the list in Waldo’s link in September 2007, that is, a year after the first FOI. The original list is still nowhere to be seen. It is the end of 2010, so we are 5 years in.

  • Tony B.

    This is actually the closest I have ever been to the data. Previously I have not been able to download the “absolute” data, but this time I was. It appears to be the average for the particular grid (not raw data), but other than being averaged, they are claiming it is the raw results. I haven’t had the time to graph the data yet.

    As far as the station list, I am pretty sure that it is not “just one small file” as Alex states. I get the point that it is a simpler thing to produce than the raw data, but it has its problems as well.

    For one thing, the list is a moving target over time as stations get added and others get taken off. So the list would have to state the time periods that a particular station was used.

    Additionally I do not expect that the data is very clean. The multiple stations with the same number, for instance. Maybe my expectations are not as high as they should be for science of this caliber, but I am not going to get too hung up on the reuse of several station numbers, as long as there is a reasonable explanation as to why the data would still be valid. In other words, as long as the stations are real and distinct, I could care less what the assigned number is.

    Further, I expect that instruments are going to be failing and being replaced. This should be documented, but not unexpected.

    As far as the number of stations, I agree that what is needed to verify the science (in order for it to be truely science) is the complete list of stations that have ever been used. Not the list that are being used this month, as a think (just my suspicion) this above list is.

    Finally, (at least as far as I can think of at the moment) the commercially available data should be listed as to source, and which were used. I don’t have a problem with scientists using commercial data in their studies, as long as they don’t hide their data behind the fact. If the person verifying the data has to purchase the data, then so be it (regardless of whether the original researcher had to or not), as long as my tax dollars didn’t pay for it in the first place. However, if the data is really licensed then I expect to be able to obtain the exact same data. If some one wants to profit from the distribution of the data, then I really do expect that the data will be available. Otherwise, I can see no reason for the license in the first place.

    I do want the raw data, by station, but don’t want to put too many restrictions on the condition of the data other than it be reasonably enough documented to be able to repeat the study. I am looking for repeatability, not perfection. I agree that the first level should have already been provided, but I don’t think that the second is ever going to happen, because it is likely it doesn’t exist. I also agree that the station list is the place to start, and that what we have been given so far is not complete.

  • WaldoNo

    Alex, I will repeat:

    “So no, my brother, I am not ready to concede anything yet.”

  • Russ R.

    Waldo,

    “So no, my brother, I am not ready to concede anything yet.”

    What evidence would actually convince you that certain important data have not been made available to skeptical researchers?

    You know… apart from the various private emails that admit to obstructing skeptic’s requests, or CRU’s public claims that the raw data were first confidential and subsequently lost, or perhaps the fact that your own attempts to obtain the data have come up short?