No Warming for 15 Years!

"There's been no global warming for fifteen years!" This is the latest cry of the global warming deniers. It's totally spurious, of course, because you need 30 years to show a climate trend, not 15. But just to prove my point, I'll show in detail how sample size affects the conclusions you can come to on temperature trends.

Here are the Hadley CRUTEMP3 annual land-sea temperature anomalies for the past 30 years:

dT (K)	Year
0.017	1982
0.316	1983
-0.058	1984
-0.008	1985
0.117	1986
0.287	1987
0.348	1988
0.199	1989
0.431	1990
0.343	1991
0.092	1992
0.175	1993
0.333	1994
0.468	1995
0.204	1996
0.463	1997
0.820	1998
0.489	1999
0.361	2000
0.552	2001
0.664	2002
0.646	2003
0.611	2004
0.747	2005
0.669	2006
0.678	2007
0.528	2008
0.642	2009
0.713	2010
0.538	2011

Now, let's start regressing the temperature values on the calendar year. For the non-statisticians in the room, "regression" or "least-squares analysis" is how you relate one data set to another. Using a sample size of two years, you will always have a perfect correlation, because two points are all you need for a line, so that figure is "trivially significant." Using more, you're doing actual regression using the least-squares line. When this is against time as the X variable, as it is here, you are determining the trend. That's what trend means in statistics.

Here's what we get with different sample sizes:

Trend	p	N
0.0234	0.0007	20
0.0198	0.0038	19
0.0161	0.0188	18
0.0141	0.056	17
0.0142	0.087	16
0.0064	0.387	15
0.0032	0.701	14
0.0117	0.142	13
0.0097	0.287	12
-0.0011	0.881	11
-0.007	0.39	10
-0.008	0.437	9
-0.0109	0.41	8
-0.0205	0.22	7
-0.0125	0.56	6
-0.0095	0.771	5
0.0101	0.853	4
-0.052	0.598	3
-0.175	0	2

With small samples, p is (except for the 2-value trivial data) no better than flipping a coin, and even the sign of the effect changes rapidly. Statisticians usually consider a regression useful only if p is less than 0.1 (the 90% level of confidence), 0.05 (the 95% level), or 0.01 (the 99%) level. The confidence level is the probability that your results are due to chance alone.

Note that 15 years, the denier's favorite period, is the most you can claim there's no significant warming. If we extend the sample size to 16 years, the relation is significant at the 90% level, and if we extend it to 18 years, it's significant at the 95% level, and with 19 years, at the 99% level. Note, too, that the trend has stabilized and no longer changes sign. It's up. Warming. The level of confidence for the full sample size of N = 30 is left as an exercise for the student.

This is why sample size is such an important consideration. The smaller your sample, the larger the chance the results are unreliable, contaminated by noise. If you tried to estimate the mean height of Americans with the first two people you ran across, your estimate would probably be well off the actual mean. Even with 15 people, it wouldn't likely be very good. It turns out that with a sample size of N = 30, even with what's called a "non-normal distribution," you can usually be confident of getting results that are significant at the 95% confidence level. Note that pollsters usually want a sample of 1000 to 3000 Americans in election years. There's a good reason for that--all else being equal, larger sample sizes are better--and you don't have to poll the whole population to get reliable results.

Page created:	04/10/2013
Last modified:	04/10/2013
Author:	BPL