The Customer Service Survey

More on Correlation

by Peter Leppik on Wed, 2014-05-14 14:39

Just in case you need more convicing that correlation is not causation, spend some time browsing the Spurious Correlation Generator.

Every day it finds a new (but almost invariably bogus) statistical correlation between two unrelated data sets. Today, for example, we learn that U.S. spending on science and technolgy is very strongly correlated (0.992) with suicides by hanging.

In the past we've learned that the divorce rate in Maine is correlated to per-capita margarine consumption, and that the number of beehives is inversely correlated with arrests for juvenile pot possession.

These correlations are completely bogus, of course. The point is to illustrate the fact that if you look at enough different data points you will find lots of spurious statistical relationships. With computers and big data, it's trivially simple to generate thousands of correlations with very high statistical significance which also happen to be utterly meaningless.

Sorry...This form is closed to new submissions.