The Customer Service Survey

Uncorrelated Data

by Peter Leppik on Wed, 2014-07-23 15:22

A few months ago I wrote about the Spurious Correlation Generator, a fun web page where you could discover pointless facts like the divorce rate in Maine is correlated to per-capita margarine consumption (who knew!).

The other side of the correlation coin is when there's a complete lack of any correlation whatsoever. Today, for example, I learned that in a sample of 200 large corporations, there is zero correlation between the relative CEO pay and the relative stock market return. None, nada, zippo.

(The statistician in me insists that I restate that as, "any correlation in the data is much smaller than the margin of error and is statistically indistinguishable from zero." But that's why I don't let my inner statistician go to any of the fun parties.)

Presumably, though, the boards of directors of these companies must believe there's some relationship between stock performance and CEO pay. Otherwise why on Earth would they pay, for example, Larry Ellison of Oracle $78 million? Or $12 million to Ken Frazier, CEO of Merck? What's more, since CEOs are often paid mostly in stock, the lack of any correlation between stock price and pay is surprising.

It's easy to conclude that these big companies are being very foolish and paying huge amounts of money to get no more value than they would have gotten had they hired a competent chief executive who didn't happen to be a rock star. And this explanation could well be right.

On the other hand, the data doesn't prove it. Just as a strong correlation doesn't prove that two things are related to each other, the lack of a correlation doesn't prove they aren't related.

It's also possible that the analysis was flawed. Or that they are related but in some more complicated way than a simple correlation.

In this case, here are a few things I'd examine about the data and the analysis before concluding that CEO pay isn't related to stock performance:

  1. Sample Bias: The data for this analysis consists of 200 large public companies in the U.S. Since there are thousands of public companies, and easily 500 which could be considered "large," it's important to ask how these 200 companies were chosen and what happens if you include a larger sample. It appears that the people who did the analysis chose the 200 companies with the highest CEO pay, which is a clearly biased sample. So the analysis needs to be re-done with a larger sample including companies with low CEO pay, or ideally, all public companies above some size (for example, all companies in the S&P 500).
  2. Analysis Choices: In addition to choosing a biased sample, the people who did the analysis also chose a weird way to try to correlate the variables. Rather than the obvious analysis correlating CEO pay in dollars against stock performance in percent, this analysis was done using the relative rank in CEO pay (i.e. 1 to 200) and relative rank in stock performance. That flattens any bell curve distribution and eliminates any clustering which, depending on the details of the source data, could either eliminate or enhance any linear correlation.
  3. Input Data: Finally there's the question of what input data is being used for the analysis. Big public companies usually pay their CEOs mostly in stock, so you would normally expect a very strong relationship between stock price and CEO pay. But there's a quirk in how CEO compensation is reported to shareholders: in any given year, the reported CEO pay includes only what the CEO got for actually selling shares in that year. A chief executive could hang on to his (or too rarely, her) stock for many years and then sell it all in one big block. So in reality the CEO is collecting many years' worth of pay all at once, but the stock performance data used in this analysis probably only includes the last year. The analysis really should include CEO pay and stock performance for multiple years, possibly the CEO's entire tenure.

So the lack of correlation in a data analysis doesn't mean there's no relationship in the data. It might just mean you need to look harder or in a different place.

Sorry...This form is closed to new submissions.