Explanation of College Football and Men's Basketball Rankings Correlations


There are numerous systems to rate and rank college sports teams, and the web, including comparison pages such as Kenneth Massey's, makes it is easy to find and compare quite a number of them.

How can we know which system is best? There are various ways to test them, and these correlations represent one approach, which I've applied to the rankings listed by Kenneth's rankings comparison pages. The approach here is motivated by an assumption: that as the season progresses, eventually the very latest average (mean) ranking on Kenneth's page, which he derives from the mean of all the individual rankings listed, is pretty close to the "correct" ranking, i.e., pretty close to the best possible ranking based upon games played so far. This assumption matches my intuition, but I don't know any way to prove it. However, it is easily observed that as the season progresses, the rankings produced by various systems grow more consistent with each other and grow closer to this mean ranking. The question I've addressed is this: among all the individual ranking systems, which one produced rankings closest to this "nearly best" ranking in previous weeks?

It would be nice, during the middle of the season, to know how close some particular ranking is to rankings produced later in the season. For example, it would be nice if during "week 8" of the season, you knew which system was producing rankings as close as possible to what will generally be seen in rankings during, say, "week 10". But naturally we can't see into the future.

Yet we can do the same thing after the fact. After "week 10", we can take a generally accepted ranking and look back to see which system was getting close to it back during "week 8". That's what these correlations do: they correlate the rankings produced by systems in earlier weeks with the most recent average (mean) ranking, which we assume to be pretty close to the best ranking available.

Seeing that a ranking system produces the results closest to "week 10" rankings back in "week 8" suggests the system has done the best job so far. Thus we might conclude that that system's final season rankings are the best of the lot. However, there remains the possibility that some ranking systems don't work so well with just a few weeks' data yet produce excellent results near and at the end of the season. Indeed these correlations have turned up examples of pairs of rankings such that one of the two has a closer correlation to the final ranking during early weeks and the other has the closer correlation during the later weeks. Thus this is no fool-proof test of ranking quality.


Specifically these are correlations of various college football and men's basketball rankings specifically comparing early weeks' individual rankings with Kenneth comparison pages' most recent average (mean) rankings. The intent is to show which individual rankings most accurately predict later consensus as represented by the average rankings.

The input data is taken from Kenneth's comparison pages:


Key to the Columns

These correlation pages are based on Kenneth's comparison pages, which serve as their key.

Example of lines in the correlation pages:


DEN        Week10  36%  921
DEN        Week11  48%  938

DEN designates the ranking, using the abbreviation used on Kenneth's comparison page.

Week10 is the week within the season, the number matching those in Kenneth's historical comparison-page URLs.

921 is the correlation between "DEN" for week 10 and the current "consensus", i.e. the ranking listed on Kenneth's current comparison page derived from the average (mean) ranking.

36% is a percentile, indicating "DEN" had a higher correlation than 36 percent of the rankings included on the week-10 comparison page. The number actually represents the percent of rankings this one beat beat so the highest number is typically in the 95%-98% range (since it didn't beat itself) and the lowest 0%.


Added "Special" Rankings

Besides comparing all the individual rankings from Kenneth's page, I also included a couple of other rankings:

Consensus refers to the average/mean ranking of the week (I probably should have labeled it "mean"; oops!). Thus these lines correlate previous week's mean/average (consensus ranking) with the current one. You can see how various individual rankings' predictive ability compares with the predictive ability of the mean of all rankings. In a typical week the mean beats almost all individual rankings.

Con2001 (2002, or whatever year) represents the previous year's final mean ranking. It does NOT use each week from the previous year: only the previous year's final mean ranking. However, it is listed against each week to show what percentile that ranking achieves as compared to that week's individual rankings. At one time I had this thought: "I'll bet the same teams are typically on top every year and last year's consensus ranking might stack up very well against the rankings we all produce, especially in the early weeks". That proved to be false since it shows a low percentile even the first week. I don't know what folks use to initialize their data but it appears to be better than simply taking the previous year's rankings.


Rankings that don't rank all teams

When I do the correlations, I fill in missing ranks, any individual ranking that doesn't have a ranking number for each team is extended as if the unranked team were in last place, or if there is more than one missing team, as if there is a multi-way tie for last place. For example, the "WAJL10" football ranking ranks 50 teams, so in this calculation, we produce the correlation for a "modified WAJL10" with each non-ranked team assigned a rank of 51.

I also do a "top 25 only" correlation, that ignores all rankings greater than 25. Any unranked team or team with a rank higher than 25 is given a rank of 26 under all the different rankings. This allows a limited-but-level comparison between rankings that only do 25 teams such as AP and USA versus other rankings.


What's in each correlation page

Within each correlation page there are four sections.

  1. The current week's correlations. This is an attempt to duplicate the numbers across the bottom of Kenneth's comparison page. These don't match Kenneth's though I've checked and rechecked my formulas and programming. The numbers I produce look generally plausible, e.g. generally the same rankings have high and low correlations, but there are occasions when on Kenneth's page, "A" has a higher correlation than "B", whereas on my page "B" has the higher correlation, etc.
  2. Week by week listings of the earlier weeks.
  3. Same data in a single table, ordered from highest correlation to lowest. This will sometimes show you that one individual rankings' predictive ability is weeks behind another, e.g. ones Week8 correlates less than another's Week7.
  4. Same data grouped by individual ranking, e.g. all the AP rankings are together. This will show you how a particular rank's correlation has risen over time and how its percentile has changed over time.

A note on the calculations

As I said above, when I applied my correlation formula in a similar manner to that on Kenneth's comparison page, I haven't been able to reproduce his numbers though mine are generally close to his.

Also, I list a "concordance" for each week. This formula also assumes a ranking from 1 to N but has been applied to data that doesn't fill this requirement, thus is a little off.

-John Wobus, 11/26/03

Back to SportsDoc II Home


Wobus Sports: www.vaporia.com/sports