- Ch = change in ranking since previous update
- Rating = overall assessment of the team's performance to this point in the season
- Power = estimated team strength going forward (used to make predictions)
- Offense = ability to score points
- Defense = ability to prevent the opponent from scoring
- HA = estimated home advantage
- SchP = strength of schedule for games played so far
- SchF = strength of schedule for all games, including those currently scheduled
- E[W]-E[L] = expected wins and losses for remaining schedule

The current version (3.0) of the Massey Ratings is described informally below.

- Introduction
- Goals
- Inputs
- Predictions
- Rating
- Power
- Offense / Defense
- Home Advantage
- Schedule Strength
- Standard Deviation
- Conference Ratings
- Preseason Ratings
- How the algorithm works
- Output

I believe that the Massey Ratings are the most scientific and full-featured system available for analyzing the performance of members of a competitive league. Future improvements are likely in an effort to further refine the model to accurately reflect the nature of sports.

The second goal is to account for the differences in schedule. When there is a large disparity in schedule strength, win-loss records lose their significance. The computer must evaluate games involving mismatched opponents, as well as contests between well matched teams.

It is neccessary to achieve a reasonable balance between rewarding teams for wins, convincing wins, and playing a tough schedule. This issue is difficult to resolve, and rating systems exist that are based on each of the extremes.

Similarly, a team's
**Defense** power rating reflects the ability to prevent its opponent from scoring.
An average defense will be rated at zero. Positive or negative defensive ratings
would respectively lower or raise the opponent's expected score accordingly.

It should be emphasized that the Off/Def breakdown is simply a post-processing step, and as such has no bearing on the overall rating. A consequence of this is that the Off/Def ratings may not always match actual production numbers. A team that routinely wins close games may have somewhat inflated Off/Def ratings to reflect the fact that they are likely to play well when they have to. Winning games requires more than just the ability to score points, but also teamwork, mental strength, and consistency. The Off/Def breakdown is simply an estimate of how much of a team's strength can be attributed to good offensive and defensive play respectively.

The **Parity** column measures how well matched the teams in a conference are.
A value of 1 indicates perfect parity - there is complete balance from top to
bottom. In contrast, a parity near 0 indicates that there is great disparity
between the good and bad teams in the conference.

The ratings are totally interdependent, so that a team's rating is affected by games in which it didn't even play. The solution therefore effectively depends on an infinite chain of opponents, opponents' opponents, opponents' opponents' opponents, etc. The final ratings represent a state of equilibrium in which each team's rating is exactly balanced by its good and bad performances.

A's points (pA) | B's points (pB) | GOF(pA,pB) |

30 | 29 | 0.5270 |

10 | 9 | 0.5359 |

27 | 24 | 0.5836 |

27 | 20 | 0.6924 |

50 | 40 | 0.7292 |

10 | 0 | 0.8548 |

30 | 14 | 0.8786 |

45 | 21 | 0.9433 |

45 | 14 | 0.9823 |

30 | 0 | 0.9920 |

56 | 3 | 0.9998 |

Each game score is plugged into a GOF that outputs the estimated probability that team A would win if the game were played again under the same conditions. This is independent of any other information since it involves only that one game in isolation. For example, it may be determined that the winner in a 30-14 game has a 88% chance of winning a rematch, while a 27-24 winner only has a 58% of winning again.

Notice that a diminishing returns principle is manifested in this GOF. There is some advantage to winning "comfortably," but limited benefit to running up the score. A team will not be penalized just for playing a weak opponent (although it becomes much harder to improve its rating by blowing someone out).

Let p = Prob(A beats B) = F(rA,rB,hA,hB), where rA,hA and rB,hB are ratings and home advantages of teams A and B respectively. F is a function of rA,rB,hA,hB that is based on the CDF of a normal random variable.

All the game scores are translated to a scale from 0 to 1 by the GOF. Let g = GOF(pA,pB), where pA and pB are the points actually scored by teams A and B in a particular game.

A nonlinear function of the teams' ratings is formed by multiplying terms that look like:

Here ^ denotes an exponent. Also note that 0 <= p,g <= 1. By maximizing the resulting function, maximum liklihood estimates (MLE) are obtained for the ratings and home advantages. The optimization problem may be solved with standard techniques such as Newton's method.

Preseason ratings may be implemented via prior distribution factors in the optimization function. Their importance diminishes as the season progresses, and they are negligable by the end of the year. A strong prior distribution must be used to compensate for lack of enough single season data for the home advantages.

Time weighting is a debatable practice, however I believe that more recent games are generally better indications of a team's true strength. An exponential decay based time weighting is applied by premultiplying g by some weight w.

The MLE ratings are used to create a prior distribution, which encodes the estimate of a team's strength based on looking at its game scores alone. The Bayesian correction is computed as an expected value using the actual wins and losses (and who they were against), combined with the prior distribution. This helps account for the possiblity of correlating performances (a team playing up or down to its opponent).

The advantage of the Bayesian approach is that it rewards teams that win consistently, no matter how they do it. The more games a team wins, the more confident the computer can be that scores are not so important. Ratings are less likely to be negatively impacted by beating a poor team. Furthermore, games involving well-matched opponents will naturally be given priority in determining the overall ratings.

- Offensive and Defensive power ratings are computed.
- Schedule strength is measured.
- Conference ratings are determined.
- The ratings are scaled to be centered at zero with standard deviation of one.

Contact | Massey Ratings | Theory