Wednesday, November 9, 2011

Adjusting Strength of Schedule

by Michael Beuoy

One of the more interesting components (to me) of Brian Burke's efficiency model is the strength of schedule adjustment. I found it interesting the way you could bootstrap yourself into a self-consistent opponent adjustment. Here is the description (link):

"To adjust for opponent strength, I could adjust each team efficiency stat according to the average opponents’ corresponding stat. In other words, I could adjust the Cardinals’ passing efficiency according to their opponents’ average defensive efficiency. I’d have to do that for all the stats in the model, which would be insanely complex. But I have a simpler method that produces the same results.

For each team, I average its to-date opponents’ GWP to measure strength of schedule. This season Arizona’s average opponent GWP was 0.51—essentially average. I can compute the average logit of Arizona’s opponents by reversing the process I’ve used so far.

The odds ratio for the Cardinals’ average opponent is 0.51/(1-0.51) = 1.03. The log of the odds ratio, or logit, is log(1.03) = 0.034. I can add that adjustment into the logit equation we used to get their original GWP.

Logit = Logit(ARI) – Logit(Avg) + 0.034
= 0.11

This makes the odds ratio e0.11 = 1.12. Their GWP now becomes 0.53. If you think about it intuitively, this makes sense. Their unadjusted GWP was 0.51. They (apparently) had a slightly tougher schedule than average. So their true, underlying team strength should be slightly higher than we originally estimated.

I said ‘apparently’ because now that we’ve adjusted each teams GWP, that makes each team’s average opponent GWP different. So we have to repeat the process of averaging each team’s opponent GWP and redoing the logistic adjustment. I iterate this (usually 4 or 5 times) until the adjusted GWPs converge. In other words, they stop changing because each successive adjustment gets smaller as it zeroes in on the true value."

The purpose of this post is to show how this iterative, multi-step adjustment can be translated into a single matrix, and how that matrix is actually independent of the ranking system being used. This matrix can then be used more generally as a measure of each team's "interconnectedness", based on the season-to-date games played so far.

First, to define some terms:

R0 = A 32 row vector whose ith row contains the "raw" ranking of team i. This ranking can be anything you want: Brian's logit model, win percentage, average margin, number of Tebows on the team, etc.

S = A 32 by 32 matrix where the ith row jth column entry contains the percentage of team i's games that have been played against team j.

In Brian's description of the model, he states that "For each team, I average its to-date opponents’ GWP to measure strength of schedule". This is just equivalent to multiplying the S matrix by the R0 ranking vector. That adjustment is then added to the original ranking. This creates a new ranking R1 = R0 + S * R0 = (1 + S) * R0. Then Brian just keeps applying that same step to each new adjusted ranking.

R2 = R0 + S*R1 = R0 + S*(1+S)*R0 = (1 + S + S^2)*R0

R3 = R0 + S*R2 = R0 + S*(1+S+S^2)*R0 = (1 + S + S^2 + S^3)*R0

and so on.....

RFinal = (1 + S + S^2 + S^3 + .....) * R0

If you're familiar with infinite series, you know that the sum in the parentheses would be equal to 1/(1-S) if S were a normal number less than 1. Fortunately, this also works for matrices, as long as we interpret 1/(1-S) as the matrix inverse of (1-S) (I'm glossing over some mathematical niceties here, but trust me that the math works in this case, with one caveat below).

Let's define this inverted matrix as the SOS matrix -> SOS = (1 + S + S^2 + S^3 + ...) = 1/(1-S).

So, your final strength of schedule adjusted ranking is just:

RFinal = SOS * R0

Where, as mentioned earlier, the SOS matrix is solely a function of which teams have played each other and has no dependence on the ranking system used.

I have compiled the details of the calculation (for the 2011 NFL Season through Week 9) in a Google Docs Spreadsheet. One technical note: In order to get the matrix inverse to exist, I had to tack on a "normalizer" matrix to S. This matrix takes a 32 row vector and subtracts out the mean of the vector. The net effect is to convert any ranking vector to one where the average ranking is zero.

One of the nice things about the SOS matrix is that it tells you how interconnected each team is to another team, based on the games they've played so far. For example, Green Bay has played both Denver and Chicago once, but the matrix is showing that GB is far more connected to Chicago than to Denver (an 0.20 factor vs. an 0.08 factor). This makes sense given that Green Bay and Chicago are in the same division and so have had more common opponents (their season to date performance is far more intertwined). So, to get a measure of Green Bay's schedule adjusted strength, you would add 0.20 of Chicago's strength, but only 0.08 of Denver's strength. Or, to get Indy's true strength, you would have to subtract 0.11 of PHI's strength. Indy hasn't played Philly, and they don't have many common opponents (the AFC South is not slated to play the NFC East this year). The +/- nature of the matrix is intuitive in that it penalizes teams that either are connected to poor teams (positive SOS times negative ranking), or disconnected to strong teams (negative SOS times positive ranking). Note that this only works when your ranking system is normalized to an average of zero, such that a below average team has a negative ranking.

To get a bit more technical, I think this adjustment is mathematically equivalent to a regression model with 32 variables (a dummy variable for each team) where each game outcome is modeled as:

game outcome = Team A - Team B

Where a positive outcome is one that favors Team A (but I haven't done a full derivation).

In the Google Docs spreadsheet, I used "Margin" as the game outcome dependent variable, but you could also use wins, difference in logit based on efficiency stats, etc. In addition, I think this is also equivalent to the very simple ranking system published at

Note: After typing this up, I realized that the calculation I've described differs somewhat from Brian's adjustment. It appears as if Brian averages GWP, converts that to logit, makes the adjustment, and then converts back to GWP for the next step. Mathematically, I think that introduces a non-linearity that my matrix approach couldn't handle. Still, I think the approach I've outlined still has merit, even if it's not mathematically identical to Brian's approach.


Jim Glass said...

That's interesting, thanks.

I use an array for SOS adjustments, though a bit different than yours at, 32x17, using iteration. Very simple. No matrix math. It basically works for any stat, to "adjust for opponent strength, [by] adjust[ing] each team efficiency stat according to the average opponents’ corresponding stat."

There are 32 row for teams (Ariz, Atl, etc.) and 17 columns for the weeks of the season (plus more columns for playoff weeks if wanted). Each entry in a row has a number for the opposing team of that week. The array is set up at the start of the season. Eack week, as game results come in, the spreadsheet fills in the numbers on the corresponding column. Read across the values in any row to date, take the average, and you have the given team's "average opponent strength" for whatever stat one is looking at. Use that to iterate.

As the most basic example take W-L% adjusted by SOS. When entering each week's game scores the spreadsheet tallies each team's W-L%. A second value for each team called "iterated WL%" (I-WL%(0), "0" for the intial iteration) is initially set to equal this, and is transferred into the array in all the appropriate spots. The result is that the row for each team includes the WL% for each of its opponents to date. Average the values, you have each team's average level of opposition by WL%. As simple as can be.

Now the spreadsheet figures each team's "true strength" WL%, as adjusted by its opponents' average WL%, using the Log5 formula. (For instance, if a team has a 60% WL% in games versus opponents with a 55% record, its recalculated WL% becomes approximately 65%.)

This adjusted WL% becomes its new iterated WL%, I-WL%(1). This number for each team now jumps into all the columns in the array, replacing the I-WL%(0) numbers, filling it with revised values. Rinse and repeat.

Each team's numbers will differ between iterations, I-WL%(0), (1), (2), etc., but with the differences getting smaller. When the numbers stabilize, there you are. It usually takes about a dozen iterations.

I set up a page with the array doing this at the beginning of the season, one can do it as soon as the league schedule is known. Once the page is set it can be used as a template to calculate SOS adjustments for practically any stat (with tweaking as the particular stat may require -- for instance replacing Log5 with whatever function is more appropriate). Just enter whatever the game result was for the stat, everything else is automatic. It can also calculate "future" SOS for the unplayed part of the season. There can be any number of these for different stats, they can interact, feed into a master stat page ... do all the things that can happen via a spreadsheet.

There's a video on the blog about how they calclulate SOS adjustments. They seem to do it in a way entirely different from Brian, you, me, I barely grasped it. It looks like there really are a lot of ways to crack a nut.

Andrew Foland said...

Peter Mucha at UNC has done some interesting work on adjacency graphs and strength of schedule; see also his updated college pages

Michael Beuoy said...

Thanks Jim. I'm still thinking through your methodology, but I think it may be equivalent to mine, if you replace the Log5 function with the identity function. It is interesting how many disparate ways there are to approach this problem. Another approach I've seen is a variant of Google's PageRank algorithm. Much in the same way that Google uses inbound links to a webpage as a "vote" for a good webpage, the method treats Team A losing to Team B as Team A "voting" for Team B, where Team A's vote counts for more if other teams have lost to (i.e. voted) for it.

Andrew - Thanks. I'll take a look.

Anonymous said...

I think you forgot Oakland...

Michael Beuoy said...

Oakland is "RAI".

Post a Comment

Note: Only a member of this blog may post a comment.