## Wednesday, January 12, 2011

### Final "BigWin%" Ratings for 2010, Looking to the Playoffs

by Jim Glass
Two earlier posts here covered the subject of how team records in one-sided games – "Big Wins" and "Big Losses" – can be a much better predictor of future performance than overall won-lost record, or even Pythagorean expectation – particularly when predicting performance in the playoffs.
Define a Big Win/Big Loss as being by 10+ points, treat all other games as ties (half a win), and compute each team's "BigWin%". The concept is that the result will be a better indicator of true team strength than regular W-L percentage. The inspiration for this idea was a post on this site showing that nearly half of all NFL game outcomes are determined by luck. It is reasonable to assume that most of those games are the closest games where a few chance events can tip the outcome – so if those close games are treated as ties (the median point differential in NFL games is about 10 points) the "noise" injected by chance into regular W-L records will be largely eliminated, giving a truer picture of team strength. The statistical record backs this up.

The playoff record
To recap what's been reported here earlier at greater length...
In the 15 years from 1995 through 2009, exactly 100 teams made the playoffs with a record of 11-5 or better. Of these:
* The top 37 teams by normal regular season W-L, with records of 13-3 or better (total: 497-95, 84%) produced in the playoffs a record of 43-31, a 58% winning percentage (with six Super Bowl winners, 16% of the 37 teams).
* The top 30 teams by Pythagorean expectation – based on points for-against ratio, and well-proven as predicting future W-L more accurately than past W-L – won the same 43 playoff games and lost only 23, a better 65% winning record (with seven Super Bowl winners, 23% of the 30).
* It took only the top the 24 teams by BigWin% to collect the same 43 playoff victories, against only 15 losses, an even better 74% winning percentage (with nine Super Bowl winners, 38% of the 24).
What about the teams that showed superior "clutch" character by winning the most close games during the regular season (as in "great teams win close games")?
* It took the 45 teams with the most "close wins" (by 10 points or less) during the regular season to win 43 playoff games, while losing 42, only a 50.6% winning percentage. (With three Super Bowl winners, 7% of the 45 teams). The nine best teams at winning close games, each with nine or more close wins during the regular season, went 8-9 in the playoffs.
OK, so much for recapping. (Anyone interested in more details can see the original post.)

Final 2010 ratings
I should have sent this in this last week, but family emergencies prevented. Life is full of conflicting priorities and hard decisions ("Should I take the wife to the hospital or finish the ratings? ... take the wife to the hospital or finish the ratings? ... aw, if I don't take her I'll never hear the end of it ...")
So, belatedly, here are the final BigW% ratings for 2010. These are computed using a strength-of- schedule adjustment based on BigW%, so the "BigW" number for each team is that expected from playing the same league-average strength schedule. The last two columns give the average strength of each team's opposition and its rank among all.

 Rank Team BigW BigL PCT SoS Rank - SoS 1 NEWE 11.8 4.2 0.737 0.562 3 2 PITT 10.9 5.1 0.683 0.531 9 3 GBAY 10.9 5.1 0.682 0.53 10 4 NYJT 10.3 5.7 0.643 0.553 5 5 BALT 9.9 6.1 0.617 0.525 11 6 SAND 9.3 6.7 0.584 0.458 24 7 ATLA 9.2 6.8 0.577 0.451 25 8 CHIC 9.2 6.8 0.574 0.512 14 9 INDY 9.1 6.9 0.57 0.508 17 9 PHIL 9.1 6.9 0.57 0.508 18 11 KCTY 8.7 7.3 0.542 0.448 26 11 CLEV 8.7 7.3 0.542 0.542 8 11 NYGS 8.7 7.3 0.542 0.479 22 14 TENN 8.6 7.4 0.54 0.51 16 15 NORL 8.6 7.4 0.535 0.441 27 16 DALL 8.5 7.5 0.532 0.501 19 17 OAKL 8.4 7.6 0.525 0.463 23 18 MIAM 8.3 7.7 0.521 0.583 2 19 DETR 8.2 7.8 0.515 0.547 6 20 TAMP 8 8 0.501 0.439 28 21 CINC 8 8 0.497 0.56 4 22 MINN 7.7 8.3 0.481 0.544 7 23 HOUS 7.4 8.6 0.46 0.523 13 24 BUFF 7.3 8.7 0.457 0.585 1 25 JAXS 6.7 9.3 0.418 0.512 14 26 WASH 6.4 9.6 0.398 0.525 11 27 STLO 6 10 0.372 0.402 32 28 SANF 5.8 10.2 0.364 0.424 31 29 DENV 5.8 10.2 0.363 0.487 21 30 SEAT 4.5 11.5 0.28 0.426 30 31 ARIZ 3.6 12.4 0.225 0.427 29 32 CARO 2.5 13.5 0.154 0.495 20

Some quick observations. By BigW%...
* The AFC East is the strongest division by far and its teams faced by far the toughest schedule, playing each other and a very tough out-of-division slate. The three toughest schedules were faced by Miami, Buffalo and New England, while the Jets played the fifth-toughest. (Buffalo and Miami aren't as bad as their regular W-L records look.)
* New England's W-L record this year is extra impressive because top-winning teams usually have easier-than-average schedules if only because they get out of having to play themselves twice. (New England gets to play Buffalo twice, Buffalo has to play New England twice.) But the Pats have compiled their 14-2 against very near the toughest schedule in the league.
* Green Bay is the most powerful team in the NFC, by a bunch.
* San Diego is the best team to miss the playoffs (and would be the second-strongest team in the NFC). Go Norv! Go!
* Atlanta, everybody's favorite team to talk about on this site, is genuinely good but not 13-3 good. The Falcons are an impressive 8-2 in close games – if you are impressed by close games, which BigW% isn't. However their +4 in Big Wins is very good and makes them #2 (if just a bit ahead of Chicago) in the NFC.
* It's an AFC year, with five of the top six and seven of the top eleven teams in the AFC.

"For entertainment purposes only"
What does all this mean for the playoffs? At this point I have no idea how differences in BigW% translate into point spreads. (And unless you are in Nevada or Atlantic City or somewhere else where sports gambling is legal, of course you wouldn't be interested in that anyhow.)
This means the only real "calls" to be made using BigW% are when the public (as evidenced by, say, the betting line) believes one team is best and will win outright, while BigW% says the other team is best and will win outright, so the spread isn't an issue.
How did BigW% perform by that standard, for entertainment purposes, over the wild-card game weekend?
* BigW% said Green Bay was better than Philadelphia outright, while the spread had the Eagles better outright. I had no qualms about entertaining my friends by telling them to, um, root for the Packers. 1-0.
* Seattle has as dreadful a record by BigW% as it does under any other rating system. I'd never have bet, er expected the Seahawks to beat the Saints outright. But by BigW% New Orleans was a barely over .500 team – so it has this as only a bad team beats mediocre team normal upset, not a bad team "defeats the defending Super Bowl Champion" huge upset.
* Kansas City got its 10 wins playing against the easiest schedule in the AFC according to BigW%, so it getting crushed by a clearly better team that played a much tougher schedule is hardly a shock.
* The Jets were somewhat stronger than Indianapolis by BigW%, but the Colts had offsetting home field advantage making the odds close. That's pretty much the same as the point spread indicated -- the Colts being 2-point favorites in Indianapolis indicates New York would have been about 4-point favorites in New Jersey. As I don't know how much home field advantage is worth by BigW%, I'd call this game for practical purposes a push, which is pretty much how it ended.
So I'd tally this up as BigW% being successfully entertaining to the extent of 1-0 plus leaning the right way twice and being neutral once in the three no-call games.
And that's the end of my last item on BigW% for this season.
Happy New Year to all, and may the gods of the playoff pools entertain you well, to the extent it is legal.

Bruce D. said...

Jim,

Nice follow-up to your previous post.

Very "out of the box" and innovating. I plan on stealing, um, borrowing your ideas here.

SoS seems to add a lot of credibility.

"Awesome job. I wonder though if 10 points is too many to be called a close game. If the cutoff is reduced to 7 points or less, would the results be robust? Do the samples get too small? Perhaps it would be even more condemning for the clutch narrative."

+1 for that(barring family emergencies of course)

Jim Glass said...

Bruce, thanks for the kind words.

I definitely plan to check numbers other than 10 as the break point, but it will have to wait until the off season due to (1) lack of time, and (2) call it "legacy problems" with my spreadsheet.

This started merely as a very quick-and-dirty look at the playoff records of teams that win "close games" in the regular season, just for my own interest. I picked 10 points to define "close" because of Brian's post a while back saying half of games are determined by luck, and the median point margin in games last year was 10 points.

"The value of winning close games" was settled for me when I found that over the past 15 years the 15 teams with the best records in close games during the regular season -- a combined 103-11 in them, pretty dang impressive! -- in the playoffs then went 15-14, 52%. That was what I was looking for. Done, case closed. Then I noticed the Big Wins business.

I wasn't thinking of looking at "Big" wins and losses at all, sure wasn't thinking of examining them to write about. So the spreadsheet is a big kludge I keep adding to ad hoc, with the 10 points sort of hard-coded in, not manipulation friendly.

This year the median point margin is less, so even by the initial simple logic 10 points is too many, too many games are being treated as BigTies, not enough as BWs and BLs. And now Brian says it's closer to 40% games determined by luck. So clearly the calibration is a long way from right -- but still, for a crude version 0.1 attempt, I find the results interesting.

My intuition is that to seriously reduce luck in results requires a point differential of more than one score, I'm thinking 8.5 points. But intuition is only a starting point, from there it takes test test test to find what produces the best result. I plan to re-write the program code and use much more robust data to do that in the offseason, real-life permitting.

And hey, feel free to steal all you want, and I'll steal from you. That's what a community is for!

Bruce D. said...

Jim,

I've written a program using your BigWin formula that connects to a database.

It'll be easy to check the various point differences as stated above, the only thing missing is the SoS factor.

If you can give a quick example of how you factored SoS in, I'll rerun the program for various point differentials to see which is the most "entertaining".

Jim Glass said...

Bruce: My 15-year playoff database is a kludge, the 2010 BigW-L% one runs easy.

Here are the post-wild card games BigW% ratings for the remaining playoff teams:

(I know this can't line up right)

_________BW____BL____pct
NEWE___11.8___4.2___0.739
PITT___11.0___5.0___0.686
GBAY___11.5___5.5___0.676
NYJT___10.9___6.1___0.641
BALT___10.9___6.1___0.640
ATLA____9.3___6.7___0.578
CHIC____9.2___6.8___0.574
SEAT____4.9__12.1___0.291

Heroically assuming the thing still to be proven (but which seems plausible so far), that these numbers are an improved realistic estimate of relative team strength, then applying Log5 gives these probable winning percentages for neutral field games between these teams...

NEWE__0.613
NYJT__0.387

PITT__0.551
BALT__0.449

CHIC__0.767
SEAT__0.233

GBAY__0.604
ATLA__0.396

If you can convert these to point spreads and adjust for home field advantage, well, feel free "entertain" yourself with all the games all weekend.

Disclaimer: The data and analysis going into it all are guaranteed fully sound and error-free up to the amount of cash you have paid me for it, minus a \$100 deductible.

If you can give a quick example of how you factored SoS in, I'll rerun the program for various point differentials to see which is the most "entertaining".

If you are proposing to test break points (other than 10) as in my 15-year playoff run, the SoS adjustment is easy and standard, I think.

At the risk of telling what you already know (in an exercise like describing a spiral staircase without using my hands)...

Count BigWs and BigLs with close games as ties, to give each team a normal W-L%. Call that its WinPct(0). Then its opponents' combined average win% (their avergage WinPct(0)) is computed. Figure the original team's adjusted win% against that strength of schedule, and the result is its WinPct(1), and do for all teams. Then substitute all the WinPct(1)s for the WinPct(0)s in figuring average opponent strength for each team, etc. Rinse and repeat until the numbers stabilize.

For instance, using the simple arithmetic method, say Team A has a 60% winning% against oppnents with an average 46% winning%. As 46% is 4 below 50% (league average) drop Team A's adjusted winning% by 4 points to 56% -- winning 60% against .460 teams is equivalent to winning 56% against .500 teams. Now team A's 56% winning% is used instead of its orginal 60% in the next iteration of figuring every other team's strength of schedule. Repeat until the numbers stop changing. Hope all the combined team strengths and SoSes the average out to .500.

The arithmetic method is very simple and fine as long as team win%s are clustered around 50%, say from 35% to 65% (as in MLB) but since it is a linear appoximation of a non-linear reality at the extremes errors become significant -- it can give teams like NE and CAR winning over 100%/less than 0% of their games. So instead I use Log5 in the iterations, which is complex for pencil-and-paper but easy for a spreadsheet. The little extra coding is a cheap price to pay to get out of having to convert a >100% chance of winning into a point spread.

Have fun!

Jim Glass said...

Bruce, dang it, I haven't been able to get a comment posted in response to yours -- and then when I did get one up here, after a while it disappeared. So let me break it up, rewrite to make the gods of Blogger and comment moderation (?) happy, and try again...
~~~

Bruce: My 15-year playoff database is a kludge, but the 2010 BigW-L% one runs easy. Here are the post-wild card games BigW% ratings for the remaining playoff teams:

(I know this can't line up right)

_________BW____BL____pct
NEWE___11.8___4.2___0.739
PITT___11.0___5.0___0.686
GBAY___11.5___5.5___0.676
NYJT___10.9___6.1___0.641
BALT___10.9___6.1___0.640
ATLA____9.3___6.7___0.578
CHIC____9.2___6.8___0.574
SEAT____4.9__12.1___0.291

Heroically assuming the thing still to be proven (but which seems plausible so far), that these numbers are a realistic estimate of relative team strength, then applying Log5 gives these probable winning percentages for neutral-field games between these teams...

NEWE__0.613
NYJT__0.387

PITT__0.551
BALT__0.449

CHIC__0.767
SEAT__0.233

GBAY__0.604
ATLA__0.396

If you can convert these to point spreads and adjust for home field advantage, well, feel free "entertain" yourself with all the games all weekend.

Disclaimer: The data and analysis going into it all are guaranteed fully sound and error-free up to the amount of cash you have paid me for it, minus a \$100 deductible. [tbc]

Jim Glass said...

If you can give a quick example of how you factored SoS in, I'll rerun the program for various point differentials to see which is the most "entertaining".

If you are proposing to test break points (other than 10) as in my 15-year playoff run, the SoS adjustment is easy and standard, I think. At the risk of telling what you already know (in an exercise like describing a spiral staircase without using my hands)...

Count BigWs and BigLs with close games as ties, to give each team a normal W-L%. Call that its WinPct(0). Then all its opponents' combined average win% (their average WinPct(0)) is computed. Figure the original team's adjusted win% against that strength of schedule, and the result is its WinPct(1), and do for all teams. Then substitute all the WinPct(1)s for the WinPct(0)s in figuring average opponent strength for each team, etc. Rinse and repeat until the numbers stabilize.

For instance, using the simple arithmetic method, say Team A has a 60% winning% against opponents with an average 46% winning%. As 46% is 4 below 50% (league average) drop Team A's adjusted winning% by 4 points to 56% -- winning 60% against .460 teams is equivalent to winning 56% against .500 teams. If Team B has a 48% winning% against teams with an average 65% winning%, it's adjusted winning% becomes 63% (higher than Team A's even though it has a worse regular W-L record). That's the first cut at expected performance against a league-average .500 schedule for Teams A, B, C, etc. Except that now each has a changed strength of opposition, so the process has to be repeated until the numbers stop changing. (Then hope all the combined team strengths and SoSes still average out to .500.)

At the end you get the expected record for each team against league-average .500 opposition, so they are all on the same scale, plus a list of opponents for each team by final adjusted "BigW-L" strenght, which (purportedly) gives an improved SoS measure for the opposition against which for instance KC got its regular 10 wins.

Now ... the simple arithmetic method is very easy and fine as long as team win%s are clustered around 50%, say from 35% to 65% (as in MLB). But it is a linear appoximation of a non-linear reality, so at the extremes errors become significant -- NE projects to win 124% of games against CAR.

So instead I use Log5 in the iterations, which is complex for pencil-and-paper but easy for a spreadsheet. The little extra coding is a cheap price to pay to get out of having to convert a 124% chance of winning into a point spread.

Have fun!

Jim Glass said...

Just for the record and heck of it (and to procrastinate away from some real-world work) I estimated point spreads for this week's games using "reverse Pythagorean" -- going from win probability to point differential, instead of the normal other way around, using the Pythagorean formula.

FWIW, for three of the games the total net point spreads by this method and from Vegas were exactly the same, 22 (with slight differences per game). So for this tiny example they approximate each other surprisingly well. The fourth game with the biggest difference of course was the Atlanta game. People are taking Atlanta's close wins seriously, looks like.

Using 2.5 pts for home field advantage, the "BigW% spread" (rough, chunky style) and smooth Vegas spread (as per today's NY Post) are ...

____________BigW___Vegas

NEWE__0.613 -6 ... -8.5
NYJT__0.387

PITT__0.551 -4 ... -3.5
BALT__0.449

CHIC__0.767 -12 ... -10
SEAT__0.233

GBAY__0.604 -1 ... +2.5
ATLA__0.396

I'm rushing out now to bet this month's mortgage payment on the Packers. For the entertainment of it only.

Jim Glass said...

Hey, through two rounds this is picking pretty darn well for entertainment purposes -- plus a free mortgage payment!

Just for the record, this week via BigW%, rating and spread giving 2.5 to the home team...

___rating_____BigW___Vegas

NYJ .648
PIT .680 __ -3.5 __ -3.5

GBY .699 __ -1.5 __ -3.5
CHI .591

Bruce D. said...

Jim,

You've rendered multiple man-hours of programming and analysis mute, now what do I do?

(Been lazy, haven't finished checking the various point differentials YET.)

Jim Glass said...

And for the big game, BigW% says it's as even a match as one could hope for: