by Denis O'Regan
Here's a breakdown of the main offensive and defensive stats for Sunday's Superbowl teams.Inevitably the post's heavy on numbers,so a brief summary may be in order.
The most useful and relevant stat to future performance is a team's yards per carry/attempt on offense or defense compared to those of the opponents they've faced throughout the season. A team that gets 4.5 yards per carry and does it over the season against defenses that only allow 4 yards per carry, can confidently be assumed to be above average when it comes to moving the ball on the ground.
These numbers are the first ones quoted in the subsequent post.
The supplementary stats that breakdown the numbers by play direction and field depth merely add colour. Sample sizes on these plays are so much smaller, so any conclusions drawn will inevitably come with a degree of caution.They can,however highlight where a team has had considerable success or major problems during the season.
Matching each team's respective offensive stats against their opponents defensive stats shows areas where each team may be successful or not on Sunday.Judging by the matchups,both teams look like having difficulty moving the ball on the ground.Pittsburgh should pass the ball well,whilst Arizona,despite good passing figures will struggle because they match up badly with Pittsburgh's excellent pass defense.
The primary stats also form the basis for a prediction model that has consistently outperformed the Vegas benchmark over multiple seasons and on this occasion predicts Pittsburgh having a 67% winning chance on Sunday.Various subsets for example using matchups involving playoff calibre teams only increase Arizona's chances by a percentage point or so.So,not surprisingly the game appears to be Pittsurgh's to lose,although Arizona should stay within a touchdown.
The model predicts around 40 points to be scored,but the one off nature of this game can see predictions made to look rather silly very quickly.
Enjoy the game.
Pittsburgh's Run Defense.
Pittsburgh allowed only 3.3 yards per carry through the regular season.Their opponents over that period played a combined 256 games,they made over 7,000 rushing plays and gained a average of 4.12 yards per carry.If we assume that Pittsburgh's opponents opponents represent a broad cross section of the NFL that allows us to put the Steelers raw 3.3 ypc figure into some sort of context.They allow 0.82 ypc less than their opponents habitually gained.
If we further breakdown the Pittsburgh run defense by looking at yards allowed depending on which direction the play was run we find that they excel all the way along the line.
When opponents ran behind their own right end they averaged 5 ypc.When they ran in that direction against Pittsburgh they gained jut 3.4 ypc.
Running behind right tackle opponents gained 4.15 ypc on average,but just 3.28 to Pittsburgh.
Right guard figures were 4.06 ypc overall compared to 3 ypc against Pittsburgh.
Runs up the middle were 4.1 to 3.24 ypc.
Runs to left guard were 4.2 to 3.45 ypc.
Runs behind left tackle were the only direction where opponents did better against Pittsburgh compared to their league average.They gained 4 ypc overall and 4.21 ypc against Pittsburgh.
Opponents averaged 5.3 ypc on runs deignated as left end,but just 3.79 ypc against Pittsburgh.
Arizona's Run Offense.
NFL games are all about matchups and Arizona look like struggling badly if they try to move the ball on the ground.
They gain 3.46 ypc against teams that allowed 4.11 ypc.So already they are gaining just 84% of their opponents average rushing yardage allowed per play.When you match those numbers up against a Pittsburgh run defense that has allowed teams just 80% of their usual yards per carry yardage it begins to look likely that Arizona will struggle to get even 3 yards per carry.
Split by direction the Cardinals are strongest running to their right side.They gain 4.23 ypc against defenses that allow 3.82 ypc when running behind their right guard.They're around average running to right end (gain 4.6 ypc against 4.8 ypc defenses),but tail off at right tackle (gain 3.8ypc against 4.32 ypc defenses).
Running up the middle is a real struggle ( gain 2.6 ypc against 4.14 ypc defenses),as is left guard (2.44 ypc verses 3.72 ypc).There's an improved,but still below average effort behind left tackle (gain 4 ypc against 4.25 ypc defenses) and they're around league average when stretching it out to the left end (5.06 ypc against 5.07 ypc defense).
Arizona's Run Defense.
Overall Arizona allow teams that average 4.34 ypc overall to get just 3.96 ypc.So they're good,but not in the same league as the Steelers.They are also patchy along the line.
Teams running behind their own right end gain 5.76 ypc against Arizona compared to an overall,combined season long figure of just 5.1 ypc.
Behind right tackle they gain just 2.33 ypc against Arizona compared to 4.49 ypc overall.
Behind right guard they gain 4.23 ypc against Arizona compared to 4.11 ypc.
Up the middle they gain 4.4 ypc against Arizona compared to 4.5 ypc.
Behind left guard they gain 5.2ypc against Arizona compared to 3.93 ypc.
Behind left tackle they gain 4.02 ypc against Arizona compared to 4.42 ypc.
Behind left end opponents gain 5.3ypc against Arizona compared to 5.3 ypc overall.
So unlike Pittsburgh's run defense which is virtually bombproof where ever you try to attack it,Arizona does have areas of vulnerability that doesn't show up in the fairly impressive average yards per carry number.
Pittsburgh's Run Offense.
Overall Pittsburgh gain 3.68 yards per carry against opponents who allow on average 4.03 ypc.That makes them below average,but not to the same extent as Arizona.They're struggle most running the ball up the middle,but progressively improve when running out towards the edges and are actually above average when running in the direction of left end.
Running behind right end they gain 4.27 ypc against defenses who allow 4.8 ypc.
Behind right tackle they gain 3.62 ypc against 4.03 defenses.
Behind right guard they gain 3.84 ypc against 4.24 ypc defenses.
Up the middle they gain 3.2 ypc against 3.73 ypc defenses.
Behind left guard they gain 3.34 ypc against 4 0 ypc defenses.
Behind left tackle they gain 3.84 ypc against 3.92 ypc defenses.
Behind left end they gain 5.97 ypc against 5.15 ypc defenses.
Again we have a below average running attack matched up with an above averge run defense,however Pittsburgh should be able to run the ball better than Arizona on the day.We've already seen that there are areas of weakness in the Arizona defensive line that can be exploited.Around 3.5 ypc looks a reasonable upside for the Steelers on Sunday.
Now for the aerial matchups.
Pittsburgh's pass Defense.
The Steelers are a very,very good pass defense.They allow teams who averaged 6.18 yards per pass attempt to only pass for 4.6 yards per attempt.Those are exceptional figures.
I further break these numbers down by looking at which areas of the field are best defended.Passes that are caught within twenty yards of the line of scrimmage are designed as short,anything longer is deep.Passes are further split as being caught to the right side of the field from the offenses viewpoint,middle or left.Again a team's raw yards per attempt figure is compared to the average ypa allowed or gained by their seasonal opponents when defending or attacking these same areas.
As with their run defense,the Steelers do not have to hide any of their players and they defend all parts of the field equally well.
They allow 4.9ypa on short left passes against teams who average 6.11ypa over the season.
They allow 5.4 ypa on short middle passes against teams who average 6.32 ypa.
They allow 4.8 ypa on short right passes against teams who average 5.39ypa.
Deeper passes are defended even better.
They allow 6.87 ypa on deep left passes against teams who average 10.24 ypa.
They allow 9.38 ypa on deep middle passes against teams who average 11.44 ypa.
They allow 8.71 ypa on deep right passes against teams who average 9.94 ypa.
Well above average right across the board.
Arizona's pass Offense.
The Cardinals pass the ball very well,they gain 7.38 ypa against defenses who allow just 6.54 ypa.However,that still compares unfavourably with the Pittsburgh defense.Overall Pittsburgh's pass defense allows 1.6 ypa less than their opponents gain over the season,whilst Arizona only gain 0.8 ypa more than their opponents allow.That still gives Pittsburgh's pass defense the upper hand by a fairly large margin.
Broken down by field position.
Arizona gain 6.52 ypa on short left passes against defenses who allow 5.94 ypa.
They gain 6.34 ypa on short midde passes against defenses who allow 6.8 ypa.
They gain 5.68 ypa on short right passes against defenses who allow 5.43 ypa.
These short passes match up particularly badly for Arizona against Pittsburgh's short range passing defense.If they are going to have any success aerially it's going to come on the deep ball where they potentially have the upper hand when throwing deep middle and deep right.But these are high risk/high reward plays.
Arizona gain 13.5 ypa on deep left passes against defenses who allow 11.24 ypa.
They gain 22.4 ypa on deep middle passes against defenses who allow 13.39 ypa.
They gain 17.07 ypa on deep right passes against defenses who allow 11.64 ypa.
These are great figures,but bear in mind in a normal game plan the Cardinals will only be throwing around half a dozen such passes.That will make the yardage susceptible to small sample errors and also limit their impact on the game compared to the more numerous shorter passes.
Arizona's pass Defense.
Arizona's pass defense is almost as bad as their passing offense is good.They allow team's who average 6.29 ypa to get 6.77 ypa when they play Arizona.
They allow 7.83 ypa on short left passes against teams who average 6.01ypa over the season.
They allow 7.38 ypa on short middle passes against teams who average 6.79 ypa.
They allow 5.09 ypa on short right passes against teams who average 5.38ypa.
They're above average defending short right passes,but well below par elsewhere.
They allow 12.19 ypa on deep left passes against teams who average 10.91 ypa.
They allow 12.1 ypa on deep middle passes against teams who average 12.6 ypa.
They allow 7.99 ypa on deep right passes against teams who average 9.84 ypa.
Pittsburgh's pass Offense.
The Steelers are marginally above average passing the ball.They gain 6.48 ypa against defenses who allow 6.31 ypa.The bulk of their gains come when they're connecting with short passes and they match up well against Arizona's pass defense in this area.They project to have success where ever they throw the ball short and in a game that could see offenses struggling this will be the key area.
They gain 5.41 ypa on short left passes against defenses who allow 5.67 ypa.
They gain 7.80 ypa on short midde passes against defenses who allow 6.53 ypa.
They gain 6.26 ypa on short right passes against defenses who allow 5.38 ypa.
Pittsburgh gain 10.8 ypa on deep left passes against defenses who allow 11.34 ypa.
They gain 10.45 ypa on deep middle passes against defenses who allow 12.29 ypa.
They gain 11.40 ypa on deep right passes against defenses who allow 10.91 ypa.
Even though Arizona passes the ball better than Pittsburgh the offensive/defensive matchups make it likely that it will be Pittsburgh will have greater success through the air on Sunday.
Friday, January 30, 2009
[+/-] |
Superbowl Matchups |
Wednesday, January 28, 2009
[+/-] |
Super Bowl XLIII- Some Food for Statistical Thought |
by Josh Fryman
Instead of draw up an entire thesis or go into a detailed analysis of Sunday’s Super matchup, I decided to take the lazy way out. Below are four statistical insights that may shed some light on what will unfold in Super Bowl XLIII.
1. Blitz Kurt Warner at your own risk- Warner had a very impressive QB rating of 96.9 during the regular season, but in blitz situations, his rating rose to 103.8. Blitzes are gambles, and it is normal to expect a QB’s rating to increase in such situations, but Warner’s rating in these situations is quite high. This year, the Cardinals faced 197 blitzes on pass plays, so for some reason, teams seem to believe that the Cardinals offense is susceptible to them. Not a good idea. Not only does Warner’s QB rating improve, but when blitzed, his yards per attempt increase to 7.86.
A lot of Warner’s success comes from his quick release, but the plethora of options is a tremendous factor, as well. In the playoffs, the Cardinals have moved Larry Fitzgerald onto both sides of the field and also into the slot, which gives Warner his favorite target anywhere he wants him in case of a secondary mismatch. Much speculation has said that the Steelers will blitz a lot in order to diminish Warner’s opportunity to throw to his deep targets.
2. There’s a Hole in the Pittsburgh O-Line- And it’s at the center and left guard positions. Justin Hartwig and Chris Kemeoatu have not been doing Ben Roethlisberger many favors this year. Whereas most quarterbacks improve their passer rating during blitzes, Roethlisberger’s dropped this year from 80.3 in normal situations to 70.3 in blitzes. Considering that the average quarterback improves 5.6 ratings points with a standard deviation of 4.4, Roethlisberger’s drop during blitzes is alarming. When you watch the film, you see that teams send blitzing defenders primarily into the gap between Hartwig and Kemeoatu. As if allowing sacks weren’t enough, Steeler’s running backs’ yards per carry drop from 3.7 overall to 3.3 when running through the left side of the offensive line.
3. The Arizona DB’s can be exploited- A lot has been made of the breakout performances of Antrel Rolle and Dominique Rogers-Cromartie this postseason, especially given some of the interceptions that the secondary has come up with. Don’t be fooled by interceptions. As discussed before on this site and at my own blog, takeaways have as much to do with chance and rely far more on an offense’s propensity to make turnovers as it is on a defense’s ability to create them.
Furthermore, when one looks beyond the takeaways, one will see that wide receivers have had field days against Arizona of late. In the postseason alone, DeSean Jackson, Muhsin Muhammed, and Roddy White all had better statistical days (receptions, yards) than was the average for the season. Only Steve Smith had a subpar performance. During their current 4-game win streak, the Cardinals have allowed 275.5 yards passing per game. Super Bowl XL MVP Hines Ward should be licking his chops.
4. Big Ben will need to get into the shotgun- Aside from the O-Line woes, Roethlisberger has turnover problems in the form of both fumbles and interceptions. Furthermore, possibly the best linebacker in the Super Bowl will be lining up for Arizona. Karlos Dansby finished the season with 119 tackles and 9 stuffs behind the line. When the quarterback is under center with only a lone back, Dansby’s tackles per play nearly double. Furthermore, Arizona has registered 7 sacks and 9 stuffs in the postseason so far. Without a serious deep threat, Pittsburgh cannot afford to leave two men in the backfield to guard against these threats. Big Ben, however, cannot be trusted under center against a reinvigorated pass rush. Therefore, the gun may be the best call.
Wednesday, January 21, 2009
[+/-] |
Scoring environment |
Recently, Brian has made his win probability (WinEx) calculator available for all to use. This is a really powerful toy, and I plan on using it for some stuff I'll post about later. For now, however, I want to point out a flaw.
One of the first things one notices when looking at NFL stats is that they lack context. How valuable is a 3 yard run? Is it a three yard run on third and two? Is it from the 2 yard line, when down by 6 with 20 seconds left? Is it in the middle of the field, as time expires, down by 27? Obviously, context is important. It only gets worse as the stats get bigger. How valuable is a 1000 yard rushing season? Well, if that's 15 rushes with 13 touchdowns, it's the best season by anyone, ever (and the question is "why did he not get more touches?"). If it's 400 rushes with 2 TDs, the question is "why is he not on the practice squad?" It's all about context -- in this case, the context of the event: what were you trying to achieve?
This extends further, however. How good is throwing for 6000 yards in a season? Well, it'd set a lot of records. Unless, of course, defensive players are on strike and teams are running amateurs out there every weekend, causing 12 quarterbacks to throw for 6000 yards. How about scoring 200 points across the regular season? Kinda crap...unless you have the best defense of the last decade, and are 14-2 when the dust settles. Again, it's all about context -- this time about the context of the achievement: how did everyone else do?
WinEx calculations are an attempt to solve the first series of questions I posed. The second is a little trickier. How does the system know that being up by 12 with 5 minutes left is good? Because those teams usually win. Note, however, that this is "those teams," not that team. In this case, the league scoring context is being used to determine how often teams tend to score.
The question is, is there another context each game is played in? I wouldn't have written all this unless the answer is "yes," of course. There are actually a couple different contexts: team-level and game-level.
A team-level context is simply which teams are playing. Let's say the Steelers and Raiders are playing. Just before the opening kick-off, you're asked who's going to win. If you're looking at things in the "NFL" context, it's 50/50. But if you know who the teams are, it's certainly not. This becomes less and less true over the course of a game -- if the Raiders are up by 14 with 20 seconds left, they've probably won, whether or not they're the Raiders -- but before the opening kickoff, it is not "anybody's ballgame".
There's another context which is important for calculating WinEx, though, and that's game context. Let's say your team is up by 10 with 8 minutes remaining. That's a pretty good lead, right? Well, if it's a 13-3 game, it's a VERY good lead -- if their poor opponents couldn't muster more than a field goal in 52 minutes, they're unlikely to close a 10 point gap now. On the other hand, if it's a 42-32 shoot-out, it's still anybody's ballgame -- the hometown heros haven't stopped them yet, and they're unlikely to start now. One piece of good fortune on an onside kick, and you could be drinking away the evening, trying to forget how the WinEx calculator told you the game was in the bag.
I know that Brian's WinEx calculator doesn't take these things into account -- it doesn't ask what teams are playing, and it asks for score differential, not current scores. It may never take them into account, and that won't stop it from being a useful tool. But there's more to think about when analyzing in-game scenarios: that 3 yard rush is meaningless without context.
Tuesday, January 20, 2009
[+/-] |
In running model for the NFL. |
by Denis O'Regan
(Before I start this can I just say that Brian's online "in running" calculator is a fantastic bit of kit and I want to thank him for making it available to everyone.My approach is going to be different to Brian's.....my calculator wouldn't know a first down if it got blindsided by one.....)
Modeling in running soccer matches is a relatively easy undertaking and usually involves estimating in game scoring rates for both teams and using these averages to calculate the probability of individual scoring events occurring in the remainder of the game by way of the Poisson distribution.
I decided to try to apply the approach to the NFL by following the methods I use for soccer games and trying to work around problems caused by the differences of the two sports when they arose.
At first glance American football is a much higher scoring sport than soccer.The average total goals in a soccer game hovers somewhere around 2.7 goals compared to 40 points in gridiron.However,the 40 NFL points are scored as a result of only 8 or 9 scoring events broken down as touchdowns,field goals with the odd safety thrown in.
Therefore I used scoring events instead of points to create a scoring expectancy for the NFL teams. For example,Team A's offense averages A scoring events per game in a league where the league average is L scoring events per game.They play Team B,whose defense allows B scoring events per game.It should be possible to work out how many scoring events Team A should be able to manage on average when they host Team B.
Team A scores at A/L times the league average.
Team B concedes at B/L times the league average.
Multiply these two rates together gives you a good idea of the scoring rate Team A will achieve against Team B's defense at a neutral venue.If we want to make Team A the home side we further need to divide the average scoring rate of all home teams (call this H) by the league average and incorporate this.
The scoring rate for Team A at home to Team B can be calculated as
Team A = A/L * B/L * H/L
Lastly,to convert this scoring rate to actual scoring events we multiply this rate by the league's average scoring events per game,namely L.
If we repeat this process for Team B,using the average scoring rate for away sides this time,we now have a scoring expectancy for each team.If you'd gone through this process for the NFC Championship game,you would have had Philly in for just over 5 scoring events and Arizona in for just over 4.
Armed with these team averages we can now use the Poisson distribution to calculate the probability that each team will achieve exactly zero scoring events,1,2,3 etc.
That further allows us to calculate the probability that,for example the game will end with Team A scoring twice and Team B scoring just once...and here's where the problems start.
In soccer winning 2-1 is definitive,you win the game,in the NFL it merely gives you a very good chance to win the game.Even if we throw out safeties as a rarity and assume all touchdowns are single point conversions,you can still score two field goals and lose to one touchdown.Whereas in soccer you could safely add the probability of a team winning the whole game 2-1 to it's overall win probability you have to keep some back in the NFL.
Breaking down the 2-1 scoring events into different combinations of TDs and FGs quickly become unwieldy if you include 2 point conversions,safeties,missed extra points,as do more common,higher scoring combinations.So I've tried various fudges.These include,do nothing (what you gain in terms of win probability on the 2-1 you lose when it comes to a 1-2 scoreline) to incorporating a points per score factor and using real life data based on scoring events.
Putting aside these problems for a moment,we do now have a way to attach a probability to every scoring event combination from say 0-0 all the way to 12-12.Which should cover most eventualities in the NFL.
So far all we've got is a clunky pre match predictor.To turn it into a serviceable "in play" predictor we need firstly to predict how each team's pre game scoring expectancy decays with time.Again soccer is quite easy.Scoring rate increases as the game goes on and you can calculate a team's remaining goal expectancy by multiplying the pre game number by the proportion of time remaining raised to the power of 0.84.The NFL also looks straightforward.After an initial lull due to kick off field position,the scoring rate remains relatively steady,until peaking inside each two minute warning.
For example 10 minutes into a game a team will still have 88% of it's pregame scoring expectancy "left".By half time it's declined so that only 47% remains.
We can now see what each team's pre game scoring expectation has decayed to at any point in a game.By entering these revised numbers into a Poisson calculation we can calculate scoring event combinations and their probabilities for the remainder of any game.Together with the current score and the average points per score for each team,this is valuable information as to determining the final outcome of the game.
A team that currently leads by 7 can now be assured of winning if they "win" the remaining mini match 2 scores to one.So for this particular combination of current score and predicted outcome the team can be assigned the full probability.Other combinations still require "interpretation".
Here's the current version in action.I've stuck to win probability updates after each score,firstly to keep it brief,but also to avoid the need to add much of an additional field position correction.
I've also added the in running probabilities from a UK betting site for comparison.
Philly@ Arizona.
Pre game Philly were favoured at most places by about 4 points.To make Arizona the favourites I think you had to take a very positive view about their home field advantage.The in running model favoured Philly pre game.I'll list the win probability of the current favourite and suffix it with a letter to denote who that fav was.
Score (Philly first) Model UK betting site.
Pre game 60%(P) 62%(P)
0-7 53%(A) 54%(A)
3-7 51%(A) 53%(A)
3-14 72%(A) 67%(A)
6-14 60%(A) 62%(A)
6-21 79%(A) 80%(A)
6-24 91%(A) 87%(A)
13-24 89%(A) 87%(A)
19-24 67%(A) 65%(A)
25-24 69%(P) 62%(P)
25-32 93%(A) 85%(A).
and now for Baltimore@Pittsburgh.
Pregame the UK were very bullish about Pittsburgh's chances and they were favoured by around 6 points.The model also favoured the Steelers,but only by about 4.5 points.
Score (Balti first) Model UK betting site.
Pre game 65%(P) 69%(P)
0-3 74%(P) 73%(P)
0-6 81%(P) 78%(P)
0-13 95%(P) 89%(P)
7-13 83%(P) 81%(P)
7-16 93%(P) 89%(P)
14-16 74%(P) 77%(P)
14-23 99%(P) 98%(P).
The fascinating game for me is the Ravens/Steelers one.Despite getting within 2 points,once the scoring started,the Ravens were still only a 26% chance to win immediately after that score.One of the strengths of this type of model is that it takes your pre game,long-term opinion of each team and sticks with it regardless.By the time they made it 14-16 just ten minutes remained.Baltimore's pre game scoring expectation had decayed to around half a score,Pittsburgh's was about a tenth higher.Plugging these new expectations into a Poisson you find that the chances of Baltimore scoring and Pittsbugh not in what remained of the game was around 17%.Both teams scoring once each was about a 20% chance,but most "one score each" permutations still gave the Steelers an overall win.Most betting men seemed to agree that,despite the closeness of the scores,Pittsburgh were still big favourites .....although I'll bet both sets of fans didn't feel quite so sure.
Saturday, January 17, 2009
[+/-] |
A Scoring Efficiency Model v.2 |
By Cyril Smith
In reviewing the model (which itself was based on data from the 2005-07 regular season and playoff games) it seemed that the weak point was the assumption that each team was roughly equal and that its scoring potential was based on 30 minutes of possession. While this approach had worked reasonably well on past data as well as the wild card round, it did not perform well in the divisional round. There is in fact a great deal of volatility in each team's points per minute and time of possession - volatility which is obscured by using averages. I therefore looked for a way to incorporate this volatility into the model.
At first blush it is apparent that volatility is inversely related to wins. Teams that showed greater volatility in points per minute and time of possession tended to do worse in results. A reasonable hypothesis is that volatility represents weaknesses which a team may not be able to overcome in the playoffs because the opposing team will zero in on those weaknesses. According I took a first cut at combining volatility of points per minute and time of possession, in each case measured by the standard deviation of the series, into the model. The preliminary results look promising. Using the additional input resulted in two changes for the divisional round: Pittsburgh was now favored over San Diego and Baltimore over Tennessee.
Looking ahead the revised model has Philadelphia over Arizona by a point and Pittsburgh over Baltimore by 4 points.
Wednesday, January 14, 2009
[+/-] |
Punting and Field Goals |
by Dean Jens
Field Goals
One of the things I often wondered before discovering The Football Project was how the probability of a kicker making a field goal varied as a function of distance. After eyeballing the distributions for a few kickers for the 2005 season, I figured I could try raising a logistic function to some power. For the first several kickers I tried, I found that that power was statistically indistinguishable from 1, so I set about fitting the probabilities to a simple logistic function, i.e. (1/2)(1+tanh((m-x)/w)).†
I had imagined, in the absence of data, that w might be independent of the kicker, and that kickers could be characterized by m, i.e. how far away they are when their percentages drop. This is not the case; w depends on the kicker, with larger values to kickers who tend to miss easy ones and make longer ones, with lower values to more consistent kickers. Olindo Mare missed a few short ones, so his percentages didn't drop off very quickly. Matt Bryant actually had a slight improvement as distances got longer; this would surely change if more statistics were taken at a normal range of distances. On the other hand, John Kasay had a much higher tendency to hit field goals shorter than 50 than if they were longer than 50; of the 8 he missed, the shortest was 42 (he made 24 shorter than that). Jeff Reed had an even sharper drop around 45 yards, missing nothing shorter than 41 and making nothing longer than 47. While I was unable to fairly characterize the best kicker in terms of a drop-off length, I was able to generate a different metric that adjusts for length. By using my logistic fits, I predicted the percentage of field goals a kicker would make if they kicked from a given distance; I then took the 1006 field goal attempts for the season and calculated the percentage of those 1006 field goals that each kicker would have made. I've only included those kickers who attempted more than 4 kicks; the kickers who were dropped were all notably worse than the ones listed.
kicker | normalized score | percentage | number of kicks |
---|---|---|---|
racken001 | 0.963 | 0.952 | 42 |
nednej001 | 0.917 | 0.9 | 30 |
wilkij001 | 0.889 | 0.871 | 31 |
dawsop001 | 0.889 | 0.933 | 30 |
kaedin001 | 0.866 | 0.875 | 24 |
kasayj001 | 0.86 | 0.805 | 41 |
vandem003 | 0.857 | 0.889 | 27 |
stovem001 | 0.851 | 0.882 | 34 |
grahas002 | 0.837 | 0.879 | 33 |
hansoj001 | 0.836 | 0.792 | 24 |
bryanm001 | 0.836 | 0.846 | 26 |
bironr001 | 0.835 | 0.793 | 29 |
feelyj001 | 0.832 | 0.833 | 42 |
linder001 | 0.819 | 0.829 | 35 |
mareo001 | 0.815 | 0.833 | 30 |
hallj006 | 0.81 | 0.824 | 17 |
elamj001 | 0.806 | 0.771 | 35 |
tynesl001 | 0.803 | 0.818 | 33 |
akersd001 | 0.802 | 0.727 | 22 |
brownj018 | 0.796 | 0.697 | 33 |
petert005 | 0.794 | 0.885 | 26 |
reedj005 | 0.785 | 0.844 | 32 |
nugenm001 | 0.773 | 0.786 | 28 |
vinata001 | 0.773 | 0.786 | 28 |
carnej001 | 0.762 | 0.781 | 32 |
longwr001 | 0.751 | 0.741 | 27 |
gouldr001 | 0.749 | 0.786 | 28 |
brownk008 | 0.745 | 0.765 | 34 |
scobej001 | 0.743 | 0.75 | 32 |
edingp001 | 0.736 | 0.735 | 34 |
janiks001 | 0.704 | 0.667 | 30 |
franct001 | 0.686 | 0.778 | 9 |
cortej002 | 0.671 | 0.706 | 17 |
novakn001 | 0.608 | 0.8 | 10 |
cundib001 | 0.541 | 0.556 | 9 |
This obviously does not adjust for wind, and the linemen on both the kicking and defending sides will have some influence on these statistics, but this at least tells which unit is doing better than which other with the confounding variable of distance removed. The average length for a field goal attempt was 36.3 yards; the average for Nick Novak was 33.7, while for Josh Brown it was 41.2. Accordingly the "scores" for these kickers find themselves lower and higher, respectively, than the raw percentage. The scores and the actual percentages have a corelation of 0.8. The means and variances are very similar, though the variance of the raw percentages is a little bit smaller; while the difference isn't statistically significant*, it is what would be expected from coaches deciding to attempt longer field goals with better kickers, and punting or going for the first down with worse kickers. Perhaps looking at all fourth down plays from around the thirty yard line would be a good step for further research.
† This isn't a least-squares fit; I try to maximize the sum of the logarithm of the fitted probability of the actual outcome: for kicks that the kicker makes, P is the fitted probability that the kicker would make the kick, while for those the kicker missed (or were blocked or whatever), it is the fitted probability that the kicker would miss the kick.
* It would be significant at the 25% confidence level on a two-tailed test; arguably a one-tailed test could be used here, but even that isn't going to pass a common significance test. When a team prepares to punt, the punter's statistics are often cited, typically the average length of his punts and the number of times he has left teams behind their own 20 yard line. These seem like kind of strange statistics to me; if I were to take the line of scrimmage and the end position of the ball and plot one against the other, what I would likely expect to see, as a first approximation, would be a 45 degree line† up to a point, and then a horizontal line from there on out. Behind a certain point on the field, a punter would be expected to net a certain length; ahead of that, he would be expected to average a certain level of field position. Grabbing every punt the Packers made that year, I found that the break-point from a least squares fit was very near midfield. Accordingly, it seems to me we ought to characterize the net length of punts from one's own half of the field, and the average final field position for punts from the fifty yard line and beyond.
Punting
Taking the data from The Football Project for 2005, I calculated these statistics for each player who punted. Every player who punted more than twice had at least one punt from each half of the field, so the figures for them are well defined. Remember, the "length" is only calculated for those punts from the punter's own end of the field; the "depth", the name of which is probably more poetically than logically motivated, is the average ensuing field position of the receiving team after punts from the fifty and beyond. I use results net of the return, though using results before the return leaves a lot of what follows more or less unchanged. The players are ordered by length-depth/4, due to the fact that about 4/5 of punts originated from the punting team's side of the fifty.
punter | length | depth | number of punts |
---|---|---|---|
moormb001 | 41.51 | 13.85 | 74 |
jonesd018 | 41.04 | 13.35 | 88 |
johnsd022 | 39.78 | 10.5 | 42 |
bergem001 | 39.69 | 10.88 | 75 |
sauert001 | 39.38 | 11.05 | 83 |
grahab001 | 38.84 | 9.82 | 75 |
scifrm001 | 39.76 | 14 | 74 |
bakerj001 | 39.55 | 14.18 | 88 |
hentrc001 | 39.48 | 14.22 | 79 |
mcbrim001 | 39.16 | 13.09 | 85 |
bidwej001 | 39.45 | 15.68 | 97 |
hansoc001 | 38.72 | 14.52 | 92 |
koenem001 | 38.8 | 14.84 | 78 |
feaglj001 | 38.02 | 13.35 | 78 |
frostd001 | 38.21 | 14.71 | 91 |
grooma001 | 38.89 | 17.67 | 12 |
playes001 | 36.95 | 10.5 | 76 |
colqud001 | 37.8 | 14.14 | 66 |
harrin002 | 36.05 | 9.71 | 89 |
landes001 | 38.41 | 20 | 34 |
edingp001 | 35 | 7 | 2 |
maynab001 | 37.78 | 18.48 | 106 |
leea003 | 36.44 | 13.11 | 110 |
barkeb001 | 36.7 | 14.27 | 51 |
gardoc001 | 36.48 | 13.95 | 86 |
aragul001 | 37.08 | 16.4 | 18 |
lechls001 | 36.18 | 13.08 | 84 |
smithh009 | 35.28 | 11.08 | 59 |
larsok002 | 36.47 | 17.73 | 66 |
stanlc002 | 34.81 | 11.5 | 79 |
benned001 | 34.57 | 11 | 8 |
millej012 | 36.46 | 19.1 | 88 |
kluwec001 | 35.21 | 14.43 | 75 |
richak003 | 34.81 | 13.17 | 81 |
rouent001 | 34.96 | 14 | 76 |
murphn001 | 33.5 | 15 | 7 |
sandeb002 | 33.37 | 15.4 | 64 |
hodger001 | 32.31 | 13.92 | 44 |
flinnr001 | 31 | 23 | 6 |
brownj018 | NA | 11.5 | 2 |
cundib001 | NA | 20 | 1 |
dawsop001 | NA | 6.5 | 2 |
ellina001 | NA | 2 | 1 |
gouldr001 | NA | 24 | 1 |
kasayj001 | NA | 20 | 1 |
mareo001 | NA | 27 | 1 |
nugenm001 | NA | 17 | 1 |
roethb001 | NA | 10.5 | 2 |
vinata001 | NA | 4 | 1 |
wilkij001 | NA | 20 | 1 |
Number one is Brian Moorman, of the Buffalo Bills; second is Donnie Jones. They are the only two punters to average more than 40 net yards from their own end of the field; of punters who punted more than twice, the two who left the ball inside the ten yard line when they punted from midfield or closer were Ben Graham, who had pretty good length as well, and Nick Harris, whose length was more mediocre.
Adding the length and depth for each player with more than two punts, I get a surprisingly narrow distribution. It is centered around 51.4 or 51.5 — 50.5 would be ideal for the use of these statistics — and has a standard deviation of only 3.5 yards. Most punters, then, seem to punt for distance behind their own 49 or so, and for field position beyond there. If I exclude Ryan Flinn, who had six punts (the fewest among those with more than two) for the worst result in both statistics (among those with more than two punts), the correlation between length and depth is 0 to two decimals.* Accordingly, a punter with better length will tend to be affected by the endzone further into his own territory, while a punter who is particularly good at pinning the opposing team against its goal line is more likely to still be punting for length a bit beyond the fifty; there is no unambiguous connection, independent of one's measure of "skill", between a punter's "breakpoint" and the skill of the punter.
It won't come as a great surprise that the length as I measure it and the average length of all punts has a correlation greater than 0.9. It might not be a big surprise either that the percentage of punts to end up inside the twenty has a correlation of -0.4 with "depth", but, interestingly, either length measurement has a correlation of 0.4 with the inside-the-twenty statistic. From a linear regression standpoint, it looks as though the inside-the-twenty statistic is including some length information; 1/3 of the variance can be explained from the two numbers in my table. The median punt to end up inside the 20 starts from 2 yards behind midfield, but 20% come from behind the punter's own 40; some of what is being recorded in that figure is not any deftness in terms of avoiding the touchback or letting one's teammates get downfield, but is simply the ability to kick to the red zone from farther away. This is a nice skill, of course, but it is fully incorporated into the length statistic; the frequency of leaving a punt inside the twenty is a hybrid of skills, and is not the best measure for any of them.
† There is some attempt here to keep the statistics simple. In fact, this line is slightly flatter than 45 degrees because the endpoint is bounded both above and below; punts from behind midfield give a slope of 0.95 that is statistically distinct from 1 at the 5% confidence level.
* This actually is less true without the return; punters who punt the ball farther before the return also tend to punt it closer to the endzone, but not dramatically so. The distribution of punters' depth+length is similar to the results with the return, with several yards simply moved from depth to length.
Tuesday, January 13, 2009
[+/-] |
Is 3rd and 6 a running down in the NFL? |
by jjbtnw
This post probably has more to do with an interesting query result than suggesting an alternative strategy for coaches. I was taking a peek at the 2005 play-by-play data that is available on this site (thanks Brian!). I had just separated everything into drives and series. I was wondering what the most efficient pass/run mix was, historically. One of the funny looking query results that I turned up was the following matrix of down-and-distance and conversion percentages. This data is limited to the 2005 data and does not include first downs gained by penalty. Nor does it exclude garbage-time drives or time-pressure drives. It does include playoff games and goal-to-go series.
Down | Distance | Pass Conv% | Run Conv% |
3 | 1 | 63.5 | 76.3 |
3 | 2 | 50.8 | 60.8 |
3 | 3 | 50.8 | 57.9 |
3 | 4 | 46.9 | 53.6 |
3 | 5 | 43.1 | 47.4 |
3 | 6 | 43.2 | 55.2 |
3 | 7 | 38.7 | 36.8 |
3 | 8 | 33.9 | 26.6 |
3 | 9 | 31.1 | 32.9 |
3 | 10 | 32.2 | 27.6 |
To me it is interesting to see that running is a better strategy on
3rd down for distances of 6 yards and less. One of the reasons that running is so effective is that it is apparently unexpected. The next table shows the same down-and-distance situations, and how many of each type of play occurred in the league.
Down | Distance | Pass Plays | Run Plays |
3 | 1 | 203 | 548 |
3 | 2 | 364 | 260 |
3 | 3 | 423 | 133 |
3 | 4 | 467 | 110 |
3 | 5 | 531 | 95 |
3 | 6 | 526 | 67 |
3 | 7 | 507 | 76 |
3 | 8 | 501 | 64 |
3 | 9 | 396 | 76 |
3 | 10 | 603 | 87 |
On 3rd down and 3, 76% of teams pass instead of run. And thepercentage of passing on 3rd down with more yards than 3 needed for a first down only gets higher. So most defenses on 3rd and 3 or higher will be implementing a pass-oriented scheme and personnel package. And 3 out of 4 times (or even more often) they will be making the correct call. Given those circumstances a run play would be expected to be more successful. And apparently, it is. According to game theory, if coaches were calling the correct run/pass mix, the conversion percentages would be nearly identical. For 3rd and 6 or less, they aren't. So the obvious assumption is that more running plays should be called on 3rd and 6 than are currently being called in the NFL. Of course, this post would need to be backed up by data from more than one year to validate any assumptions made.
Friday, January 9, 2009
[+/-] |
Quarterbacks: Starters verses Backups |
By Denis O'Regan.
This is a tentative attempt to find a ball park figure for how much a team's chances are compromised when the backup quarterback plays.
Firstly,I collected and combined data for every regular season play since 2007 made by a quarterback who,at the start of the season was considered to be the team's number one quarterback.
I then repeated the exercise for every other quarterback who threw a pass over the same time scale.As I'm initially just looking for a general figure I didn't take any account of how the backup came to be playing.Backups can be under centre for a variety of reasons.They can be mopping up at the end of a large victory or playing out time after a heavy defeat.The best comparison between backups and starters is obviously when the latter is injured and the former starts,but for the moment I've looked at all plays.
I did eliminate passes thrown by non quarterbacks,such as punters or running backs,but I'll present those numbers at the end because they do make interesting reading.
I looked at various passing stats,but I'll be using yards per attempt as the main tool to compare an average backup with an average starter.
Here's the results.
Interceptions.As you'd expect the average starting QB looks after the ball better than his understudy.Backups are intercepted on 3.6% of their attempts since 2007 compared to only 2.8% for starters.
Touchdowns.Starters throw a touchdown on 4.3% of their attempts compared to just 3.3% for backups.
Sacks.Once again starters are better,going down on 5.6% of their total dropbacks compared to 7.3% for backups.However, backups do slightly better by only losing 6.3 yards per sack compared to 6.5 yards for starters.So a small,if insignificant victory,although as we're only looking at two seasons worth of results,this could simply be a sample size issue.
Quarterback rating.I know it's not perfect,but I'm including it anyway.If you take every play made by a starting QB since 2007 and compile a combined QB rating you get 86.0.For backup's it's only 72.4.
Completion percentage.Not surprisingly,backups do poorly compared to starters,not only do they lack the starter's talent,they also get much less time to practise with their receivers.The backup averages a completion rate of 57.9% compared to 62.1%.Breaching 60% completion rates seems to be a factor that defines quality in a quarterback.Eli Manning only hit 56% of his passes in the Giants Superbowl winning season,but on the road,where the trophy was won he connected on 61% of his throws.Prior to Manning you have to go back to the Raven's Trent Dilfer (59%) in 2000 to find a SB winning QB whose regular season completion rate was sub 60%.
Yards/Attempt.This is the stat I'm going to use to estimate a starters worth.Backups pass for 6.26 yards per attempt,very nearly a yard per attempt less than starters,who averaged 7.11 ypa over the two season.These figures don't take into account sacks as failed pass attempts,nor does it subtract sack yards or interception yards,but if you do correct for these occurrences,then the discrepancy in the two figures remains fairly constant.
If you patch all these stats together you find that the 2008 quarterback who comes closest to matching the average stats for a starting quarterback is Denver's Jay Cutler.The pin up guy for the backups based on 2008 is Marc Bulger,which probably says as much about the Rams as it does about Bulger.If you want a player who comes closest to replicating the average backups stats who actually is a backup,try a combination of Brian Griese from 2007 in Chicago and 2008 in Tampa.
Armed with the comparison between yards/attempt for each type of quarterback I next constructed a predictive model based around yards per pass and yards per rush.
I calculated the yards per pass and yards per rush each team had achieved on offense and defense prior to each game from week 4 onwards for the last 7 seasons.I corrected these figures for strength of opponents faced.I then matched each teams offensive numbers with their opponents defensive numbers,and vice versa for every game played and regressed those numbers against the actual result.
For example if team A was averaging 6.5 yards/pass against defenses who were allowing 6.8 yards/per pass,then they were considered below average to the tune of 0.3 yards/pass.If they were matched up against a defense that had allowed 6 yards/pass against offenses that were averaging 6.4 yards/pass,then I considered that the defense they were facing that day was 0.4 yards/pass above average.Combining these two figures gives an overall projected passing capability of team A in this particular game of 0.7 yards/pass below the league average.
I also did this for the projected rushing offense of team A and repeated the process for team A's opponents on that day.
Both teams,therefore had a pregame projected rating for projected passing ability (containing information about their game day opponents pass defense) and rushing ability (containing informtion about their gameday opponents run defense).These four pregame inputs turn out to be statistically significant in predicting the actual game outcome and tests on out of sample games perform to a level similar to the Vegas line.This limited model also suggests that it is much more important to be able to pass the ball than it is to run the ball,which backs the intuitive knowledge that losing your starting passer is a really big deal.
Having produced a serviceable prediction model based on passing yardage all we need to do now to predicted the impact of a backup QB starting is to reduce the expected passing ability by a similar drop to the one seen in the two sets of aggregate stats for starters and backups.
If we do this for every matchup over a series of seasons we find that on average the presence of a backup for the whole game decreases the win probability of that team by about 8 percentage points.In other words a team with a win probability of 58% would turn into a coin toss if they played their backup in that game and their yards/pass numbers were reduced by just under a yard/pass.
So to sum up,in terms of points on the scoreboard a backup QB on average seems to cost a team about a field goal.Individual teams will of course see differences within these averages.Swapping between Manning and Sorgi you would expect would cost the Colts more than the average,whereas the choice between Orton and Grossman may result in little difference.
For completeness,here's the 2007-2008 stats for non QBs passing the ball.They completed 55% of their passes,they were sacked on 14% of their drop backs.They threw 10 yards/attempt and 23% of their passes went for TD's!.They had a combined QB rating of 116.
Thursday, January 8, 2009
[+/-] |
A Scoring Efficiency Model And The Playoffs |
by Cyril Smith
This is the time of year when the NFL really gets interesting. The ups and downs of the regular season have ended and the playoffs hold the promise of competitive matchups for every game. I have put together a simple scoring efficiency model for predicting the outcome of playoff games. I look at the ability of a team to score based on its time of possession, in other words points per minute (ppm). PPM measures not only offense but defense and special teams as well. Both good defenses and good special teams give the offense good field position, which means less time is needed to score. A defensive touchdown or a kick return for a touchdown represents scoring with minimal time expenditure.
The model takes the average points per minute for the last nine games. Each playoff team's ppm is calculated by dividing its aggregate nine game score by its aggregate nine game time of possession. An assumption of the model is that the playoff teams are roughly equal; accordingly each team's basic score is calculated by multiplying its average ppm by 30. I then make three adjustments: home team gets 3 points; a team whose quarterback has never started a playoff game is docked 3 points; and a team whose average ppm has increased significantly over the past three games compared to games four through six is given 3 points for trend.
How did the model do on the first round of playoffs? It had Arizona over Atlanta by 6 points; San Diego over Indianapolis by 3 points; Miami over Baltimore by 1 point; and Philadelphia over Minnesota by 1 point.
For next weekend the model has Carolina over Arizona by 11; San Diego over Pittsburgh by 4; Tennessee over Baltimore by 3 1/2; and New York over Philadelphia by 4 1/2.
Wednesday, January 7, 2009
[+/-] |
Does Baltimore's Defense Travel? |
by Denis O'Regan
This was supposed to be a short piece on home field advantage,but then I caught Boomer Esiason previewing th AFC wildcard weekend on NASN and it turned into something more.
In discussing the Baltimore Miami game Boomer's view was that the Ravens wouldn't be inconvenienced too much by being on the road because "defense travels".
So I decided to see if defense generally and Baltimore's in particular does travel.
Firstly,I compared Baltimore's home and away record since week one 2004,a decent sample size and a period during which their defense has been their dominant asset.
They are 29-12 at home and 15-25 away.If defense does travel I would expect their home and away records to be a lot closer than they actually are.
So next I used a method that gives a reasonable estimation of a team's home field advantage.I'll outline the ideal (but wholly impractical) methodology used and then try to show how you can get reasonable approximations using readily available data.
Say you've got two teams,we'll call them S and D,S can be the superior team.The first step towards determining HFA for each would be to determine how much better,on average in terms of points S is compared to D at a neutral venue.To do this we'd simply require S to play D,over and over at a neutral spot and average the margin of victory (or the occasional defeat).
We'd then ask the teams to repeat the exercise,with D acting as the hosts.After enough re runs,we would again average the margin of victory or defeat for S and could infer that the amount that this figure had declined compare to S's superiority on neutral turf,would be equal to D's HFA.
Similarly when we played out an extended series at S's home field,the amount that the average margin of victory had increased compared to the neutral figure would be as a result of S's HFA.
Of course none of he above is possible.For a start,aside from the odd trip to London and the Superbowl,hardly any neutral venue games are played.
It does give clue as to how to approach the problem though.If you take the average difference between the margin of victory with S hosting D and D hosting S,you've got a figure that comprises the HFA of D plus the HFA of S.
There are still problems,even for divisional rivals to get,say 40 pairs of games,you're looking at 40 seasons of results.Teams change,as do likely contributing factors that could affect HFA.
So instead do the next best thing and take,say San Diego's last 40 home and away games (5 years of games).The average margin of victory at home was 10.9 points,away from home it was +3.8 points.That's a spread of 7.1 points.
SD's opponents over those 40 games should represent a fair cross section of the NFL,as should the HFA's of those teams.Therefore,as a good approximation for the HFA of SD's opponents we could use the average HFA for the league over the same period.That figure is 2.5 points.
So subtracting 2.5 from 7.1 gives us SD's HFA over the last five years as 4.6 points.
Do the same for every team and you get.
Team.......HFA (points).
Baltimore .......8
Arizona...........7.8
Seattle............6.9
Houston..........5.8
St Louis..........4.7
San Diego.......4.6
Jax.................4.3
NYJ................4.3
Tampa............4.1
KC.................4.1
Washington....3.7
Atlanta...........3.7
SF.................3.6
Mini................3
Philly..............2.8
Indy................2.7
Denver............2.5
Buffalo.............2
Dallas.............1.9
Chicago..........1.3
Oakland..........1.2
Cleveland........1.2
Pittsburgh........0.9
GreenBay........0.8
Detroit.............0.5
Tennessee......-0.2
NewOrleans....-0.3
NYG..............-1.7
NewEngland...-1.8
Miami............-1.8
Carolina.........-1.8
Cinci.............-2.1
So over the period Baltimore,as their win/loss record suggests appear to have benefited enormously from being at home and have fared comparatively poorly on the road.Rather than defense travelling well,the reverse could actually be true.
The next step towards perhaps proving that defense doesn't travel involves using a statistics that can represent a teams defensive capability.I used FO's dvoa stats.Their website contains weekly offensive,defensive and special teams dvoa stats from 2004 onwards.
I recorded the offensive,defensive and ST dvoa each team and their opponents took into every game from week 4 2004 to week 16 2006.I regressed these variables against the actual game result.I then ran a real time live test through the 2007 and 2008 season,using the regression line to predict game outcomes.
In short the respective dvoa stats that the home and away teams took into a match up are statistically significant in predicting the outcome of that match up.
The difference between the margin of victory (or defeat) predicted by the regression line and the actual margin averages out at 10.6 points per game over the two seasons of live testing.By contrast the Vegas line is out by on average 10.5 points per game over the same time scale.However,the dvoa based regression shades Vegas overall by being closer to the actual margin of victory in 60% of games.The slightly better average margin of error by Vegas compared to dvoa could be explained by Vegas incorporating readily available team news.
Having,hopefully established the legitimacy of the dvoa regression line we can now use it to see if it says anything about Baltimore's defense.
To see which of the individual dvoa stats are most important in determining game outcome I standardised the inputs and re did the regression.
The largest contributer to match outcome is the offensive dvoa of the away side,followed by the offensive dvoa of the home side.Because the regression contains a constant that equates to the average home field advantage for the NFL as a whole this tells you that having a good offense helps a team more on the road than at home,but the gap is small.
The third biggest contributer is home defensive dvoa,but it is followed by the away special teams dvoa and the home special teams dvoa.The smallest contributer to game outcome is the defensive dvoa of the away side.
This is significant to the Baltimore case.The offensive and special teams dvoas merely tweak the built in home field advantage because the relative sizes of their home and away regression coefficients are similar in size.But if you rely greatly on defense,as Baltimore do,the fact that the size of the home and away defensive regression coefficients differ greatly results in a comparatively good predicted home performance and a comparatively poor away one.
So does this predicted home away split occur in practice.
Given the home/away win/loss splits and the apparently large home field advantage,it appears to.But just to be sure,I split Baltimore's 2008 games by venue and looked at the average yards per play allowed through the air and on the ground by the defense.I didn't correct for opponent.
At home the Ravens allowed 4.15 yards per pass and 3.34 yards per run.Those figures increased to 6.12 ypp (increase of 47%) and 3.76 ypr(increase of 13%) on the road.
Spread over the 5 seasons from 2004 onwards,this drop off in defense on the road is still present,albeit less dramatic.Overall Baltimore yields 10% more yardage both to the run and the pass on the road compared to at home.
So to summarise,using dvoa ratings as a measure of team talent,and regressing those ratings against game outcome implies that a good defense is very helpful at home,but performs comparatively poorly on the road.
The really interesting question is why if,Baltimore are typical,do defense reliant teams struggle on the road compared to at home.An obvious cause could be that it's easier for a ref to call pass interference against a road side,but for the moment that's just speculation.Penalty stats are also notoriously difficult to accurately collect.
Another point to consider is that teams that have large home field advantages (for whatever reason,climate being an obvious other factor) are going to be inconvenienced at some stage during the post season.Baltimore are rightly considered to be slight favs even on the road in Miami,but it may be no coincidence that if we look back at the hfa table,all of the Superbowl winners over the period considered had little or no home field advantage.
Sunday, January 4, 2009
[+/-] |
What Wins Championships? |
by Derek Singer This study uses data from the 1990-2007 seasons culled from pro-football-reference.com. The main statistic used is percentage over league average of yards per play. From year to year, the average yards gained per pass play hovers around 6, while the average for run plays stays around 4. So an offense gaining 6.6 yards per pass play would be around 10% over league average. A defense allowing 5.4 yards per pass play would be around 10% over league average (i.e. >0% means the defense is above average, <0%>10% above average, 11 won the Super Bowl (13.6%). The second table shows that teams that make it deeper into the playoffs are better in all four phases of the game on average. All of the last 18 Super Bowl champions have been above average in at least one phase, with 15 being above average in three phases. Only one champion has been less than 5% above average in all phases of the game: the 2001 Patriots. Sixteen of the last 18 champions were at least 5% above average in two phases of the game. Ten have been at least 10% above average in 2 or 3 phases (none in all 4, seven in only 1). A team that excels in one area can make the playoffs. Championship teams, however, excel in more than one area. In general, they are balanced teams in that they are great at one or two things and terrible at very little. Only the 2001 Patriots were more than 5% below average in two phases (run offense and run defense). Applying these ideas to the 2008 playoffs, the Carolina Panthers emerge as the favorites to win it all. They are more than 5% above average in pass offense, run offense, and pass defense. Four of the other five NFC teams are more than 5% above average in two areas. The Steelers look like the favorites in the AFC, with an exceptional defense and average passing game. Baltimore, Tennessee and Pittsburgh all have very good defenses and mediocre offenses this year. On the flipside, Miami, San Diego, and Indianapolis have good pass offenses but unexceptional running games and defenses.
Perhaps the oldest and most revered cliché in all of sports is “defense wins championships.” Play a drinking game based on commentators using that cliché, and you may be dead before the divisional round. Teams such as the 70s Steelers, 2000 Ravens, and 2002 Bucs are often trotted as proof. While the latter two certainly had great defenses backed up by mediocre offenses, the Steelers had several Hall of Famers on offense. The Steel Curtain was complemented with great running and passing attacks. “Smashmouth” football alone didn’t win. The 80’s 49ers were made famous by their West Coast Offense, one built on the passing game. The 90’s Cowboys were best known for the triple threat of QB Troy Aikman, WR Michael Irvin, and RB Emmit Smith. Both dynasties also had good defenses, however, during their championship seasons. The greatest teams, it would seem then, are the best balanced teams. They have good offenses and good defenses. In Football Outsiders’ Pro Football Prospectus 2006, they found that defense does indeed have a higher correlation with playoff success than offense. Correlation not being causation, we have to ask why offense would suddenly become less valuable in the postseason. How is the postseason different than the regular season? The competition is much better. If the proportion of teams with good offenses is higher in the postseason than in the regular season, then the only thing that will separate the best from the good is defense (and special teams to a lesser extent). Total Teams Playoffs Playoff % Conf. Champs Conf. Champ % SB Champs SB Champ % pass off. >5% 167 115 68.86% 25 14.97% 13 7.78% pass off. >10% 105 81 77.14% 19 18.10% 11 10.48% pass off. >15% 60 49 81.67% 14 23.33% 7 11.67% pass off. >20% 24 20 83.33% 7 29.17% 4 16.67% run off. >5% 161 76 47.20% 14 8.70% 9 5.59% run off. >10% 87 44 50.57% 9 10.34% 6 6.90% run off. >15% 52 25 48.08% 4 7.69% 2 3.85% run off. >20% 24 10 41.67% 3 12.50% 1 4.17% pass def. >5% 158 101 63.92% 18 11.39% 11 6.96% pass def. >10% 67 53 79.10% 11 16.42% 7 10.45% pass def. >15% 21 17 80.95% 5 23.81% 4 19.05% pass def. >20% 4 3 75.00% 1 25.00% 1 25.00% run def. >5% 159 70 44.03% 19 11.95% 11 6.92% run def. >10% 80 37 46.25% 10 12.50% 5 6.25% run def. >15% 35 14 40.00% 4 11.43% 1 2.86% run def. >20% 15 4 26.67% 2 13.33% 1 6.67%
Of the 44 postseason teams with a run offense >10% above average, 6 won the Super Bowl (13.6%).
Of the 53 postseason teams with a pass defense >10% above average, 7 won the Super Bowl (13.2%).
Of the 37 postseason teams with a run defense >10% above average, 5 won the Super Bowl (13.5%).
Avg. Max. Team with Max. Min. Team with Min. Pass Off 11.197 30.739 STL 1999 -11.152 BAL 2000 Run Off 3.6846 22.536 STL 1999 -18.314 NE 2003 Pass Def 8.5432 22.304 TB 2002 -1.5238 NE 2001 Run Def 4.3924 34.174 BAL 2000 -28.287 IND 2006 Conf. Champions (% above league avg) Avg. Max. Team with Max. Min. Team with Min. Pass Off 10.804 34.146 STL 2001 -11.152 BAL 2000 Run Off 2.7994 22.536 STL 1999 -18.314 NE 2003 Pass Def 6.4548 22.304 TB 2002 -5.5967 TEN 1999 Run Def 4.3563 34.174 BAL 2000 -28.287 IND 2006 Playoff Teams (% above league avg) Avg. Max. Team with Max. Min. Team with Min. Pass Off 6.803 41.785 STL 2000 -17.34 TB 2005 Run Off 1.5066 38.559 DET 1997 -25.13 NE 1994 Pass Def 4.5736 22.904 NO 1992 -13.531 WAS 2005 Run Def 1.218 34.174 BAL 2000 -33.141 IND 2005 Team Seed Pass Off Run Off Pass Def Run Def TEN 1 -4.93 2.94 15.09 11.37 PIT 2 -0.34 -12.63 28.03 21.71 MIA 3 13.00 0.70 -1.03 0.71 SD 4 23.30 -2.50 0.06 4.35 IND 5 6.91 -18.12 4.43 0.95 BAL 6 -0.93 -4.55 16.82 15.40 NYG 1 -1.15 19.29 4.66 5.55 CAR 2 16.53 14.99 7.11 -5.25 MIN 3 -0.09 6.86 0.59 21.16 ARI 4 13.34 -17.61 -4.60 5.73 ATL 5 17.43 3.75 1.88 -16.90 PHI 6 -1.41 -5.49 15.24 16.63