NFL Coach Quality: A Bayesian Approach To Approximating the Value of Coaches - UPDATED
by David Durschlag
You are currently viewing version 1 of this article. To view version two, please click here.
Summary
Evaluating NFL coaches is a difficult task, popular among fans and vitally important to franchises. This is a brief attempt at the task, using purely quantitative data.
Data
The numbers of regular season games each team won each year are treated as data points. No information beyond number of regular season wins was used.
While the "metagame" of the NFL continues to evolve, the data used herein is from 1993 onward, when the last Collective Bargaining Agreement was signed. While victories now come in different environments, they are all under (roughly) the same rules. Data from before this period could be skewed based on the different rules for control of players, so it was excluded.
Also excluded was the performance of any team in a year in which it had multiple head coaches. This was done to ensure that credit for a season was easy to assign.
In total, 107 coaches and 565 team-years of data were used.
Assumptions
- Each coach has a hidden "value". Better coaches have higher value.
- The number of games a team wins can be modeled as a draw from a normal distribution with a mean of the value of their coach, and an unknown variance dubbed "Season Variance". This variance is constant across all seasons for all coaches.
- The value of coaches is normally distributed across the population, with unknown mean and variance.
Process
The above assumptions were encoded as a model for BUGS, which was run for 10,000 iterations, then another 10,000 iterations.
Results
Coach quality had converged after 10,000 iterations.
The posterior distribution for system constants were as follows:
| Constant | Mean | Standard Deviation | 
|---|---|---|
| Season Variance | 0.137 | 0.009 | 
| Coach Value Population Mean | 7.698 | 0.191 | 
| Coach Value Population Variance | 0.626 | 0.207 | 
The posterior distributions for coach values were as follows:
| Coach | Mean Value | Standard Deviation Value | 
|---|---|---|
| Bill Belichick | 10.100 | 0.633 | 
| Tony Dungy | 9.948 | 0.671 | 
| Mike Tomlin | 9.490 | 0.920 | 
| Bill Cowher | 9.350 | 0.636 | 
| Mike McCarthy | 9.344 | 0.865 | 
| John Harbaugh | 9.303 | 0.972 | 
| Sean Payton | 9.245 | 0.852 | 
| Marty Schottenheimer | 9.209 | 0.686 | 
| Mike Smith | 9.197 | 0.970 | 
| Andy Reid | 9.180 | 0.656 | 
| Mike Holmgren | 9.107 | 0.606 | 
| Wade Phillips | 9.051 | 0.792 | 
| Mike Shanahan | 8.970 | 0.606 | 
| Barry Switzer | 8.833 | 0.957 | 
| Mike Martz | 8.763 | 0.863 | 
| Jimmy Johnson | 8.751 | 0.892 | 
| Mike Sherman | 8.746 | 0.856 | 
| Tom Coughlin | 8.626 | 0.608 | 
| Jeff Fisher | 8.578 | 0.604 | 
| Jack Pardee | 8.549 | 1.221 | 
| Dennis Green | 8.507 | 0.684 | 
| Brian Billick | 8.505 | 0.738 | 
| Lovie Smith | 8.466 | 0.777 | 
| George Seifert | 8.412 | 0.812 | 
| Marv Levy | 8.397 | 0.905 | 
| Rex Ryan | 8.380 | 1.020 | 
| Don Shula | 8.375 | 1.017 | 
| Bill Parcells | 8.372 | 0.694 | 
| Jon Gruden | 8.370 | 0.692 | 
| Steve Mariucci | 8.216 | 0.778 | 
| Bobby Ross | 8.148 | 0.772 | 
| Jim Caldwell | 8.105 | 1.009 | 
| Wayne Fontes | 8.094 | 0.945 | 
| Jim Fassel | 8.077 | 0.805 | 
| Brad Childress | 8.066 | 0.908 | 
| Dick Vermeil | 8.058 | 0.778 | 
| John Fox | 7.978 | 0.743 | 
| Pete Carroll | 7.967 | 0.935 | 
| Al Groh | 7.954 | 1.210 | 
| Ken Whisenhunt | 7.876 | 0.897 | 
| Mike Tice | 7.845 | 0.944 | 
| Jim Mora | 7.833 | 0.956 | 
| Gunther Cunningham | 7.801 | 1.104 | 
| Gary Kubiak | 7.768 | 0.849 | 
| Jason Garrett | 7.766 | 1.203 | 
| Jack Del Rio | 7.759 | 0.746 | 
| Hue Jackson | 7.745 | 1.201 | 
| Tony Sparano | 7.734 | 0.943 | 
| Jim L. Mora | 7.731 | 0.960 | 
| Dave Wannstedt | 7.722 | 0.698 | 
| Dan Reeves | 7.705 | 0.707 | 
| Norv Turner | 7.704 | 0.634 | 
| Marvin Lewis | 7.695 | 0.748 | 
| Nick Saban | 7.631 | 1.094 | 
| Bill Callahan | 7.629 | 1.099 | 
| Joe Gibbs | 7.616 | 0.946 | 
| Mike White | 7.607 | 1.102 | 
| Jim Haslett | 7.584 | 0.855 | 
| Ray Rhodes | 7.546 | 0.887 | 
| Mike Singletary | 7.479 | 1.092 | 
| Mike Mularkey | 7.467 | 1.103 | 
| Art Shell | 7.423 | 1.016 | 
| Todd Haley | 7.403 | 1.022 | 
| Jerry Glanville | 7.373 | 1.210 | 
| Chan Gailey | 7.358 | 0.968 | 
| Tom Cable | 7.311 | 1.119 | 
| Rich Brooks | 7.307 | 1.100 | 
| Josh McDaniels | 7.155 | 1.102 | 
| Dick Jauron | 7.151 | 0.752 | 
| Tom Flores | 7.143 | 1.092 | 
| Buddy Ryan | 7.142 | 1.096 | 
| Steve Spurrier | 7.141 | 1.099 | 
| Lindy Infante | 7.139 | 1.119 | 
| Jim Zorn | 7.135 | 1.094 | 
| Eric Mangini | 7.124 | 0.906 | 
| Vince Tobin | 7.107 | 0.957 | 
| June Jones | 7.104 | 1.022 | 
| Dennis Erickson | 7.098 | 0.864 | 
| Herman Edwards | 7.090 | 0.787 | 
| Butch Davis | 7.002 | 0.947 | 
| Sam Wyche | 6.997 | 1.030 | 
| Scott Linehan | 6.983 | 1.104 | 
| Kevin Gilbride | 6.977 | 1.232 | 
| Jim E. Mora | 6.975 | 0.958 | 
| Richie Petitbon | 6.975 | 1.231 | 
| Joe Bugel | 6.967 | 1.113 | 
| Lane Kiffin | 6.962 | 1.231 | 
| Romeo Crennel | 6.870 | 0.957 | 
| Gregg Williams | 6.867 | 1.033 | 
| Raheem Morris | 6.864 | 1.035 | 
| Ted Marchibroda | 6.793 | 0.861 | 
| Mike Nolan | 6.722 | 1.025 | 
| Dave McGinnis | 6.717 | 1.030 | 
| Chuck Knox | 6.654 | 1.123 | 
| Bruce Coslet | 6.617 | 0.972 | 
| Dom Capers | 6.597 | 0.789 | 
| Mike Ditka | 6.573 | 1.050 | 
| Dave Campo | 6.569 | 1.032 | 
| Dick LeBeau | 6.496 | 1.124 | 
| Mike Riley | 6.443 | 1.052 | 
| Cam Cameron | 6.366 | 1.263 | 
| Dave Shula | 6.311 | 1.041 | 
| Rich Kotite | 6.263 | 0.995 | 
| Chris Palmer | 6.011 | 1.162 | 
| Marty Mornhinweg | 6.009 | 1.164 | 
| Steve Spagnuolo | 5.877 | 1.072 | 
| Rod Marinelli | 5.864 | 1.071 | 
Conclusions and Analysis
Despite having a small sample size and extremely limited data, the estimated value of coaches agrees closely with conventional wisdom. Value estimates are relatively narrow, ranging from a 66% chance that Jeff Fisher's value is between 7.974 and 9.182 to a 66% chance Cam Cameron's value is between 5.103 and 7.629. Applying Microsoft's ? - 3 * ? method of combining the parameters of a normal distribution for TrueSkill, then renormalizing (so that the unit of value is approximately the win), the following list is obtained:
| Coach | Normalized Combined Rating | 
|---|---|
| Bill Belichick | 10.060 | 
| Tony Dungy | 9.870 | 
| Bill Cowher | 9.520 | 
| Mike Holmgren | 9.410 | 
| Andy Reid | 9.360 | 
| Mike Shanahan | 9.310 | 
| Marty Schottenheimer | 9.310 | 
| Tom Coughlin | 9.070 | 
| Jeff Fisher | 9.040 | 
| Mike McCarthy | 9.030 | 
| Mike Tomlin | 9.010 | 
| Sean Payton | 8.980 | 
| Wade Phillips | 8.970 | 
| Dennis Green | 8.820 | 
| John Harbaugh | 8.770 | 
| Jon Gruden | 8.700 | 
| Bill Parcells | 8.700 | 
| Brian Billick | 8.700 | 
| Mike Smith | 8.700 | 
| Mike Sherman | 8.620 | 
| Mike Martz | 8.620 | 
| Lovie Smith | 8.590 | 
| Jimmy Johnson | 8.550 | 
| George Seifert | 8.480 | 
| Barry Switzer | 8.470 | 
| Steve Mariucci | 8.410 | 
| Bobby Ross | 8.380 | 
| Norv Turner | 8.350 | 
| John Fox | 8.320 | 
| Dick Vermeil | 8.300 | 
| Marv Levy | 8.270 | 
| Jim Fassel | 8.260 | 
| Dave Wannstedt | 8.230 | 
| Dan Reeves | 8.200 | 
| Jack Del Rio | 8.150 | 
| Marvin Lewis | 8.110 | 
| Brad Childress | 8.030 | 
| Don Shula | 8.010 | 
| Rex Ryan | 8.010 | 
| Wayne Fontes | 7.970 | 
| Gary Kubiak | 7.940 | 
| Ken Whisenhunt | 7.920 | 
| Pete Carroll | 7.900 | 
| Jim Caldwell | 7.840 | 
| Jim Haslett | 7.800 | 
| Mike Tice | 7.790 | 
| Jim Mora | 7.760 | 
| Tony Sparano | 7.720 | 
| Dick Jauron | 7.710 | 
| Ray Rhodes | 7.700 | 
| Jack Pardee | 7.700 | 
| Jim L. Mora | 7.680 | 
| Joe Gibbs | 7.630 | 
| Herman Edwards | 7.590 | 
| Dennis Erickson | 7.430 | 
| Gunther Cunningham | 7.420 | 
| Chan Gailey | 7.390 | 
| Eric Mangini | 7.360 | 
| Art Shell | 7.340 | 
| Nick Saban | 7.320 | 
| Todd Haley | 7.310 | 
| Bill Callahan | 7.310 | 
| Al Groh | 7.300 | 
| Mike White | 7.290 | 
| Vince Tobin | 7.240 | 
| Dom Capers | 7.240 | 
| Ted Marchibroda | 7.220 | 
| Mike Singletary | 7.220 | 
| Butch Davis | 7.190 | 
| Mike Mularkey | 7.190 | 
| Jason Garrett | 7.180 | 
| Hue Jackson | 7.170 | 
| Jim E. Mora | 7.150 | 
| June Jones | 7.100 | 
| Rich Brooks | 7.080 | 
| Romeo Crennel | 7.070 | 
| Tom Cable | 7.040 | 
| Sam Wyche | 7.010 | 
| Tom Flores | 6.980 | 
| Buddy Ryan | 6.970 | 
| Jim Zorn | 6.970 | 
| Josh McDaniels | 6.970 | 
| Steve Spurrier | 6.960 | 
| Lindy Infante | 6.920 | 
| Gregg Williams | 6.910 | 
| Raheem Morris | 6.900 | 
| Jerry Glanville | 6.890 | 
| Bruce Coslet | 6.860 | 
| Scott Linehan | 6.840 | 
| Mike Nolan | 6.820 | 
| Joe Bugel | 6.810 | 
| Dave McGinnis | 6.810 | 
| Dave Campo | 6.700 | 
| Mike Ditka | 6.660 | 
| Mike Riley | 6.570 | 
| Chuck Knox | 6.560 | 
| Richie Petitbon | 6.560 | 
| Kevin Gilbride | 6.560 | 
| Rich Kotite | 6.560 | 
| Lane Kiffin | 6.550 | 
| Dave Shula | 6.500 | 
| Dick LeBeau | 6.450 | 
| Steve Spagnuolo | 6.120 | 
| Rod Marinelli | 6.110 | 
| Cam Cameron | 6.060 | 
| Chris Palmer | 6.020 | 
| Marty Mornhinweg | 6.020 | 
The top two coaches both spent a lot of their career with transcendent quarterbacks. With the firing of Steve Spagnuolo and Raheem Morris, Romeo Crennel is the lowest ranking active head coach. If Bill Cowher says he'd like to get back into coaching, and inquires as to whether your organization is hiring, you say "yes." Using a similar model based on points for/against might not only produce slightly more accurate information, but could also reveal whether head coaches can be "defense-oriented" or "offence-oriented" and, if so, which ones are which.
Summary
Evaluating NFL coaches is a difficult task, popular among fans and vitally important to franchises. This is a brief attempt at the task, using purely quantitative data. The goal here is dual -- both to increase our understanding of how (and how much) NFL teams are reflective of their coaches, as well as to introduce the dichotomy of frequentist and Bayesian analysis to the NFL statistics community, from which it is largely absent.
Data
The numbers of regular season games each team won each year are treated as data points. No information beyond number of regular season wins was used.
While the "metagame" of the NFL continues to evolve, the data used herein is from 1993 onward, when the last Collective Bargaining Agreement was signed. While victories now come in different environments, they are all under (roughly) the same rules. Data from before this period could be skewed based on the different rules for control of players, so it was excluded.
Also excluded was the performance of any team in a year in which it had multiple head coaches. This was done to ensure that credit for a season was easy to assign.
In total, 107 coaches and 565 team-years of data were used.
Assumptions
- Each coach has a hidden "value". Better coaches have higher value.
- The number of games a team wins can be modeled as a draw from a normal distribution with a mean of the value of their coach, and an unknown standard deviation which is constant across all seasons for all coaches. This will be refered to as "Season Standard Deviation."
- The value of coaches is normally distributed across the population, with unknown mean and standard deviation. These will be refered to as "Population Mean" and "Population Standard Deviation".
Process
The above assumptions were encoded as a model for BUGS, which was run for 10,000 iterations, then another 10,000 iterations.
Results
Coach quality had converged after 10,000 iterations.
The posterior distribution for system constants were as follows. Posterior information for individual coaches is included in the large table in the next section.
| Constant | Mean | Standard Deviation | 
|---|---|---|
| Season Standard Deviation | 2.70 | 0.09 | 
| Population Mean | 7.69 | 0.19 | 
| Population Standard Deviation | 1.33 | 0.18 | 
Conclusions and Analysis
Below is a table with a variety of data about each coach.
- Value Posterior Mean: The mean of the posterior distribution for the coach's value.
- Value Posterior StdDev: The standard deviation of the posterior distribution for the coach's value.
- Triple-Conservative Rating: μ - 3 * σ -- a conservative single-number rating. Microsoft uses this to collapse TrueSkill distributions to single ratings.
- Normalized Triple-Conservative Rating: The triple-conservative rating re-normalized to the same scale as posterior mean coach value.
- Raw Average Wins: The coach's raw number of average wins over the data set. Provided for comparison to the posterior means -- this is one way to gauge the benefit of the Bayesian model over a standard frequentist approach. Because the Bayesian model incorporates the idea of uncertainty, a coach with one season of 14 wins is not considered a 14 win coach. Alternately sorting between this column and that of posterior mean, then judging which list looks better, is a reasonable shortcut for juding the modeling approach taken here.
- 80% Confidence Range: The range of win values the model expects with 80% confidence. In other words, this coach would be expected to win fewer games than the minimum 10% of the time, and more than the maximum 10% of the time. This can be used to gauge the amount of information added by the model -- the narrower these ranges are, the more information was available for judging that coach. The fact that these tend to be very wide indicates that the model cannot make strong predictions -- a result of both the minimal amount of data available and the unpredictability of the NFL. Taking all coaches as a data set together yields an 80% confidence range of 4.27-11.91. Comparing this to the ranges for invidual coaches is a good way to see how much information the model was able to add. In general, you'll find that the range has narrowed only very slightly, but has shifted a win or two. This can be translated roughly as "the uncertainty created by minimal data and the general difficulty in predicting the NFL means it is hard to predict how many games an individual coach's team will win, but we have a pretty good idea who the 'better' coaches are."
| Coach | Value Posterior Mean | Value Posterior StdDev | Triple-Conservative Rating | Normalized Triple-Conservative Rating | Raw Average Wins | 80% Confidence Range | 
|---|---|---|---|---|---|---|
| Al Groh | 7.95 | 1.21 | 4.32 | 7.30 | 9 | 4.03-11.87 | 
| Andy Reid | 9.18 | 0.65 | 7.21 | 9.36 | 9.69 | 5.81-12.54 | 
| Art Shell | 7.42 | 1.01 | 4.37 | 7.34 | 7 | 3.70-11.14 | 
| Barry Switzer | 8.83 | 0.95 | 5.96 | 8.47 | 10 | 5.17-12.49 | 
| Bill Belichick | 10.1 | 0.63 | 8.20 | 10.06 | 10.8 | 6.76-13.43 | 
| Bill Callahan | 7.62 | 1.09 | 4.33 | 7.31 | 7.5 | 3.82-11.43 | 
| Bill Cowher | 9.35 | 0.63 | 7.44 | 9.52 | 9.85 | 6.00-12.69 | 
| Bill Parcells | 8.37 | 0.69 | 6.29 | 8.70 | 8.63 | 4.97-11.77 | 
| Bobby Ross | 8.14 | 0.77 | 5.83 | 8.38 | 8.37 | 4.67-11.62 | 
| Brad Childress | 8.06 | 0.90 | 5.34 | 8.03 | 8.4 | 4.45-11.67 | 
| Brian Billick | 8.50 | 0.73 | 6.29 | 8.70 | 8.88 | 5.06-11.94 | 
| Bruce Coslet | 6.61 | 0.97 | 3.70 | 6.86 | 5.5 | 2.93-10.29 | 
| Buddy Ryan | 7.14 | 1.09 | 3.85 | 6.97 | 6 | 3.34-10.94 | 
| Butch Davis | 7.00 | 0.94 | 4.16 | 7.19 | 6.25 | 3.34-10.65 | 
| Cam Cameron | 6.36 | 1.26 | 2.57 | 6.06 | 1 | 2.39-10.33 | 
| Chan Gailey | 7.35 | 0.96 | 4.45 | 7.39 | 7 | 3.68-11.03 | 
| Chris Palmer | 6.01 | 1.16 | 2.52 | 6.02 | 2.5 | 2.14-9.87 | 
| Chuck Knox | 6.65 | 1.12 | 3.28 | 6.56 | 4.5 | 2.82-10.48 | 
| Dan Reeves | 7.70 | 0.70 | 5.58 | 8.20 | 7.7 | 4.29-11.11 | 
| Dave Campo | 6.56 | 1.03 | 3.47 | 6.70 | 5 | 2.83-10.30 | 
| Dave McGinnis | 6.71 | 1.03 | 3.62 | 6.81 | 5.33 | 2.98-10.45 | 
| Dave Shula | 6.31 | 1.04 | 3.18 | 6.50 | 4.33 | 2.56-10.05 | 
| Dave Wannstedt | 7.72 | 0.69 | 5.62 | 8.23 | 7.72 | 4.31-11.12 | 
| Dennis Erickson | 7.09 | 0.86 | 4.50 | 7.43 | 6.66 | 3.52-10.66 | 
| Dennis Green | 8.50 | 0.68 | 6.45 | 8.82 | 8.81 | 5.11-11.89 | 
| Dick Jauron | 7.15 | 0.75 | 4.89 | 7.71 | 6.88 | 3.69-10.60 | 
| Dick LeBeau | 6.49 | 1.12 | 3.12 | 6.45 | 4 | 2.66-10.32 | 
| Dick Vermeil | 8.05 | 0.77 | 5.72 | 8.30 | 8.25 | 4.57-11.54 | 
| Dom Capers | 6.59 | 0.78 | 4.23 | 7.24 | 6 | 3.10-10.09 | 
| Don Shula | 8.37 | 1.01 | 5.32 | 8.01 | 9.33 | 4.65-12.09 | 
| Eric Mangini | 7.12 | 0.90 | 4.40 | 7.36 | 6.6 | 3.51-10.73 | 
| Gary Kubiak | 7.76 | 0.84 | 5.22 | 7.94 | 7.83 | 4.21-11.32 | 
| George Seifert | 8.41 | 0.81 | 5.97 | 8.48 | 8.85 | 4.89-11.93 | 
| Gregg Williams | 6.86 | 1.03 | 3.76 | 6.91 | 5.66 | 3.12-10.60 | 
| Gunther Cunningham | 7.80 | 1.10 | 4.48 | 7.42 | 8 | 3.99-11.61 | 
| Herman Edwards | 7.09 | 0.78 | 4.72 | 7.59 | 6.75 | 3.59-10.58 | 
| Hue Jackson | 7.74 | 1.20 | 4.14 | 7.17 | 8 | 3.83-11.65 | 
| Jack Del Rio | 7.75 | 0.74 | 5.52 | 8.15 | 7.77 | 4.30-11.21 | 
| Jack Pardee | 8.54 | 1.22 | 4.88 | 7.70 | 12 | 4.62-12.47 | 
| Jason Garrett | 7.76 | 1.20 | 4.15 | 7.18 | 8 | 3.85-11.67 | 
| Jeff Fisher | 8.57 | 0.60 | 6.76 | 9.04 | 8.81 | 5.26-11.88 | 
| Jerry Glanville | 7.37 | 1.21 | 3.74 | 6.89 | 6 | 3.45-11.28 | 
| Jim Caldwell | 8.10 | 1.00 | 5.07 | 7.84 | 8.66 | 4.39-11.82 | 
| Jim E. Mora | 6.97 | 0.95 | 4.10 | 7.15 | 6.25 | 3.31-10.63 | 
| Jim Fassel | 8.07 | 0.80 | 5.66 | 8.26 | 8.28 | 4.56-11.58 | 
| Jim Haslett | 7.58 | 0.85 | 5.01 | 7.80 | 7.5 | 4.02-11.14 | 
| Jim L. Mora | 7.73 | 0.96 | 4.85 | 7.68 | 7.75 | 4.06-11.39 | 
| Jim Mora | 7.83 | 0.95 | 4.96 | 7.76 | 8 | 4.17-11.49 | 
| Jim Zorn | 7.13 | 1.09 | 3.85 | 6.97 | 6 | 3.33-10.93 | 
| Jimmy Johnson | 8.75 | 0.89 | 6.07 | 8.55 | 9.6 | 5.15-12.34 | 
| Joe Bugel | 6.96 | 1.11 | 3.62 | 6.81 | 5.5 | 3.14-10.78 | 
| Joe Gibbs | 7.61 | 0.94 | 4.77 | 7.63 | 7.5 | 3.96-11.26 | 
| John Fox | 7.97 | 0.74 | 5.74 | 8.32 | 8.11 | 4.52-11.42 | 
| John Harbaugh | 9.30 | 0.97 | 6.38 | 8.77 | 11 | 5.62-12.98 | 
| Jon Gruden | 8.37 | 0.69 | 6.29 | 8.70 | 8.63 | 4.97-11.76 | 
| Josh McDaniels | 7.15 | 1.10 | 3.84 | 6.97 | 6 | 3.34-10.96 | 
| June Jones | 7.10 | 1.02 | 4.03 | 7.10 | 6.33 | 3.37-10.83 | 
| Ken Whisenhunt | 7.87 | 0.89 | 5.18 | 7.92 | 8 | 4.27-11.47 | 
| Kevin Gilbride | 6.97 | 1.23 | 3.28 | 6.56 | 4 | 3.03-10.91 | 
| Lane Kiffin | 6.96 | 1.23 | 3.26 | 6.55 | 4 | 3.02-10.89 | 
| Lindy Infante | 7.13 | 1.11 | 3.78 | 6.92 | 6 | 3.31-10.96 | 
| Lovie Smith | 8.46 | 0.77 | 6.13 | 8.59 | 8.87 | 4.98-11.94 | 
| Marty Mornhinweg | 6.00 | 1.16 | 2.51 | 6.02 | 2.5 | 2.13-9.87 | 
| Marty Schottenheimer | 9.20 | 0.68 | 7.15 | 9.31 | 9.75 | 5.81-12.60 | 
| Marv Levy | 8.39 | 0.90 | 5.68 | 8.27 | 9 | 4.78-12.00 | 
| Marvin Lewis | 7.69 | 0.74 | 5.45 | 8.11 | 7.66 | 4.24-11.14 | 
| Mike Ditka | 6.57 | 1.05 | 3.42 | 6.66 | 5 | 2.81-10.32 | 
| Mike Holmgren | 9.10 | 0.60 | 7.29 | 9.41 | 9.5 | 5.79-12.41 | 
| Mike Martz | 8.76 | 0.86 | 6.17 | 8.62 | 9.5 | 5.19-12.33 | 
| Mike McCarthy | 9.34 | 0.86 | 6.74 | 9.03 | 10.5 | 5.77-12.91 | 
| Mike Mularkey | 7.46 | 1.10 | 4.15 | 7.19 | 7 | 3.65-11.27 | 
| Mike Nolan | 6.72 | 1.02 | 3.64 | 6.82 | 5.33 | 2.99-10.45 | 
| Mike Riley | 6.44 | 1.05 | 3.28 | 6.57 | 4.66 | 2.68-10.20 | 
| Mike Shanahan | 8.97 | 0.60 | 7.15 | 9.31 | 9.31 | 5.65-12.28 | 
| Mike Sherman | 8.74 | 0.85 | 6.17 | 8.62 | 9.5 | 5.18-12.30 | 
| Mike Singletary | 7.47 | 1.09 | 4.20 | 7.22 | 7 | 3.68-11.27 | 
| Mike Smith | 9.19 | 0.96 | 6.28 | 8.70 | 10.75 | 5.52-12.87 | 
| Mike Tice | 7.84 | 0.94 | 5.01 | 7.79 | 8 | 4.19-11.49 | 
| Mike Tomlin | 9.49 | 0.92 | 6.72 | 9.01 | 11 | 5.86-13.11 | 
| Mike White | 7.60 | 1.10 | 4.30 | 7.29 | 7.5 | 3.79-11.41 | 
| Nick Saban | 7.63 | 1.09 | 4.34 | 7.32 | 7.5 | 3.83-11.43 | 
| Norv Turner | 7.70 | 0.63 | 5.80 | 8.35 | 7.71 | 4.36-11.04 | 
| Pete Carroll | 7.96 | 0.93 | 5.16 | 7.90 | 8.25 | 4.32-11.60 | 
| Raheem Morris | 6.86 | 1.03 | 3.75 | 6.90 | 5.66 | 3.12-10.60 | 
| Ray Rhodes | 7.54 | 0.88 | 4.88 | 7.70 | 7.4 | 3.95-11.13 | 
| Rex Ryan | 8.38 | 1.02 | 5.32 | 8.01 | 9.33 | 4.65-12.10 | 
| Rich Brooks | 7.30 | 1.1 | 4.00 | 7.08 | 6.5 | 3.50-11.11 | 
| Rich Kotite | 6.26 | 0.99 | 3.27 | 6.56 | 4.75 | 2.56-9.96 | 
| Richie Petitbon | 6.97 | 1.23 | 3.28 | 6.56 | 4 | 3.03-10.91 | 
| Rod Marinelli | 5.86 | 1.07 | 2.65 | 6.11 | 3.33 | 2.08-9.64 | 
| Romeo Crennel | 6.87 | 0.95 | 3.99 | 7.07 | 6 | 3.20-10.53 | 
| Sam Wyche | 6.99 | 1.03 | 3.90 | 7.01 | 6 | 3.26-10.73 | 
| Scott Linehan | 6.98 | 1.10 | 3.67 | 6.84 | 5.5 | 3.17-10.79 | 
| Sean Payton | 9.24 | 0.85 | 6.68 | 8.98 | 10.33 | 5.68-12.80 | 
| Steve Mariucci | 8.21 | 0.77 | 5.88 | 8.41 | 8.5 | 4.73-11.69 | 
| Steve Spagnuolo | 5.87 | 1.07 | 2.66 | 6.12 | 3.33 | 2.09-9.65 | 
| Steve Spurrier | 7.14 | 1.09 | 3.84 | 6.96 | 6 | 3.33-10.94 | 
| Ted Marchibroda | 6.79 | 0.86 | 4.21 | 7.22 | 6.16 | 3.22-10.35 | 
| Todd Haley | 7.40 | 1.02 | 4.33 | 7.31 | 7 | 3.67-11.13 | 
| Tom Cable | 7.31 | 1.11 | 3.95 | 7.04 | 6.5 | 3.48-11.13 | 
| Tom Coughlin | 8.62 | 0.60 | 6.80 | 9.07 | 8.87 | 5.31-11.93 | 
| Tom Flores | 7.14 | 1.09 | 3.86 | 6.98 | 6 | 3.34-10.94 | 
| Tony Dungy | 9.94 | 0.67 | 7.93 | 9.87 | 10.69 | 6.57-13.32 | 
| Tony Sparano | 7.73 | 0.94 | 4.90 | 7.72 | 7.75 | 4.08-11.38 | 
| Vince Tobin | 7.10 | 0.95 | 4.23 | 7.24 | 6.5 | 3.44-10.77 | 
| Wade Phillips | 9.05 | 0.79 | 6.67 | 8.97 | 9.75 | 5.55-12.54 | 
| Wayne Fontes | 8.09 | 0.94 | 5.25 | 7.97 | 8.5 | 4.44-11.74 | 
One thing to note about this information is that some of it points to possible violations of the model's assumptions. The top two coaches both spent a lot of their career with transcendent quarterbacks, for example. That's ok -- the model will simply be off on coaches for whom the assumptions don't hold as well. This is not about finding an exact cause-and-effect relationship, it's about finding general corrolative information, which means that some coaches will be easier to predict than others.
 

 
 
 Posts
Posts
 
 

11 comments:
How is this a measure of coaching value? It's a (crude) measure of the quality of teams a given coach has had. The main assumption you are making is that all teams have the same skill level -- clearly false. If you want to attempt to measure the value of a coach, you have to do some kind of analysis of how the team performed just before and just after his tenure. Even that would be pretty severely biased by unaccounted-for personnel changes.
How comes Norv Turner is 28th out of 107 ??? AFIR, the guys from FO did some work before the 2008 season; there he is the worst coach in the history of the NFL (rightfully chosen, b/c even w/o numbers the human eye can witness this every given sunday :-)). He lost the most games when entering the 4th Qtr and he had the fewest 4th Qtr comebacks. I doubt he improved in the last few years, b/c he is the same timid coach he always was. And we shall remember he inherited a 14-2 team which could have won some superbowls (i totally agree with Ryan here). Otherwise, i appreciate you work.
Karl, Germany
Tony Dungy and his ultra-conservatism #2? I think what this ended up measuring is which teams won the most. Although I really appreciate the effort.
I'll address the first response later, as it requires a somewhat more complex explanation, but the second is simple -- it's a matter of assumptions. The model assumes the difference between a coach's win count for a season and their actual skill or value can be modeled by Gaussian noise. If you believe the majority of Turner's career took place starting from a dominant team with a brilliant GM, which he gradually destroyed, well, that's not Gaussian noise. In other words, it's the same argument as "Belichick is only the top coach because he got lucky on Tom Brady" -- it may be true, but the model doesn't incorporate that information.
Also note that this is an attempt to evaluate a head coach in a way that considers team composition as part of a head coach's responsibility (which it is -- free agency, the draft, and trades are usually controlled by a team's head coach). I suspect the FO study was an attempt to measure coach value independent of player value -- what a coach actually does in terms of clock management, play design, etc. to take the same group of guys and make them better or worse.
To summarize -- the charges had a lot of pro bowlers under Norv. If you think he had nothing to do with that, then you will disagree with his ranking in the model. You may be right -- it is, after all, just a model.
Boston Chris -- you are correct. This is largely based on which teams won the most. The idea is to attribute those wins to their head coach, since he has final say over the teams. I'll write more about how much information this adds (i.e. how strong predictions based on these rankings are) some point soon. The fact, however, that the model does not measure the things that cause a team to win does not mean it cannot predict which teams (i.e. coaches) will win in the future.
As to Tony Dungy himself -- one could argue as to whether the model applies well to him. To be honest, I think it does. His continuous success on both the Buccs and the Colts may have had extenuating circumstances, but all success involves a bit of luck. That he did it so consistently for so many years indicates that, given the opportunity, he'd probably do it again -- and that is what I set out to measure.
Yes the Chargers had many pro bowlers which Turner inherited from Schott (another timid coach, who not coincidentally never won a playoff game, but at least his teams were ready by september). After all, as you said, Turner "had nothing to do with that". All he did was wasting the most talented team of the late 2000´s.
No, Beli-Cheat, as we readers all know, is a great coach (from a pure football standpoint). Year in year out he has more wins than expected by Brians and others great models.
But as i said before, i still appreciate your good work.
Karl, Germany
Yes, assuming that a head coach is mostly responsible for drafting/trading/signing is a very poor assumption. Without this assumption the entire article becomes misleading at best.
I hope that version 2 of the article answers some of the comments above, both in terms of the math and the goals of this piece.
I'll point out that the Bayesian results are 0.946 correlated with simply dividing wins by 16. So for most purposes, the straight-up winning percentage is probably sufficient.
One potential point of interest, is can you form any hypothesis as to what the remaining variation is due to? (By looking at which coaches over and underperform in the Bayesian value relative to the straight-up value, and trying to think about similarities in each set.)
If one is being careful, there is no need for the "season standard deviation". The draw should not be from a normal distribution with an unknown mean winning percentage and standard deviation, it should be from a binomial distribution with an unknown winning percentage. The variance of a binomial distribution is given by its mean, and so the extra parameter is unnecessary.
In fact, from a binomial perspective, it is impossible to get a "season standard deviation" (taking the square root of the variance) that is above 2. The fact that you found a value above 2 is probably the most interesting result of the analysis. It means there is some unknown additional factor (time-series non-invariance being the most likely culprit).
Probably need to explain hoe bayesian analysis works....and whatever the microsoft algorithm you used to normalize.
And what is BUGS?? I think you need to explain what you are doing better.
Post a Comment
Note: Only a member of this blog may post a comment.