Monday, October 25, 2010

Interception and counter-momentum?

by Andy Steiner

David Romer considers momentum in his paper “Do Firms Maximize? Evidence from Professional Football”. In this study he looks at very good plays and very bad plays, and then the next three plays in an attempt to quantify momentum. He finds that there is no significant momentum effect (actually it goes in the reverse direction, the team that did poorly did slightly better than average on the next play). I wanted to look at a very similar effect; the momentum value of interceptions. I wanted to see how much more likely, in terms of actual points the intercepting team was to score. What I found definitely surprised me, although I should have seen it coming.

To see if interceptions have some extra effect, besides just the obvious value of taking the ball away from your opponent, I will simply compare the expected (actual) point curve for drives starting normally “normal starting”; and drives starting from interceptions.

Each graph is plotted as actual points scored vs. yard line from end zone. The “normal starting” drives actual points graph is shown below, with the equation and R2 value shown on the graph:

The interception drives actual points graph is shown here, equation and R2 value also shown:

Lastly, interception actual points best fit line overlaid with “normal starting” actual points best fit line:

The interceptions yield considerably less points, but I have no idea how to test for significance for something like this. All I know is there are 501 samples in the interception line, and there are 15,683 samples in the “normal starts” line. One thing that indicates to me that this could be more than a statistical fluke is that the slopes of the two lines are remarkably similar, even if they aren’t exactly the same (.0602 v .0597). To me, that says something in itself, even if the result is not statistically significant, it means that there is no evidence to suggest that interceptions actually cause a positive change in momentum (positive for the intercepting team). I do actually believe in sports momentum (because I have golfed!), but in the case of the interception I don’t think momentum works as expected. I am not a psychologist, but it makes sense, because the people that now take the field didn’t actually do anything wrong. This is certainly not an exhaustive study of momentum, maybe there is a significant effect once the offense guilty of throwing the interception takes the field, but that is not specifically what I studied.

Data Set

I have been using data from the 1st and 3rd quarters, from the 2002 to 2009 seasons. I am making the assumption that intercepting teams are no more or less likely to score near the end of a half than they are in the first or third quarter, at least not more/less likely when compared to a “normal starting” drive. I also made the cutoff that the next score has to occur in the first or third quarter to be counted. Since an interception can happen anywhere in the 1st quarter (or 3rd), interceptions thrown near the offense’s own end zone are more likely to be realized as scores and included in the data set before the quarter expires and the datum point is thrown out. I am comparing this number to the “normal starting” actual points of drives that score during the 1st or 3rd quarters.

Trying to understand how expected points are calculated

When Brian Burke first explained expected points (EP) it seemed very easy to understand, but after thinking about it further I realized it could get very confusing. It took me a while to realize that on the course of a drive, several ‘bins’ are opened up (for each 1st down yard line). The resulting points of that drive are added equally to all the bins. This confused me for a while because I didn’t realize that all the bins were essentially independent of each other. When a teams gets the ball on their own 40, they either score 7,3,-7,-3,2, or -2 (neglecting the kickoff for now). The 40 yard line receives a datum point along with all the other yard lines they happened to stop at with a first down along the way. I thought this could lead to a bias…because only 3 or 4 bins were even counted in a successful scoring drive..what happened to all the other yard lines? Essentially, I had to realize that all the other bins were independent and would receive their data at a later time. With this in mind, I recalculated an actual point curve to match the data set (scoring had to occur in the 1st and 3rd quarters) using this same method, putting each first down into the yard line bin, and then putting the appropriate amount of points into that bin.

Note that I am actually re-calculating the actual scores from 2002-2009, not just comparing a play to an expected point curve. This shows how many points were actually scored after interceptions compared to “normal starting” drives.


I took a standard .7 points away from the score to account for kickoffs. Touchdowns are now worth 6.3 points, field goals are worth 2.3 points.

Weird anomalies

Sometimes the point values aren’t right, including point values that don’t actually exist, specifically 5.3 shows up. I think this may come from how I did my code, but I can’t quite narrow it down. It might be that the OFFSCORE column adds points to the team that scores in the wrong row (i.e. a few plays later), throwing off the “score” part of the bin data. I am not completely sure. I do know that these weird scores are not very frequent, even though they look like it on the graphs. In fact, for both intercepting and “normal starts” curves, ~97% of all data points come from normal values of scores…6.3 or 2.3. I see no reason that these randomly scattered values would actually throw off the slope or intercept of the curves by very much.

I didn’t separate interceptions from the “normal starting” drives. Essentially, if I had separated the two, all other types of “normal starts” would actually have a higher actual point value than the “normal starting” line that is shown above. This only widens the gap between normal starts and interception actual points.

Feel free to comment, and thanks for reading!


Anonymous said...

I suggest averaging all the data points for each yard line (or do a 5 yard moving average or something) - then your data will probably look very pretty and convincing.

Anonymous said...

5.3 points = TD with missed extra point or failed 2 point conversion.

Anonymous said...

Sorry, more like: 5.3 points = TD with missed extra point or failed 2 point conversion - your point adjustment of .7

Andy said...

That is a good thought about the averaging, the scatter might not be the best plot because there are so many points "hiding" behind common values. I will look into averaging. I have to use the same method to evaluate both the interception and normal drives.
about the 5.3 points, I totally overlooked that when i was putting this together, that would explain a lot of those point errors. thanks.

Post a Comment

Note: Only a member of this blog may post a comment.