Comments on Advanced NFL Stats Community: Combining Win Probability Model With Live WP Graph

Is there any stat that simply measures the area un...

2012-09-17T02:34:24.540-04:00

Is there any stat that simply measures the area under the WPA graph (integral?) to measure game dominance? If not WPA + time remaining should be enough to come up with something like this and I think would be valuable when ranking/comparing teams.

Thanks.

To put it simply, Brian's model gives us the c...

2012-09-15T11:26:42.850-04:00

To put it simply, Brian's model gives us the current measurement of the difference between the two teams due to what has happened in the game, and the part I have introduced says: right, we know that a sixty minute game between these two will give us an expected result of X with probabilities of distance from X varying according to its standard deviation S, so if we shrink the game to what time reamins, T, we must calculate the new value of S, which is S*((T/60)**0.5), and the new value of X, which is simply X*(T/60). Then we can simply combine what Brian's model says, and what we have from the new values of X and S, to compute the total probability of winning.

Ah Andrew, perhaps I wasn't clear enough. The ...

2012-09-15T11:17:22.871-04:00

Ah Andrew, perhaps I wasn't clear enough. The team strength does indeed vary linearly with time, but any suitably distributed random variable (normal, logistic etc) must also be considered relative to its standard deviation, or in the case of the logistic distribution its S value, which is proportional to standard deviation - see here:http://en.wikipedia.org/wiki/Logistic_distribution
As such the overall logit must vary as (T/60)/((T/60)^0.5) = (T/60)^0.5 and hence we reach what I stated. This is due to the fact, that is well known, that the standard deviation of a random variable of this sort (logistic, normal, lognormal...) follows the rule that its standard deviation varies as T^0.5. So if we have some value M measured over 60 minutes with a standard deviation of 7, then over 30 minutes we would expect a mean of M/2, but a standard deviation of 7*((30/60)^0.5) = 4.95, not 3.5. This is very important.

I can confirm that this equation is the correct solution to the problem at hand when matches are considered via logistic distribution, which is much nicer to implement than the Normal distribution, and is the natural basis for Brian's equations. I use this equation for drawing graphs from NBA games, and it agrees to within a 1% tolerance with a model that instead uses a monte carlo simulation to play out remaining game time taking into account the difference in team strengths.

Have you tried out your equation with some real ga...

2012-09-15T10:51:18.859-04:00

Have you tried out your equation with some real game data to see if it does what you expect/ want? It would be interesting to see a plot of the in-game WP as currently calculated, and the in-game WP as using this correction. The most important thing is to show that at the beginning of the game, it equals S(60), and that at the end of the game, it slowly converges to the orignial WP at either one or zero (depending on who won.)

However, I do suspect (and maybe you've found a version) that there's a much simpler first-order approximation to my original formula that would work practically as well. Finding such an approximation would be a worthy thing.

Although I should say that I have the sense that Brian knows how to implement the complicated version, but has hesitated. I'd speculate it's because you can introduce a lot of confusion and a lot to argue about.

"As such we reach the equation for the variation of the team strength parameter over the course of the game, S(T)=S(60)*((T/60)^0.5)."

What varies with the square root of time remaining is the standard deviation of the realized strength advantage outcome over many games, not the team strength parameter, which is the average of the of the realized strength advantage outcome over many games.

This is one reason it gets complicated--the things with simple time dependences are denominated in points, but what you want is a probability.

I believe this is indeed correct: "if I tell you team A will on average beat team B by ten points over a sixty minute game, then we'd both agree that if the games were cut to thirty minutes then on average A would beat B by five points", namely, that team strength S(t) varies linearly in time, not like the square root.