ExpG Comment Reply
Note: this is a detailed response to a comment left on another post (found here) about Expected Goals (ExpG).
The claim was that I “don’t have any idea what [I’m] doing.” That’s the general from the specific that I had trashed someone else’s work, even though that work was apparently superior.
As was pointed out, having an Actual Goals/ExpG calculation closer to 1 is an indication that our model is doing a good job measuring the cumulative value of shots a team has taken. And yes, 1.116 is closer to 1 than 1.206 (this is the single statistic—two different measures of Swansea’s ExpG last season—from which the criticism above was derived).
But that’s one team. A better comparison would involve more than one team. So I did similar comparisons for multiple teams. I found four complete seasons worth of ExpG calculations using the model you referred to over at SB Nation—2013-14 seasons for each of Spain, Italy and Germany; and the 2014-15 EPL season—and did the full season calculations for three of those to compare against. I left out the Bundesliga. Why? Because I had just done something similar for a post below. Theoretically that would mean I had already done those calculations, making it less work. But I stupidly neglected to save my R workspace, and I don’t want to do the same work twice. This does take a little bit of time. Still, if you check out the charts in that Bundesliga post, visually, there’s a compelling case that we’re doing “better” (although I have no idea which model the original poster used for his calculations). Anyway, I’m pretty sure that three seasons will suffice for a comparison.
Again, just to be clear (and as you stated), closer to one is better. That’s for both an over- and under-shot. For example: Newcastle scored 56 goals in the 2011-2012. That’s not made up, that’s their actual total (and holy crap, I totally forgot they finished 5th that season, even ahead of Chelsea).
Suppose both ExpG calculations over-estimate Newcastle. One is 58 and the other is 63. The former is “better” as 56/58 (or .966) is closer to one than 56/63 (or .889). Similarly, if we undershoot, we want to undershoot by less. So if the calculations are 51 and 47, the former is again better as 56/51 (1.098) is closer to one than 56/47 (1.191)
With that in mind here are the results for the three seasons specified:
For the 2014-15 EPL, I was more accurate on 15 of 20 teams (highlighted in green). For the 2013-14 La Liga season I was again more accurate on 15 of 20 teams. The 2013-14 Serie A campaign is a little trickier because some of the goal totals on the SB Nation chart are inaccurate. Specifically, those for the following teams (the ‘Posted’ values are those on the SBNation page, the Actual are what actually happened over the course of the season).
I originally thought there might be incomplete data and that the calculations were for games available. But in that instance none of teams would have had a posted total greater than the actual (i.e. if there were missing games then there is no way Juve’s posted total (84) could have had more than their actual total (80); you can’t total up goals you don’t have). My guess is that they were honest mistakes (it’ super easy to get lost when you are moving data around). So I simply substituted in the correct values and, after doing that, I was more accurate on 15 of the 19 Serie A teams.
Yes, there are 20 teams in Serie A. I had problems with my AS Roma calculations and, even though I tried to correct in a way that penalized me for missing data, it didn’t seem right to include (FWIW, my calculation was ultimately closer to 1 but, without any guarantee of uniformity across processes, it still seemed better to toss it out). That super conspicuous black line, that’s Roma.
Add all three seasons up and it’s 45 out of 59. That’s a .763 batting average.
If you count total goals to the good, I’m better by a cumulative 99.4 goals. And that’s net. Gross, I’m up 124.6 to 25.2.
Back to the original post and the relevant ExpG calculations. Swansea was one of my biggest misses this past season. They scored 46 on ExpG of 38.1 (my number). So that’s almost off by a full 8 goals (which to a low total is a sizable percentage). The starting point for the post at Statsbomb was that the Swans substantially over-performed their expectations. Swansea looked like an outlier. That’s what made them worth digging more into.
By the SBN model, the Swans Actual/Exp was 1.1165. If you take the three seasons here as the dataset, then doing a simple mean and variance calculation it turns out that Swansea were about .80 standard deviations above the mean. They weren’t even a full standard deviation above it. About 57% of the data are going to be between +/- .80 standard deviations. I’m not sure what would constitute an outlier for ExpG, but I’m pretty sure that’s not it.
For comparison’s sake, SBN has a mean of 0.983 and an standard deviation of 0.165, my respective numbers are 1.004 and 0.1347. So my numbers have the Swans 1.50 standard deviations above the mean (so about 86% of the data are going to be +/1 1.50 standard deviations). Maybe not what you’d consider a true outlier either, but at least large enough to be worth another look. Plus, when we’re right most of the time, we can be pretty confident that, on a big miss, something is up.
Is my model perfect? Not even close. It’s football fer chrissakes. It’s like non-linear dynamics as a game. Even if it were decipherable mathematically, I can think of four additional factors I want to add (so I’m not even complete by what I want to do). They are just going to be ridiculously complex to add and the gains will be marginal. Moreover, as I’ve said elsewhere, I don’t even think these calculations are where the real value in having a good model lies. Still, if someone is going tell me that I have no idea what I’m talking about then doing some math in my defense seems like an entirely reasonable response.
Even if it’s one I spent far too much time on for my own liking.
Thoroughly documented, bravo.
Changing topics slightly–why even build ExpG models? They only tell you that distance (plus delivery of pass leading to shot in other models) can predict goals better than just regular shots–well duh, but that has no practical application…”hey, Jose Mourinho, tell the lads that shoot more shots from areas closer to the goal”…duh. The only thing I can think of is that I could be a better measure for how good a team is (better than goals or points because discrete rare events are hard to quantify). But even then, this metric is one manifestation of a latent factor(s)–simple factor or dynamic factor model should tell us much much more about a team’s quality.
North Yard Analytics does a model that’s not shot based. They talk about it in a post on their site. That probably has more practical usefulness. And I have found one instance where I needed to regress on to ‘something’ that ExpG was pretty much the perfect solution. Moreover, I’ve got a fair few factors in the model and two or three of those coefficients can tell you something about how to play the game (tactically). But generally, in isolation, it doesn’t do much. Even for teams or players there’s nothing that tells you if over- or under-performances are sustainable or just random occurrences (e.g. Messi is pretty much sustainable, Immobile ’13-’14 was an aberration). Sure if it happens three seasons in a row, it’s reasonable to assume it’s not an aberration, but by then everyone knows about the player so you don’t have any advantage.
It is super time-consuming to build a decent model, and once you do run data through and find that your results match the real world enough that you know you didn’t screw up, I can see where you could run into the Law of the Instrument (i.e. ‘When all you have is a hammer, every problem looks like a nail’).
why do you think being closest to actual goals is more accurate? surely expG isn’t trying to measure actual goals, we already have actual goals there. I’ve always heard it explained as trying to be more accurate than goals, so comparing it to actual goals seems counter-intuitive to me. am I missing something?
You’re the Saturdays on the Couch person, correct?
First, I really enjoy your stuff. Think you are doing more with passing data than anyone else blogging these days. That’s really rich data and I like seeing what you’ve been doing. Second, sorry for the delay in accepting and replying to your comment. I sometimes pay that little attention to my own blog.
Finally, regarding the actual question: Actual goals are what we use to build the model (usually… again NYA does a non-shot-based model, which I’m starting to see the genius in). What we’re doing is assigning a probability to each individual shot. Sum a a few hundred or thousand of those over a large set (like a season) and, if our model is doing a good job, we should be really close to what was actually scored. If we’ve missed a lot of ‘stuff’ then we’re not accurately assigning probabilities to individual shots. Over the course of the season that will also start to show up as the actual and expected converge less (than by a ‘better’ model).
I literally just put up a post that maybe (indirectly) addresses this (just go to the home page and it should be the top post… at least it will for a week or two as I have nothing else in the pipeline I’m going to push out). If you still have questions, fire away.
yes that’s me, thank you it’s appreciated.
I do a lot of non-shot based stuff as well, and while it sounds better it is not any sort of holy grail I can assure you. the differences in conversion from position to shot are much greater than shot to goal. non-shot based are better for description and making conclusions over small samples on team quality however.
I understand the totals should add up, but still don’t quite understand why each team should add up. at some point can you not simply explain everything away and wind up with a .98 R2 or something? shouldn’t we be taking samples and then testing going forward, as xG should be more predictive than goals mainly and then shots, SOT, deep completions, etc.
I guess I just don’t quite see why a season is the length of time that decides expected and real goals should converge.