Hardwick also apologised to Sydney Stack for giving the young star a spray for being late to training
I had some trouble developing this component of analysis. As I mentioned in my previous article on four factor offensive efficiency, some things don’t quite translate for defence.
The first and most obvious element for anyone familiar with the sport is that there aren’t clear defensive partners for each offensive statistical category. For example, the opposite of a bucket at the rim in basketball might be a block, but AFL doesn’t really have ‘blocks’ as a defensive measure of denying a score.
We can take into consideration smothers, usually rolled into the measure ‘1 per centers’ on public stats sites, with spoils and other discrete defensive actions. However, smothers are relatively rare and also don’t apply solely to players taking a shot. To this end measuring defence in a quantitative manner is a trickier task.
Because I was basing my approach on Dean Oliver’s four factors for basketball my initial instinct was to simply try and find the reverse factors on defence. Where Oliver’s talking about offensive rebounding rate, for defence he can look at defensive rebounding rate.
The other factors more or less take into consideration the opponents results on offence. For example, in defensive four factors Oliver would look at the opponent’s effective field goal rate as the corollary to your own effective field goal percentage. We can do this too when we’re looking at team data.
In that case we would look at the defensive four factors as:
This works just fine if you’re looking at team data where we only need to apply the data to the whole unit, but if we want some individual results, we have a different problem. For example, there is no ‘metres restricted’ or ‘metres decreased’ as a specific individual measure for a player’s defence. Additionally what we are really doing here is just comparing one team’s average offence versus the average offence they allow. It’s bot a bad measure of defensive prowess, but (a) I don’t know if this tells the whole story and (b) it cannot be applied to individual stats to get a quantitative figure for defence.
If we take a look at this initial approach, we can see that the top three teams in 2018 for defensive efficiency were Richmond, Geelong and Hawthorn. This is a somewhat interesting result. Both Richmond and Geelong were in the top three for least points conceded, so their defensive efficiency status might come as little surprise.
Hawthorn, on the other hand, actually ranked eighth for points conceded but were third by my measure of defensive efficiency. This perhaps speaks to the fact that relative to game speed and volume of possession the Hawks defence is arguably still quite good. And although we are only talking about small differences between actual points and predicted points we can also see that not much separates teams.
|Team||Opp EffScRate||Opp TO Rate||Opp MetresG||Opp Off Ret||Actual||Predicted|
Continuing with Hawthorn as the example, an improvement of only one less goal conceded per game would make them the best defence in the game. Using my defensive efficiency formula Hawthorn could achieve this is by tweaking only a few aspects of their play.
In regard to the four factors the Hawks rank as the No. 1 team in opposition offensive retention and opposition advancement rate, meaning they are good at restricting the opposition’s run and perhaps long kicking and also don’t allow the opposition to hold the ball in their own forward half for very long. In the other two categories they rank sixth for opposition turnover rate and eighth for opposition effective scoring.
Looking at these two areas for improvement, we could posit Hawthorn turning one opposition goal from last season into a behind this season and forcing one extra error out of the opposition each game for 22 more total forced errors on the season. These seemingly small improvements would take Hawthorn’s predicted score against to a little over 65 points per game which, if realised, in actuality would be the most miserly offence in some time.
Indeed to put it in perspective only three teams have averaged fewer than 65 points per game since the 1967 season.
|Season||Team||Ave points against|
Obviously this is easier imagined by my model than done in reality, but I think this gives a feel for how small changes in on-field actions adjust the bottom line.
So more or less I like looking at defence for teams using this model. It runs pretty close to actual results and makes good sense. However, it’s not great for the individual defence, and this is where my analysis becomes more complex and probably less reliable.
First of all, when doing big data set analysis of typical defensive indicators I found that there is little or indeed opposite correlation with being scored against. That is, if we look at, say, the numbers of spoils or rebound 50s as it relates to being scored against, we actually find that more rebound 50s happen in losing teams that do get scored against. What we would be hoping for is that more defensive stats correlate with restricting opposition scoring.
Even when we look at those metrics relative to opposition 50s, which might account for the players who are doing the best or most defending relative to the amount of pressure they’re under, also does not show a strong relationship.
It’s although worth mentioning that the difference in what we are trying to achieve with the defensive metric is the inverse relationship to our offensive factors. In other words, when a player does defensive things it reduces their cost on defence. What this means is we are basically saying that if every team gets scored against, every player is to varying degrees responsible for this and as such their ‘defensive cost’ is better if it is lower.
What I found was that the things players do that correlate strongly with lower opposition scores are:
I felt there was some difficulty in just building out a model with these stats alone. Ultimately this would probably overrate contested possession midfielders as the best defenders on the ground. Although I think there’s something in that, the point of what this model is supposed to achieve is a reflection of defensive efficiency, primarily by those players who are mostly engaged with the job of actually defending.
In the end what I have tried to produce is a model with correlation and relevance so that it is predictive of the actual score, albeit not as strong as with the offensive model, but which also reflects legitimate defensive actions.
What I attempted to develop was:
Ultimately, however, I found that trying to have separate metrics to build a similar multivariable linear model as with the offensive efficiency simply never gave a result that was reliable enough to keep me happy. What I’ve ended up with is a single metric combined of the following stats which I think still gets at the notion of ‘stopping’ and ‘releasing’ from defence:
All of this smushed together and then divided by the number of opposition entries inside 50 gave a very strong correlation to not being scored against. Indeed I was happy enough with the pattern this metric created with opposition scoring to build an exponential regression model as opposed to a linear one.
I might be stretching a tiny bit with the exponential relationship, but the linear model had about as much error as the exponential one and I like the idea that the model creates a bit more separation between points than a linear model does.
Ranking team defence with my new metric gives the following results.
|Team||K||D||Opp In50||T||R50||CP||MI5||0.01||ITC||T50||Def metric||Predicted||Actual|
We see, for example, that Hawthorn and Richmond remain in the top three in terms of efficiency but Melbourne shoot to the first spot in the rankings and Geelong slide way down to ninth spot. In that case, although Geelong conceded only 70.7 points per game, this measure suggests their defensive efficiency is closer to 83.5 points per game.
Going back to the example of last year’s grand final, we can now apply individual defensive scores to each player and maybe establish a more well-rounded picture of performance on the day.
|Player||Team||K||HB||D||Def Metric||Pred Def||Pred Off||Net|
|Jordan de Goey||Coll||12||1||13||7.68||5.74||12.50||6.8|
As a test case I’m happy that this metric establishes a realistic but also an interesting explanation of efficiency and performance. To begin with, we can see that the least cost or best defensive players on the day were Luke Shuey, Tom Barrass and Jeremy McGovern. I think this fits with a subjective ‘eye-test’ and also makes some sense in respect to how we know the game is played and where defence comes from.
Especially pleasing from my point of view is that when I combined offensive efficiency with defensive efficiency to produce net results a sensible albeit interesting picture starts to emerge in terms of performance. The best player in the game comes out as Luke Shuey, which, to be honest, had my results produced anything other than Shuey as the top player, I would’ve felt that the model was fundamentally broken.
Following Shuey are Jordan DeGoey and Josh Kennedy – not unreasonable estimates for second and third best but also interesting insofar as both players touched the ball only 13 and 18 times respectively.
Another area I’d like to explore more is the concept of ‘threshold’ wins and losses. With more analysis we can get a measure for an average or replacement-level player. I still don’t have that, yet I can say that the average player in this game was worth -0.9 of a point. Given this we can estimate the player’s impact on the end result by asking what the score would have been if this player was replaced with an average replacement. By that measure we can say if the following West Coast players were replaced by a game average player in a game decided by five points, West Coast would have lost without them:
On the flip side, in a loss of five points the following Collingwood players can be deemed to have lost the game as replacing them with a game average player would have made a five-point difference or more:
Having gone almost 2000 words, I’ll leave it to you to discuss the results further. My guess is that many of you will have taken issue with how Cox rates in terms of net impact, but I implore you to go to FootyWire and check out his stat line in relation to the stuff I have talked about in this article and my previous one. I’m ready to defend Mason in the comments section in any case.
Also, I have only used examples to illustrate methodology, not to preference any specific team or player. There are lots of things that would be worthy of investigation in respect to this analysis, so go ahead and suggest anything you’d like to look at in greater depth.