A math geek's guide to the true AFL ladder

24th July, 2018

We all have opinions about footy. That’s why we follow sports; that’s why sports exist. The only real reason we allow so much money to be poured into sports in our society is that we enjoy wrestling with it, mentally and emotionally.

Why is my team better than your team? Who’s the best player? Why is player X better, more important, or improving faster than player Y? Which team is most likely to win the flag? Make the finals? Finish last? Improve the most?

How can I statistically justify the love for my chosen team, player, coach (or mascot)?

It gives us pleasure to cheer our team on. But it also gives us pleasure to be right about our beliefs!

For some of us, trying to quantify that process gives us even more pleasure – call us math geeks, we won’t complain.

So, for you math geeks and spreadsheet fanatics out there, we’re going to quantify the most important question we can ask about our sport of preference – where does your team rank among the 18 AFL clubs right now?

What are their chances for the rest of the season?

I’ve been writing about this topic using my patented ELO-Following Football rating system in this publication for two years now. The system’s been in place for several more years than that, long enough to work most of the kinks out.

But it’s not the only such rating system out there. Today, we’re going to combine four of the most mathematically justifiable systems online today: besides mine, we’re going to utilize The Arc, FMI, and The Wooden Finger.

There are other sites I like which use pure statistical analyses to evaluate teams, like HPN or A Matter of Stats. However, these four provide a single number to describe each team, and therefore makes today’s exercise much easier.

So, here are the ratings in each system for each of the 18 teams in the AFL as of Round 18. Fair warning; those of you who think math is evil, skip on down below the two spreadsheets and let us math geeks have our fun.

Club	ELO	The Arc	Wooden Finger	FMI
Adelaide	51.7	1531	-0.9	1168
Brisbane	52.0	1431	-5.1	976
Carlton	7.9	1276	-40.7	755
Collingwood	64.6	1581	15.1	1217
Essendon	54.2	1539	3.5	1100
Fremantle	32.8	1398	-17.1	927
Geelong	63.4	1592	15.8	1274
Gold Coast	18.1	1326	-34.0	728
GWS Giants	63.9	1561	12.5	1210
Hawthorn	53.8	1523	8.7	1162
Melbourne	68.0	1590	18.9	1252
North Melbourne	45.1	1487	0.9	1095
Port Adelaide	53.8	1535	5.0	1194
Richmond	80.6	1671	29.3	1385
St Kilda	35.9	1415	-11.5	943
Sydney	56.1	1554	6.9	1292
West Coast	65.0	1605	15.6	1273
Western Bulldogs	27.6	1385	-22.9	918

To compare these four disparate systems is difficult as is. But there’s something each system has in common; there is a centre point which each set of numbers revolves around.

For the ELO-FF, the eighteen ratings always add up to 900, and the average rating is always 50 because whatever is added to one team’s score is subtracted from their opponent’s rating (that’s what makes it an ELO system).

For The Arc, Matt Cowgill uses 1500 as his centre point, and when Gold Coast gained 41 points with its upset victory Saturday, Sydney lost those same 41 points on their Arc score.

The other two systems also balance the total ratings of the eighteen teams, but over a longer period of time – generally the last 22 games.

So, the overall total of the teams’ ratings average about the same throughout the year: zero for Wooden Finger, and about 1100 for FMI.

While Gold Coast went up 7.5 points last week on the Finger’s rating system, Sydney’s went down slightly more than that (7.6 points). That’s because of idiosyncrasies in the ups and downs across each team’s previous 22 games. But that tenth of a point will balance out over the course of the round and the season.

So, if we re-examine each rating system using some basic statistical analysis, we can take those center points and figure out what the standard deviation of each rating is and compare those.

A quick explanation for those non-math geeks who snuck into the conversation; we use standard deviation as a fairly arbitrary measure of how far a rating is from the average, and how dramatically that rating deviates from the mean.

In general, with a large enough sample of numbers, about 68% of all numbers should fall between one standard deviation above and one below the average – just over 95% fall between two standard deviations on either side.

In our ELO-FF system, that deviation is about 18.8 points, so we’re describing all the ratings between 31.2 (50–18.8) and 68.8 (50+18.8).

Right now that’s 14 of the 18 teams falling in that range. Earlier in the season, it was as low as 12 clubs. 68% of 18 would be between 12-13 teams, so we’re in the right ballpark.

On the chart below, I’ve translated all those numbers from the first list into standard deviations, so we can compare them across the four rating systems. Zero means the team is exactly on the average rating. Positive numbers indicate ratings above the mean and negative indicate teams below the mean.

Club	ELO	The Arc	Wooden Finger	FMI	Average	Ranking
A cut above the rest
Richmond	1.672	1.589	1.518	1.491	1.568	1
Legitimate competition
Melbourne	0.984	0.836	0.979	0.787	0.896	2
West Coast	0.820	0.976	0.808	0.898	0.875	3
Geelong	0.732	0.855	0.819	0.903	0.827	4
Collingwood	0.798	0.753	0.782	0.601	0.734	5
Semi-legitimate competition
GWS Giants	0.760	0.567	0.648	0.564	0.635	6
Sydney	0.333	0.502	0.358	0.998	0.548	7
Splashing about at sea level
Port Adelaide	0.208	0.325	0.259	0.479	0.318	8
Hawthorn	0.208	0.214	0.451	0.310	0.296	9
Essendon	0.230	0.362	0.181	-0.019	0.189	10
Adelaide	0.093	0.288	-0.047	0.342	0.169	11
North Melbourne	-0.268	-0.121	0.047	-0.045	-0.097	12
Below average, but respectably so
Brisbane	0.109	-0.641	-0.264	-0.675	-0.368	13
St Kilda	-0.770	-0.790	-0.596	-0.850	-0.752	14
Fremantle	-0.940	-0.948	-0.886	-0.935	-0.927	15
Western Bulldogs	-1.224	-1.069	-1.187	-0.983	-1.115	16
Witness protection program
Gold Coast	-1.743	-1.617	-1.762	-1.989	-1.778	17
Carlton	-2.301	-2.082	-2.109	-1.846	-2.084	18

I also took the liberty of shuffling the order to demonstrate how the teams have stratified this season.

While the numbers slide around a bit from round to round (Brisbane is slowly moving from group four up to group three, while the Giants made a significant jump over the last two or three weeks as well), we can see some definite separations worthy of note.

Richmond is in a class of their own right now. Their lead stands out like a sore thumb when you see it like this – no other team is a full deviation above the average anywhere, but the Tigers are about one and a half deviations up in every system.

Right now, no matter how you measure it, Richmond is a couple of cuts above the rest.

Remember, though, the only thing our four systems measure is current performance. If Dustin Martin and Alex Rance get injured, everything could change in a heartbeat.

Also, these systems only compare apples to apples. It says nothing about how 2018 Richmond would do against, say, the 2000 Bombers, the 2009 Cats, the 2013 Hawks or the 1929 Magpies.

The next group, the competitors with a real chance to overtake the leaders, is composed of at least three and perhaps as many as six other teams.

The Demons, Eagles, and Cats are all demonstrably above average in every system, and to be close to a full deviation above average is to stand out against all (ordinary) competition.

Collingwood is part of that group – although the FMI isn’t so sure, and they have legitimate reasons, statistically.

Sydney was unquestionably in that group until they chose to take three quarters off on Saturday, whereas the Giants have been steadily moving up from the bottom of the lower group back to where we expected them to be all along.

Then we can see this collection of five teams struggling to stay at or above average. Competitively, they’re still trying to make finals, even if they know they have fatal flaws that by all rights should doom them to a quick out.

Thank you, 2016 Bulldogs, for being the patron saint of this group, giving them a reason for hope from now on.

All five were above the mean last week; the Kangaroos dropped just below after their demolition by Collingwood.

Every one of them is worthy of playing in finals as each one is a demonstrably good team, at least average among the 18 best in the country. But compared to the seven teams ahead of them? Yeah, they might win a game or three, but that’s not the way we’d tip.

Then there’s the group Brisbane’s working hard to advance out of: the teams that the top twelve figure they should have routine wins over each time out.

Notice something important – see how much larger the deviations are for these teams (ignoring the Lions) than even the top non-Richmond teams are, meaning that the Dockers and Bulldogs are ‘more bad’ than teams like Geelong and West Coast are ‘good’.

Freo almost certainly won’t be around in September. (Photo by Adam Trafford/AFL Media/Getty Images)

The gap between these six teams and the top twelve is so large that the weak literally balance twice as many good teams on the other side.

Detour number two for non-math geeks; think of these deviation numbers as a way to balance the eighteen teams on a seesaw. The bigger the number, ignoring the negative sign, the ‘heavier’ the team’s rating is.

Essentially, there are six fat teams on the negative side balancing Richmond and eleven much lighter teams on the positive side.

Then, there are two really, really fat teams on the bottom of the ladder, despite one dramatic win to the contrary.

Before Round 17’s victory, Gold Coast were also two full deviations below the average ratings, as Carlton’s is.

One look at Gold Coast and Carlton’s numbers on this chart tells you why there’s such consternation on their viability moving forward.

What those two teams need and what they’re going through deserve an entire column of its own.

Suffice to say right now that, for the two teams to continue to exist in the AFL two full deviations below even the average team means they wouldn’t be expected to win against anyone except each other. (That said, I’ll take Gold Coast by less than the twenty-point margin this weekend).

The chances of them winning against any other opponent are always going to be one-in-ten or worse. Over their previous five games, each of these two teams had just one occasion when the oddsmakers and public tipsters gave them better than that ten per cent chance of winning.

That’s not healthy for the players who have dedicated themselves to achieving the level of personal excellence necessary to play AFL-quality football, nor is it healthy for the teams these bereft clubs play, nor is it beneficial for either the barrackers of each squad.

It’s also unhealthy for, strangely, the opposition fans, for whom it’s a lose-lose proposition. Win, and you were supposed to win; lose, and you’re Sydney this week.

A league is healthiest when every team can go into the season dreaming of greatness.

The AFL is in pretty good shape in 2018. Twelve of its eighteen teams can still realistically say they’ve got a legitimate shot at finals and maybe a bit more – 13 if you’re reading this on the west coast.

Sixteen of its teams can project out past 2018 and see success on the horizon if their vision’s good enough. (You know, if they have 2020 vision).

And as for Carlton and Gold Coast? Well, they can always hope for help from above.

Specifically, Gil McLachlan’s office.

The Crowd Says:

2018-07-26T13:39:02+00:00

Doctor Rotcod

Guest

Although it relates to Collingwood, in terms of a healthy %,the Eagles and Essendon, in the eight by a gnat's whisker over Melbourne last year have gone in opposite directions off a lowish % and a 12/10 finish.

2018-07-26T04:21:38+00:00

Matt H

Roar Guru

Great article, well done.

2018-07-26T01:28:19+00:00

dontknowmuchaboutfootball

Guest

thanks, Gordon. Appreciate the response.

2018-07-25T12:30:09+00:00

Doctor Rotcod

Guest

Gordon, I noticed that quite a few of these predictive blogs have carryover points from last season and while there is some merit in this ,I think using the HPN idea of Player Approximate Value allows for retirees, decline from career-best year,i.e.Martin,means that each game is judged on the players who are actually available and their current form trend for the game that is about to be played. The carryover points ,have skewed Richmond's metrics for this year and I don't think that they're a lock for the flag,despite the $2.35 on offer

2018-07-25T09:17:16+00:00

Fat Toad

Roar Rookie

GPS, Thanks for the considered reply(s). I though your observation about the percentage was an interesting one and sort of brings together some thoughts that I have previously had in relation to lower sides with disproportionately high percentages. I think that it seems to be a feature of clubs that are maturing their list of getting their systems right that they have good quarters rather than good games. Interestingly, I think that Collingwood started this process about three or four years ago, so in some respects it is overdue. (However, I think that it took Richmond about the same amount of time.) You are probably already aware of some of the tricks of using lumpy results to get good out comes. The wins and losses over the last five games on the AFL site's ladder is a sort of rolling average for the eyeball on how a team is tracking. I have also been thinking about things like using winning margins, but because of their high variance in relation to the mean possibly transforming the figure to a log value. I have also said somewhere else here that rather than looking at teams beaten above and below which combines a snap shot (current ladder position and out of date form) I am always bemused that people say team X has not beaten anyone ranked above them hance are not that good. For example what would this say about Richmond, or Carlton? A better indicator might be to allocate a difficulty score based on the premiership points differance of teams on the day the game was played. This would account for the problems of beating a team three positions above, but which had won the same number of games with the ladder position reflecting only their respective percentage. I have come to thinking that the idea of ranking a win and a rolling average of ranking points for the last five weeks might be of some value. I think that it might end up a little like the player rankings in tennis where a win over a higher ranked player is of greater benefit than a win over a similar ranked player. While this sounds difficult, it is the kind of thing that spread sheets are made for and would be easy to automate. Thanks for your time and sharing your stuff with everyone. I love it!! (you probably already guessed this from my comments B-) )

2018-07-25T08:43:47+00:00

Doran Smith

Roar Guru

Great article, most open season in years, injuries could play a part in deciding who the premiers are.

2018-07-25T08:14:16+00:00

Doctor Rotcod

Guest

I noticed the other day that the bookies for the first time this season have the Eagles on the second line of betting. Certainly Richmond are the team to beat but they are a beatable team and as you say a Rance or Martin injury might make it more likely.

2018-07-25T08:13:29+00:00

tibor nagy (big four sticks)

Guest

Great article. You are right Richmond are a cut above the rest. Gold Coast will beat Carlton by triple figures. Carlton don't even want to win.

2018-07-25T05:05:59+00:00

Peter

Guest

Carlton actually had one of their better games against Sydney in Sydney (Rnd 11). In front at half time, lost the third qtr but broke even in the 4th. The game was there to be won by the blues. But yes Carlton are rubbish

AUTHOR

2018-07-25T04:36:49+00:00

Gordon P Smith

Roar Guru

To be utterly frank with you, I have on several occasions tried to refine this system by accommodating top players in or out using a variety of valuations (player ratings, fantasy values, you name it), but it was so time intensive and (amazingly) had such a minimal result that it wasn't worth the effort. Having just retired, it's possible that I'll take another shot at it, but my results have been good enough without doing so that just adding a bit of common sense (GWS got some players back, so hedge your tips that direction), you can be pretty accurate with your picks.

AUTHOR

2018-07-25T04:26:54+00:00

Gordon P Smith

Roar Guru

I chose these four because every one of them exclusively use only the final score and treat the team as a whole - no breaking apart offense and defense, no player injuries accounted for, just the raw end results. What differentiates the results of the four systems is how they weigh immediate scores compared to older results. That basically decides how quickly the rating responds to one good or bad game - another way of saying how likely the system thinks one aberrant result is part of a trend, like the positive outcomes for the Magpies after about round three or so, or a one-off that should be taken with a grain of salt (think last year's Saints rout of the Tigers in R17). I don't have a short answer as to what can be gleaned from these results that the ladder itself wont answer, although the predictive accuracy of all four systems significantly exceeds picking by ladder or percentage position. But here's a tidbit that you CAN glean from the ladder that most people don't realize - if there's a team at season's end whose percentage is much higher than their win/loss record would indicate, bet on them making a BIG jump next season. Collingwood was that team last year. (It doesn't consistently work the other way, BTW.)

AUTHOR

2018-07-25T04:13:01+00:00

Gordon P Smith

Roar Guru

FT, the quick answer is that doing the variances produced close enough to the same results as the much simpler (and less mathematically justifiable, you're right) averaging of the four sigmas that for a mainstream site like this it made much more sense to go the simple route for public consumption. As for your second question, I didn't "allow" for the differences - I felt that those differences highlighted the different ways the four systems analyzed results. The "lumpy nature" of the draw is not completely immaterial, and both WF and FMI try to account for it more than the Arc and FF do, but the nature of all four systems is to compare expected results rather than raw scores. If the Giants only defeat the Saints by, say, twenty points, then they'll drop in any of our system, whereas if the Crows beat Melbourne by twenty, they'll move up. So it's almost immaterial which teams play whom. (Now, accounting for playing hot teams is nearly impossible...)

2018-07-25T03:12:40+00:00

Dutski

Roar Guru

Great article. Cheers ? Always good to see someone having fun with numbers.

2018-07-25T02:22:26+00:00

Wayne Kerr

Guest

Our form is absolutely pathetic. Brown is totally right to criticize Marchbank. In fact he should criticize the bulk of the team. They have no endeavor. I think we are far worse than the Gold Coast. There is no way we would beat Sydney in Sydney. I see little hope for us going forward to be honest. This is the worst Carlton team I have ever seen. They are embarrassment to the club and to the rest of the competition. The club culture stinks.

2018-07-25T02:16:09+00:00

Fat Toad

Roar Rookie

One of the dilemmas facing predictive analytics is that they tend to use the same inputs so similar outputs are expected. So they are all correct together or all wrong together, it is the great minds think alike or fools seldom differ conundrum. The real thing that separates them is the level of insight of the individual author of the algorithm over the others. Although you can use existing data to proof the algorithm it can become data checking itself. For there to be any value you would probably need to go back and test the algorithm week by week for a number of years. now that would be a great little challenge for all the visual basic buff and excel fans out there!

2018-07-25T02:13:25+00:00

Raj

Guest

Well done! Interesting perspective on the teams. It really emphasises how remarkable it was for Gold Coast to beat Sydney, after 5 goals down as the away team! Keep this ladder going... can we see Brisbane jump up into the next group?! Rising from the ashes. The Western Bulldogs are forever going to be that team, that one time a team leaped to the ultimate glory despite all odds haha Of particular interest is Melbourne. I actually don’t think they will make the finals but they are ranked number 2! If they lose against Adelaide you can put a line through them.

2018-07-25T02:04:58+00:00

Fat Toad

Roar Rookie

GPS, thanks for the work you have put in to this. I always love to see the numbers. I am not exactly sure how you have arrived at some of the results, but I think you have averaged the standard deviations of the four statistics. It probably doesn't make much difference, but I recall that it is not accurate to average standard deviations. I think the correct process is to average the variances of the population and then take the square root to get the average Std Dev. In doing the work, how did you allow for recent results being of greater predictive value than old results? Also how did you cater for the difference in the quality of opponents due to the lumpy nature of the draw?

2018-07-25T02:01:51+00:00

dontknowmuchaboutfootball

Guest

Thanks, Gordon. Always enjoy reading your columns, even though I'm not a maths geek. I've got a question for you, though — or maybe a challenge — about underlying assumptions, and more about the systems, I guess, than about the take home message. Decisions about what sort of data to analyse or factor into an analysis will obviously affect results. But to the extent that statistical analyses at least aspire to provide a representation or assessment of an objective reality, i.e something that isn't simply an invention of the analysis itself (here: "underlying strength"), I take it as given that the expectation or hope is that different statistical systems will produce more or less the same findings. And this would be even more the case when those systems share key aspects of methodology: e.g. the use of a centre point. So my question is: why would or should we be surprised if it turns out that the systems largely confirm each other's results? From a quick skim, it looks like the systems rank the teams fairly similarly, and broadly with the same degrees of relative difference. And — again, broadly speaking — this ranking corresponds to the percentage ladder. I appreciate that part of the point here is to see how far the actual ladder, i.e. points then percentage ladder, reflects a team's demonstrated "strength" (or potential). But to the extent that one can get similar results simply by looking at the percentage ladder, I'm wondering what these systems tell us that we can't already glean from the statistical system that is the bog standard ladder. Or to put the question another way: what are the (surprising) differences between the results produces by each system? To the extent that results vary according to decisions made about data and methodology, these differences obviously correspond to the specific methodology of the system, so what are the significant methodological differences in respect of these differences in results? Further — and this is the really interesting question for me — to the extent that each systems aspires to measure an objective reality ("underlying strength"), what do these differences in results say about the specific quality of each team's underlying strengths (in the plural)? Sorry, that's probably a convoluted way to ask the question, and definitely goes to an issue which it isn't the point of this column to address.

2018-07-24T22:31:09+00:00

Fairsuckofthesav

Guest

Great article. Terrific amount of work here. Be interesting to see what this looks like starting from the bye. E.g. Crows have won 3 out of 4 since their bye. And/or could you factor in how how many players from the top ten of a club's best and fairest are either in the team or missing and how that impacts...

A math geek's guide to the true AFL ladder

The Crowd Says:

Read more at The Roar