The Roar
The Roar

AFL
Advertisement

PyEx: A statistic you can sink your teeth into

Expert
26th March, 2014
14

North Melbourne are 2014’s consensus ‘surge up the ladder’ team. Most pundits have them in the top four, and some were even as bold to say they’ll go all the way and take the title off Fremantle (who are my pick).

Last week’s insipid display may cause some to waver, though, although the Bombers seem to find something when their club is in the headlines in the lead-up to the real show (and, yes, that was a play on words).

The Roos finished up last year in 10th place, with a 10-12 record and a very strong percentage of 119.5 per cent (good enough for sixth overall). In kicking 342.255 on the season (that is, goals and behinds), the Roos kicked more maximums than anyone except Geelong and Hawthorn, while their accuracy trailed only the Dockers.

Defensively, they were merely middling, but in a competition where just under half the teams make the finals, that’s good enough.

North were in 10 matches where the final margin was less than two goals – what I consider a ‘close’ game from an analytical perspective – and won just three of them. And they had a ‘contender’s’ off season, in so far as they bought in the specific players and coaches to target deficiencies.

So it’s no surprise they’re the sweethearts of this year’s footy season.

Well, I hate to burst the AFL world’s bubble, but the numbers say North finished pretty much exactly where they should have last year. I don’t mean the numbers I’ve presented above, nor the numbers of revisionist commentators who said so after the fact. I’m talking about a statistic I’d like to introduce you to: Pythagorean Expected Wins (let’s call it PyEx; it can’t be a stat unless it’s got a cool abbreviation).

You’re probably thinking: I know this Pythagoras thingy from somewhere! It’s a mathematical theory we all learnt in high school as the way to calculate the length of one side of a triangle. In a sports statistics sense, it was first championed by the Godfather of sports analytics, Bill James.

Advertisement

If you’re interested in more information about the theory, shoot over to the Wikipedia page (which is, surprisingly, pretty good).

Essentially what it boils down to is that a team’s winning record should be a function of its offensive potency and defensive abilities, and that the two of these are, statistically, unrelated (that is, a team’s offense is independent of its defence).

It was first used in baseball, where it’s quite obvious the two are distinct. It’s since spread to a range of other American sports, and is able to accurately predict the wins a team should have throughout a season.

PyEx’s most heavy use in baseball, in particular, is to test the role of luck and/or outside factors in a team’s wins and losses. The theory behind this is fairly sound: if a team’s quality is measured by its offense and defence (PyEx), then this should be reflected in its overall wins and losses. Any deviation from this is caused by in-game or situational factors largely beyond the team’s control; a crucial free kick, for example, may lead to a close loss.

It has some issues, for sure. To be fully accurate, it needs a big sample: it’s incredibly effective in a 160+ game baseball slug fest, and less so in a 16 game gridiron grind.

For the AFL, there’s an argument that the offensive and defensive elements of the game aren’t independent – which is something I’d concede. However, as you’ll see towards the end of the article, PyEx is very, very, accurate when a couple of tweaks are applied.

But enough talk, what does the straight up PyEx formula say about the 2013 AFL home-and-away season?

Advertisement

(Note, if you just want to see the final results, skip ahead to the last table of figures.)

 

Wins

%

Ladder Rank

PyEx Wins

PyEx Rank

Hawthorn

19

135.7%

1

17.6

1

Geelong

18

135.6%

2

17.6

2

Fremantle

16.5

134.1%

3

17.1

3

Sydney

15.5

132.5%

4

17.1

4

Richmond

15

122.8%

5

15.7

5

Collingwood

14

115.0%

6

14.3

7

Essendon*

14

107.3%

7

12.7

9

Port Adelaide

12

102.4%

8

11.6

11

Carlton

11

106.7%

9

12.6

10

North Melbourne

10

119.5%

10

15.2

6

Adelaide

10

108.1%

11

12.9

8

Brisbane Lions

10

89.6%

12

8.4

14

West Coast

9

95.3%

13

9.8

12

Gold Coast

8

91.7%

14

8.9

13

Western Bulldogs

8

85.1%

15

7.2

15

St Kilda

5

82.6%

16

6.6

16

Melbourne

2

54.1%

17

1.3

17

Greater Western Sydney

1

51.0%

18

0.9

18

*Note I’ve chosen to place Essendon in their W/L position on the ladder, as we’re not testing the effect of executive orders on ladder position.

So PyEx gets it pretty well on the money from a rankings point of view. The top five teams according to PyEx ended up in the top five positions on the ladder, and it got the bottom four too – both in order, no less.

But in the middle things get a bit dicey. PyEx says North were the sixth-best team in the league, despite finishing 10th on the ladder, while Port Adelaide (even with Ken Hinkley) were only the 11th-best team, despite coming eighth on the W/L ladder.

You’ll also notice the wins and PyEx wins are quite different for a number of teams. Particularly at the very pointy end of the ladder (Hawthorn miss out on almost two wins), but also at the bottom (St Kilda gain 1.6 wins), and, well, the middle too, with North ‘gaining’ more than five wins using PyEx.

Advertisement

What’s the issue? Well, we want the formula to give a true account of a team’s luck (good or bad) over the course of a season, so we can enter the following season with some guide as to which teams may perform better (or worse) based on some external elements, like a more favourable draw, more luck in the close games, or the impact of changes in personnel. So, some tweaks are required.

My investigations led to two conclusions: PyEx in the AFL doesn’t appropriately take into account the role of defence in the creation of scoring shots (and so overvalues offensive potency), and the role close losses play in determining the number of wins a side has over the year.

Let’s mix these ingredients into PyEx and see what we get.

Close losses
As I foreshadowed earlier, close losses aren’t really taken into account in PyEx. But when you think about it, they can’t be. All PyEx is doing is trying to put a value on a team’s offensive and defensive potency; it won’t look at circumstances where a team wins or loses by a small margin.

Take North, who were involved in 10 games – 10 of their 22 – where the margin was two goals or less. That’s a phenomenal number; I might check this later on, but I’d hazard a guess it’s the most in any AFL season, and probably VFL too (although this is less likely as the game was lower scoring in the past).

There were 45 games decided by less than two goals over the whole home-and-away season; 44 of them were won, and there was one draw (between Fremantle and Sydney). How often should a side win a close game? A fair starting point is to assume that, over the long term, a side will win 50 per cent of games it plays in that are decided by two goals or less.

You could argue that some teams win more, consistently, but I’d almost guarantee that this stat will regress to the mean over the long run.

Advertisement

Right, so if a side can expect to win 50 per cent of its close games, how did each side fair last year?

Played

Won

% Won

+/- Close Wins

Brisbane Lions

7

5

71%

+1.5

Essendon

5

4

80%

+1.5

Port Adelaide

8

5

63%

+1.0

Fremantle

4

2.5

63%

+0.5

Hawthorn

5

3

60%

+0.5

Melbourne

1

1

100%

+0.5

West Coast

5

3

60%

+0.5

Western Bulldogs

5

3

60%

+0.5

Geelong

8

4

50%

0.0

Richmond

4

2

50%

0.0

Carlton

7

3

43%

-0.5

Collingwood

3

1

33%

-0.5

Gold Coast

3

1

33%

-0.5

Greater Western Sydney

1

0

0%

-0.5

Sydney

2

0.5

25%

-0.5

Adelaide

8

3

38%

-1.0

St Kilda

4

1

25%

-1.0

North Melbourne

10

3

30%

-2.0

North Melbourne won two fewer close games than we would expect last year but, as you can see, its winning percentage of 30 per cent isn’t the worst in the league.

St Kilda were involved in four close games, and won just one of them – with our analysis saying they could expect to have won at least one more.

The top end of this ladder is also quite intruiging. The Bombers were involved in five close games last year, and managed to snag four of them, earning them an extra 1.5 wins than we would expect over the long run. How about the Lions though! Involved in seven close games, and they managed to edge over the line in five of them – again, good enough for an extra 1.5 wins than we would generally expect.

Let’s now add these figures to PyEx, and see where we get. Note, I’ll be taking these +/- close wins from the PyEx figure, as we’re trying to calibrate the reasons as to why PyEx overstates or understates a team’s wins. As we’ve said, PyEx, theoretically, assumes that a team will win 50 per cent of its close games, which is why we’re taking the +/- out.

Or, to put it another way, we want to be adding the +/- close wins to the team’s actual wins, because we’re then in a way accounting for PyEx’s cold statistical analysis with the colour and excitement of the actual competition.

Advertisement

Look, just take my word for it, and if anyone wants to duke it out in the comments, we’ll deal with it then.

Wins

%

Ladder Rank

PyEx Wins

PyEx Rank

Hawthorn

19

135.7%

1

18.1

1

Geelong

18

135.6%

2

17.6

3

Fremantle

16.5

134.1%

3

17.6

2

Sydney

15.5

132.5%

4

16.6

4

Richmond

15

122.8%

5

15.7

5

Collingwood

14

115.0%

6

13.8

7

Essendon*

14

107.3%

7

14.2

6

Port Adelaide

12

102.4%

8

12.6

9

Carlton

11

106.7%

9

12.1

10

North Melbourne

10

119.5%

10

13.2

8

Adelaide

10

108.1%

11

11.9

11

Brisbane Lions

10

89.6%

12

9.9

13

West Coast

9

95.3%

13

10.3

12

Gold Coast

8

91.7%

14

8.4

14

Western Bulldogs

8

85.1%

15

7.7

15

St Kilda

5

82.6%

16

5.6

16

Melbourne

2

54.1%

17

1.8

17

Greater Western Sydney

1

51.0%

18

0.4

18

Right, so not a lot has changed in terms of the positions on the ladder. The top five and bottom four remain the same (although the Cats and Dockers have changed places), North and Adelaide have dropped back, while the Bombers have rocketed up into sixth position (causing everyone else to shift down a position or two).

But what we’re more interested in is the PyEx Wins versus the actual wins. And we’re significantly closer now. The biggest deviations remain North (10 Wins versus a PyEx of 13.2), West Coast (nine versus 10.3) and Hawthorn (19 v 18.1).

PyEx still hasn’t quite given GWS its only win of 2013, but as you can see, 12 teams are now rated within one win of their actual wins, while seven are within half a win (which I’d consider pretty good). This compared to 10/4 in the straight PyEx.

I’m not satisfied, though. Lets now add the last ingredient: the underappreciation of the role defence plays in creating scoring opportunities.

Adjusting offense
The final tweak we’ll make is to reduce, slightly, the influence of offense on PyEx Wins. This is to acknowledge that, unlike other sports, defensive skill plays a direct role in the ability to create scoring opportunities.

Advertisement

We’ve seen this emerge in recent years, with an increasing per centage of scores coming from direct turnovers, and with more and more direct turnovers resulting from the application of pressure by the defensive side.

I’m unfortunately not in a position to put some solid numbers behind how much we should cut ‘points for’ in PyEx. Intuitively, though, it makes sense. What I found in my fiddling with the numbers is that in the 2013 home-and-away season, reducing the influence of offense on PyEx by 2.7 per cent gives the best result.

Now that our PyEx is fully baked, where have we landed?

Wins

%

Ladder Rank

PyEx Wins

PyEx Rank

Hawthorn

19

135.7%

1

17.7

1

Geelong

18

135.6%

2

17.1

3

Fremantle

16.5

134.1%

3

17.2

2

Sydney

15.5

132.5%

4

16.1

4

Richmond

15

122.8%

5

15.1

5

Collingwood

14

115.0%

6

13.2

7

Essendon*

14

107.3%

7

13.6

6

Port Adelaide

12

102.4%

8

11.9

9

Carlton

11

106.7%

9

11.4

10

North Melbourne

10

119.5%

10

12.6

8

Adelaide

10

108.1%

11

11.2

11

Brisbane Lions

10

89.6%

12

9.2

13

West Coast

9

95.3%

13

9.6

12

Gold Coast

8

91.7%

14

7.8

14

Western Bulldogs

8

85.1%

15

7.1

15

St Kilda

5

82.6%

16

5.1

16

Melbourne

2

54.1%

17

1.7

17

Greater Western Sydney

1

51.0%

18

0.3

18

Ok, so we haven’t quite nailed the ladder. But take a look at the PyEx Wins versus actual wins. We’ve now got to a situation where 15 of the 18 teams have PyEx Wins that are within one of actual wins, and seven which are within half a win.

Here’s the table of the final PyEx versus actual wins, so you can see more clearly.

Fully baked

Advertisement

Wins

PyEx Wins

+/-

Hawthorn

19

17.7

+1.3

Geelong

18

17.1

+0.9

Fremantle

16.5

17.2

-0.7

Sydney

15.5

16.1

-0.6

Richmond

15

15.1

-0.1

Collingwood

14

13.2

+0.8

Essendon

14

13.6

+0.4

Port Adelaide

12

11.9

+0.1

Carlton

11

11.4

-0.4

North Melbourne

10

12.6

-2.6

Adelaide

10

11.2

-1.2

Brisbane Lions

10

9.2

+0.8

West Coast

9

9.6

-0.6

Gold Coast

8

7.8

+0.2

Western Bulldogs

8

7.1

+0.9

St Kilda

5

5.1

-0.1

Melbourne

2

1.7

+0.3

Greater Western Sydney

1

0.3

+0.7

When reading this table, what matters is the +/- figure. This, effectively, is the difference in terms of wins between a team’s actual result and the result implied by their offensive and defensive skills (and the interaction between defence and offence) and their luck in winning close games.

A plus indicates that the overall wins earned by a team are likely to reflect circumstance versus quality; or, if you’d like, their W/L column is overstated. A minus means the opposite.

Lets take Hawthorn, for example. Hawthorn ended the year on top of the ladder with 19 wins. Yet, according to the PyEx formula, they were only good enough for 17.7 wins – implying that circumstances outside of their control gave them an extra 1.3 wins over the course of the year. This also means that, if we were to replay the 2013 season again from scratch, we would expect Hawthorn to end up between 17 and 18 wins – more likely 18.

It also means they are a prime candidate for regression this year, as those factors which gave them an extra 1.3 wins may or may not be repeated in 2014. Whether they are able to offset that by increasing their offensive or defensive potency isn’t for me to judge.

Make no mistake, the figures still show that Hawthorn was the best side in 2013, but perhaps by less than was implied by their 19 wins.

Now, have a look at North Melbourne. Even after accounting for their terrible luck in close games, and adjusting down offense (which North are more associated with than defence), North were still 2.6 wins short of what you would tend to expect given their output. Adelaide were in a similar boat, which took me by surprise a bit.

Advertisement

This makes North Melbourne your prime bounce (pardon the pun) candidate for 2013 (particularly after their offseason moves), followed closely by Adelaide. Don’t rule out an improvement from Fremantle and West Coast based on these figures, too, and, well if we think Buddy is worth his price tag, Sydney should get a lift in 2014, too.

So, I commend PyEx to the masses. I’ll revisit this using 2014 numbers when we get to, say, the half way mark of the season to see how your team is tracking. And if The Crowd considers it useful, I may even go back over previous seasons and see where we get to.

close