Hundreds, ducks, averages and #RootMaths
Happier days for Root, happier days for England. England's Joe Root celebrates his century during play on Day 3 of the second Ashes Test at Lord's. AFP PHOTO / CARL COURT
During the latest Ashes series in Australia, a gentleman by the name of Dave Ticknor came up with the term #RootMaths.
If I interpreted correctly, he was defending the age old use of cricket averages and why, in his view, it was nonsensical to remove Joe Root’s highest Test score of 180 from any evaluation of him as that is not how averages work…or something like that.
In any case, it got me thinking about how the cricketing world uses maths.
However, there is a counter argument to #RootMaths and it goes something like this:
Averages are a poor predictor of a batman’s likely score as it allows for outlying scores (e.g. Root’s 180 and his one duck) to skew what may be his predicted score.
In another sense, would you prefer Marcus North in your team who is perceived to score either 0 or 100 only, or Ed Cowan, who will make you 35 every time? Depending on your team makeup and strategy, this could be very important data.
So, I applied standard deviation theory to the batting career’s of Root (29 innings @ 36), North (35 @ 35.5) and Cowan (32 @ 31) who all have similar records and here is what I have found.
Let’s start with Root. Firstly, when plotted on a chart, his innings are clearly skewed to the left.
That is to say, he is more likely to make less than his mean, than more on any given innings, with 72% of his scores falling less than 32 (his mean).
Note also that for this purpose, not outs receive no special treatment unlike in cricketing averages. This exercise will only predict Root’s likely scoring range irrespective of the not out or game situation. It is an innings score predictor, not an average per wicket lost predictor.
However, if the principle is applied equally across all players, it should not matter.
Next, using standard deviation mathematics and plotting those points, we get a standard deviation of 38 on a mean of 32.
In simple language, probability maths says that 66.6% of the time, Joe Root will score between 0 and 70.
95% of the time, he will score between 0 and 108.
99% of the time, he will score between 0 and 146.
His highest score of 180 is therefore a 1% or even <1% event.
Does that mean you can exclude it from his average? In a cricketing convention sense, probably not. However, in a Moneyball sense, I let maths do the speaking.
66.6% of the time, North will score between 0 and 76.
95% of the time, he will score between 0 and 117.
99% of the time, he will score between 0 and 159
71.5% of his scores fall below his mean of 33.4
66.6% of the time, Cowan will score between 0 and 60
95% of the time, he will score between 0 and 89
99% of the time, he will score between 0 and 118
72.5% of his scores fall below his mean of 31.2
Of these three players with similar records, North was the one more likely to score higher on any given day, even though his average was less than Root’s.
I’m still developing my theories, and considering the strengths and weaknesses of this type of analysis, but welcome your thoughts. Let’s hear your comments!