Hundreds, ducks, averages and #RootMaths

By Dennis Freedman / Roar Guru

During the latest Ashes series in Australia, a gentleman by the name of Dave Ticknor came up with the term #RootMaths.

If I interpreted correctly, he was defending the age old use of cricket averages and why, in his view, it was nonsensical to remove Joe Root’s highest Test score of 180 from any evaluation of him as that is not how averages work…or something like that.

In any case, it got me thinking about how the cricketing world uses maths.

However, there is a counter argument to #RootMaths and it goes something like this:

Averages are a poor predictor of a batman’s likely score as it allows for outlying scores (e.g. Root’s 180 and his one duck) to skew what may be his predicted score.

In another sense, would you prefer Marcus North in your team who is perceived to score either 0 or 100 only, or Ed Cowan, who will make you 35 every time? Depending on your team makeup and strategy, this could be very important data.

So, I applied standard deviation theory to the batting career’s of Root (29 innings @ 36), North (35 @ 35.5) and Cowan (32 @ 31) who all have similar records and here is what I have found.

Let’s start with Root. Firstly, when plotted on a chart, his innings are clearly skewed to the left.

That is to say, he is more likely to make less than his mean, than more on any given innings, with 72% of his scores falling less than 32 (his mean).

Note also that for this purpose, not outs receive no special treatment unlike in cricketing averages. This exercise will only predict Root’s likely scoring range irrespective of the not out or game situation. It is an innings score predictor, not an average per wicket lost predictor.

However, if the principle is applied equally across all players, it should not matter.

Next, using standard deviation mathematics and plotting those points, we get a standard deviation of 38 on a mean of 32.

In simple language, probability maths says that 66.6% of the time, Joe Root will score between 0 and 70.

95% of the time, he will score between 0 and 108.

99% of the time, he will score between 0 and 146.

His highest score of 180 is therefore a 1% or even North’s numbers
66.6% of the time, North will score between 0 and 76.

95% of the time, he will score between 0 and 117.

99% of the time, he will score between 0 and 159

71.5% of his scores fall below his mean of 33.4

Cowan’s numbers
66.6% of the time, Cowan will score between 0 and 60

95% of the time, he will score between 0 and 89

99% of the time, he will score between 0 and 118

72.5% of his scores fall below his mean of 31.2

Of these three players with similar records, North was the one more likely to score higher on any given day, even though his average was less than Root’s.

I’m still developing my theories, and considering the strengths and weaknesses of this type of analysis, but welcome your thoughts. Let’s hear your comments!

The Crowd Says:

AUTHOR

2014-01-18T12:13:20+00:00

Dennis Freedman

Roar Guru


Thanks Steve

AUTHOR

2014-01-18T12:13:06+00:00

Dennis Freedman

Roar Guru


Mark Waugh is overrated. The figures prove it. The whole purpose of this article was to remove bias and subjectivity, which you have tried to bring to it

2014-01-14T14:05:57+00:00

Steve O'Loughlin

Guest


Excellent stuff, Dennis. If I was still teaching VCE maths I'd be knocking up a SAC for the kids with this sort of analysis as we speak. Well done!

2014-01-13T22:16:25+00:00

Aransan

Guest


Dennis, you have certainly attracted a lot of interest with your article. Congratulations! I think the statistical analysis is going to be very difficult, you would want to find a theoretical distribution (not normal) that can approximate the typical spread of runs that batsmen score. I suspect it will be bi-modal (two peaks) with one peak near the duck end of the distribution. The parameters would then be calculated for each batsmen including mean and standard deviation, but perhaps other parameters as well (the normal distribution only has mean and sd as parameters). You may find that more than one theoretical distribution is required depending on the batsman. Armed with this data you could then do computer simulations to give a range of results for a particular team. But just to make things really complicated you would need to take into account the attack of the opposing team. How would you model the bowling statistics? All beyond me!

2014-01-13T21:20:38+00:00

The Barry

Roar Guru


Whatever the statistics they ignore how the runs are scored. The biggest statistical anomaly I've seen is Mark Waugh. He only averaged 42 or 43 but was a much better player than that. I don't think SD would have altered things that much with his stats given the number of tests, runs, innings and his spread of scores. Mark would come in at 2-plenty and get out cheaply trying to dominate as opposed to coming in at 2-not many and score a classy backs to the wall hundred. I guess for players like him you can analyse all the stats in the world and they're never going to tell the full story.

2014-01-13T21:14:43+00:00

The Barry

Roar Guru


The 99% may have been rounded down...although if Gillespie got 200...? I guess I'd be hoping for my debut against Bangladesh or hope Sweden get test status sometime soon...

2014-01-13T21:03:04+00:00

Bearfax

Guest


Quite right this Aaron. I guess it would be the next level in interpreting value. But I suspect the variations would not be that great and it would be at best tweaking how averages are tp be interpreted. I suspect as a batsman ages, he becomes more like your Watson variation, but still all experienced batsmen do go through black periods where they struggle to exceed 50. And if a batsman was scoring 50 consistently or 100 one innings and 0 the next, would that be a significant factor in selection. There are five other batsmen in the team and they also would average out the total scores. I would certainly prefer a batsman who gets out for a duck and then 100 in alternative innings (eg Warner) more than someone more consistent scoring 30 every innings (eg Cowan).. Its the overall aggregate that's important I would think

AUTHOR

2014-01-13T20:37:23+00:00

Dennis Freedman

Roar Guru


Interesting concept. I will do some further work on it.

2014-01-13T15:51:41+00:00

Aaron

Guest


Good idea this. I've been thinking for a while that cricket needs a simple Standard Deviation measure for each batsmen. A batsmen that averages 50 can score 0,100,0,100 ( a marcus north type) or 50, 50,50,50 (a shane watson type). Which one is better for your particular team? I think an interesting use for this stat would be a team 'collapsibility' measure. Combine the standard deviation + average for each batsmen and it will give you a pretty good idea whether your top order is likely to be all out for < 50 in any one particular innings. Could be a really useful tool for selecting teams.

2014-01-13T05:52:17+00:00

Ed Lamb

Guest


Key to a players success is when they get their luck - e.g. Hussain not given caught behind, went in to his maiden Test hundred and captain his country. Ed Smith's book about luck is a good read - he got a handful of Tests but a bit of ill- fortune. He doesn't give that as the reason he didn't have a long Test career, but talks about the role luck plays.

2014-01-13T03:57:17+00:00

Gr8rWeStr

Guest


The relative importance of runs goes beyond the quality of opposition to the match playing conditions for each innings, which can affect cricket more so than most sports. This article's idea of minimising the impact extreme scores by using standard deviation will reduce the impact of scoring lots of runs against minnow sides but the result will still be skewed by how often minnow sides are played. My question is, should a 50 in a total of 195 be considered a more significant innings than 100 in a score of 7dec/650? I think yes, so favour adding a %age of innings, team and/or match totals to identify the hard run getters, rather than the 'flat track bullies'.

2014-01-13T03:13:27+00:00

Vikramsinh

Guest


Some do rate him . . .

2014-01-13T03:11:51+00:00

Vikramsinh

Guest


I would love to see pujara.

2014-01-13T03:08:47+00:00

josh

Roar Rookie


Typically middle order batsman have better averages than openers.

2014-01-13T02:49:17+00:00

JGK

Roar Guru


What the stats don't show is that Cowan was mostly picked as an opener and North as a middle order batsman.

2014-01-13T02:41:44+00:00

Matt F

Roar Guru


I'd probably go with 'none of the above' to be honest :)

2014-01-13T02:36:29+00:00

josh

Roar Rookie


On those stats you would pick Cowan.

2014-01-13T02:14:14+00:00

Pope Paul VII

Guest


Apparently so. Aaron Finch must by very good because he conjures a lot of luck. Some significant innings by good batsman making it happen, their score in brackets when they had the let off, when I can remember S R Waugh (42) 200 D G Bradman (28) 187 S Broad D M Jones (5) 184 A Symonds 5 ( 151 ) 30 ( 160 ) E Cowan (47) 136 M Vaughan - 2002/3 2005 - stood his ground a couple of times in 2002/3, big drop 2005 and many many more Probably all were incorrectly dismissed often as well so it evens out. But you need your 30 plus tests to get a leg up.

2014-01-13T02:07:45+00:00

Matt F

Roar Guru


I certainly think that there's room for more statistics in cricket, or at least more recognition of statistics. Too often we hear people talk about how good a batsman somebody is because he 'looks comfortable at the crease' or some other irrelevant factor. The fact that they don't score big scores anywhere near as often as a 'scratchy' player is sometimes overlooked. Of course not every 'scratchy' player is better than others, it's just an example. I suppose the only issue is that a lot of what you've explained can be seen from existing statistics. By simply looking at the number of hundreds by number of innings you can tell that North was more likely to make a big score than Root or Cowan. North made 5 hundreds in 35 innings (14.3%) Cowan scored 1 hundred in 32 innings (3.1%) Joe Root has scored 2 centuries in 29 innings or (6.9%) Nmber of times passed 50: North - 9/35 or 25.8% Cowan - 7/32 or 21.9% Root - 20.7% Of course in contrast, look at the number of times each player passed 25. North passed 25 runs 11 times out of 35 innings (31.4%) Cowan passed 25 runs 17 times out of 32 (53.1%) Root passed 25 runs 12 times out of 29 (41.4%)

2014-01-13T01:59:54+00:00

JGK

Roar Guru


Good batsmen make the most of their luck!

More Comments on The Roar

Read more at The Roar