The Roar
The Roar

Lurch

Roar Rookie

Joined September 2021

0

Views

0

Published

17

Comments

Published

Comments

Lurch hasn't published any posts yet

Seconded. Great article.

Yes, batting averages have stayed stable since 1920. Why does it matter?

The ICC formula has been tweaked over the years, but the core of it remains the same and the bulk of the original algorithm is in a book called “Deloitte Ratings: The complete guide to Test Cricket in the Eighties” by Marcus Berkmann. As far as the pitch goes, they use a formula to complete incomplete innings (because 180 for 2 will rarely equate to 900 all out), then use those figures to create a “match pitch factor” and an “innings pitch factor” whereby the match pitch factor contributes greater weight to the overall ratings than the innings pitch factor, but they both have a part in the overall rating. They effectively get given an adjusted score or adjusted bowling figures, so I’ve always thought they could/should publicise adjusted career averages (or averages across their peak 20 matches, or something) in real number terms instead of the “point in time” peaks converted into a score out of 1000 – that doesn’t tell you that a batsman or a bowler is better than another, just that their purple patch peaked more. There’s certain nuances in the algorithm that heavily impact the overall ratings with a butterfly effect over 140-odd years – how to complete incomplete innings to assess the pitch was one, but also how to assess the quality of debutants or people with not much experience, how to treat not outs (e.g. a low not out score vs a high not out score) and so on. But effectively, the ICC rankings already do what this article is suggesting should be done, except keeping it to the figures that are available in all scorecards, not things like how much it’s seaming and spinning and so on, that can change significantly throughout a match.

Cricket may need to adopt the par score system to measure the skill level of its Test batters

matth, this is the same issue (the circularity of using batsmen’s quality to assess bowler’s quality and vice versa) that the official ICC rankings face… Rob Eastaway (who invented the algorithm for that) addresses that years ago in a maths paper he wrote, which is still available here: https://plus.maths.org/issue24/features/eastaway/2pdf/index.html/op.pdf – the relevant portion states: “when updating a player’s rating, the value of a batsman’s performance needs to take account of the latest ratings of the opposing bowlers, while the bowler’s performance needs to take account of the opposing batsmen’s up−to−date ratings. There is a circularity here. Who should you rate first − the bowlers or the batsmen? We got around this
problem by giving each player a “provisional” rating after the match, using that provisional rating in all the adjustment calculations, and then replacing the provisional figure with the updated figure. Although this introduces slight distortions, they are what might be described as “second order” errors and can be discounted.”

Cricket may need to adopt the par score system to measure the skill level of its Test batters

Summed it up beautifully matth… I can’t fathom a test cricketer going “Oh well, doesn’t matter if I get out cheaply, it’s only…” the second innings/a dead rubber/weak opposition/a flat pitch/likely to be a draw/whatever. They’re playing in the moment and they’re representing their country. So why would you promote the efforts of a batsman who only achieves in a limited and specific set of circumstances? Exceptions to the rule in days of yore, although I think the Trumper argument gets exaggerated a bit (and Clem Hill doesn’t get the accolades he deserves), and chasing quick runs to set a target gets a leave pass too, but otherwise you’d expect batsmen to be trying as hard as possible to get as large a score as possible whenever they batted.

There’s nothing wrong with making runs – any time

👍 I’d only just come across it, that’s why I was late to the party!

How much is a run really worth? A tale of two Test matches

I guess that’s always been a flaw with their system, but pretty much by design. If you start trying to cater for contextual situations, how do you end up on an agreed set of rules? Just using your opener example, is it about their score, or more about overs batted and taking the shine off the ball? That’s just one example. Almost every stage of the game would have rules needing to be defined to cater for all the potential scenarios – and some of those might relate to factors that aren’t available in all the scorecards going back to 1877, e.g. balls faced, or which specific bowlers were bowling to which specific batsmen.
With the ICC ratings, their stated desire was always to provide ratings that still effectively equate to batting averages (and bowling averages), but more contextual by adjusting scores for pitch quality and (general) opposition strength, with a focus on recent form. So to that end, they largely achieve that.
But even within that, they’ve had to make rules and assumptions that have a butterfly effect on the results – particularly around assessing/rating debutants (or relatively new players). Tweak those variables even slightly and you significantly change overall rankings of players by the time you have bowlers bowling to batsmen who face other bowlers who bowl to different batsmen and so on for 140-odd years.

How much is a run really worth? A tale of two Test matches

I really like this analysis and summary and reasoning and it highlights deficiencies in how the official ICC rankings get calculated. But on the flipside, if you’re trying to turn it into an objective formula, when exactly does a pressure situation stop being a pressure situation, and how exactly do you compare first innings setting the scene vs second innings dealing with the current state of play? I guess that’s the whole point of the article! For the record, in terms of the ICC ranking formulas using formulas for bowling and pitch quality, the top 10 ratings for each innings in those two matches in sequence are as follows: Slater 176, Ponting 196, M.Waugh 140, Langer 100*, Collingwood 96, Pietersen 92, Hick 80, Ponting 60*, Thorpe 67, Atherton 54. I offer no commentary around it, just that “it is what it is” and a reasonable portion of it happens to match in with your analysis, beyond the low scores that you rate highly and Pieterson and Collingwood’s fight back second innings knocks in 06/07 that you wiped off as “doomed”. I can see both sides of that argument. Yeah, they were on a hiding to nothing and were never going to win, but they still put up a hell of a fight – is that worth complete denial? If so, what sort of effort/score in that situation would be worthy of being given a thumbs up? Interestingly, the two main scores that rate lowly in the ICC rankings are Hussey’s 86 and Langer’s 82 in the first innings of the 2006/07 game, both of which you rate highly… and somewhat validly… although you can also understand the discounts in the ICC rankings – both are decent scores but in a huge total (therefore easier batting conditions) against a relatively average attack.

How much is a run really worth? A tale of two Test matches

No worries Renato… you’re a bit pot, kettle, black – you’ve actually missed the point rather than made your point and I’ve used numbers throughout to make my point as objective as possible, rather than subjective statements like players being likely to perform better when we have no way of knowing whatsoever. As I’ve said previously, Bradman is one of a number of factors – I was just responding to a message about that particular topic. Of those factors, the three main ones that explain almost the entire discrepancy in batting averages for your niche topic of 1926-40 Ashes tests in England are: the Bradman impact (which we’ve done to death – whether you replace or exclude, it impacts the whole era average by 2.5 to 3 runs, which is statistically significant), 4 of the only 5 ever English Timeless Tests being scheduled in this era (and all during August) and the remaining Tests scheduled for only 3 and 4 days rather than the full 5 used after WWII (so less pitch deterioration). And you can use numbers to back all of this up objectively.

Busting cricket myths: The batting average has been constant since 1920

You can cherry pick silly examples all you like. I’ll cherry pick mine – how did Australia go in response to the 7/903 without Bradman batting? Or the first test of Bodyline without him when conditions were so good for batting that England made 524? We could spend hours picking apart each other’s arguments and citing examples in both directions – they were just two that sprang to mind. You’re missing the point re: the statistical analysis – it’s not about individual efforts within specific games whatsoever. If you’re given an aggregate dataset where almost everything fits in a bell curve and then you have one data point that is about 6 standard deviations above the mean (or whatever, I’m not looking at specific numbers right now), it’s perfectly valid to discard that data as an anomaly and analyse the rest. Or you can make arguments to substitute the anomaly with a data point that is more compatible (e.g. McCabe-esque numbers) – but either way, in a statistical sense, my point remains – a large part of the reason the 1926-40 numbers were higher than other eras was purely due to Bradman himself.

Busting cricket myths: The batting average has been constant since 1920

Not a huge fan of couldashouldawoulda in statistical analysis. With no Bradman, the England bowlers might have been into the middle order earlier with a newer ball and ripped through them cheaper and everyone’s averages could have suffered. Like the recent Aussies when missing key players. Just as plausible an argument.

Busting cricket myths: The batting average has been constant since 1920

It was purely from a statistics/data analytics perspective, not a cricketing “who would bat instead?” perspective – removing a freak outlier altogether to analyse “typical” batsmen does have merit. But as I said, replace him with McCabe-esque numbers if you like – it doesn’t change the point much, really. One person batted in just 3.6% of the innings in England between 1926-40 but scored 11.4% of the runs and 20% of the centuries scored in the era, which significantly boosted the overall averages being analysed. There’s no other player within cooee of that in terms of impact on an era’s numbers, for any era. By removing those numbers altogether, you’re still analysing 563 remaining innings. Or if you want to substitute, substitute McCabe and the era’s average drops by 2.5 instead of 3. Either way, the impact of Bradman on the era’s numbers is a significant factor.

Busting cricket myths: The batting average has been constant since 1920

Just a couple of notes… those figures aren’t batting averages like the rest of the figures have been – they are runs / wickets, which means they include extras and are higher than the batting averages (e.g. Old Trafford batting average is 45.2, not 47.98). The other thing is that, with the exception of The Oval, the rest of the batting averages of England venues of that era are inflated by a greater proportion of innings from the 1 to 7 bats than the tail and a greater proportion of not outs. e.g. Old Trafford in those tests had 19.7% not outs, as opposed to the all time rate of 12.9% (and the “all out” rate of 9%). Which is partly explained by the 3 and 4 day tests of the era.

Busting cricket myths: The batting average has been constant since 1920

It’s perfectly valid to remove a single statistical outlier from a large dataset. But if you did, McCabe is a more apples-to-apples replacement – next best run scorer in the same tests of the same series – not that it matters… 35/36.2/36.5… it’s all a large chunk lower than 38. Regardless, there are a whole heap of interrelated factors, which I haven’t gone into in my brief responses… e.g. the duration of Ashes tests in England were 3 days up to and including 1926, 4 days from 1930 to 1938 and then 5 days from 1948 onwards. That all has an impact on a whole heap of things, from batting aggression to pitch wear and tear to quantities of not outs and ratio of draws and so on… and those changes in duration happen to coincide with your chosen era. You won’t be able to isolate a single factor (or even 2 or 3) and pin it down, because it was a combination of many different things, including the Bradman effect.

Busting cricket myths: The batting average has been constant since 1920

These numbers actually get back to the Charles Davis argument… remove the statistical outlier (Bradman) and those figures from 1926-40 become 32.3 (instead of 34.7) and 35.0 (instead of 38.0), which brings them back to the pack.

Busting cricket myths: The batting average has been constant since 1920

I’ve got a couple of minutes before work… I’ve got some detailed analysis of timeless tests somewhere, but just quickly – the short story apples v oranges for timeless tests is largely: Australian weather vs English weather (especially all 4 England tests in mid-late August vs Australian ones across the whole summer with more extreme weather variations), Australian dirt vs English dirt (initial pitch prep, propensity to crack) and smaller ovals in England vs larger ones in Australia. And taking away the timeless tests, the scores over 600 that you list aren’t too dissimilar to scores made in the 1989 and 1993 Ashes. BTW, it’s not unreasonable to remove a statistical outlier, but I probably should have done the same with a batsman from the comparison era.

Busting cricket myths: The batting average has been constant since 1920

I won’t have a chance to properly reply until later tonight, but comparing scores in different countries introduces a heap of different factors that make it apples v oranges. Also, just quickly, 675 is such an arbitrary figure. 550 is massive too. Why 675? Make the cut off a different number and you get a different picture. And again, a number of different factors come into play – e.g. attitudes towards both batting aggression and declarations have changed a lot since I’ve been alive, let alone prior to then.

Busting cricket myths: The batting average has been constant since 1920

I feel there are a number of flaws in the choice of data for this analysis (which I won’t get into), but probably the two main contributing factors to the anomalies described are (1) the “Bradman factor” and (2) timeless tests. Taking the “Bradman factor”, he personally raises the batting average of 1926-40 (using the chosen criteria) by 4.31 and the 100s % by 1.7% (!!). As far as timeless tests go, in a game of “risk vs reward” like cricket is, batsmen are clearly going to take a less risky approach when they can theoretically bat forever. Of the 7 mammoth scores in 18 tests, 3 of them came in the 4 timeless tests in England in that period (5th test of each series – 1926, 1930, 1934 and 1938). If you remove the timeless tests and all the scores made by Bradman, all of a sudden the batting average is 47.5 and 100s are 13.5% of all innings, which is still fairly high, but relatively comparable to another 3 series combination of 1985, 1989, 1993 where the average is 47.1 and 100s are 12.4% of all innings – again, using the same criteria of Top 7, first innings only, Ashes Tests in England. (By the way, there are a few errors in the data shown too – e.g. 252 innings between 1926 and 1940 and 100s at 15.1%, not 255 and 16.1%)

Busting cricket myths: The batting average has been constant since 1920

close