Saranga Sudarshan

In writing cricketstats I have become more and more interested in how analytics is used to evaluate players and teams. Fortunately, there has been a lot of Test cricket played recently and so a lot of ink spilled on analysing player and team performances. Reading some of these analyses raised some questions for me about what phenomena analytics is used to evaluate and how these uses are communicated to lay readers like myself. In this post I thought I might explore some of these questions and how I think the distinction between capacity, execution and outcome can help in answering them.

Although I’m sure there are other pieces of writing that contributed, the pieces that I remember most clearly raising the questions I want to explore were Jarrod Kimber’s substack post on England’s batting, Kartikeya Date’s post on why England is losing the Ashes, and Dan Weston’s post on England’s use of Jack Leach. None of what I write here is meant necessarily as an argument against the evaluations in those pieces. Rather I want to focus on how at times the use of analytics to evaluate players and teams can be unclear or confusing. For instance, in the course of presenting various statistics of England’s batting performances, Kimber writes that “…they have never been this bad this often ever,” and “There is no way around this, they have had one batter all year, and he has been absolutely incredible, and the rest of the batters have been as bad as he is good.” Then in the final summation and in responding to possible objections, Kimber concludes, “But this is not a team good at batting. What coach could turn five batting spots that can’t average 35 between them into a good team? Silverwood didn’t coach their techniques; he inherited a batting lineup where every player has multiple failings. We’re not talking about refining working techniques. We’re looking at taking players who have never made big runs domestically or batters who have struggled at international cricket all the time and making them into proper batters.” As I have said already my questions about this use of analytics isn’t on the merits of the actual evaluations made but rather on clarifying how analytics is being used in those evaluations. The central question that Kimber’s analysis raised for me was, what is actually being evaluated as good or bad? Is it the English batting lineup’s past performances? As in are we using analytics to see how the English batting lineup’s past performances are bad compared to other teams or some implicit standard we have for batting eg. averaging over 40, or, are we using analytics to evaluate the inherent quality of the batting lineup?

The same questions arise when looking at Date and Weston’s posts as well. In analysing the England and Australian bowling lineups, Date says “Those are good figures, but they’re unlikely to worry any reasonably strong opponent”, and when concluding writes, “England’s serious problems lie on the bowling side. Their worst players in this series have been Stuart Broad, Chris Woakes, Jack Leach and Mark Wood.” Weston in analysing England’s use of Jack Leach in terms of matches and matchups refers to Leach being “pretty good when used in the right circumstances’’ and being “very good, in fact probably even better than that, when he’s asked to perform the role which he should be in the team for - as a strong match-up option against right-handed batters”. In these posts the question is again what is being evaluated as good or bad? Is it the past performances of particular players and teams, or the inherent quality of the players and teams?

The problem here is not that there is some contradiction in any option. But rather, it isn’t always clear what evaluation is actually taking place. I suspect that the authors of the posts I’ve mentioned implicitly understand what they are evaluating and when they’re switching from evaluating past performances to evaluating players and teams themselves. But I would say it isn’t always clear to lay readers like myself.

What I think can help here is a distinction between capacity, execution, and outcomes. To get a sense of this distinction think of a vase being bumped off a table and smashing on a concrete floor. The outcome is the final state of affairs of the vase broken on the floor. The execution is the series of events that produced the outcome, ie. being bumped off the table and making contact with the floor at a given speed. The capacity is the structural properties of the vase that give it the disposition to break when hitting concrete at a certain velocity.

Translated into cricket we can see that outcomes are the score recorded in the scoresheet or ball by ball record. In other words quite literally the outcome of each ball. This is of course different from execution which is what the player actually does to produce the outcome in each ball. After all, to score four runs a batter can hit the ball in ways that carry different risks eg. a lofted drive at catchable height or a flick behind square. Finally, capacity is the property of a player that gives them the disposition to execute a particular delivery or shot in various ways and hence produce various outcomes.

The important point in all of this is that statistics (excluding ball tracking and shot selection information), is only ever a description of the outcomes. At times, it can seem like we are analysing capacity or execution directly, but we aren’t. We can then take those outcomes and infer a player’s capacity to execute and produce those outcomes in the future. This of course doesn’t mean we can never analyse capacity and execution directly. For a more direct analysis of capacity and execution Rob Johnston’s piece on England’s batting is a good example that goes into player techniques and preparation. But aside from those types of analyses, and analytics involving ball tracking and shot selection information we should always keep in mind that when evaluating past performances we are dealing purely with outcomes. It is only then on the basis of those evaluations that we indirectly evaluate a player or team’s capacity.

Another point to keep in mind is that whilst past performances are fixed, capacities can change. After all a player’s capacity can be developed, improved and even degraded. This means that in the absence of a continuous distribution of evidence of outcomes, there is an open question at any juncture whether a player has improved beyond the outcomes we use to evaluate them.

I want to stress again that none of what I have said is a criticism of any particular writer or use of analytics, but rather a reflection on clarifying certain concepts and methods that could prove useful to lay readers like myself. For instance, I think Kimber’s articles on Bangladesh’s win against NZ and Australia’s T20 World Cup Win actually show and explain the importance of the distinction. Both pieces mention how some players developed their capacity or had latent untapped capacities to produce better outcomes than their past performances indicated. And I think one of the clearest expression of the distinction is by Jack Hope on the use and limitations of analytics in predicting success in Tests based on first-class batting averages.

I guess the takeaway point for those like me who are new to looking at cricket from an analytics perspective is to always keep a clear sight in our minds about what we are using analytics to evaluate. Being clear about that will clarify how confident we can be in our evaluations of players or teams and how we should update our evaluations as time passes.