Ranking the teams in any team sport is an interesting problem, especially when the teams do not play the same number of matches against other teams. For example, in the Indian Premier League, each team plays against every other team twice, therefore sorting the teams in the decreasing order of the number of wins gives a fair ranking of the teams. However, unlike the domestic Cricket leagues, the schedule of International Cricket is not so symmetric. Ranking the teams by their win percentage will not be fair, since a team can inflate its win-loss record by playing against other “weak” teams. This leads us to an interesting circular problem: the definition of a “weak” or a “strong” team depends on its ranking relative to other teams, but to rank a team we need to know what the “weak” and the “strong” teams are.
The International Cricket Council uses its own methodology for computing the team ranks in ODIs and the Test Cricket. However, this method uses arbitrary parameters for computing the team rankings (for example, notice the use of two arbitrary parameters in the following rule: “If the gap between the ratings of the two teams at the commencement of the match is fewer than 40 points, the winner scores 50 points more than the opponent’s rating“). Ideally, the ranking methodology should be parameter free, that is it should not use any parameter changing whose value can affect the outcome of the method. In this blog post, we propose such a parameter free method for ranking teams that uses only the win-loss-tie record between each pair of teams.
The model we use for ranking the teams is that of a random tournament. Consider a hypothetical tournament conducted in the following fashion. First, we pick two random teams, say A and B, and make them play against each other. The probability of A winning this match is proportional to the number of times A has defeated B in games actually played out between the two teams. The winner of this tournament then plays against another randomly picked team, and so on. If this tournament is played out for a number of matches, then we will end up with a winner, which with high probability will be among the “stronger” teams. Suppose we play out this tournament a number of times, and keep a frequency table of the number of times each team emerges as the winner of this tournament. Then arranging the teams in the decreasing order of their wins will give us a ranking of the teams. This is roughly (but not exactly) the model we use for ranking teams.
To be more specific (and technical), we construct a Markov Chain in which the states are the teams, and the transition probabilities for the state corresponding to a team, say A, are given by:
- The transition probability from team A to team B is proportional to the number of times B has defeated A in games actually played between the two teams, plus half the number of times the games between the two have ended in a tie.
- The transition probability from A to itself is proportional to the total number of wins of team A, plus half the number of tied games in which A was involved.
The probabilities of each state are then normalized to ensure that all the transition probabilities from a state sum up to 1. The steady state probability of this Markov Chain gives us the rating for each team, and arranging the teams in the decreasing order of the their steady state probabilities gives us the ranking of the teams. Alternatively, we can look at the eigenvector corresponding to the eigenvalue of 1 of the transpose of the transition matrix, which will give us the same ranking. For this reason, we will refer to the score of a team obtained using this method as its EVscore and the corresponding rank as EVrank.
The table below gives the ranking of the teams in ODI for the year 2011. Note that for ranking the teams in a particular year, we include only the games played in that year.
The complete ranking of the teams in Tests/ODI/T20I can be found on the ranks page. The idea of using eigenanalysis to rank the teams is not entirely new; the astute reader will see the similarity between this ranking methodology and the Pagerank algorithm used by Google for ranking web pages.
An interesting extension will be to use a similar method to obtain rankings of the players, since we have the batsman versus bowler match-up data available. However, there is an inherent asymmetry in the batsman versus bowler interaction, as compared to the team versus team interaction. Moreover, the interaction between batsmen and bowlers is much more sparse as compared to the interaction between two teams, and so the small sample size of the pairwise interactions may not lead to a meaningful ranking of the players. Nevertheless, it is an interesting idea worth pursuing in the future.