Ranking methods for 1st-person-shooters

Static rankings

This section is about the problem to estimate the strengths of the players with help of some given data. Here the strength of a player is looked upon as an unknown constant, which has to be found. A development in time is not subject of this section, but of the section about dynamic rankings.

The data

The following data is needed:

ntotal number of players
a_ijThis entry of a quadratic n×n-matrix a gives the number of frags, player i made against player j (i.e. how often player i killed j). The entry a_ii is the number of suicides of player i.

This data may be collected over one or more games or maps. The more data, the better the ranking will be. If the frag numbers a_ij are too small, there is a serious problem with statistical relevance. I will discuss this in the section about the statistical relevance.

The ranking as optimizing problem

Now we try to assign to every player i his strength g_i. Those strengths should be choosen such that one can see from g_i and g_j, how the players i and j do in direct comparison:

   (g_i-g_j) = (a_ij-a_ji)/(a_ij+a_ji)

Obviously the g_i should be between 0 and 1. Of course the above equation generally cannot be valid for all i,j. (For example it may be, that player 1 is better than 2, 2 better than 3, and 3 better than 1). That's why we try to minimize the sum over all pairs (i,j) (i!=j) of the squares of the differences. The solution is:

   g_i = 1/n * \sum_{j!=i} (a_ij+a_ji)/(a_ij+a_ji)

If there is a (i,j) with a_ij+a_ji=0, the problem is not well defined. Unfortunately this will occur in practice quite frequently, because not every two players meet. We suggest to skip such summands and to decrease the number of players n accordingly.

Suicides

The above formula doesn't take into account the number of suicides of a player, and there is no canonical way to do this. One could for example subtract the number a_ii/B_i, where B_i is the total number of frags player i was involved into. Alternatively one could distribute his suicides proportionately under his oppenents, before calculating the strength according to this formula.

Nevertheless those possiblities to take the suicides into account, are very arbitrary, and their consequences are not so clear. In practice none of these solutions was really satisfying.

The problem with the statictical relevance

The first tests with this formula alwas resulted in players on the first ranks, which nearly didn't play. And worse: The longer a player played, the smaller became the chances, that he came onto one of the first places. The problem was, that scores of 1:0 or 2:0 beween two single players clearly dominated the ranking.

Our attempts to devalue or to weigh less those quotients in the formula, seemed to be theoretically proper, but all failed in practice. Finally the only practical solution was to make the list of players more sparse, so that only a certain "core" of player rested, of which everyone did enough frags against every other to found a solid statistics.

The problem with the size of the data

Another serious disadvantage of this method of ranking is, that the size of the data a_ij to hold in the memory increases quadratically with the number of players. Normally most of the a_ij are zero, but if one tries not to save those zeros, the programming expense and the error proneness increases dramatically.

This is a problem on principle of the method, that can't be avoided by clever programming.

Dynamic rankings

This section is about a ranking method, which tries to calculate the development of the strengths of the players in time. It is similar to the one used for the world ranking lists in chess and table tennis (the ELO ladder). Given constant strength, the calculated strengths converge against the correct values after some time.

A duel league

First we look at the case very similar to the world ranking lists mentioned above, where the players play against each other in pairs (i.e. duel). To every player i his strength g_i is assigned. The probablity, that player i wins against player j is by convention

   w_ij = 1 / (1+10^((g_i-g_j)/400))

The number 400 only determines, how high the values in the rankings will be. Now let the players i and j play against each other. How are the g_i and g_j to be adapted? If a_i is the number of frags of player i, and a_j the one of player j, the number

   e_ij = a_i / (a_i+a_j)

is between 0 and 1 and says, how strong player i was (in comparison with player j). If the duel ends 0:0, we take e_ij=0.5.

The new strengths of the players i and j can now be calculated as follows:

   g_new_i = g_i + K * (e_ij-w_ij)
   g_new_j = g_j + K * (e_ji-w_ji)

If one of the two new values gets lesser than zero, it is set to zero. The number K determines, how dynamically the ranking will be. For big K a few wins are enough to jump up the ladder. On the other hand some losses, and one falls down again. If K is small, it is necessary to play rather well over a longer time to go up in the ranking, and it is possible to loose some duels.

As one can see, it is possible, that on goes down in the ranking, even if one has won a duel. This is the case, of one didn't win high enough according to the strenght of the oppenent.

Accounting suicides

The number of suicides may simply be substracted from the number of frags of the player. A problem then occurs, if one or both of the a_i and a_j become negative. In this case the formula can't be used.

For that reason, if one of the values is negative, it can be substracted from both values. Two examples:

    6 : -2   becomes   +8 :  0
   -3 : -5   becomes   +5 : +3

A decline in time

Players, which don't play any more, or which play too seldom, should fall down in the ranking. Otherwise players, which reached a high score, could stop playing.

So from each rank should some points be substracted in regular intervals. If a player is the ranked less than zero, he is reset to zero.

Where the points come from

Perhaps the reader has remarked, that the sum of the g_i in the formula above is constant: \sum g_i = \sum g_new_i. The only exception is, when a g_i becomes negative and so is reset to zero. Ultimately all points of the ranking come from this and were distributed from below to above.

Dynamic rankings for deathmatches

For the problem of a dynamic ranking, if the players don't duel, but if many player are in one map, there is a rather nice solution: Instead of the matrix a_ij we process directly one frag after the other, as they appear in the log files of the game servers. We simply count every frag of player i against player j as a duel with result 1:0.

Suicides may be looked upon as -1:0 results against an imaginary player of the same strength.

In practice this method works rather well and is even suitable for vast amounts of data, since for every player only a real number (the strength) has to be rememberd in memory or in a database. The log files of the servers may be read as data streams, processed line by line and then thrown away.

Dynamic rankings for CTF and other Mods

As simple as rankings for multiplayer deathmatches are, other modifications may be ranked. As an example I will tell a bit about CTF (Capture the Flag) rankings.

At first we have to identify the important events. Let's say in CTF games these are captures and frags. (There are other important events, which I don't enumerate here and which partially depend on the exact game.)

For every event we give a certain number of points x, for example 7 points for a capture, and 1 point for a frag. Every event is now counted as x:0 result for the player, who is concerned, and as 0:x result for his oppenent, if a well defined oppenent exists at all. For events, for which no well defined opponent exists, for example for captures, we take as strength of the (imaginary) opponent in the ranking formula the average strength of all players of the oppsosing team.

Team-kills could be processed as -1:0 results for the player, which did the kill. The killed player should in this case not be ranked down.

The rate of convergence

In dynamic rankings every player starts with zero. Then his calcualted strength will increase, first quickly, then slower and slower, and finally converge against a value, which corresponds to his real strength.

In dynamic rankings of multiplayer games the rate of convergence does not only depend on the constant K, but also in how many frag events the player is involved per time unit. So if a player plays very hectically, his rank will converge rather quickly. Another player with the same real strength, but who is more careful, will be ranked lower in the beginning, because his rank doesn't increase so quickly.

If there is a decline in time, the latter player has a real disadvantage: The calculated ranks are equilibria between the real strengths and the decline. So the calculated rank of the slower player will be lower than the calculated rank of the more hectic player of the same real strength.

Is there a program?

Yes. Christoph Loewe aka AEon has written a program parsing the log files of game servers and writing html pages with many statistics: AEstats.

Which of the methods above is implemented, I don't know. I only did the mathematical background.

by Michael Becker, 8/2001. Translated: 4/2002, Last modification: 4/2002.