How to get a single player's rank based on the score
Recently, a friend of mine approached me asking an advise for the task of getting the rank based on the score. When we have a list of, say, players, each having some score points, naturally these points are used to range the players on the scoreboard. Although to create a scoreboard is a trivial task - just order the table by the score,
SELECT * FROM players ORDER BY score DESC
to get a single player's rank based on their score is not that simple.
Surprisingly, there are no good solutions on Stack Overflow (the top result from Google for me for example) . Instead, some SQL monsters are offered, using variables, DBMS-specific functions like RANK(), derived tables and such. But I felt that the solution could be much simpler and could be done using just basic SQL operators.
The basic principle
The basic solution, given the score is known, is ridiculously simple: we just have to count all players placed on the scoreboard before. Naturally, it means that we need to count all players with a score higher than the given value:
SELECT count(*) FROM players WHERE score > ?
Not yet. This query is giving us just the number of players before us, whereas our rank is one point bigger. So the proper query would be
SELECT count(*)+1 FROM players WHERE score > ?
Adding another field
Well, there is a pitfall: it may happen that some users will share the same score. On the one hand, we can give such users the same rank. That's possible, but it will make the query too complex. Besides, such an approach is rarely used because people prefer to be certain about their rank. To make things straight, an additional field is often used in the ORDER BY clause when displaying a scoreboard, to make the resulting list consistent. It could be any field - the player's name or id for example. But it seems fair to use the time when the current score has been achieved - the first to come is higher on the scoreboard:
SELECT * FROM players ORDER BY score DESC, score_updated ASC
Therefore, to display the rank, we need to take into consideration this another field as well
The first thing that comes to mind (well, at least came to my mind, I have to confess) is simply to add the new field to the condition:
SELECT count(*)+1 FROM players WHERE score > ? AND score_updated < ?
Well, obviously, it doesn't work. It will cut off all the players who have a higher score but also a lesser update time. We need to separate these conditions, so they won't interfere. It is usually done with
OR operator. But yeah, simply changing AND to OR will just wreck havoc in the results. By this time I started visualizing the players table as a sort of line, where we have a part with all the players with higher score and a part with mates sharing the same score, like this
and the solution instantly sprang to my mind: we just need to count these two parts separately:
SELECT count(*)+1 FROM players WHERE score > ? OR (score = ? AND score_updated < ?)
the leftmost condition is to select all the superiors and the second condition to count all the peers who have lesser update time. Bingo!
Getting the rank along with other data
The last stroke.
Imagine we are displaying the player's account page and among other stats have to provide their rank from the scoreboard. All we know is the player's id. Of course we could get the desired info in two queries - first getting the score and the update time, and then getting the rank. But it smells. That's a user's data all the same, why not to get it in a single query? So it makes the question, how to get the score based on id. The most obvious solution is a sub-query, like
SELECT count(*)+1 FROM players WHERE score > (SELECT score FROM players WHERE id = ?)
but that's just for one condition and we have two! It will make the SQL monster we are trying to avoid. Luckily, there is another way to get the data from the same row, by joining the table to itself:
SELECT p1.*, count(p2.id)+1 as rank FROM players p1
JOIN players p2 on p2.score > p1.score or (p2.score = p1.score AND p2.score_updated < p1.score_updated)
WHERE p1.id = 7393
this latter query is just a familiar LEFT JOIN with a count() - what could be simpler?
Note: an index for the
score is not to be forgotten! This field is heavily used in this query and without an index it will take for the database to rake through all the table. Whereas using an index it will only take to calculate the entries in the index, which is incomparable faster.