pabloskimg's blog

By pabloskimg, history, 5 years ago, In English

Problem: there are N players, each one starts with some initial score, therefore they appear sorted non-increasingly by their scores in a dynamic leaderboard. From time to time some player changes his score, and the leaderboard gets updated to reflect the player's new ranking (in case of ties, for simplicity let's assume ties are broken by comparing the players' unique IDs). At the same time, we need to answer queries of the form "What is the accumulated sum of the scores of all players with rank between L and R in the leaderboard?". Score updates and range sum queries are interleaved.

Does anybody know how to implement this efficiently? I know I can implement a dynamic ranking using STL policy based data structures, because I can query the current ranking of a player in O(log(N)), remove him in O(log(N)), insert him again with his new score in O(log(N)) and finally query his updated ranking in O(log(N)). However I don't know how to keep track of accumulated sums of ranked scores efficiently. Maybe some clever adaptation of segment tree or fenwick tree might be of help but I'm not sure how.

Any help is very appreciated.


Edit: motivating problem: link

  • Vote: I like it
  • +2
  • Vote: I do not like it

»
5 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Segment Tree idea (hoping everything is offline): After reading input, you know all the scores that are ever possibly achieved by any player — this amounts to a linear number of possible scores. Do a coordinate-compression type thing on those scores, and then build a segment tree that ranges over those values.

You can do two segment tree walks (O(log N)) per query to find the indices of the people that are at rank L and at rank R. Then you can do a range sum query between these two indices for the answer (O(log N)).

To update, you can do point updates in O(log N) per update to change the number of people at an index (-1 where they currently are, +1 to their new location).

  • »
    »
    5 years ago, # ^ |
    Rev. 4   Vote: I like it 0 Vote: I do not like it

    You can do two segment tree walks (O(log N)) per query to find the indices of the people that are at rank L and at rank R

    I didn't understand this. You map score values to compact indices. Why (and how) would you use a segment tree for that? Isn't an unordered_map enough? Likewise, I think an STL policy based data structure should be enough to know the value ranked at position k-th right?

    Then you can do a range sum query between these two indices for the answer (O(log N)).

    But if you do a range sum over indices, you would be summing indices, not the actual scores, or am I misunderstanding something?

    To update, you can do point updates in O(log N) per update to change the number of people at an index (-1 where they currently are, +1 to their new location).

    I'm a bit confused here, if someone moves a lot of in the leaderboard, then not only that person position changes but all the people between the old position and the new position change their position as well, so you need to do a range update, not many point updates, or am I misunderstanding something here as well?

    • »
      »
      »
      5 years ago, # ^ |
      Rev. 2   Vote: I like it 0 Vote: I do not like it

      Here, I'll go into more detail:

      Let's say that you found that all the scores ever attained are: [13, 24, 25, 26, 28, 31, 47]

      Then that would get mapped into: [13:0, 24:1, 25:2, 26:3, 28:4, 31:5, 47:6]

      Now, I'm not using the segment tree for this part — you don't even need a pbds for this — just a set.

      Then, we build a segment tree that ranges from 0 to 6. Each node in the segment tree will contain two pieces of information — the number of people in the range represented by that node, and the total score of the people in the range represented by that node.

      So, if you have four players, and you're looking at the number of people, the bottom layer of nodes could look like: [1, 1, 0, 0, 1, 0, 1] to indicate that your players have scores 13, 24, 28, and 47. If you're looking at scores, the bottom layer of nodes would look like: [13, 24, 0, 0, 28, 0, 47].

      Okay so then if we're looking for the people of rank L=2 and rank R=4, then we can walk the segment tree to find the 2nd guy (who is at index 1) and the fourth guy (who is at index 6). Then a range sum query between index 1 and index 6 (using the other info in the segtree) will get the sum of the scores of the people between rank 2 and rank 4 (24+28+47).

      Now, if a person gets their score updated, then we can do -1 at their old index, and +1 to their new index, and also modify the scores appropriately. Let's say the guy with score 13 gets upped to score 26. Then the bottom layer of nodes will change from:

      [1, 1, 0, 0, 1, 0, 1]

      [13, 24, 0, 0, 28, 0, 47]

      into:

      [0, 1, 0, 1, 1, 0, 1]

      [0, 24, 0, 26, 28, 0, 47]

      By subtracting one at the old index and moving it to the new index, the people in between implicitly get their ranks shifted by one.

      While writing this I assumed that rank was such that the first rank is equal to the lowest number, but for this problem it might actually be that first rank is equal to the highest number, which would be a trivial change.

»
5 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Also, I think this problem can be handled by a binary search tree (like treap or splay) but I'm not experienced with that so if someone can tell me if that's feasible please do :)

»
5 years ago, # |
  Vote: I like it +6 Vote: I do not like it

Use a treap or a splay tree. A basic treap can be implemented in around 40 lines of code.

  • »
    »
    5 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Thank you for your reply. Would you mind elaborating a bit more? How can a treap or a splay tree be used in this problem? What operations would they be useful for and why?