adamant's blog

By adamant, 3 years ago, translation, In English,

Hi everyone! After a relatively long lull, I decided that my contribution growing too slowly the hour has come to please you with another article in the blog :)

2 months ago user Perlik wrote an article, in which he described a very interesting STL implemented data structure that allows you to quickly perform various operations with substrings. Some time after I tested it on various tasks and, unfortunately, tend to get a negative result — rope was too slow, especially when it came to working with individual elements.

For some time, I forgot about that article. Increasingly, however, I was faced with problems in which it was necessary to implement set with the ability to know ordinal number of item and also to get item by its ordinal number (ie, order statistic in the set). And then I remembered that in the comments to that article, someone mentioned about the mysterious data structure order statistics tree, which supports these two operations and which is implemented in STL (unfortunately only for the GNU C++). And here begins my fascinating acquaintance with policy based data structures, and I want to tell you about them :)

Let's get started. In this article I will talk about IMO the most interesting of the implemented structures — tree. We need to include the following headers:

#include <ext/pb_ds/assoc_container.hpp> // Common file
#include <ext/pb_ds/tree_policy.hpp> // Including tree_order_statistics_node_update

After closer inspection you may find that the last two files contained in the library

#include <ext/pb_ds/detail/standard_policies.hpp>

Namespace, which we will have to work in newer versions of C++ is called __gnu_pbds;, earlier it was called pb_ds;

Now let's look at the concrete structure.

The tree-based container has the following declaration:

	  template<
	  typename Key, // Key type
	  typename Mapped, // Mapped-policy
	  typename Cmp_Fn = std::less<Key>, // Key comparison functor
	  typename Tag = rb_tree_tag, // Specifies which underlying data structure to use
	  template<
	  typename Const_Node_Iterator,
	  typename Node_Iterator,
	  typename Cmp_Fn_,
	  typename Allocator_>
	  class Node_Update = null_node_update, // A policy for updating node invariants
	  typename Allocator = std::allocator<char> > // An allocator type
	  class tree;
	

Experienced participants may have already noticed that if initialize the template only the first two types, we obtain almost exact copy of the container map. Just say, that this container can be set, for this you just need to specify the second argument template type as null_type ( in older versions it is null_mapped_type).

By the way Tag and Node_Update are missing in map. Let us examine them in more detail.

Tag — class denoting a tree structure, which we will use. There are three base-classes provided in STL for this, it is rb_tree_tag (red-black tree), splay_tree_tag (splay tree) and ov_tree_tag (ordered-vector tree). Sadly, at competitions we can use only red-black trees for this because splay tree and OV-tree using linear-timed split operation that prevents us to use them.

Node_Update — class denoting policy for updating node invariants. By default it is set to null_node_update, ie, additional information not stored in the vertices. In addition, C++ implemented an update policy tree_order_statistics_node_update, which, in fact, carries the necessary operations. Consider them. Most likely, the best way to set the tree is as follows:

typedef tree<
int,
null_type,
less<int>,
rb_tree_tag,
tree_order_statistics_node_update>
ordered_set;

If we want to get map but not the set, as the second argument type must be used mapped type. Apparently, the tree supports the same operations as the set (at least I haven't any problems with them before), but also there are two new features — it is find_by_order() and order_of_key(). The first returns an iterator to the k-th largest element (counting from zero), the second — the number of items in a set that are strictly smaller than our item. Example of use:


ordered_set X; X.insert(1); X.insert(2); X.insert(4); X.insert(8); X.insert(16); cout<<*X.find_by_order(1)<<endl; // 2 cout<<*X.find_by_order(2)<<endl; // 4 cout<<*X.find_by_order(4)<<endl; // 16 cout<<(end(X)==X.find_by_order(6))<<endl; // true cout<<X.order_of_key(-5)<<endl; // 0 cout<<X.order_of_key(1)<<endl; // 0 cout<<X.order_of_key(3)<<endl; // 2 cout<<X.order_of_key(4)<<endl; // 2 cout<<X.order_of_key(400)<<endl; // 5

Finally I would like to say about the performance of order_statistics_tree in STL. For this, I provide the following table.

Solution\Problem 1028 1090 1521 1439
order_statistics_tree, STL 0.062 0.218 0.296 0.468
Segment tree 0.031 0.078 0.171 0.078
0.859*
Binary Indexed Tree 0.031 0.062 0.062
-

* The final task requires direct access to the nodes of the tree for the implementation of solutions for O (mlogn). Without it, the solution works in O (mlogn*logn).

As you can see from all this , order_statistics_tree relatively little behind handwritten structures, and at times ahead of them in execution time. At the same time the code size is reduced considerably. Hence we can conclude is that order_statistics_tree — it is good and it can be used in contests.

Besides tree, I also wanted to describe here trie. However , I was confused by some aspects of its implementation, greatly limiting its usefulness in programming olympiads, so I decided not to talk about it. If anyone want he is encouraged to try to learn more about this structure by himself.

Useful links:
Documentation of pb_ds
Testing of pb_ds
Using of pb_ds
Demonstration of order_statistics_tree
Demonstration of trie with prefix search
Operations with intervals with handwritten update policy class
More examples from that site

P.S. Sorry for my poor English :)

 
 
 
 
  • Vote: I like it  
  • +119
  • Vote: I do not like it  

»
3 years ago, # |
  Vote: I like it +12 Vote: I do not like it

Example of trie with search of prefix range.
Problem: 1414
Solution: http://ideone.com/6VFNZl

»
3 years ago, # |
  Vote: I like it +17 Vote: I do not like it

Возможно, вам покажутся слегка нетривиальными решения деревом отрезков и деревом Фенвика, особенно, задач 1521 и 1439. Скорее всего, позже я также предоставлю статью, в которой опишу некоторые интересные способы использования этих структур, которые редко встречаются.

======================================================================================= You may be wondered about how I use segment tree and binary indexed tree in my solutions, especially for problems 1521 and 1439. Most likely, later I'll provide an entry about some interesting ways of using this structures, which are quite rare.

»
3 years ago, # |
  Vote: I like it +21 Vote: I do not like it

This is really useful. Thanks a lot!

»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Very useful article! I need order-statistics on a multiset. How should I define the tree as?

  • »
    »
    3 years ago, # ^ |
      Vote: I like it +8 Vote: I do not like it

    As I know, there is no implemented tree multiset in STL. However you can use pair<T,int> as a key where the second element in pair is the time when item has been added.

    • »
      »
      »
      3 months ago, # ^ |
        Vote: I like it +8 Vote: I do not like it

      Apparently, you can. Once I tried to write less_equal instead of less and it started to work as multiset, I even got AC using it in region olympiad)

      • »
        »
        »
        »
        3 months ago, # ^ |
        Rev. 2   Vote: I like it +11 Vote: I do not like it

        I can't erase elements with less_equal comparator, e.g. this code output "1"

        code

        So I guess it's not very useful thing (or I do something wrong).

        UPD: I can delete iterator which I got with lower_bound. But it works incorrectly. This code erase 1, not 0

        Code
        • »
          »
          »
          »
          »
          3 months ago, # ^ |
            Vote: I like it 0 Vote: I do not like it

          Wow. then it really sucks. Seems like I only used it with insert operations and strangely enough it worked)

»
3 years ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

.(same comment as above)

»
2 years ago, # |
  Vote: I like it 0 Vote: I do not like it

what will be the complexity of erase operation? O(logn) or O(n)

»
2 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Is there any efficient function to merge 2 trees ?

»
21 month(s) ago, # |
  Vote: I like it 0 Vote: I do not like it

Surely worth looking into :) Thanks

»
20 months ago, # |
  Vote: I like it 0 Vote: I do not like it

how can i use it like multiset ?

  • »
    »
    20 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Main idea is to keep pairs like {elem, id}.

    typedef tree<
    pair<int, int>,
    null_type,
    less<pair<int, int>>,
    rb_tree_tag,
    tree_order_statistics_node_update> ordered_set;
    
    int t = 0;
    
    ordered_set me;
    ...
    me.insert({x, t++});
    me.erase(me.lower_bound({x, 0}));
    cout << me.order_of_key({x, 0}) << "\n";
    
»
15 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Hi, adamant, the code files in Useful Links don't seem to work. Could you fix them?

Thanks for this great post. I am looking forward to your next and next next posts.

  • »
    »
    15 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Can you elaborate please?

    • »
      »
      »
      15 months ago, # ^ |
      Rev. 2   Vote: I like it 0 Vote: I do not like it

      For example, the code in "Demonstration of trie with prefix search" cannot run on my computer. I saw that there was some old syntax like the namespace pb_ds. I changed it, then it returned a new error in another place. The truth is I am not good enough to change things any more. I hope that you can update it. (I know that I can use the trie code in one of your comments, but this post would be even better if the cost in Useful Links were also updated)

      Thank you.

»
3 days ago, # |
  Vote: I like it 0 Vote: I do not like it

https://www.e-olymp.com/ru/problems/2961 Is it possible to solve this problem using this algorithm?