### SirRembocodina's blog

By SirRembocodina, history, 13 months ago, translation,

Hi. I want to share a simple method for coordinate compression. Even noob can write it.

vector<int> a(n);
vector<int> b = a;
sort(b.begin(), b.end());
map<int, int> m;
for (int i = 0; i < n; i++) {
m[b[i]] = i;
}
for (int i = 0; i < n; i++) {
a[i] = m[a[i]];
}


Now every value of an array lies in [0, n). The most convineint it that if you need the original value for a[i], you can just write b[a[i]].

And how do you write it?

• +21

 » 13 months ago, # |   +15 I think when you are initializing the map, you need to use m[b[i]] instead of m[a[i]], right?
•  » » 13 months ago, # ^ |   +3 Yes, thank you.
 » 13 months ago, # |   +301 vector d = a; sort(d.begin(), d.end()); d.resize(unique(d.begin(), d.end()) - d.begin()); for (int i = 0; i < n; ++i) { a[i] = lower_bound(d.begin(), d.end(), a[i]) - d.begin(); } //original value of a[i] can be obtained through d[a[i]] 
•  » » 13 months ago, # ^ | ← Rev. 5 →   +87 +1, this method is faster because it doesn't use sets/mapsThe one from the blog doesn't map to values 0,1,2,3,... but instead, it leaves some gaps. This might be confusing while debugging.EDIT: It might be even faster to sort pairs (value, index) to avoid binary search. This requires iterating the sorted sequence and doing nxt++ when we have value different than previous value. Longer implementation, better runtime (I think). int n = a.size(); vector> pairs(n); for(int i = 0; i < n; ++i) { pairs[i] = {a[i], i}; } sort(pairs.begin(), pairs.end()); int nxt = 0; for(int i = 0; i < n; ++i) { if(i > 0 && pairs[i-1].first != pairs[i].first) nxt++; a[pairs[i].second] = nxt; } 
•  » » » 13 months ago, # ^ | ← Rev. 2 →   0 template struct compress { vector &src; vector comped;//change this to reference to mutate inplace vector helper; int n; explicit compress(vector &src) : src(src), comped(src), helper(comped), n(src.size()) { sort(helper.begin(), helper.end()); helper.resize(unique(helper.begin(), helper.end()) - helper.begin()); for (int i = 0; i < n; ++i) { comped[i] = lower_bound(helper.begin(), helper.end(), src[i]) - helper.begin(); } } T &operator[](int i) { assert(i >= 0 && i < n); return comped[i]; }//returns compressed val T &at(int i) { assert(i >= 0 && i < n); return helper[comped[i]]; }//returns original val before compression }; A more generic ,OOP implementation of this great snippet :)
•  » » » 13 months ago, # ^ |   0 would you please give a small snippet of it? I'm not getting it
•  » » » » 13 months ago, # ^ |   0 void solve() { int n = 5; vector src = {5, 4, 3, 2, 5}; compress obj(src); deb(obj.helper); //2 3 4 5 -> sorted non-duplicate val list deb(obj.comped); // 3 2 1 0 3 -> compressed list where each val this is same as src[4] deb(obj[4]); // 3 -> the hash-key which src[4] is mapped to } 
•  » » » 13 months ago, # ^ |   +1 I know my method isn't fast. I just think it's very easy to comprehend for the beginners.
•  » » » 13 months ago, # ^ | ← Rev. 4 →   0 Thanks for your idea, it is greate, here is my "short" template implementation from your code UwU #define fi first #define se second template void compress(vector &a) { vector > M; for (int i = 0; i < a.size(); ++i) M.emplace_back(a[i], i); sort(M.begin(), M.end()); a[M[0].se] = 0; /// Assign first value with zero for(int i = 1; i < a.size(); ++i) /// Iterate the rest and increase when adj are different a[M[i].se] = a[M[i - 1].se] + (M[i - 1].fi != M[i].fi); } 
•  » » » » 13 months ago, # ^ |   0 I don't claim that [N lower bounds] are slower than [sort]. I claim that sort + lower bounds are together slower than sort (even if it's sort on pairs).Also, it's faster to choose the size of vector M while creating it, instead of doing emplace_back many times.
•  » » » » » 13 months ago, # ^ | ← Rev. 8 →   +10 Yes sir, thanks for pointing my mistakesAnd yes sir, but I just afraid that some class might not have constructor.Here are my fixed versions Reserve + Emplace back#define fi first #define se second template void compress(vector &a) { vector > M; M.reserve(a.size()); for (int i = 0; i < a.size(); ++i) M.emplace_back(a[i], i); sort(M.begin(), M.end()); a[M[0].se] = 0; /// Assign first value with zero for(int i = 1; i < a.size(); ++i) /// Iterate the rest and increase when adj are different a[M[i].se] = a[M[i - 1].se] + (M[i - 1].fi != M[i].fi); }  Assign directly#define fi first #define se second template void compress(vector &a) { vector > M(a.size()); for (int i = 0; i < a.size(); ++i) { M[i].first = a[i]; M[i].second = i; } sort(M.begin(), M.end()); a[M[0].se] = 0; for(int i = 1; i < a.size(); ++i) a[M[i].se] = a[M[i - 1].se] + (M[i - 1].fi != M[i].fi); }  When it is not good to create clone value of arraytemplate void compress(vector &a) { vector M(a.size()); iota(M.begin(), M.end(), 0); sort(M.begin(), M.end(), [&](int i, int j) { return a[i] < a[j]; } ); bool diff[a.size()]; for (int i = 1; i < a.size(); ++i) diff[i] = (a[M[i - 1]] < a[M[i]]); a[M[0]] = T(0); for (int i = 1; i < a.size(); ++i) { a[M[i]] = a[M[i - 1]]; if (diff[i]) ++a[M[i]]; } } 
•  » » » » » » 13 months ago, # ^ | ← Rev. 2 →   0 Is there anything in C++ that can be compared with < and != and doesn't have a constructor? Even if you create your own class with some int/string attributes, they are 0/empty by default. So it's only if you create a class with some custom constructor with arguments? Seems very rare in CP.Btw. I don't think something so simple should be templated and copy-pasted to use during a contest. In the long run, you are better off by just implementing it every time. This way you get to know an algorithm well and you can modify it.
•  » » » » » » » 13 months ago, # ^ |   0 Yes sir, it is really just very small chances for random cases when I read other solutions with non-constructor custome class/struct, but/and all of them also dont have comparators
•  » » » » » » » 13 months ago, # ^ | ← Rev. 2 →   0 Can I ask for a question related to the implementations ?Yesterday I use priority_queue with value comparator and it is amolst TLE while using position comparator give TLE. Why using position are slower struct value_comparator { bool operator < (const myClass &A, const myClass &B) const { return A > B; } };  vector A; struct position_comparator { bool operator < (int i, int j) const { return A[i] > A[j]; } }; And is it good to use my third implementation of coordinate compression than my other two ? (comparing using position instead of directly compare the values)
•  » » » » » » » » 13 months ago, # ^ |   +3 Accessing something in a separate array costs time and it isn't cache-efficient. It's better to have a value here. And is it good to use my third implementation of coordinate compression than my other two? Just run some tests and see which version is faster.
•  » » » » » » » » » 13 months ago, # ^ |   0 Oh thanks sir, I thought using index will reduce the amount of extra memory but now I know sometimes cache miss is really a problem
•  » » » » » » » 13 months ago, # ^ |   0 Yes sir, but instead of copying the template to every problems, I usually only write the template<> when I face such problem, since the algorithm is simple but it is still a bunch of lines of code if we use many times :V
•  » » » 7 weeks ago, # ^ |   0 That's nice code. It help me. Thank you!
 » 13 months ago, # |   0 Better if(!m.count(b[i]) m[b[i]]=i;
 » 13 months ago, # | ← Rev. 3 →   +4 My way template void compress(vector &a) { int order = -1; set S; map M; for (T x : a) S.insert(x); /// Taking all unique elements for (T x : S) M[x] = ++order; /// Assign value for unique elements for (T &x : a) x = M[x]; /// Assign value for array } Notice that by using set, this would work faster when the number of duplicates higher, but will work slower when all array are unique
•  » » 13 months ago, # ^ |   0 Notice that by using set, this would work faster when the number of duplicates higher, but will work slower when all array are uniqueI came across a problem on CSES Salary Queries where Code using Set+Map for compression resulted in TLE, and Code using Vector+Sort+Map gave AC.PS: The Time-Limit of the problem is 1s.
•  » » 13 months ago, # ^ |   0 you don't need sets map mp; int cnt = 0; for (auto v : a) mp[v]; for (auto& e : mp) e.second = cnt++; for (auto& v : a) v = mp[v]; 
•  » » » 13 months ago, # ^ | ← Rev. 2 →   0 Thanks for giving me a new ideaI forgot auto method so I thought it would be long to use something like range based looptemplate void compress(vector &a) { int order = -1; map M; for (T x : a) M[x]; for (pair &e : M) e.second = ++order; for (T &x : a) x = M[x]; }  iteratortemplate void compress(vector &a) { int order = -1; map M; for (T x : a) M[x]; for (ITR it = M.begin(); it != M.end(); ++it) it->second = ++order; for (T &x : a) x = M[x]; } but also I think I can use unordered_map to assigning value faster (if hash function for specific class then it would be more worse) unordered_maptemplate void compress(vector &a) { int order = -1; set S; unordered_map M; for (T x : a) S.insert(x); /// Taking all unique elements for (T x : S) M[x] = ++order; /// Assign value for unique elements for (T &x : a) x = M[x]; /// Assign value for array } 
 » 13 months ago, # |   +11 I write it the same way as you
 » 13 months ago, # |   0 Can you suggest some problems based on this cool trick?
•  » » 13 months ago, # ^ | ← Rev. 3 →   0 Many algorithms only need the order of the number compare to each other and not they value, so something like {$1, 4, 9, 16$} can be use as {$1, 2, 3, 4$} with return the same result (smaller value, faster to do). So when you face some problem like that, we can compress the array down to smaller (since the limitation for $n$ usually much smaller than maximum value of $a_i$). One algorithm usually use this trick is Sweep Line
•  » » 13 months ago, # ^ |   0 Well, for example 61E - Противник слаб. And in general it is used when you want to store a set of numbers with a segment tree.
 » 13 months ago, # | ← Rev. 2 →   -13 set co; // Put all vals in the set map cocom; int cocnt=1; for(int x:co) cocom[x] = cocnt++; // cocom[x] = compressed coordinate of x 
•  » » 13 months ago, # ^ | ← Rev. 2 →   -8 [Deleted]
•  » » » 13 months ago, # ^ |   +5 In unordered_map the keys are not sorted
 » 13 months ago, # |   0 vector coordinate_compression(vector x, int start = 0, int step = 1) { setunique(all(x)); mapvalpos; int idx = 0; for (auto it : unique) { valpos[it] = start + idx * step; idx++; } for (auto& pos : x)pos = valpos[pos]; return x; } 
 » 7 weeks ago, # |   -20 I use this template: struct COORDINATE_COMPRESSION { typedef long long ll; vector vect; bool is_compress = true; int size() { if(!is_compress)compress(); return vect.size(); } void clear() { vect.clear(); is_compress = true; } void insert(ll x) { vect.push_back(x); is_compress = false; } void compress() { sort(vect.begin(), vect.end()); vect.resize(unique(vect.begin(), vect.end()) - vect.begin()); is_compress = true; } vector compress_offline(vector vect) { if(!vect.size())return vect; vector> vvv; for(int i = 0 ; i < vect.size() ; i++) { vvv.push_back({vect[i],i}); } sort(vvv.begin(), vvv.end()); int cont = 0; ll last = vvv[0].first; vect[vvv[0].second] = 0; for(int i = 1 ; i < vvv.size() ; i++) { if(vvv[i].first != last) { cont++; last = vvv[i].first; } vect[vvv[i].second] = cont; } return vect; } int get(ll x) { if(!is_compress)compress(); int pos = lower_bound(vect.begin(), vect.end(), x) - vect.begin(); assert(pos != vect.size() && vect[pos] == x); return pos; } ll iget(int x) { if(!is_compress)compress(); assert(0 <= x && x < vect.size()); return vect[x]; } };