### i_love_emilia_clarke's blog

By i_love_emilia_clarke, history, 6 years ago,

Given an array A of N integers answer Q queries of type (L, R). Answer to a query (L, R) is the sum of distinct integers in the range A[L...R]

Constraints: N, Q <= 100000

0 <= L <= R < N

abs(A[i]) <= 1 000 000 000

I read somewhere that this can be done using persistence segment tree.

So, it would be helpful if one can explain that approach.

Other approach are welcome too.

• +11

| Write comment?
 » 6 years ago, # | ← Rev. 2 →   0 I don't now the persistent tree solution,but I can tell you a nice offline solution. Let's use B[]=an array in which we can add/delete elemnts/query the sum in O(logN).So B can be basically a standard Fenwick Tree or Segment Tree. It is easy to understand that the sum of the distinct integers in the range [L,R] is the sum of the integers that appear for the last time in the sequence(we simply add each element the last time it occurs in the subsegment).We sort the queries by R and solve them offline.At each i (1<=i<=N) we insert on the position i of B[] the value of A[i].Let ist(1<=lst
 » 6 years ago, # | ← Rev. 3 →   0 Imagine if you created a segment tree for every range of the array. To answer a query you just pick the segment tree corresponding to the range [L,R] and get your answer from the root of the tree. Query is O(1) but the construction would be O(N^3) which isn't feasible. What you can do is build a segment tree for every prefix of the array which can be done in O(NlogN). (This is what persistent segment tree is basically). Answer for a query [L,R] would just be answer for ans_for(R) — ans_for(L-1). Use coordinate compression initially to map the values in the range [1,N]. EDIT : ^This wouldn't work. Instead of just storing counts of values in segment tree node, store count of their rightmost occurences.
•  » » 6 years ago, # ^ |   +1 He wants the sum of distinct integers in the specified.range. Simply doing ans(R) — ans(L — 1) won't work here.
•  » » » 6 years ago, # ^ | ← Rev. 2 →   -9 I dont see why not , we can traverse both the trees simultaneously like we do to find the Kth smallest number in a range. EDIT : My bad!
•  » » 6 years ago, # ^ |   0 When you remove the contribution of distinct integer by subtracting ans_for(L-1), you are not simultaneously adding(if there are any) integers which are in the range L to R but were not counted because of ans_for(L-1)
•  » » » 6 years ago, # ^ |   -9 say the array is A = [1,2,2,3,3,3,4,4,4,4] , first store the rightmost occurence of every index , for this array right_most_occur = [11,3,11,5,6,11,8,9,10,11] , where 11 is a way of storing that the right most occurence does not exist. Now the distinct integers in a range [L,R] are all those indexes i where right_most_occur[i] > R. Should be easy to figure out the persistent segment tree solution now.
 » 6 years ago, # |   +30 A persistent segment tree is able to support queries of the form : Count the number of values <= X in the range [L, R]. Using it, we can solve the given problem online (i.e without preprocessing the queries)Let's solve a different problem first using persistent segment tree: Number of distinct integers in a range. DQUERYLet prev[i] denote the previous occurrence of arr[i]. Replace every element of your array by the location of it's previous index (i.e by prev[i]).For, example 1 2 1 2 3 (1-based indexing) will now become 0 0 1 2 0 Now the answer for the number of distinct values in range [L, R] is simply the number of values in this new array <= L - 1 in the range [L, R]. Why? Because if a value occurs more than once in the range, all its occurrences except the first one will have prev[i] >= L. So, by restricting ourselves to values <= L - 1, we are only considering the first occurrence in the range. When adding index i of the array to your persistent segment tree, you need to add 1 to index prev[i] of the persistent segment tree.Now, lets come back to your original problem. It can be solved in a similar fashion. You will need to create a persistent segment tree which supports weighted sum for all values <= X in range [L, R]. (This is a trivial modification to the usual persistent segment tree. You may practice on this problem if you wish.)When adding index i of the array to your persistent segment tree, you need to add a weight of arr[i] to index prev[i] of the persistent segment tree.Time complexity : O(N * log(N) + Q * log(N) )