[Need Help] Order statistics in rooted trees

#	User	Rating
1	ecnerwala	3649
2	Benq	3581
3	orzdevinwang	3570
4	Geothermal	3569
4	cnnfls_csy	3569
6	tourist	3565
7	maroonrk	3531
8	Radewoosh	3521
9	Um_nik	3482
10	jiangly	3468

#	User	Contrib.
1	maomao90	174
2	awoo	164
3	adamant	163
4	TheScrasse	159
5	nor	158
6	maroonrk	156
7	-is-this-fft-	152
8	SecondThread	147
9	orz	146
10	pajenegod	145

Hello!

I am solving a problem where a rooted tree with N vertices is given, each vertex has a value V <= 10^9, and i have to answer M queries. Each of the queries has the format (V,K), and i have to answer which is the K-th smallest value in the subtree rooted in vertex V. The limits for N,M are 10^5.

My current approach solves almost every instance of the problem (O(N log N) average) but fails when the tree degenerates to a linked list (O(N^2 log N) worst case). I store every query so i can answer all them after i run my algorithm. I process the tree from the leaves up to the root using topological sort, and when i am processing some vertex, i merge the leaves' subtrees values using heapsort (found it to be faster than mergesort, but still slow if the tree is a linked list), and answer every query related to this vertex. After every vertex has been processed, i iterate over the answer array and print the results.

Can anyone help me with a better approach, or point some property that can help my solution so it doesn't get TLE verdict?

Thanks in advance.

EDIT : problem solved, used the idea given by ffao.

Original statement : https://www.urionlinejudge.com.br/judge/en/problems/view/1695

My code : http://pastebin.com/ncWeC102

#include <bits/stdc++.h> using namespace std; #define MAXN 100011 // #define debug(...) do{ fprintf( stdout, __VA_ARGS__ ); } while( false ) #define debug(...) do{ } while( false ) int T,N,M,A,B; int v[MAXN], vpre[MAXN], ibegin[MAXN], iend[MAXN]; vector<int> adj[MAXN]; /* https://www.quora.com/How-can-you-build-a-data-structure-on-an-array-that-returns-kth-order-statistics-on-subarrays-in-logarithmic-time */ struct elem { int val, pos; bool operator< (elem b) const { return val<b.val; } }; int* tree[4*MAXN+10]; elem temp[MAXN], arr[MAXN], sorted[MAXN]; int* merge(int e, int d) { int* num_left = (int*) malloc(sizeof(int) * (d - e + 1)); int left = e, right = (e+d)/2+1; int i = 0, cnt = 0; while (left <= (e+d)/2 && right <= d) { if (arr[left].pos <= arr[right].pos) { num_left[i] = ++cnt; temp[i] = arr[left++]; } else { num_left[i] = cnt; temp[i] = arr[right++]; } i++; } while (left <= (e+d)/2) { num_left[i] = ++cnt; temp[i] = arr[left++]; i++; } while (right <= d) { num_left[i] = cnt; temp[i] = arr[right++]; i++; } for (int j = 0; j < (d-e+1); j++) { arr[e+j]=temp[j]; } return num_left; } void create_tree (int i=1, int e=1, int d=N) { if (e == d) return; else { create_tree(2*i, e, (e+d)/2); create_tree(2*i+1, (e+d)/2 + 1, d); tree[i] = merge(e-1, d-1); } } int query(int p, int q, int k, int i=1, int st=1, int end=N) { if (st == end) return sorted[st-1].val; int left = (p!=1 ? tree[i][p-2] : 0); int right = tree[i][q-1]; int diff = right - left; if (diff >= k) return query(left+1,right,k,2*i,st,st+(end-st)/2); else return query(p-left,q-right,k-diff,2*i+1,st+(end-st)/2+1,end); } /* ------------------------------------------------------------------------ */ int preorderCount; int dfs(int root, int pred) { ibegin[root] = preorderCount++; vpre[ibegin[root]] = v[root]; for (auto u : adj[root]) if (u != pred) dfs(u,root); iend[root] = preorderCount-1; } int main() { ios_base::sync_with_stdio(false); scanf("%d", &T); while (T--) { scanf("%d %d", &N, &M); for (int i = 0; i < N; i++) { adj[i].clear(); scanf("%d", &v[i]); } for (int i = 0; i < N-1; i++) { scanf("%d %d", &A, &B); A--; B--; adj[A].push_back(B); adj[B].push_back(A); } preorderCount = 0; dfs(0,-1); debug("pre: "); for (int i = 0; i < N; i++) { debug("%d ", vpre[i]); sorted[i].val = vpre[i]; sorted[i].pos = i; } debug("\n"); sort(sorted,sorted+N); memcpy(arr,sorted,sizeof(sorted)); create_tree(); for (int i = 0; i < M; i++) { scanf("%d %d", &A, &B); A--; int l = ibegin[A], r = iend[A]; printf("%d ", query(l+1,r+1,B)); } printf("\n"); } }

Comments (14)

Write comment?

snacache

8 years ago, # |

Link to the problem? It sounds interesting :)

→ Reply

skavurskaa

8 years ago, # ^ |

https://www.urionlinejudge.com.br/judge/en/problems/view/1695

shubham1694

1.First of all flatten your tree into array using euler tour(single dfs).The property of the euler tour generated is that all the subtrees can be represented by a single subarray.

The problem now reduces to finding kth element in a range in an array.

2.Now use MO's algorithm and a BST.Insert and remove them from BST as you encounter them in Mo's algorithm. Finding kth element in a BST is a standard problem which can be done in O(logn)

Total complexity: O(M*sqrt(N)*log(N))

anh1l1ator

Hello finding k th element in a range of an array can be done in O(log N ) — search fractional cascading or O(log N^2 ) ( Search Merge sort tree ) . There exists an offline solution too with Amortized complexit O(logN).

But I was thinking some solution without flattening of the tree.

My first idea was: Although the idea is same as of flattening we can keep a global counter i.e. time and append values onto a segment tree built on time and answer the query during this process only ( we can add an element in merge sort tree with given bounds in O(logN^2) .

Assuming a worst case where every vertex stores exactly 1 query, could fractional cascading still be used? I searched how it works but never used it so i don't know its limitations very well

Sounds like a good approach, but would it pass given the constraints? Worst case could go above 10^8 operations

I don't think it would, given that there are upto 35 test cases in one input file.But you could try anyway.

ffao

+17

This solution is O(logN) per query and works online, but you'll need some preprocessing first.

First, number all your nodes according to a preorder traversal of the tree. This reduces the problem to "Given an array A with n elements, answer some queries of the form (i, j, k), which ask for the k-th smallest element in A[i...j]".

Take your array A and subdivide it into two reduced arrays, such that all elements in the left reduced array are smaller than the ones in the right. Then do it recursively until you're left with single elements in every reduced array, for example, if array A was [1 7 3 5 4 6 2]:

[1 7 3 5 4 6 2]  
[1 3 4 2] [7 5 6]
[1 2] [3 4] [5 6] [7]
[1] [2] [3] [4] [5] [6] [7]

Now, given a query (i, j, k), your goal is to find out in which of the two reduced arrays the answer is. For that, for every one of our arrays R we will compute num_left[i] = how many elements in R[0...i] go to the left reduced array.

Now, to answer a query (i, j, k), we know that num_left[j] - num_left[i-1] elements are in the left reduced array. Because all elements in the left are smaller than the ones in the right, if this number is greater than or equal to k, our answer is in the left, otherwise it is in the right. Call the function recursively for the reduced subarray until you reach a leaf, which has the value you want.

Note: When you call the function recursively, positions [i..j] in the original array will not correspond to positions [i..j] in the reduced subarray. So you need to do some math to figure the interval that is equivalent to [i..j].

This looks efficient enough to pass the problem. I could simulate a heap to get the indexes of the sub-arrays, right? I mean, imagine each sub-array as a heap node (full array is the root), so, for each array a, its left subarray starts in index 2a and right subarray starts in index 2a+1.

That's how I would code it, yes.

Auto comment: topic has been updated by skavurskaa (previous revision, new revision, compare).

SarvagyaAgarwal

← Rev. 2 →

could you post your code ? the link isn't opening for some reason !

EDIT : code in rev1 was wrong, my bad.

gongy

This is a really interesting problem, but I found a different solution to the very interesting one presented by ffao.

We do the same thing with preorder traversal, turning the tree into an array, and then our operations are to query k-th largest element in a range.

Consider a segment tree over value (storing count of how many have a value in my range): we can walk through the nodes, choosing left and right based on the count stored by the children. So each segment tree node stores the range of values for which it is counting (e.g. I count how many values are between 4 and 20), and its count (e.g. there are 4 things in the array between 4 and 20 which my tree counts).

We create N of these trees each describing a prefix of the array, persistently (resulting in O(N log N) memory).

Now, by walking the tree as described before in one of these trees, we can query: k-th largest element in a prefix of this array. So how do we extend this to any range (not just a prefix)?

Notice that the count of values in each node of the tree is just a cumulative sum up to the index the tree describes. So at each respective position in the tree, the value for a range [L, R] is simply the value in tree R at this node position, minus the value in tree L-1 at this node position. Therefore we can walk through the L-1 and R trees at the same time obtaining the k-th largest for the range [L, R].

skavurskaa's blog