Given a $$$N \times N$$$ table, each cell $$$(x,y)$$$ has a value $$$a[x][y]$$$.

From cell $$$(x, y)$$$ we can jump to cell $$$(x_1,y_1)$$$ if **both** of these satisfy:

$$$a[x_1][y_1] > a[x][y]$$$

$$$($$$$$$|x - x_1|=1$$$ and $$$|y - y_1| > 1)$$$ or $$$($$$$$$|x - x_1|>1$$$ and $$$|y - y_1| =1)$$$.

Find the longest path (number of jumps) we can make, if we start at cell $$$(X_0, Y_0)$$$.

Input is $$$N, X_0, Y_0$$$ and the array $$$a$$$. $$$(N \leq 1500, a[i][j] \leq 10^6)$$$

Example:

4

1 1

1 2 3 4

2 3 4 5

3 4 5 6

4 5 6 7

Output: $$$3$$$ (The path is $$$(1,1) \Rightarrow (2, 3) \Rightarrow (4,2) \Rightarrow (3,4)$$$).

My current solution works in about $$$O(4 * N * N * log(N) + 10^6 * log(10^6))$$$, and it didn't pass the time limit in $$$1$$$ second.

**Details of my solution**

Please show me a faster (possibly fastest) solution or some improvements that I can make to my algorithm. Thank so much <3 <3.

I think in this particular case, if you change your bits to seg tree, it should improve. I am sure there is a really simple segtree that handles point update, and range query for max (for ex: the one with 2*n nodes, where node I is parent of 2*I and 2*I+1). It will make it 2nlogn instead of 4, and in practice it is quite fast as well

Edit: for each row , column you need at most 4 values, you don't need bit/segtree.

Thanks for your answer.

But I forgot to tell that I have tried that earlier and it turned out to be slower than using 4 BITs (with my implementation).

Each row and column I will save a multiset of 4 values. Updates are fast. But when querying, as we have to query 4 times, each time I will push all the three values of dp in a vector, sort it and compare to the multiset of that row/col. I suspect that $$$2 * log(1500)$$$ is about 24 and query above is about $$$4 + 8 + 4$$$ with probably bigger constant factor (vector stuff), so that with my bad implementation it turned out to be slower in practice. Do you have any better suggestions for the implementation?

I will try to replace seg tree. Thanks.

And by the way this wasn't from an ongoing contest ^^. (I have the testcase I can show you).

Let's store an array of $$$k$$$ elements ($$$k$$$ maximums that you are interested in) in non-increasing order. To update this array with $$$val$$$ let's iterate with $$$i$$$ from $$$0$$$ to $$$k-1$$$ through it. On each iteration if $$$mx_i < val$$$ $$$swap(mx_i, val)$$$, otherwise do nothing. This way every update is $$$O(k)$$$ and everything works fast.

I hope this will help :)

Thanks so much !! Without those vector and multiset stuff everything is 1 second faster.

That implementation trick for maintaining $$$k$$$ maximums is surprisingly neat.

I hope it wasnt from an ongoing contest

same as an older comment