Dragonado's blog

By Dragonado, 14 months ago, In English

Hello everyone,

Recently peltorator created a challenge to create interesting educational blogs. This gave me the motivation to write about a trick I discovered a while back. The trick is simple and I wouldn't be surprised if it already exists with some other name. But I couldn't find any blog on it so I'm making my own.

I have made a data structure that deals with the partition of an array. I call it Cut-Merge-Stick (CMS) trick. A partition of an array is a grouping of its elements into non-empty subarrays, in such a way that every element belongs to exactly one subarray. I refer to every segment in this partition as a "stick".

Some operations on this data structure are merging two consecutive sticks, cutting a single stick into two smaller sticks, getting the longest stick in a range, counting the number of sub-sticks in a range, etc.

Problem statement

You are given a stick of length $$$N$$$ units, placed on the positive $$$X$$$ axis with its left end at the origin. The stick is indexed such that, in between the $$$i$$$-th meter and the $$$i+1$$$-th meter, the number $$$i$$$ is written on the stick, for all $$$0 \le i \le N-1$$$.

You must address $$$Q$$$ queries which will be of the following types:

  • 1 k : Make a cut at $$$x = k$$$ units , As a result, the stick which covers this point is cut into two. It is guaranteed that a cut has not been made at this point prior to the query.
  • 2 k : Choose the stick containing the index $$$k$$$. Join it with the stick to its right which is immediately adjacent to it (if such a stick exists).
  • 3 k : Print the length of the stick containing the index $$$k$$$.
  • 4 l r : Print the length of the longest stick between index $$$l$$$ and index $$$r$$$ (both inclusive). Note that for sticks extending outside the range $$$[l, r]$$$, the length outside the range is not considered.
  • 5 l r : Print the number of pairs $$$(i, j)$$$ such that
    • $$$l \le i \le j \le r$$$
    • index $$$i$$$ and index $$$j$$$ are contained in the same stick.

All values in the input are integers. Solve for $$$N, Q \le 2 \cdot 10^5$$$

Preqrequistes

You only need to know about arrays and segment trees. Even the segment tree part can be thought of as a black box with the following points.

Suppose there is an array $$$A$$$ of size $$$N$$$ initially filled with $$$0$$$s. You can assume the following operations/queries are done in $$$O(\log N)$$$ time each.

  • Set the value of $$$A_i := A_i + v$$$ for all $$$i$$$ in the range $$$[l, r]$$$
  • Get the value of $$$A_l + A_{l+1} + \dots + A_r$$$
  • Get the value of $$$\max(A_l, A_{l+1}, \dots, A_r)$$$

If you want to know how the operations can be done then you can refer to the lazy segment tree tutorial in the EDU section.

Idea

Initial setup

For now, let us only focus on the first $$$3$$$ types of queries.

We will denote each stick by consecutive natural numbers starting from $$$1$$$. Let $$$A$$$ be an array of length $$$N$$$ defined as follows. So $$$A_i$$$ will denote the prefix-length of the stick that is at index $$$i$$$. Look at the below picture for a better understanding.

So in this case, the first stick has length $$$5$$$ meters, the second stick has length $$$1$$$ meter, the third stick has length $$$3$$$ meters, the fourth stick has length $$$1$$$ meter and the fifth stick has length $$$2$$$ meters.

Trying the cut operation

Suppose we want to make a cut at $$$x = 2$$$ meters. This will give us $$$2$$$ sticks of length $$$2$$$ and $$$3$$$. The values of $$$A$$$ of the left stick, that is $$$A[0 \dots 1]$$$ are still valid. But for the right stick of length $$$2$$$, that is $$$A[2 \dots 4]$$$, we should re-assign it.

Currently we have $$$A[2 \dots 4] = [3, 4, 5]$$$ but since $$$A[2 \dots 4]$$$ is a new stick we want it to be $$$A[2 \dots 4] = [1, 2, 3]$$$.

We want to add something to $$$A[2...4]$$$ to have it updated correctly.

So $$$X + [3, 4, 5] = [1, 2, 3]$$$

$$$\implies X = [1, 2, 3]- [3, 4, 5] = [-2, -2, -2]$$$

It turns out that If we add $$$-2$$$ to every index in the range $$$[2, 4]$$$ then our array $$$A$$$ will be correctly updated. This is a very useful observation as we can achieve this operation in $$$O(\log N)$$$ time using our black box segment tree.

In general, if we have to make a cut at $$$x=i$$$ meters and the index of the right endpoint of the stick is at $$$r$$$ then we need to do the following to correctly update the array $$$A$$$.

  • Let $$$t := A_i$$$
  • $$$A_j := A_j + (1 - t)$$$ for all $$$j$$$ in the range $$$[i, r]$$$

Small problem with current cut operation

But not everything is solved yet, how do we get the right-end point of the stick that contains the index $$$i$$$? We can't iterate from left-to-right and check for every index as that takes $$$O(N)$$$ time which is too slow.

Suppose you want to know the index of the left end point of the stick that contains the index $$$i$$$, It is simply the value $$$i -A_i + 1$$$. We can get this value from in $$$O(\log N)$$$ time (because It takes $$$O(\log N)$$$ time to get the value of $$$A_i$$$ from the segment tree).

So we can easily find the left-end point of a stick but we can't find the right-end point? That doesn't sound right, because there is nothing about this problem statement that is special about the left. So the left-side and the right-side should be somewhat symmetric.

Fixing the problem with cut operation

Because of our labeling with the array $$$A$$$ we only get the information about the left-end point for every stick. So, we should just make another array that does the opposite of $$$A$$$. Let us call this array $$$B$$$. So $$$B_i$$$ will hold the suffix length of the stick at index $$$i$$$.

This double-sided labeling turns out to be very useful. Now if we want the left and right endpoints of the stick that contains the index $$$i$$$:

  • $$$\text{left endpoint} = i + (1 - A_i)$$$
  • $$$\text{right endpoint = } i + (B_i - 1)$$$
  • Let us define the function $$$\text{getEndPoints(i) = (left endpoint, right endpoint)}$$$ [pair of left and right end points].

We can get the value of $$$A_i$$$ and $$$B_i$$$ for any $$$i$$$ in $$$O(\log N)$$$ time.

Notice that after cutting at $$$x = 2$$$ meters, the subarray $$$B[2 \dots 4]$$$ will be correctly filled. It is $$$B[0 \dots 1]$$$ that is now incorrect. So to perform the cut operation, we have to update the values in both $$$A$$$ and $$$B$$$.

Before cutting, $$$B[0 \dots 1] = [5, 4]$$$, and after cutting we want it to be $$$B[0 \dots 1] = [2, 1]$$$. Again we observe that $$$[2, 1] - [5, 4] = [-3, -3]$$$

If we add $$$-3$$$ to every index in the range $$$[0, 1]$$$ then our array $$$B$$$ gets updated correctly.

In general if we have to make a cut at $$$x=i$$$ meters and the index of the left endpoint of the stick is at $$$l$$$ then we need to do the following to correctly update the array $$$B$$$.

  • Let $$$t := B_{i-1}$$$
  • $$$B_j := B_j + (1 -t)$$$ for all $$$j$$$ in the range $$$[l, i-1]$$$

cut operation

So finally, to make a cut at $$$x=i$$$ meters, we have to do the following:

  • Let $$$l, r := \text{getEndPoints(i)}$$$
  • Let $$$v_A := A_i$$$ and $$$v_B := B_{i-1}$$$
  • Set $$$A_j := A_j + (1 - v_A)$$$ for all $$$j$$$ in the range $$$[i, r]$$$
  • Set $$$B_j := B_j + (1 - v_B)$$$ for all $$$j$$$ in the range $$$[l, i-1]$$$

So now the arrays $$$A$$$ and $$$B$$$ look like

Merge operation

Lets say you want to join two consecutive sticks.

To update the values of $$$A$$$, simply add the length of the left stick to each element of $$$A$$$ covered by the right stick. Similarly, to update $$$B$$$, add the length of the right stick to each element of $$$B$$$ covered by the left stick.

  • Let $$$l_1, r_1 := \text{getEndPoints(i)}$$$
  • Let $$$l_2, r_2 := \text{getEndPoints}(r_1 + 1)$$$
  • Set $$$A_j := A_j + (r_1 - l_1 + 1)$$$ for all $$$j$$$ in the range $$$[l_2, r_2]$$$
  • Set $$$B_j := B_j + (r_2-l_2+1)$$$ for all $$$j$$$ in the range $$$[l_1, r_1]$$$

It is easy to verify that the updated values of $$$A$$$ and $$$B$$$ become consistent with their definitions

Suppose we perform the merge operation (query of type $$$2$$$) for index $$$9$$$ in the above example. The figure now becomes

and again perform the merge operation (query of type $$$2$$$) for index $$$7$$$. The figure then becomes

So far we have successfully done queries of type $$$1$$$ and $$$2$$$.

Answering query of type $$$3$$$

The length of the stick containing the index $$$i$$$ $$$= \text{getEndPoints(i).second -getEndPoints(i).first +1}$$$

Answering query of type $$$4$$$

Query of type $$$4$$$ is quite simple. Just do the following:

  • Cut at $$$L$$$ meters.
  • Cut at $$$R$$$ meters.
  • Print the value of $$$\max(A_{L-1}, A_L, \dots, A_{R-1})$$$
  • Merge at $$$L-1$$$
  • Merge at $$$R-1$$$

Answering query of type $$$5$$$

Query of type 5 is almost exactly the same as type $$$4$$$.

  • Cut at $$$L$$$ meters.
  • Cut at $$$R$$$ meters.
  • Print the value of $$$A_{L-1} + A_L + \dots + A_{R-1}$$$
  • Merge at $$$L-1$$$
  • Merge at $$$R-1$$$

There are actually too many cut and merge operations that are happening. Although asymptotically it is still $$$O(\log N)$$$ and optimal, in practice, it may slow the program. We can do some case analysis and reduce the number of these function calls. I won't describe them here but they will be implemented in the source code.

Implementation

I have used the AtCoder Library for the implementation of the lazy segment tree required to support range update/query operations. You may want to read the documentation for the lazy segment tree that is used in AtCoder Library.

In the implementation I call the array $$$A$$$ as $$$\text{left_label}$$$ and $$$B$$$ as $$$\text{right_label}$$$.

The implementation has fewer function calls of cut and merge while getting the answer for queries of type $$$3, 4$$$, and $$$5$$$, than what is described above. It is left as an exercise to the reader to figure out these minor optimizations from the implementation (I definitely didn't forget how they work)

Implementation of the data structure

Example Problems

ABC217 D Cutting Woods

This is the problem that inspired the trick. This contest was held on Sep 4th 2021. I thought of this trick during the contest, implemented the basics of it and submitted it. During the contest I used a fenwick tree because only range add and point query was needed.

The intended solution of using a std::set<int> never occurred to me during the contest. So I was continuously thinking of compressing the long stick using coordinate compression and solving for the stick of size $$$2\cdot10^5$$$

Once we have the CMS trick the problem is quite easy. We just have to maintain a map that compresses and uncompresses coordinates.

clean code

You can refer to my submission here

CF1567E Non-Decreasing Dilemma

I encountered this problem in a rated contest and was shocked. This contest was on Sep 5th, 2021. Exactly $$$1$$$ day after the AtCoder contest, when I thought of this trick. Unfortunately, I had not implemented the code for the full version of this data structure, hence I spent a lot of time debugging bad code that I wrote during the contest.

To solve this problem we realize that a "stick" in this context is a maximal non-decreasing subarray. We can split the given array into disjoint subarrays (partition) such that each subarray is a non-decreasing subarray. Every point update can either cut or merge some sticks.

Instead of doing case analysis to determine if a point update does the cut or merge or both operations, we can explicitly cut at $$$i-1$$$ and $$$i$$$. Perform the point update and then merge at $$$i-1$$$ and $$$i$$$.

Snippet of my in-contest submission (hot garbage)
Snippet of the intended solution using generic segment tree (I wouldn't recommend)
clean code using cms trick

You can refer to my submission of the clean version here

Conclusion

I would like to thank Ashishgup, InfernoSloth and JeevanJyot for reviewing/discussing the blog. Special thanks to the_hyp0cr1t3 for many helpful suggestions to improve this blog.

I feel this is a straightforward trick and it solves very specific problems. I had a hard time searching for problems where this trick was applicable. Please post some problems in the comments if the CMS trick can solve them.

I hope this double-sided labeling trick is new to you and you learned something worthwhile. This is my first time writing a tutorial blog, so please leave a comment if I was unclear about something and I will try to clear your doubt. Upvote if you found the blog helpful.

Thank you.

Some Shower Thoughts

  • I have used $$$0$$$-based indexing and $$$[l, r]$$$ notation for denoting ranges, as this is my style. But the implementation may be neater if we use $$$1$$$-based indexing and $$$[l, r)$$$ to denote the ranges.
  • The capability of this data structure is a subset of the capabilities of a lazy segment tree. The same operations can be achieved using std::set<int> and/or a generic lazy segment tree in the same time complexity. By generic segment tree I mean, each node holds some information like the size of the subarray, leftmost stick, rightmost stick, etc. Merging $$$2$$$ nodes of the segment tree is not too difficult but it's a pain. As mentioned in the $$$\text{CF 1567E}$$$ problem, the intended code for this alternate implementation is very ugly and laborsome.
  • I haven't tried any benchmark tests between the CMS trick and the generic segment tree but I'm guessing the CMS trick implementation is faster. Feel free to try your own benchmark tests to prove/disprove my guess.
  • It may be possible to merge multiple consecutive sticks in $$$O(\log N)$$$ by lazily merging, but I haven't thought much about it.
  • It may be possible to generalize this trick to trees or some other generalization but I have been unsuccessful so far.
  • It may be possible to add more functions other than $$$\text{max, min}$$$ but I don't have a deep understanding of the abstractness of lazy propagation in segment trees. I only know that the function $$$f$$$ you apply must satisfy $$$f(ab) = f(a)f(b)$$$ as given in the AtCoder Library documentation.

Full text and comments »

  • Vote: I like it
  • +80
  • Vote: I do not like it

By Dragonado, 2 years ago, In English

ನಮಸ್ಕಾರ ಕೋಡ್‌ಫೋರ್ಸಸ್ (Hello Codeforces)!

We invite you to participate in CodeChef’s March Cook-Off, this Saturday, 5th March, rated for all.

Time: 8 PM — 10:30 PM IST. Note the new schedule.

Web3 giant Chingari is onboard to pick their recruits in the areas of Machine Learning, Rust and Web3.

These job openings are for Indians, and Non-Indians (who can work from home), and are experienced in any of the following: Rust, React, Golang,C/ C++, Python, Machine Learning, Deep Learning, Data Mining, Recommendation Systems etc. You’re also a desirable candidate if you have a deep understanding of the blockchain and cryptocurrency space and smart contract systems, specifically having a good understanding of Solana. The application for the openings are given in the Cook-Off contest landing page.

This edition of CookOff is part of the NITK annual CP contest called Inscription. We have many more tech events under our annual tech fast called Engineer 2022. You can check our events out here. Most of the problems of this contest were written by the undergraduate students of NITK ^_^

The contest will be 2.5 hours long. There will be 7 problems in Div 1/3 and 8 problems in Div 2/4. It will be rated for all four Divisions.

Joining me on the problem setting panel are:

Prizes:

  • Top 10 global Division One users will get $100 each.
  • Top 25 Indian Division One coders to get Amazon Vouchers worth Rs. 1500 each.

The video editorials of the problems will be available on our YouTube channel as soon as the contest ends. Subscribe to get notifications about our new editorials.

Also, if you have some original and engaging problem ideas, and you’re interested in them being used in CodeChef's contests, you can share them here.

Hope to see you participating. Good Luck!

Full text and comments »

  • Vote: I like it
  • +45
  • Vote: I do not like it

By Dragonado, 3 years ago, In English

Hello Codeforces!

We , aditya_chirania, niranjan.sy99, lomadapraveenreddy, janmansh and I (Dragonado) of IEEE-NITK, are extremely happy to release our VSCode Extension for CF called CodePal

CodePal is made to help Codeforces Users Code with Convenience. This extension is especially for people who want to save time in a live codeforces contest and upsolve problems comfortably. This extension responds quickly to users. It can swiftly filter through the problem list by specifying tags and ratings, create folders for contests and problems containing sample tests of each problem in them and compile and run tests automatically. For added convenience, we've created buttons to directly open problem statements and submission pages on the default browser.

Current Features of this Extension

  • View Complete ProblemSet List along with their associated tags and ratings.
  • Swiftly Filter through the ProblemSet by specifying Ratings and Tags.
  • View Currently running, Upcoming and all Past contests and also view the precise start date, time, and duration of upcoming contests.
  • Fast Folder Creation on a single click for Contests from the Codeforces contest list and Problems from the Codeforces problem list.
    • Folder for a contest contains a folder for each problem and also contains all sample test cases and program files for each problem.
    • Folder for a problem consists of all its sample test cases and a program file loaded with a template whose path may be specified in the settings.
    • Manually create a contest or problem folder by simply specifying name of contest/problem and number of problems.
  • Add additional tests to any problem.
  • Compile and run any program file against the test cases and get comprehensive results.
  • Open the problem statement or submission page with a single click, on your default browser. (You must be logged in to codeforces beforehand to open the submission page successfully)
  • Compiler may be selected and compilation flags can be set through the CodePal settings.
  • Perform Stress testing to find a counter test case for your code.
  • Get a personalized experience of viewing details of your Codeforces profile and the status of problem submissions made, by entering your Codeforces handle in the codepal settings.
  • Use AtCoder library in C++ with no extra setup.

Languages Supported

  • C++ (compiler: g++)
  • C (compiler: GCC)
  • Java (compiler: javac)
  • Python
  • Kotlin
  • Note: You may add additional compilation flags through the codepal settings (example: -std=c++14) and also choose between python, python2, or python3 depending upon the python command you use on your system to run.

Operating Systems Supported

  • Windows
  • Linux
  • Mac

Here is a snapshot of the extension

You can install our extension here: https://marketplace.visualstudio.com/items?itemName=IEEE-NITK.codepal

You can find the usage guide, developer docs, and demo GIFs here: https://github.com/IEEE-NITK/CodePal/blob/master/README.md

Please do try it and give us your feedback. If you face issues or bugs, please comment on this blog or here. We're planning some more features for our subsequent releases and wish to keep all subsequent releases as smooth functioning and responsive as this.

Thank you :)

UPDATE: We have added AtCoder library support for C++. You can now use this great library with no extra setup. You can easily submit it to any Online Judge other than AtCoder as well. For more information on usage, you check our README here

UPDATE: After the recent feature on codeforces to highlight each test case, our extension had broken. We have now patched it. We have now also added support for Haskell.

Full text and comments »

  • Vote: I like it
  • +135
  • Vote: I do not like it