Noble_Mushtak's blog

By Noble_Mushtak, history, 2 months ago, In English

I recently found an implementation of Li Chao tree in this ICPC notebook, and it is quite different than the implementations I have seen before. Usually, I see the insert_line method implemented in a way similar to that described in this cp-algorithms.com article, where you recur either on the left child or the right child depending on if the new line is above or below the current line at the midpoint of the interval. However in this implementation, they recur on both the left and the right child, but they stop recurring once the new line is completely below or completely above the current line in that node.

Does this implementation still handle inserting lines in $$$O(\log n)$$$ time in the worst case? I tried using this implementation on the Frog 3 problem on AtCoder and it got AC, but I am wondering if it is due to weak test cases, or if this implementation is actually valid and really handles inserting lines in $$$O(\log n)$$$ time. I wasn't able to find a counterexample where inserting a line would cause the implementation to traverse the whole tree, but recurring on both the left and the right children seems strange to me.

Read more »

 
 
 
 
  • Vote: I like it
  • +80
  • Vote: I do not like it

By Noble_Mushtak, history, 2 months ago, In English

I think the editorial for Round#810 is not out yet, and I found a solution for 810C: XOR Triangle which I thought was interesting, even if my solution is probably more complicated the intended solution, so I wanted to write it up both because I want to share it with others and because I feel like explaining a solution can help myself remember it better. My implementation is here.

First, notice that, since every leg of the triangle $$$a\oplus b, a\oplus c, b\oplus c$$$ is the XOR of the other two legs in the triangle, and it is well-known that $$$x+y\geq x\oplus y$$$ for all positive integers $$$x, y$$$, the triangle is degenerate if and only if $$$x+y=x\oplus y$$$ for some legs $$$x, y$$$ of the triangle, i.e. $$$x+y < x\oplus y$$$ is not a possibility. I will be using this assumption implicitly throughout my solution.

Let's say the binary string in the test case is $$$b_{N-1} \dots b_0$$$ (rightmost bit is labelled $$$b_0$$$, because it is the least significant bit). The first idea behind my solution is that you can solve it using DP, where you loop from right to left in the binary string and compute the answer for $$$n=b_i\dots b_0$$$ for all $$$i=0$$$ to $$$N-1$$$, where $$$N$$$ is the length of the given binary string. That is, you compute the answer for every $$$n$$$ which is a suffix of the binary string in the given test case.

Now, let's say we are trying to find the answer for $$$n=b_i\cdots b_0$$$. If $$$b_i=0$$$, then the answer is just whatever the last answer for $$$n=b_{i-1}\cdots b_0$$$ was, which we already know from the previous DP answer. Otherwise if $$$b_i=1$$$, we have to do some work.

If $$$b_i=1$$$, I split the answer into four cases and then I add the answers from each of the four cases up to get the total answer:

  1. Number of triples where exactly 0 of $$$a,b,c$$$ have their $$$i$$$ th bit set to 1
  2. Number of triples where exactly 1 of $$$a,b,c$$$ have their $$$i$$$ th bit set to 1
  3. Number of triples where exactly 2 of $$$a,b,c$$$ have their $$$i$$$ th bit set to 1
  4. Number of triples where exactly 3 of $$$a,b,c$$$ have their $$$i$$$ th bit set to 1

For the rest of the post, let $$$lastVal$$$ be the value of $$$b_{i-1}\cdots b_0$$$, mod $$$998244353$$$. It is easy to compute $$$lastVal$$$ for all $$$i$$$ using a simple for loop, adding $$$2^i$$$ wherever $$$b_i$$$ is set in the binary string.

Case where 0 numbers have $$$i$$$ th bit set to 1

In this case, since the $$$i$$$th bit of all of $$$a,b,c$$$ are $$$0$$$, it is obvious that $$$a,b,c < n$$$, so $$$a,b,c$$$ are just arbitrary $$$i$$$-bit numbers. Let $$$d=a\oplus b$$$ and $$$e=a\oplus c$$$. Then, the legs of the triangle have lengths $$$d, e, d\oplus e$$$. Again, $$$d$$$ and $$$e$$$ are just arbitrary $$$i$$$-bit numbers, since $$$a,b,c$$$ are arbitrary $$$i$$$-bit numbers. Overall, there are $$$2^i\cdot 2^i$$$ possible values of $$$(d, e)$$$, and there are three cases where the triangle is degenerate:

  • $$$d+e=d\oplus e$$$: This happens when $$$d$$$ and $$$e$$$ are disjoint, i.e. there is no $$$j$$$ where both $$$d$$$ and $$$e$$$ have their $$$j$$$th bit set to $$$1$$$. There are $$$3^i$$$ possibilities where this could occur, because there are $$$3$$$ possibilities for each bit: each of the $$$i$$$ bits are either in $$$d$$$, in $$$e$$$, or in neither.
  • $$$d+(d\oplus e)=e$$$: This happens when $$$d$$$ is a subset of $$$e$$$, i.e. there is no $$$j$$$ where $$$d$$$ has its $$$j$$$ th bit set but $$$e$$$ does not have its $$$j$$$ th bit set. There are $$$3^i$$$ possibilities where this could occur, because there are $$$3$$$ possibilities for each bit: each of the $$$i$$$ bits are either in $$$d$$$, in both $$$d$$$ and $$$e$$$, or in neither.
  • $$$e+(d\oplus e)=d$$$: This happens when $$$e$$$ is a subset of $$$d$$$, similar to above case, there are $$$3^i$$$ possibilities where this could occur.

Now, we need to account for the intersections of these cases:

  • Both $$$d$$$ and $$$e$$$ are disjoint and $$$d$$$ is a subset of $$$e$$$ when $$$d=0$$$ -> $$$2^i$$$ possibilities
  • Both $$$d$$$ and $$$e$$$ are disjoint and $$$e$$$ is a subset of $$$d$$$ when $$$e=0$$$ -> $$$2^i$$$ possibilities
  • $$$d$$$ is a subset of $$$e$$$ and $$$e$$$ is a subset of $$$d$$$ when $$$d=e$$$ -> $$$2^i$$$ possibilities

Finally, all three cases occur at the same time when $$$d=e=0$$$ -> $$$1$$$ possibility

Thus, total number of pairs $$$(d,e)$$$ which lead to a non-degenerate triangle is: $$$4^i-3^i-3^i-3^i+2^i+2^i+2^i-1=4^i-3\cdot 3^i+3\cdot 2^i-1$$$

Finally, notice that for every pair $$$(d,e)$$$, there are $$$2^i$$$ triples $$$(a,b,c)$$$ where $$$(a\oplus b,a\oplus c)=(d,e)$$$. This is because for every pair $$$(d,e)$$$, any pair of the form $$$(a,b,c)=(f, f\oplus d, f\oplus e)$$$ for an arbitrary $$$i$$$-bit number $$$f$$$ will lead to $$$(a\oplus b,a\oplus c)=(d,e)$$$, and there are $$$2^i$$$ possibilities for $$$f$$$. Ergo, we need to multiply the above answer by $$$2^i$$$.

Case where 1 numbers have $$$i$$$ th bit set to 1

Without loss of generality, assume $$$a$$$ has $$$i$$$ th bit set to 1 and $$$b$$$ and $$$c$$$ have $$$i$$$ th bit set to 0. (We'll multiply the answer from this case by $$$3$$$ at the end to account for this arbitrary choice.) Thus, $$$a=2^i+f$$$ for some $$$0\leq f \leq lastVal$$$ and $$$b$$$ and $$$c$$$ are arbitrary $$$i$$$-bit numbers. The triangle has legs $$$2^i+(b\oplus f), 2^i+(c\oplus f), (b\oplus c)$$$. If we let $$$g$$$ and $$$h$$$ be $$$b\oplus f$$$ and $$$c\oplus f$$$, respectively, then the triangle has legs $$$2^i+g, 2^i+h, g\oplus h$$$. Like the last case, the number of possibilities for the pair $$$(g,h)$$$ is $$$2^i\cdot 2^i=4^i$$$.

There are two cases where the triangle could be degenerate (notice that $$$2^i+g+2^i+h=g\oplus h$$$ is not possible because the left side is at least $$$2^i$$$ while the right side is an $$$i$$$-bit number):

  • $$$2^i+g+(g\oplus h)=2^i+h$$$: This happens when $$$g$$$ is a subset of $$$h$$$, so there are $$$3^i$$$ possibilities
  • $$$2^i+h+(g\oplus h)=2^i+g$$$: This happens when $$$h$$$ is a subset of $$$g$$$, so there are $$$3^i$$$ possibilities

Finally, the intersection of both cases occur when $$$h=g$$$, so there are $$$2^i$$$ possibilities for the intersection

Thus, total number of pairs $$$(g,h)$$$ which lead to a non-degenerate triangle is: $$$4^i-3^i-3^i+2^i=4-2\cdot 3^i+2^i$$$

Finally, notice that for every pair $$$(g,h)$$$, there are $$$lastVal+1$$$ triples $$$(a,b,c)$$$ where $$$a=2^i+f$$$ and $$$(b\oplus f, c\oplus f)=(g,h)$$$. In the last case, this was $$$2^i$$$ because $$$f$$$ could be any arbitrary $$$i$$$-bit number, but here, it is $$$lastVal+1$$$ because we have the restriction $$$0\leq f\leq lastVal$$$, and there are only $$$lastVal+1$$$ $$$i$$$-bit numbers which satisfy this inequality. Ergo, we need to multiply the above answer by $$$lastVal+1$$$, and then multiply by $$$3$$$ to account for the arbitrary choice of making $$$a$$$ have the $$$1$$$ bit.

Case where 2 numbers have $$$i$$$ th bit set to 1

Without loss of generality, assume that both $$$a$$$ and $$$b$$$ have their $$$i$$$ th bit set to $$$1$$$ and $$$c$$$ has its $$$i$$$ th bit set to $$$0$$$. (We'll multiply the answer from this case by $$$3$$$ at the end to account for this arbitrary choice.) Then, $$$a=2^i+f$$$ and $$$b=2^i+g$$$ for $$$0\leq f,g\leq lastVal+1$$$ and $$$c$$$ is an arbitrary $$$i$$$-bit number. The legs of the triangle have lengths $$$f\oplus g, 2^i+(f\oplus c), 2^i+(g\oplus c)$$$, and there are $$$(lastVal+1)\cdot (lastVal+1)\cdot 2^i$$$ total possibilities for the triple $$$(f,g,c)$$$ (and there is a one-to-one mapping between triples $$$(a,b,c)$$$ and triples $$$(f,g,c)$$$ by the equations $$$a=2^i+f$$$ and $$$b=2^i+g$$$, so it suffices to count the number of triples $$$(f,g,c)$$$ for this case).

The triangle is degenerate in two cases: (notice that $$$2^i+(f\oplus c)+2^i+(g\oplus c)=f\oplus g$$$ is not possible because the left side is at least $$$2^i$$$ while the right side is an $$$i$$$-bit number):

  • $$$2^i+(f\oplus c)+(f\oplus g)=2^i+(g\oplus c)$$$: In this case, $$$(f\oplus c)+(f\oplus g)=(g\oplus c)$$$, meaning that $$$f\oplus c$$$ is a subset of $$$g\oplus c$$$. In terms of sets, XOR is like symmetric difference so this means $$$c$$$ contains all of $$$f\setminus g$$$ and is disjoint with $$$g\setminus f$$$. In other words, $$$c$$$ has a $$$1$$$ bit wherever $$$f$$$ has a $$$1$$$ bit and $$$g$$$ has a $$$0$$$ bit and $$$c$$$ has a $$$0$$$ bit wherever $$$g$$$ has a $$$1$$$ bit and $$$f$$$ has a $$$0$$$ bit. Ergo, if we fix $$$f$$$ and $$$g$$$, then every bit in $$$c$$$ where the bits of $$$f$$$ and $$$g$$$ differ is determined, but every other bit in $$$c$$$ is free, i.e. every other bit could be either $$$0$$$ or $$$1$$$. The bits where $$$f$$$ and $$$g$$$ differ are exactly those bits where $$$f\oplus g$$$ has a $$$1$$$ bit. Ergo, there are $$$popcnt(f\oplus g)$$$ bits where that bit in $$$c$$$ is determined, so $$$i-popcnt(f\oplus g)$$$ bits in $$$c$$$ are free, meaning there are $$$2^{i-popcnt(f\oplus g)}$$$ possible values for $$$c$$$. (Note that $$$popcnt$$$ is a function that takes in a nonnegative integer and returns the number of $$$1$$$ bits.) Ergo, the total number of triples $$$(f,g,c)$$$ where this occurs is: $$$\sum_{0\leq f\leq lastVal} \sum_{0\leq g\leq lastVal} 2^{i-popcnt(f\oplus g)}$$$
  • $$$2^i+(g\oplus c)+(f\oplus g)=2^i+(f\oplus c)$$$: Same number of possibilities as previous case, so we'll just multiply the number of possibilities we got from the previous case by two.

Finally, the intersection of both cases happens only when $$$f\oplus g=0$$$, i.e. when $$$f=g$$$. Since there are $$$lastVal+1$$$ possibilities for $$$f$$$ and, in this case, $$$c$$$ could be any $$$i$$$-bit number, there are $$$2^i\cdot (lastVal+1)$$$ possibilities for the intersection.

Thus, in total, the number of triplets where $$$(f,g,c)$$$ leads to a nondegenerate triangle is:

$$$2^i\cdot (lastVal+1)^2-2\left(\sum_{0\leq f\leq lastVal} \sum_{0\leq g\leq lastVal} 2^{i-popcnt(f\oplus g)}\right)+2^i\cdot lastVal$$$

Factor out $$$2^i$$$ from everything for convenience:

$$$2^i\left( (lastVal+1)^2-2\left(\sum_{0\leq f\leq lastVal} \sum_{0\leq g\leq lastVal} 2^{-popcnt(f\oplus g)}\right)+lastVal \right)$$$

Notice that negative exponents are OK because we are working in modular arithmetic, and $$$2$$$ is invertible modulo the modulus they give us.

Let $$$y_i=\sum_{0\leq f\leq b_i\dots 0} \sum_{0\leq g\leq b_i\dots b_0} 2^{-popcnt(f\oplus g)}$$$. I'll explain how to compute $$$y_i$$$ using DP at the end of this post, but for now, just notice that we can replace the big sum in the expression above with $$$y_{i-1}$$$.

Finally, don't forget to multiply by $$$3$$$ due to the arbitrary choice we made at the beginning.

Case where 3 numbers have $$$i$$$ th bit set to 1

In this case, there are numbers $$$f, g, h$$$ such that $$$0\leq f,g,h\leq lastVal$$$ such that $$$a=2^i+f, b=2^i+g, c=2^i+h$$$ and the legs of the triangle have lengths $$$f\oplus g, f\oplus h, g\oplus h$$$, i.e. the $$$i$$$ th bits cancel out when we take the XORs since every number has their $$$i$$$ th bit set. Ergo, the answer for this case is just the answer we computed for $$$n=b_{i-1}\dots b_0$$$ previously, so just use that previous answer for this case.

We finally went through all four cases, so we are basically done. Now we just need to figure out how to compute $$$y_i$$$ for all $$$i=0$$$ to $$$N-1$$$.

First, the base case is $$$y_{-1}=1$$$. I treat $$$b_{-1}\dots b_0$$$ as an "empty" number which is equal to $$$0$$$, and it is easy to verify that $$$\sum_{0\leq f\leq 0}\sum_{0\leq g\leq 0} 2^{-popcnt(f\oplus g)}=2^{-0}=1$$$.

Now let's handle the recurrence case: First, if $$$b_i=0$$$, then $$$y_i=y_{i-1}$$$ so we can just use the previous answer. Otherwise, if $$$b_i=1$$$, we split the sum into four cases:

  • $$$2^i\leq f, g\leq b_i\dots b_0$$$: In this case, both $$$f$$$ and $$$g$$$ has their $$$i$$$th bit set, so $$$f\oplus g$$$ cancels the $$$i$$$ th bit out. Ergo, this is the same as summing over $$$0\leq f,g\leq b_{i-1}\dots b_0$$$, so we just use the previous answer of $$$y_{i-1}$$$ in this case.
  • $$$2^i\leq f\leq b_i\dots b_0$$$ and $$$0\leq g\leq 2^i-1$$$: In this case, notice that for any fixed $$$f$$$, $$$f\oplus g$$$ is some $$$(i+1)$$$-bit number with its $$$i$$$ th bit set, which we can represent as $$$2^i+z$$$ for some $$$i$$$-bit number $$$z$$$. Moreover, for any $$$i$$$-bit number $$$z$$$, the only solution to the equation $$$f\oplus g=2^i+z$$$ is $$$g=f\oplus (2^i+z)$$$. In other words, for every $$$i$$$-bit number $$$z$$$, there is exactly one $$$g$$$ such that $$$f\oplus g=2^i+z$$$, so we have: $$$\sum_{0\leq g\leq 2^i-1} 2^{-popcnt(f\oplus g)}=\sum_{0\leq z\leq 2^i-1} 2^{-popcnt(2^i+z)}=\sum_{0\leq z\leq 2^i-1} 2^{-(1+popcnt(z))}=2^{-1}\sum_{0\leq z\leq 2^i-1}2^{-popcnt(z)}$$$. Let $$$x_i=\sum_{0\leq z\leq 2^i-1}2^{-popcnt(z)}$$$. I'll explain how to compute $$$x_i$$$ at the very end of this post. Thus, since $$$\sum_{0\leq g\leq 2^i-1} 2^{-popcnt(f\oplus g)}$$$ doesn't actually depend on the value of $$$f$$$, we find that $$$\sum_{2^i\leq f\leq b_i\dots b_0}\sum_{0\leq g\leq 2^i-1} 2^{-popcnt(f\oplus g)}=(lastVal+1)\cdot 2^{-1}\cdot x_i$$$, since there are $$$lastVal+1$$$ possible values for $$$f$$$.
  • $$$2^i\leq g\leq b_i\dots b_0$$$ and $$$0\leq f\leq 2^i-1$$$: Same as previous case, so we can just multiply the answer by the previous case by $$$2$$$, which gets rid of the $$$2^{-1}$$$
  • $$$0\leq f,g\leq 2^i-1$$$: Similar to the previous case, for any fixed $$$f$$$, and for any $$$i$$$-bit number $$$z$$$, there is exactly one $$$g$$$ such that $$$f\oplus g=z$$$, so $$$\sum_{0\leq g\leq 2^i-1} 2^{-popcnt(f\oplus g)}=\sum_{0\leq z\leq 2^i-1} 2^{-popcnt(z)}=x_i$$$. Since the value of this sum doesn't depend on $$$f$$$ and there are $$$2^i$$$ possible values of $$$f$$$, we find that $$$\sum_{0\leq f,g\leq 2^i-1} 2^{-popcnt(f\oplus g)}=2^i\cdot x_i$$$.

Now we know how to compute $$$y_i$$$, but we need to compute $$$x_i=\sum_{0\leq z\leq 2^i-1}2^{-popcnt(z)}$$$ for all $$$i=0$$$ to $$$N-1$$$. First, $$$x_0=1$$$, because $$$\sum_{0\leq z\leq 0} 2^{-popcnt(z)}=2^{-0}=1$$$.

Now let's handle the recurrence case: We split the sum into two cases:

  • $$$0\leq z\leq 2^{i-1}-1$$$: $$$\sum_{0\leq z\leq 2^{i-1}-1} 2^{-popcnt(z)}$$$ is just $$$x_{i-1}$$$, so we just use the previous answer of $$$x_{i-1}$$$ in this case.
  • $$$2^i\leq z\leq 2^i-1$$$: Here, every $$$2^i\leq z\leq 2^i-1$$$ can be written as $$$z=2^i+w$$$ for some $$$0\leq w\leq 2^{i-1}-1$$$, and $$$popcnt(z)=popcnt(2^i+w)=1+popcnt(w)$$$, so we have $$$\sum_{2^{i-1}\leq z\leq 2^i-1} 2^{-popcnt(z)}=\sum_{0\leq w\leq 2^{i-1}-1} 2^{-(1+popcnt(w))}=2^{-1} \sum_{0\leq w\leq 2^{i-1}-1} 2^{-popcnt(w)}=2^{-1}\cdot x_{i-1}$$$.

Ergo, the recurrence for $$$x_i$$$ is simply $$$x_i=(1+2^{-1})x_{i-1}$$$, and with this, we have all the info we need to solve the problem.

Read more »

 
 
 
 
  • Vote: I like it
  • +93
  • Vote: I do not like it

By Noble_Mushtak, history, 3 years ago, In English

Hello. As some of you may know, the Mount Allison Programming Showdown 2020 took place yesterday, and I had a question about the Divide and Conquer problem.

You can read the problem yourself, but basically, the problem gives you two integers $$$b$$$ and $$$d$$$, where $$$1 < b, d < 2^{63}$$$ and $$$d$$$ is prime, and then asks you if there exists an $$$m$$$ such that $$$b^m \equiv -1 \pmod{d}$$$. You need to output yes if such an $$$m$$$ exists and output no otherwise. (This is an oversimplification of the problem because the actual problem statement is quite long, so it is possible I misread what it is asking, but I am just giving this short summary of the problem so that I don't have to repeat the whole problem statement in this post.)

Here is how my solution works:

  • If $$$d=2$$$, then $$$m$$$ exists if and only if $$$b$$$ is odd.
  • If $$$d$$$ is an odd prime, then $$$m$$$ exists if and only if $$$b$$$ has even order in the group of units $$$\pmod{d}$$$.
    • PROOF: If $$$m$$$ exists, then we can let $$$m'$$$ is the smallest positive integer such that $$$b^{m'}\equiv -1\pmod{d}$$$. Thus, the order of $$$b$$$ must be greater than $$$m'$$$. Moreover, we have $$$b^{2m'}\equiv (-1)^2\equiv 1\pmod{d}$$$, so the order of $$$b$$$ must be at most $$$2m'$$$. Now, for any integer $$$m < n < 2m'$$$, it holds that $$$0 < n-m' < m'$$$, so $$$b^{n-m'}\not\equiv -1\pmod{d}$$$ since $$$m'$$$ is the smallest positive integer such that $$$b^{m'}\equiv -1\pmod{d}$$$ and $$$n-m' < m'$$$. Thus, $$$b^n\equiv b^mb^{n-m}\equiv -b^{n-m}\not\equiv 1\pmod{d}$$$, so the order of $$$b$$$ is not $$$n$$$ for any $$$m' < n < 2m'$$$. Therefore, the order of $$$b$$$ must be $$$2m'$$$, so the order of $$$b$$$ is even.
    • PROOF CONTINUED: If $$$b$$$ has even order, then let $$$2m$$$ be the order of $$$b$$$. Then, $$$b^{2m}\equiv (b^m)^2\equiv 1\pmod{d}$$$. Thus, either $$$b^m\equiv -1\pmod{d}$$$ or $$$b^m\equiv 1\pmod{d}$$$. However, since $$$2m$$$ is the order of $$$b$$$, $$$b^m\not\equiv 1\pmod{d}$$$, so it must be that $$$b^m\equiv -1\pmod{d}$$$.
    • Now, to see if the order of $$$b$$$ is even, calculate $$$b^n\pmod{d}$$$, where $$$n$$$ is the biggest odd factor of $$$d-1$$$. If $$$b^n\equiv 1\pmod{d}$$$, then the order of $$$b$$$ divides $$$n$$$, so $$$b$$$ has odd order, so output no.
    • Otherwise, if $$$b^n\not\equiv 1\pmod{d}$$$, then the order of $$$b$$$ does not divide $$$n$$$. Since the order of $$$b$$$ must divide $$$d-1$$$, this means that the order of $$$b$$$ must be even, so output yes.

However, when I submit a solution off the above methods, I get Wrong Answer. Can anyone see what I am doing wrong? I have included my C++ code below. Thanks for the help.

#pragma GCC optimize("Ofast")
#pragma GCC optimize("unroll-loops")
#pragma GCC target("sse,sse2,sse3,ssse3,sse4,popcnt,abm,mmx,avx,avx2")
#include <bits/stdc++.h>

typedef int64_t num;

static inline num multModulo(num num1, num num2, num MOD) {
    return ( static_cast<__int128>(num1) * num2) % MOD;
}

num calcModuloExp(num base, num exp, num MOD) {
    num result = 1, cur = base;
    while (exp) {
        if (exp & 1) result = multModulo(result, cur, MOD);
        cur = multModulo(cur, cur, MOD), exp >>= 1;
    }
    return result;
}

int main() {
    num b, d;
    scanf("%" PRId64 " %" PRId64, &b, &d);
    if (d == 2) {
        puts((b & 1) ? "yes" : "no");
    } else {
        num biggestOddFactor = d-1;
        while (!(biggestOddFactor & 1)) biggestOddFactor >>= 1;
        num answer = calcModuloExp(b, biggestOddFactor, d);
        puts((answer == 1) ? "no" : "yes");
    }

    return 0;
}

Read more »

 
 
 
 
  • Vote: I like it
  • +3
  • Vote: I do not like it

By Noble_Mushtak, history, 3 years ago, In English

Usually, I hear people saying that std::cin and std::cout is a relatively slow form of input while scanf and printf is a faster form of input which should be sufficient for all problems. However, sometimes I wonder if there are any problems where scanf and printf would be too slow.

For example, look at the problem 1181D — Irrigation. This problem asks us to read in 500,000 numbers, each of which could be as large as 500,000, and then asks us to read in 500,000 more numbers, each of which could be as large as $$$10^{18}$$$. Clearly, this is a lot of input/output, so users should use scanf and printf over std::cin and std::cout. However, even when I used scanf and printf, my solution ran in 2355 ms, barely under the time limit of 2.5 seconds. On the other hand, when I wrote my code using an I/O library I wrote myself, which uses fread and fwrite to read/write 32768 bytes at a time, my improved solution ran in 592 ms, almost 4 times faster than the original solution.

Since my code barely ran under time using scanf and printf, I wonder if there are any CodeForces problems where using scanf and printf will inevitably lead to a Time Limit Exceeded error and users must use some other I/O method like I did. Has anyone else encountered a problem where scanf and printf just weren't quite fast enough? Moreover, has anyone else built a custom I/O library for competitive programming like I did? If anyone has knows how to do input/output faster than fread and fwrite, I would love to hear about it.

Read more »

 
 
 
 
  • Vote: I like it
  • +5
  • Vote: I do not like it

By Noble_Mushtak, history, 3 years ago, In English

Recently, I have become interested in x86 assembly, and I wanted to try out my skills on CodeForces using __asm__ extension in GCC. For example, I was very pleased when I solved problem 495B using a very naive algorithm implemented in assembly which would probably have TLEd if it was written in regular C (basically, looped all the way up to $$$\frac{n}{4}$$$ to find all divisors of $$$n$$$ for $$$n \leq 10^9$$$ rather than using $$$O(\sqrt{n})$$$ algorithm).

However, this algorithm was implemented by using only 32-bit numbers and registers, so I was curious as to how the performance would be affected by using 64-bit numbers and registers. Therefore, I tried to use the %rax, %rbx, etc. registers instead of the %eax, %ebx, etc. registers, but for some reason, this gave me a "bad register" error. I then tried running my code using Clang++17 Diagnostics and this gave me a much clearer error:

p71.cpp:24:13: error: register %rax is only available in 64-bit mode
    __asm__("push %%rax \n\t"

Of course, this implies that CodeForces is compiling our programs using 32-bit mode, so using 64-bit integers is even slower than it would normally be, because GCC/G++/Clang++ has to manipulate multiple registers in order to implement 64-bit operations in 32-bit x86. The fact that CodeForces uses 32-bit mode surprised me very much, since 64-bit has been the standard for computers and even smartphones for years. In fact, Ubuntu recently decided to drop its support for 32-bit computers in its upcoming versions, showing how old 32-bit really is.

Given all this, why does CodeForces continue to use GCC/G++/Clang++ in 32-bit version? Is this to penalize programmers from using 64-bit integers when it's not necessary, or do they use 32-bit mode for a more technical reason? I am not sure using 32-bit mode makes a huge difference since I find most problems can be solved using only 32-bit numbers anyway, but I am very curious as to why CodeForces made the decision to choose 32-bit over 64-bit.

Read more »

 
 
 
 
  • Vote: I like it
  • +17
  • Vote: I do not like it

By Noble_Mushtak, history, 3 years ago, In English

On SPOJ, there is a problem named POLYMUL, which is pretty straightforward: Multiply two polynomials, which have integer coefficients in the interval $$$[0, 1000]$$$ and can each have a degree of up to $$$10000$$$.

Currently, I am trying to solve this problem with NTT and my solution can be found here. As the solution shows, I am doing NTT with the primes of both $$$998244353$$$ and $$$2013265921$$$, and then using Chinese Remainder Theorem in order to find the true coefficients without any modulo (I am assuming the coefficients are less than $$$1000^2\cdot 10000=10^{10}$$$, which is much less than the product of $$$998244353$$$ and $$$2013265921$$$). On my machine, it takes about 3 seconds for this solution to solve a test case with $$$T=10$$$ (i.e. $$$10$$$ pairs of polynomial) where each polynomial in each pair is of degree $$$10000$$$ and the coefficients are integers randomly picked from the interval $$$[0, 1000]$$$.

However, the time limit is 1 second, and I do not know what to do to optimize my solution more. Does any one have any suggestions on how to optimize my NTT algorithm or how to optimize my C code in general? Any solutions would be very welcome. Thanks!

Read more »

 
 
 
 
  • Vote: I like it
  • +11
  • Vote: I do not like it

By Noble_Mushtak, history, 6 years ago, In English

All solutions below are thoroughly commented and based on the official editorial. I might start doing this for one or two editorials every month starting now because writing alternate solutions for 701C and 701D and writing solutions for 701E and 701F (could not figure out on my own) really helped me understand the problems better. Also, I'm sorry if the later solutions are more clunky; I'm not used to solving problems of that difficulty. Thanks!

701A — Cards

Their explanation is pretty straight-forward, so here's the solution.

701B — Cells Not Under Attack

Here, I deviate from the editorial in two ways:

  • Because I am using C and do not have set, I instead mark rows and columns as attacked in an array of bools and count the number of attacked rows and columns as I go along.
  • Their formula is n·n - cnt where cnt = szY·n + szX·n - szX·szY. This gives us n·n - szY·n - szX·n + szX·szY = (n - szX)(n - szY). I use the latter in my solution.

Here is my solution for this problem.

701C — They Are Everywhere

The confusing part of their answer is that they're not really clear about when they're talking about looping through the different types of letters in s or all of the letters in order. Here is my implementation of their solution.

701D — As Fast As Possible

Here is my binary search solution. The problem with their explanation here is it's out of order, they never actually show their equation for what I called x, their variable posM convoluted their formula for x, and they never actually solve for what I called y but instead only briefly mention that solutions must account for the bus going back.

701E — Connecting Universities

I think the way they explained 701E pretty well other than the fact that lv is unnecessary since lv  =  1 for all v. However, it is missing lots of detail if you're not so good at working with trees. Here's my implementation of the editorial solution.

701F — Break Up

I could not understand from the editorial's explanation how to find the second edge in the case of two edges until I looked at the dfs2() function from this solution to 701F from Vosatorp, so thanks to them for this solution!

Basically, in our depth first search, if an edge from va to vb in path from s to t is unnecessary, then there will contain a cycle with vb and a vertex before vb in the path. This is because since the edge is unnecessary, there exists another path from s to t without that edge. Let vl be the vertex farthest along the original path that is before vb on this new path. and vr be the vertex closest to vb that is after it on this new path. Then, using the old path, we get to vb, then vr and using the new path, we get to vl. Thus, by this process, we have created a cycle with both vb and vl, proving our original point. Ergo, by contrapositives, edges where vb is not in any cycle with a previous vertex in the path are necessary.

This process seems to be a variation on Tarjan's Bridge-Finding algorithm which I did not know before.

Other than that proof of correctness, everything else is pretty much explained in the comments of my re-writing of Vosatorp's solution in C, without vector.

700D — Huffman Coding on Segment

Coming when I have enough time/experience to understand the solution!

700E — Cool Slogans

Coming when I have enough time/experience to understand the solution!

Read more »

 
 
 
 
  • Vote: I like it
  • +12
  • Vote: I do not like it

By Noble_Mushtak, history, 6 years ago, In English

There's this really good article on how to find the maximal empty subgrid of a grid so one might wonder, "How do we find the number of empty subgrids of a grid?" I first saw the answer to the second question in this solution to CHSGMNTS on CodeChef by lohit_97 (found on CodeChef and on CodeForces), so thanks to them for the following algorithm!

First, here is a more formalized way of saying the problem:

  • The first line of input contains two integers width and height (where 1 ≤ width, height ≤ 1000).
  • The next height lines of input contain width integers that fit into long in C/C++.
  • Let matrix[i][j] be the integer number i on row number j.
  • Find the number of tuples (a, b, c, d) such that matrix[i][j] is 0 for all a ≤ i ≤ c and b ≤ j ≤ d.

The way we can solve this problem is by keeping an array emptyToLeft[i][j] which is the maximum number such that i - emptyToLeft[i][j] < x ≤ i implies matrix[x][j] is 0. (If matrix[i][j] is not 0, then we just set emptyToLeft[i][j] to 0 to make the inequality always false.)

Once we have that, we loop through each column i and in each column, we make a stack and a tempAnswer = 0. In the bottom of a stack, we start with  - 1 and as we loop through each row j, each row is popped on and some rows are popped off in order to preserve the property that between two consecutive stack elements a and b where b is directly above a, we have a < y ≤ b implies emptyToLeft[i][y] ≥ emptyToLeft[i][b]. By the definition of emptyToLeft, this means i - emptyToLeft[i][b] < x ≤ i and a < y ≤ b implies matrix[x][y] is 0. Thus, all such (x, y) as (a, b) and (i, j) as (c, d) in the tuple (a, b, c, d) is an answer. Therefore, two consecutive stack elements a and b give us (b - a) * emptyToLeft[i][b] answers, so we add that to tempAnswer when we add b to the stack and subtract that from tempAnswer when we pop b off. Finally, the answer is simply the sum of all tempAnswers across all i, j.

Here is the C code for this algorithm:

#include <stdio.h>
#include <stdlib.h>
#define REPEAT(token, num) for (token = 0; token < num; token++)

typedef long num_cells;
typedef long cell;
typedef long long num_matrices;

cell matrix[1000][1000];
num_cells width, height, emptyToLeft[1000][1000], stack[1000], stackLength;
num_matrices answer, tempAnswer;

int main() {
    num_cells i, j, prevToLeft;
    //Get the input for the matrix:
    scanf("%li %li", &width, &height);
    REPEAT(j, height) REPEAT(i, width) scanf("%li", matrix[i]+j);

    //Create emptyToLeft by looping through the rows:
    REPEAT(j, height) {
        //This is the previous emptyToLeft, which starts out at 0 because at the beginning of a row, there are no elements to the left.
        prevToLeft = 0;
        //Loop through each column:
        REPEAT(i, width) {
            //If matrix[i][j] != 0, then emptyToLeft[i][j] = 0.
            if (matrix[i][j]) emptyToLeft[i][j] = 0;
            //Otherwise, add one to prevToLeft.
            else emptyToLeft[i][j] = prevToLeft+1;
            //Also, update prevToLeft:
            prevToLeft = emptyToLeft[i][j];
        }
    }

    REPEAT(i, width) {
        //For each column, create a stack with -1 at the bottom.
        stack[0] = -1, stackLength = 1;
        //Also, reset tempAnswer:
        tempAnswer = 0;
        //Loop through each row:
        REPEAT(j, height) {
            //If emptyToLeft[i][stack[stackLength-1]] >= emptyToLeft[i][j], then putting j onto the stack will violate the stack property, so we need to pop the stack:
            while (stack[stackLength-1] != -1 && emptyToLeft[i][stack[stackLength-1]] >= emptyToLeft[i][j]) {
                //Update tempAnswer now that we are going to pop the stack:
                //Note that here, stack[stackLength-1] is described as b above while stack[stackLength-2] is a.
                tempAnswer -= (stack[stackLength-1]-stack[stackLength-2])*emptyToLeft[i][stack[stackLength-1]];
                //Pop the stack:
                stackLength--;
            }
            //Add the current row to the stack and update tempAnswer accordingly.
            stack[stackLength++] = j;
            tempAnswer += (stack[stackLength-1]-stack[stackLength-2])*emptyToLeft[i][stack[stackLength-1]];
            //Finally, add tempAnswer to answer across all i, j.
            answer += tempAnswer;
        }
    }
    //Print the answer:
    printf("%lli\n", answer);
    
    exit(0);
}

Notice that we have a while loop inside of the j loop, giving us a third inner loop and thus possibly O(height2width) complexity. However, each run of the while loop pops an element off the stack and we only add an element to the stack for each j, meaning overall, for one i, we can do at most height pops. Thus, the j loop is O(height) for each i and the while loop is O(height) for each i, which gives us O(height·width) complexity for this problem.

Read more »

 
 
 
 
  • Vote: I like it
  • -20
  • Vote: I do not like it

By Noble_Mushtak, history, 6 years ago, In English

After the contest, I read other people's solution to Vacations and people kept using big dynamic programming arrays on Div. 2 Problem C, which confused me, so I decided to show all of you that this was unnecessary by solving this problem using only one uint_32 variable in C. Using bit fields in a struct, we can get this down to only three bytes.

The intuition behind this solution is that you only need to keep track of what is possible from the day before and if nothing is possible, then you have to rest this day and everything will be possible the next day. I think it's kind of a greedy algorithm because unlike the DP algorithms, which seemed to take into account solutions where we rest early in order to do something the next day, this algorithm does something on every day when it can and rests only when it finds that it needs to. I do this because whether or not we rest early or the day after when we have to is irrelevant: We still end up resting one day, so the answer is the same. Thus, no dynamic programming is necessary and we only need to keep track of one answer at every point.

Read more »

 
 
 
 
  • Vote: I like it
  • +10
  • Vote: I do not like it

By Noble_Mushtak, history, 6 years ago, In English

Here is my analysis of the solution to Contest 362 Div. 1 Problem A:

  • There are edges per query.
  • Each edge takes to process. There are added every query and O(q) queries, so we have edges, meaning each edge is to process.
  • Finally, there are O(q) queries.

This gives us overall. However, the editorial does not have the term and simply has . How did they get rid of the term in the analysis? Did I do something wrong here?

Read more »

 
 
 
 
  • Vote: I like it
  • -3
  • Vote: I do not like it