Blog entries - Codeforces

#	User	Rating
1	ecnerwala	3649
2	Benq	3581
3	orzdevinwang	3570
4	Geothermal	3569
4	cnnfls_csy	3569
6	tourist	3565
7	maroonrk	3531
8	Radewoosh	3521
9	Um_nik	3482
10	jiangly	3468

#	User	Contrib.
1	maomao90	174
2	awoo	164
3	adamant	161
4	TheScrasse	159
5	nor	158
6	maroonrk	156
7	-is-this-fft-	152
8	SecondThread	147
9	orz	146
10	pajenegod	145

Noble_Mushtak's blog

Every Technique and Algorithm I Used to Become Grandmaster

By Noble_Mushtak, history, 15 months ago, In English

Hi everyone, I became grandmaster today so I decided to go down memory lane and review every problem I have ever done in a CodeForces contest to look back and see every technique and algorithm I had to use to become a grandmaster. I am not sure how useful this is, I mostly think it's fun to look at all the old solutions I wrote and how different it was from how I write code now. The main takeaway is that, as many people have said before, you don't need to know or memorize a bunch of advanced algorithms to become grandmaster. As you will see, there are many problems where I just used "ad hoc reasoning," meaning there's not a standard technique I used to solve the problem and you just need to make some clever mathematical observations to solve the problem. Also, there are many popular algorithms that I have never needed in a CodeForces contest despite being a grandmaster:

Sparse tables, Fenwick trees, and segment trees
String algos, like rolling hashes, Knuth-Morris-Pratt, suffix arrays, and Aho-Corasick
Shortest path algos, like Dijkstra's algorithm, Bellman-Ford, and Floyd-Warshall
Flow algos, like Edmonds-Karp, Dinic's, and push-relabel
Convex hull trick
Fast Fourier transform and number theoretic transform

This isn't to say that learning algorithms is bad—I love reading cp-algorithms.com and learning new algorithms—and I have used many of these algorithms in other contests, like CodeChef, ICPC, and Google Code Jam, but practicing consistently and getting better at ad hoc reasoning and your general problem-solving abilities matter much more.

CodeForces Round #362 (Div. 2): Rating went from 1500 to 1504 (back then, default rating was 1500 instead of 0)

Problem A: Ad hoc reasoning and math
Problem B: Basic string manipulation

Codeforces Round #363 (Div. 2): Rating went from 1504 to 1536 (+36)

Problem A: Brute force
Problem B: Brute force
Problem C: Ad hoc reasoning and basic bitmasks (there is a dynamic programming solution, but that's not how I solved it)

Codeforces Round #553 (Div. 2): Rating went from 1540 to 1566 (+26)

Problem A: Brute force
Problem B: Brute force and bitwise operations
Problem C: Ad hoc reasoning and math

Codeforces Round #570 (Div. 3): Rating went from 1566 to 1732 (+166)

Problem A: Basic number theory
Problem B: Ad hoc reasoning and math
Problem C: Ad hoc reasoning and math
Problem D: Greedy and priority queue
Problem E: Ad hoc reasoning with strings (there is a BFS solution, but that's not how I solved it)
Problem H: Dynamic programming (I used the count unique subsequences of a given length algorithm)

Educational CodeForces Round 68: Rating went from 1732 to 1766 (+44)

Problem A: Ad hoc reasoning and math
Problem B: Brute force and row/column sums
Problem C: Ad hoc reasoning with strings and subsequences
Problem D: Game theory

CodeForces Global Round 4: Rating went from 1776 to 1807 (+31)

Problem A: Greedy
Problem B: Dynamic programming
Problem C: Binary exponentiation
Problem D: Ad hoc reasoning with graphs
Problem E: Deques

Educational Codeforces Round 74: Rating went from 1807 to 1920 (+113)

Problem A: Ad hoc reasoning and math
Problem B: Greedy and sorting
Problem C: Greedy
Problem D: Ad hoc reasoning and combinatorics

Codeforces Round #600 (Div. 2): Rating went from 1920 to 1877 (-43)

Problem A: Ad hoc reasoning and math
Problem C: Sorting, prefix sums, and dynamic programming
Problem D: Depth-first search and connected components

Codeforces Round #601 (Div. 2): Rating went from 1877 to 2005 (+128)

Problem A: Ad hoc reasoning and math
Problem B: Ad hoc reasoning with graphs and sorting
Problem C: Ad hoc reasoning with graphs and permutations
Problem D: Ad hoc reasoning with grids
Problem E1: Greedy

Codeforces Global Round 14: Rating went from 2005 to 1877 (-128)

Problem A: Ad hoc reasoning with sorting
Problem B: Ad hoc reasoning with number theory
Problem C: Greedy and priority queue

Deltix Round, Spring 2021: Rating went from 1877 to 1975 (+98)

Problem A: Ad hoc reasoning with simulations
Problem B: Ad hoc reasoning and math
Problem C: Stacks

CodeForces LATOKEN Round 1: Rating went from 1975 to 1974 (-1)

Problem A: Ad hoc reasoning with grids
Problem B: Ad hoc reasoning and math
Problem C: Cycle decomposition and binary exponentiation
Problem D: Ad hoc reasoning with trees

Codeforces Round #729 (Div. 2): Rating went from 1974 to 2132 (+158)

Problem A: Ad hoc reasoning and math
Problem B: Ad hoc reasoning with number theory
Problem C: Ad hoc reasoning with number theory
Problem D: Binary exponentiation and dynamic programming
Problem E1: Dynamic programming and combinatorics (I used the count permutations with given number of inversions algorithm, the numbers of permutations with a given number of inversions are also known as Mahonian numbers)

Hello 2022: Rating went from 2132 to 2085 (-47)

Problem A: Ad hoc reasoning with grids
Problem B: Greedy
Problem C: Cycle decomposition

Codeforces Round #765 (Div. 2): Rating went from 2085 to 2092 (+7)

Problem A: Greedy and bitwise operations
Problem B: Ad hoc reasoning and maps
Problem C: Dynamic programming

Educational Codeforces Round 122: Rating went from 2092 to 2168 (+76)

Problem A: Ad hoc reasoning with number theory
Problem B: Greedy
Problem C: Brute force
Problem D: Dynamic programming
Problem E: Kruskal's algorithm and binary search tree maps

Codeforces Round #808 (Div. 1): Rating went from 2168 to 2138 (-30)

Problem A: Greedy
Problem B: Ad hoc reasoning with simulations

Codeforces Global Round 23: Rating went from 2138 to 2239 (+101)

Problem A: Ad hoc reasoning and math
Problem B: Greedy with stacks
Problem C: Ad hoc reasoning with permutations
Problem D: Tree DP
Problem E: Ad hoc reasoning, with inspiration from binary search

Codeforces Global Round 24: Rating went from 2239 to 2291 (+52)

Problem A: Ad hoc reasoning and math
Problem B: Ad hoc reasoning with number theory
Problem C: Prefix sums and maps
Problem D: Modular arithmetic, extended Euclidean algorithm, and combinatorics
Problem E: Greedy, sorting, and binary search tree sets

Good Bye 2022: 2023 is NEAR: Rating went from 2291 to 2358 (+67)

Problem A: Greedy and binary search tree sets
Problem B: Ad hoc reasoning with permutations
Problem C: Ad hoc reasoning with number theory (I did use Miller-Rabin, but only to generate the primes <100, which could have been hardcoded instead)
Problem D: Disjoint set union, and I also used DFS trees for the intuition behind this solution
Problem E: Tree DP and extended Euclidean algorithm

Hello 2023: Rating went from 2358 to 2331 (-27)

Problem A: Ad hoc reasoning with simulations
Problem B: Ad hoc reasoning and math
Problem D: IntervalContainer and maps
Problem E: Sorting and ad hoc reasoning with graphs and strongly connected components

TypeDB Forces 2023: Rating went from 2331 to 2416 (+85)

Problem A: Ad hoc reasoning and math
Problem B: Miller-Rabin, Pollard's rho algorithm, and number theory
Problem C: Dynamic programming
Problem D: Tarjan's SCC algorithm and DP over directed acylic graphs
Problem E: Greedy and bitwise operations
Problem F: Binary exponentiation, extended Euclidean algorithm, and cycle decomposition

Full text and comments »

+213

Noble_Mushtak
15 months ago
19

Alternate Implementation of Li Chao Tree?

By Noble_Mushtak, history, 21 month(s) ago, In English

I recently found an implementation of Li Chao tree in this ICPC notebook, and it is quite different than the implementations I have seen before. Usually, I see the insert_line method implemented in a way similar to that described in this cp-algorithms.com article, where you recur either on the left child or the right child depending on if the new line is above or below the current line at the midpoint of the interval. However in this implementation, they recur on both the left and the right child, but they stop recurring once the new line is completely below or completely above the current line in that node.

Does this implementation still handle inserting lines in $$$O(\log n)$$$ time in the worst case? I tried using this implementation on the Frog 3 problem on AtCoder and it got AC, but I am wondering if it is due to weak test cases, or if this implementation is actually valid and really handles inserting lines in $$$O(\log n)$$$ time. I wasn't able to find a counterexample where inserting a line would cause the implementation to traverse the whole tree, but recurring on both the left and the right children seems strange to me.

Full text and comments »

#convex hull trick, #li chao tree

Noble_Mushtak
21 month(s) ago
5

Solution to #810 Div. 1 Problem C: XOR Triangle

By Noble_Mushtak, history, 21 month(s) ago, In English

I think the editorial for Round#810 is not out yet, and I found a solution for 810C: XOR Triangle which I thought was interesting, even if my solution is probably more complicated the intended solution, so I wanted to write it up both because I want to share it with others and because I feel like explaining a solution can help myself remember it better. My implementation is here.

First, notice that, since every leg of the triangle $$$a\oplus b, a\oplus c, b\oplus c$$$ is the XOR of the other two legs in the triangle, and it is well-known that $$$x+y\geq x\oplus y$$$ for all positive integers $$$x, y$$$, the triangle is degenerate if and only if $$$x+y=x\oplus y$$$ for some legs $$$x, y$$$ of the triangle, i.e. $$$x+y < x\oplus y$$$ is not a possibility. I will be using this assumption implicitly throughout my solution.

Let's say the binary string in the test case is $$$b_{N-1} \dots b_0$$$ (rightmost bit is labelled $$$b_0$$$, because it is the least significant bit). The first idea behind my solution is that you can solve it using DP, where you loop from right to left in the binary string and compute the answer for $$$n=b_i\dots b_0$$$ for all $$$i=0$$$ to $$$N-1$$$, where $$$N$$$ is the length of the given binary string. That is, you compute the answer for every $$$n$$$ which is a suffix of the binary string in the given test case.

Now, let's say we are trying to find the answer for $$$n=b_i\cdots b_0$$$. If $$$b_i=0$$$, then the answer is just whatever the last answer for $$$n=b_{i-1}\cdots b_0$$$ was, which we already know from the previous DP answer. Otherwise if $$$b_i=1$$$, we have to do some work.

If $$$b_i=1$$$, I split the answer into four cases and then I add the answers from each of the four cases up to get the total answer:

Number of triples where exactly 0 of $$$a,b,c$$$ have their $$$i$$$ th bit set to 1
Number of triples where exactly 1 of $$$a,b,c$$$ have their $$$i$$$ th bit set to 1
Number of triples where exactly 2 of $$$a,b,c$$$ have their $$$i$$$ th bit set to 1
Number of triples where exactly 3 of $$$a,b,c$$$ have their $$$i$$$ th bit set to 1

For the rest of the post, let $$$lastVal$$$ be the value of $$$b_{i-1}\cdots b_0$$$, mod $$$998244353$$$. It is easy to compute $$$lastVal$$$ for all $$$i$$$ using a simple for loop, adding $$$2^i$$$ wherever $$$b_i$$$ is set in the binary string.

Case where 0 numbers have $$$i$$$ th bit set to 1

In this case, since the $$$i$$$th bit of all of $$$a,b,c$$$ are $$$0$$$, it is obvious that $$$a,b,c < n$$$, so $$$a,b,c$$$ are just arbitrary $$$i$$$-bit numbers. Let $$$d=a\oplus b$$$ and $$$e=a\oplus c$$$. Then, the legs of the triangle have lengths $$$d, e, d\oplus e$$$. Again, $$$d$$$ and $$$e$$$ are just arbitrary $$$i$$$-bit numbers, since $$$a,b,c$$$ are arbitrary $$$i$$$-bit numbers. Overall, there are $$$2^i\cdot 2^i$$$ possible values of $$$(d, e)$$$, and there are three cases where the triangle is degenerate:

$$$d+e=d\oplus e$$$: This happens when $$$d$$$ and $$$e$$$ are disjoint, i.e. there is no $$$j$$$ where both $$$d$$$ and $$$e$$$ have their $$$j$$$th bit set to $$$1$$$. There are $$$3^i$$$ possibilities where this could occur, because there are $$$3$$$ possibilities for each bit: each of the $$$i$$$ bits are either in $$$d$$$, in $$$e$$$, or in neither.
$$$d+(d\oplus e)=e$$$: This happens when $$$d$$$ is a subset of $$$e$$$, i.e. there is no $$$j$$$ where $$$d$$$ has its $$$j$$$ th bit set but $$$e$$$ does not have its $$$j$$$ th bit set. There are $$$3^i$$$ possibilities where this could occur, because there are $$$3$$$ possibilities for each bit: each of the $$$i$$$ bits are either in $$$d$$$, in both $$$d$$$ and $$$e$$$, or in neither.
$$$e+(d\oplus e)=d$$$: This happens when $$$e$$$ is a subset of $$$d$$$, similar to above case, there are $$$3^i$$$ possibilities where this could occur.

Now, we need to account for the intersections of these cases:

Both $$$d$$$ and $$$e$$$ are disjoint and $$$d$$$ is a subset of $$$e$$$ when $$$d=0$$$ -> $$$2^i$$$ possibilities
Both $$$d$$$ and $$$e$$$ are disjoint and $$$e$$$ is a subset of $$$d$$$ when $$$e=0$$$ -> $$$2^i$$$ possibilities
$$$d$$$ is a subset of $$$e$$$ and $$$e$$$ is a subset of $$$d$$$ when $$$d=e$$$ -> $$$2^i$$$ possibilities

Finally, all three cases occur at the same time when $$$d=e=0$$$ -> $$$1$$$ possibility

Thus, total number of pairs $$$(d,e)$$$ which lead to a non-degenerate triangle is: $$$4^i-3^i-3^i-3^i+2^i+2^i+2^i-1=4^i-3\cdot 3^i+3\cdot 2^i-1$$$

Finally, notice that for every pair $$$(d,e)$$$, there are $$$2^i$$$ triples $$$(a,b,c)$$$ where $$$(a\oplus b,a\oplus c)=(d,e)$$$. This is because for every pair $$$(d,e)$$$, any pair of the form $$$(a,b,c)=(f, f\oplus d, f\oplus e)$$$ for an arbitrary $$$i$$$-bit number $$$f$$$ will lead to $$$(a\oplus b,a\oplus c)=(d,e)$$$, and there are $$$2^i$$$ possibilities for $$$f$$$. Ergo, we need to multiply the above answer by $$$2^i$$$.

Case where 1 numbers have $$$i$$$ th bit set to 1

Without loss of generality, assume $$$a$$$ has $$$i$$$ th bit set to 1 and $$$b$$$ and $$$c$$$ have $$$i$$$ th bit set to 0. (We'll multiply the answer from this case by $$$3$$$ at the end to account for this arbitrary choice.) Thus, $$$a=2^i+f$$$ for some $$$0\leq f \leq lastVal$$$ and $$$b$$$ and $$$c$$$ are arbitrary $$$i$$$-bit numbers. The triangle has legs $$$2^i+(b\oplus f), 2^i+(c\oplus f), (b\oplus c)$$$. If we let $$$g$$$ and $$$h$$$ be $$$b\oplus f$$$ and $$$c\oplus f$$$, respectively, then the triangle has legs $$$2^i+g, 2^i+h, g\oplus h$$$. Like the last case, the number of possibilities for the pair $$$(g,h)$$$ is $$$2^i\cdot 2^i=4^i$$$.

There are two cases where the triangle could be degenerate (notice that $$$2^i+g+2^i+h=g\oplus h$$$ is not possible because the left side is at least $$$2^i$$$ while the right side is an $$$i$$$-bit number):

$$$2^i+g+(g\oplus h)=2^i+h$$$: This happens when $$$g$$$ is a subset of $$$h$$$, so there are $$$3^i$$$ possibilities
$$$2^i+h+(g\oplus h)=2^i+g$$$: This happens when $$$h$$$ is a subset of $$$g$$$, so there are $$$3^i$$$ possibilities

Finally, the intersection of both cases occur when $$$h=g$$$, so there are $$$2^i$$$ possibilities for the intersection

Thus, total number of pairs $$$(g,h)$$$ which lead to a non-degenerate triangle is: $$$4^i-3^i-3^i+2^i=4-2\cdot 3^i+2^i$$$

Finally, notice that for every pair $$$(g,h)$$$, there are $$$lastVal+1$$$ triples $$$(a,b,c)$$$ where $$$a=2^i+f$$$ and $$$(b\oplus f, c\oplus f)=(g,h)$$$. In the last case, this was $$$2^i$$$ because $$$f$$$ could be any arbitrary $$$i$$$-bit number, but here, it is $$$lastVal+1$$$ because we have the restriction $$$0\leq f\leq lastVal$$$, and there are only $$$lastVal+1$$$ $$$i$$$-bit numbers which satisfy this inequality. Ergo, we need to multiply the above answer by $$$lastVal+1$$$, and then multiply by $$$3$$$ to account for the arbitrary choice of making $$$a$$$ have the $$$1$$$ bit.

Case where 2 numbers have $$$i$$$ th bit set to 1

Without loss of generality, assume that both $$$a$$$ and $$$b$$$ have their $$$i$$$ th bit set to $$$1$$$ and $$$c$$$ has its $$$i$$$ th bit set to $$$0$$$. (We'll multiply the answer from this case by $$$3$$$ at the end to account for this arbitrary choice.) Then, $$$a=2^i+f$$$ and $$$b=2^i+g$$$ for $$$0\leq f,g\leq lastVal+1$$$ and $$$c$$$ is an arbitrary $$$i$$$-bit number. The legs of the triangle have lengths $$$f\oplus g, 2^i+(f\oplus c), 2^i+(g\oplus c)$$$, and there are $$$(lastVal+1)\cdot (lastVal+1)\cdot 2^i$$$ total possibilities for the triple $$$(f,g,c)$$$ (and there is a one-to-one mapping between triples $$$(a,b,c)$$$ and triples $$$(f,g,c)$$$ by the equations $$$a=2^i+f$$$ and $$$b=2^i+g$$$, so it suffices to count the number of triples $$$(f,g,c)$$$ for this case).

The triangle is degenerate in two cases: (notice that $$$2^i+(f\oplus c)+2^i+(g\oplus c)=f\oplus g$$$ is not possible because the left side is at least $$$2^i$$$ while the right side is an $$$i$$$-bit number):

$$$2^i+(f\oplus c)+(f\oplus g)=2^i+(g\oplus c)$$$: In this case, $$$(f\oplus c)+(f\oplus g)=(g\oplus c)$$$, meaning that $$$f\oplus c$$$ is a subset of $$$g\oplus c$$$. In terms of sets, XOR is like symmetric difference so this means $$$c$$$ contains all of $$$f\setminus g$$$ and is disjoint with $$$g\setminus f$$$. In other words, $$$c$$$ has a $$$1$$$ bit wherever $$$f$$$ has a $$$1$$$ bit and $$$g$$$ has a $$$0$$$ bit and $$$c$$$ has a $$$0$$$ bit wherever $$$g$$$ has a $$$1$$$ bit and $$$f$$$ has a $$$0$$$ bit. Ergo, if we fix $$$f$$$ and $$$g$$$, then every bit in $$$c$$$ where the bits of $$$f$$$ and $$$g$$$ differ is determined, but every other bit in $$$c$$$ is free, i.e. every other bit could be either $$$0$$$ or $$$1$$$. The bits where $$$f$$$ and $$$g$$$ differ are exactly those bits where $$$f\oplus g$$$ has a $$$1$$$ bit. Ergo, there are $$$popcnt(f\oplus g)$$$ bits where that bit in $$$c$$$ is determined, so $$$i-popcnt(f\oplus g)$$$ bits in $$$c$$$ are free, meaning there are $$$2^{i-popcnt(f\oplus g)}$$$ possible values for $$$c$$$. (Note that $$$popcnt$$$ is a function that takes in a nonnegative integer and returns the number of $$$1$$$ bits.) Ergo, the total number of triples $$$(f,g,c)$$$ where this occurs is: $$$\sum_{0\leq f\leq lastVal} \sum_{0\leq g\leq lastVal} 2^{i-popcnt(f\oplus g)}$$$
$$$2^i+(g\oplus c)+(f\oplus g)=2^i+(f\oplus c)$$$: Same number of possibilities as previous case, so we'll just multiply the number of possibilities we got from the previous case by two.

Finally, the intersection of both cases happens only when $$$f\oplus g=0$$$, i.e. when $$$f=g$$$. Since there are $$$lastVal+1$$$ possibilities for $$$f$$$ and, in this case, $$$c$$$ could be any $$$i$$$-bit number, there are $$$2^i\cdot (lastVal+1)$$$ possibilities for the intersection.

Thus, in total, the number of triplets where $$$(f,g,c)$$$ leads to a nondegenerate triangle is:

$$$2^i\cdot (lastVal+1)^2-2\left(\sum_{0\leq f\leq lastVal} \sum_{0\leq g\leq lastVal} 2^{i-popcnt(f\oplus g)}\right)+2^i\cdot lastVal$$$

Factor out $$$2^i$$$ from everything for convenience:

$$$2^i\left( (lastVal+1)^2-2\left(\sum_{0\leq f\leq lastVal} \sum_{0\leq g\leq lastVal} 2^{-popcnt(f\oplus g)}\right)+lastVal \right)$$$

Notice that negative exponents are OK because we are working in modular arithmetic, and $$$2$$$ is invertible modulo the modulus they give us.

Let $$$y_i=\sum_{0\leq f\leq b_i\dots 0} \sum_{0\leq g\leq b_i\dots b_0} 2^{-popcnt(f\oplus g)}$$$. I'll explain how to compute $$$y_i$$$ using DP at the end of this post, but for now, just notice that we can replace the big sum in the expression above with $$$y_{i-1}$$$.

Finally, don't forget to multiply by $$$3$$$ due to the arbitrary choice we made at the beginning.

Case where 3 numbers have $$$i$$$ th bit set to 1

In this case, there are numbers $$$f, g, h$$$ such that $$$0\leq f,g,h\leq lastVal$$$ such that $$$a=2^i+f, b=2^i+g, c=2^i+h$$$ and the legs of the triangle have lengths $$$f\oplus g, f\oplus h, g\oplus h$$$, i.e. the $$$i$$$ th bits cancel out when we take the XORs since every number has their $$$i$$$ th bit set. Ergo, the answer for this case is just the answer we computed for $$$n=b_{i-1}\dots b_0$$$ previously, so just use that previous answer for this case.

We finally went through all four cases, so we are basically done. Now we just need to figure out how to compute $$$y_i$$$ for all $$$i=0$$$ to $$$N-1$$$.

First, the base case is $$$y_{-1}=1$$$. I treat $$$b_{-1}\dots b_0$$$ as an "empty" number which is equal to $$$0$$$, and it is easy to verify that $$$\sum_{0\leq f\leq 0}\sum_{0\leq g\leq 0} 2^{-popcnt(f\oplus g)}=2^{-0}=1$$$.

Now let's handle the recurrence case: First, if $$$b_i=0$$$, then $$$y_i=y_{i-1}$$$ so we can just use the previous answer. Otherwise, if $$$b_i=1$$$, we split the sum into four cases:

$$$2^i\leq f, g\leq b_i\dots b_0$$$: In this case, both $$$f$$$ and $$$g$$$ has their $$$i$$$th bit set, so $$$f\oplus g$$$ cancels the $$$i$$$ th bit out. Ergo, this is the same as summing over $$$0\leq f,g\leq b_{i-1}\dots b_0$$$, so we just use the previous answer of $$$y_{i-1}$$$ in this case.
$$$2^i\leq f\leq b_i\dots b_0$$$ and $$$0\leq g\leq 2^i-1$$$: In this case, notice that for any fixed $$$f$$$, $$$f\oplus g$$$ is some $$$(i+1)$$$-bit number with its $$$i$$$ th bit set, which we can represent as $$$2^i+z$$$ for some $$$i$$$-bit number $$$z$$$. Moreover, for any $$$i$$$-bit number $$$z$$$, the only solution to the equation $$$f\oplus g=2^i+z$$$ is $$$g=f\oplus (2^i+z)$$$. In other words, for every $$$i$$$-bit number $$$z$$$, there is exactly one $$$g$$$ such that $$$f\oplus g=2^i+z$$$, so we have: $$$\sum_{0\leq g\leq 2^i-1} 2^{-popcnt(f\oplus g)}=\sum_{0\leq z\leq 2^i-1} 2^{-popcnt(2^i+z)}=\sum_{0\leq z\leq 2^i-1} 2^{-(1+popcnt(z))}=2^{-1}\sum_{0\leq z\leq 2^i-1}2^{-popcnt(z)}$$$. Let $$$x_i=\sum_{0\leq z\leq 2^i-1}2^{-popcnt(z)}$$$. I'll explain how to compute $$$x_i$$$ at the very end of this post. Thus, since $$$\sum_{0\leq g\leq 2^i-1} 2^{-popcnt(f\oplus g)}$$$ doesn't actually depend on the value of $$$f$$$, we find that $$$\sum_{2^i\leq f\leq b_i\dots b_0}\sum_{0\leq g\leq 2^i-1} 2^{-popcnt(f\oplus g)}=(lastVal+1)\cdot 2^{-1}\cdot x_i$$$, since there are $$$lastVal+1$$$ possible values for $$$f$$$.
$$$2^i\leq g\leq b_i\dots b_0$$$ and $$$0\leq f\leq 2^i-1$$$: Same as previous case, so we can just multiply the answer by the previous case by $$$2$$$, which gets rid of the $$$2^{-1}$$$
$$$0\leq f,g\leq 2^i-1$$$: Similar to the previous case, for any fixed $$$f$$$, and for any $$$i$$$-bit number $$$z$$$, there is exactly one $$$g$$$ such that $$$f\oplus g=z$$$, so $$$\sum_{0\leq g\leq 2^i-1} 2^{-popcnt(f\oplus g)}=\sum_{0\leq z\leq 2^i-1} 2^{-popcnt(z)}=x_i$$$. Since the value of this sum doesn't depend on $$$f$$$ and there are $$$2^i$$$ possible values of $$$f$$$, we find that $$$\sum_{0\leq f,g\leq 2^i-1} 2^{-popcnt(f\oplus g)}=2^i\cdot x_i$$$.

Now we know how to compute $$$y_i$$$, but we need to compute $$$x_i=\sum_{0\leq z\leq 2^i-1}2^{-popcnt(z)}$$$ for all $$$i=0$$$ to $$$N-1$$$. First, $$$x_0=1$$$, because $$$\sum_{0\leq z\leq 0} 2^{-popcnt(z)}=2^{-0}=1$$$.

Now let's handle the recurrence case: We split the sum into two cases:

$$$0\leq z\leq 2^{i-1}-1$$$: $$$\sum_{0\leq z\leq 2^{i-1}-1} 2^{-popcnt(z)}$$$ is just $$$x_{i-1}$$$, so we just use the previous answer of $$$x_{i-1}$$$ in this case.
$$$2^i\leq z\leq 2^i-1$$$: Here, every $$$2^i\leq z\leq 2^i-1$$$ can be written as $$$z=2^i+w$$$ for some $$$0\leq w\leq 2^{i-1}-1$$$, and $$$popcnt(z)=popcnt(2^i+w)=1+popcnt(w)$$$, so we have $$$\sum_{2^{i-1}\leq z\leq 2^i-1} 2^{-popcnt(z)}=\sum_{0\leq w\leq 2^{i-1}-1} 2^{-(1+popcnt(w))}=2^{-1} \sum_{0\leq w\leq 2^{i-1}-1} 2^{-popcnt(w)}=2^{-1}\cdot x_{i-1}$$$.

Ergo, the recurrence for $$$x_i$$$ is simply $$$x_i=(1+2^{-1})x_{i-1}$$$, and with this, we have all the info we need to solve the problem.

Full text and comments »

#dp, #bitmasks, #xor

Noble_Mushtak
21 month(s) ago
5

Help with "Divide and Conquer" Problem from MAPS 2020

By Noble_Mushtak, history, 4 years ago, In English

Hello. As some of you may know, the Mount Allison Programming Showdown 2020 took place yesterday, and I had a question about the Divide and Conquer problem.

You can read the problem yourself, but basically, the problem gives you two integers $$$b$$$ and $$$d$$$, where $$$1 < b, d < 2^{63}$$$ and $$$d$$$ is prime, and then asks you if there exists an $$$m$$$ such that $$$b^m \equiv -1 \pmod{d}$$$. You need to output yes if such an $$$m$$$ exists and output no otherwise. (This is an oversimplification of the problem because the actual problem statement is quite long, so it is possible I misread what it is asking, but I am just giving this short summary of the problem so that I don't have to repeat the whole problem statement in this post.)

Here is how my solution works:

If $$$d=2$$$, then $$$m$$$ exists if and only if $$$b$$$ is odd.
If $$$d$$$ is an odd prime, then $$$m$$$ exists if and only if $$$b$$$ has even order in the group of units $$$\pmod{d}$$$.
- PROOF: If $$$m$$$ exists, then we can let $$$m'$$$ is the smallest positive integer such that $$$b^{m'}\equiv -1\pmod{d}$$$. Thus, the order of $$$b$$$ must be greater than $$$m'$$$. Moreover, we have $$$b^{2m'}\equiv (-1)^2\equiv 1\pmod{d}$$$, so the order of $$$b$$$ must be at most $$$2m'$$$. Now, for any integer $$$m < n < 2m'$$$, it holds that $$$0 < n-m' < m'$$$, so $$$b^{n-m'}\not\equiv -1\pmod{d}$$$ since $$$m'$$$ is the smallest positive integer such that $$$b^{m'}\equiv -1\pmod{d}$$$ and $$$n-m' < m'$$$. Thus, $$$b^n\equiv b^mb^{n-m}\equiv -b^{n-m}\not\equiv 1\pmod{d}$$$, so the order of $$$b$$$ is not $$$n$$$ for any $$$m' < n < 2m'$$$. Therefore, the order of $$$b$$$ must be $$$2m'$$$, so the order of $$$b$$$ is even.
- PROOF CONTINUED: If $$$b$$$ has even order, then let $$$2m$$$ be the order of $$$b$$$. Then, $$$b^{2m}\equiv (b^m)^2\equiv 1\pmod{d}$$$. Thus, either $$$b^m\equiv -1\pmod{d}$$$ or $$$b^m\equiv 1\pmod{d}$$$. However, since $$$2m$$$ is the order of $$$b$$$, $$$b^m\not\equiv 1\pmod{d}$$$, so it must be that $$$b^m\equiv -1\pmod{d}$$$.
- Now, to see if the order of $$$b$$$ is even, calculate $$$b^n\pmod{d}$$$, where $$$n$$$ is the biggest odd factor of $$$d-1$$$. If $$$b^n\equiv 1\pmod{d}$$$, then the order of $$$b$$$ divides $$$n$$$, so $$$b$$$ has odd order, so output no.
- Otherwise, if $$$b^n\not\equiv 1\pmod{d}$$$, then the order of $$$b$$$ does not divide $$$n$$$. Since the order of $$$b$$$ must divide $$$d-1$$$, this means that the order of $$$b$$$ must be even, so output yes.

However, when I submit a solution off the above methods, I get Wrong Answer. Can anyone see what I am doing wrong? I have included my C++ code below. Thanks for the help.

#pragma GCC optimize("Ofast")
#pragma GCC optimize("unroll-loops")
#pragma GCC target("sse,sse2,sse3,ssse3,sse4,popcnt,abm,mmx,avx,avx2")
#include <bits/stdc++.h>

typedef int64_t num;

static inline num multModulo(num num1, num num2, num MOD) {
    return ( static_cast<__int128>(num1) * num2) % MOD;
}

num calcModuloExp(num base, num exp, num MOD) {
    num result = 1, cur = base;
    while (exp) {
        if (exp & 1) result = multModulo(result, cur, MOD);
        cur = multModulo(cur, cur, MOD), exp >>= 1;
    }
    return result;
}

int main() {
    num b, d;
    scanf("%" PRId64 " %" PRId64, &b, &d);
    if (d == 2) {
        puts((b & 1) ? "yes" : "no");
    } else {
        num biggestOddFactor = d-1;
        while (!(biggestOddFactor & 1)) biggestOddFactor >>= 1;
        num answer = calcModuloExp(b, biggestOddFactor, d);
        puts((answer == 1) ? "no" : "yes");
    }

    return 0;
}

Full text and comments »

#help, #number theory, #modular arithmetic

Noble_Mushtak
4 years ago
4

Is scanf/printf ever too slow?

By Noble_Mushtak, history, 5 years ago, In English

Usually, I hear people saying that std::cin and std::cout is a relatively slow form of input while scanf and printf is a faster form of input which should be sufficient for all problems. However, sometimes I wonder if there are any problems where scanf and printf would be too slow.

For example, look at the problem 1181D — Irrigation. This problem asks us to read in 500,000 numbers, each of which could be as large as 500,000, and then asks us to read in 500,000 more numbers, each of which could be as large as $$$10^{18}$$$. Clearly, this is a lot of input/output, so users should use scanf and printf over std::cin and std::cout. However, even when I used scanf and printf, my solution ran in 2355 ms, barely under the time limit of 2.5 seconds. On the other hand, when I wrote my code using an I/O library I wrote myself, which uses fread and fwrite to read/write 32768 bytes at a time, my improved solution ran in 592 ms, almost 4 times faster than the original solution.

Since my code barely ran under time using scanf and printf, I wonder if there are any CodeForces problems where using scanf and printf will inevitably lead to a Time Limit Exceeded error and users must use some other I/O method like I did. Has anyone else encountered a problem where scanf and printf just weren't quite fast enough? Moreover, has anyone else built a custom I/O library for competitive programming like I did? If anyone has knows how to do input/output faster than fread and fwrite, I would love to hear about it.

Full text and comments »

fast input/output, big inputs, scanf/printf

Noble_Mushtak
5 years ago
10

Why Doesn't CodeForces Use 64-bit Mode?

By Noble_Mushtak, history, 5 years ago, In English

Recently, I have become interested in x86 assembly, and I wanted to try out my skills on CodeForces using __asm__ extension in GCC. For example, I was very pleased when I solved problem 495B using a very naive algorithm implemented in assembly which would probably have TLEd if it was written in regular C (basically, looped all the way up to $$$\frac{n}{4}$$$ to find all divisors of $$$n$$$ for $$$n \leq 10^9$$$ rather than using $$$O(\sqrt{n})$$$ algorithm).

However, this algorithm was implemented by using only 32-bit numbers and registers, so I was curious as to how the performance would be affected by using 64-bit numbers and registers. Therefore, I tried to use the %rax, %rbx, etc. registers instead of the %eax, %ebx, etc. registers, but for some reason, this gave me a "bad register" error. I then tried running my code using Clang++17 Diagnostics and this gave me a much clearer error:

p71.cpp:24:13: error: register %rax is only available in 64-bit mode
    __asm__("push %%rax \n\t"

Of course, this implies that CodeForces is compiling our programs using 32-bit mode, so using 64-bit integers is even slower than it would normally be, because GCC/G++/Clang++ has to manipulate multiple registers in order to implement 64-bit operations in 32-bit x86. The fact that CodeForces uses 32-bit mode surprised me very much, since 64-bit has been the standard for computers and even smartphones for years. In fact, Ubuntu recently decided to drop its support for 32-bit computers in its upcoming versions, showing how old 32-bit really is.

Given all this, why does CodeForces continue to use GCC/G++/Clang++ in 32-bit version? Is this to penalize programmers from using 64-bit integers when it's not necessary, or do they use 32-bit mode for a more technical reason? I am not sure using 32-bit mode makes a huge difference since I find most problems can be solved using only 32-bit numbers anyway, but I am very curious as to why CodeForces made the decision to choose 32-bit over 64-bit.

Full text and comments »

64-bit integers, long long

Noble_Mushtak
5 years ago
1

How to Optimize C Code for NTT -- POLYMUL on SPOJ

By Noble_Mushtak, history, 5 years ago, In English

On SPOJ, there is a problem named POLYMUL, which is pretty straightforward: Multiply two polynomials, which have integer coefficients in the interval $$$[0, 1000]$$$ and can each have a degree of up to $$$10000$$$.

Currently, I am trying to solve this problem with NTT and my solution can be found here. As the solution shows, I am doing NTT with the primes of both $$$998244353$$$ and $$$2013265921$$$, and then using Chinese Remainder Theorem in order to find the true coefficients without any modulo (I am assuming the coefficients are less than $$$1000^2\cdot 10000=10^{10}$$$, which is much less than the product of $$$998244353$$$ and $$$2013265921$$$). On my machine, it takes about 3 seconds for this solution to solve a test case with $$$T=10$$$ (i.e. $$$10$$$ pairs of polynomial) where each polynomial in each pair is of degree $$$10000$$$ and the coefficients are integers randomly picked from the interval $$$[0, 1000]$$$.

However, the time limit is 1 second, and I do not know what to do to optimize my solution more. Does any one have any suggestions on how to optimize my NTT algorithm or how to optimize my C code in general? Any solutions would be very welcome. Thanks!

Full text and comments »

ntt, optimization, #help

Noble_Mushtak
5 years ago
6

Commented Solutions to Codeforces Round #364

By Noble_Mushtak, history, 8 years ago, In English

All solutions below are thoroughly commented and based on the official editorial. I might start doing this for one or two editorials every month starting now because writing alternate solutions for 701C and 701D and writing solutions for 701E and 701F (could not figure out on my own) really helped me understand the problems better. Also, I'm sorry if the later solutions are more clunky; I'm not used to solving problems of that difficulty. Thanks!

701A — Cards

Their explanation is pretty straight-forward, so here's the solution.

701B — Cells Not Under Attack

Here, I deviate from the editorial in two ways:

Because I am using C and do not have set, I instead mark rows and columns as attacked in an array of bools and count the number of attacked rows and columns as I go along.
Their formula is n·n - cnt where cnt = szY·n + szX·n - szX·szY. This gives us n·n - szY·n - szX·n + szX·szY = (n - szX)(n - szY). I use the latter in my solution.

Here is my solution for this problem.

701C — They Are Everywhere

The confusing part of their answer is that they're not really clear about when they're talking about looping through the different types of letters in s or all of the letters in order. Here is my implementation of their solution.

701D — As Fast As Possible

Here is my binary search solution. The problem with their explanation here is it's out of order, they never actually show their equation for what I called x, their variable posM convoluted their formula for x, and they never actually solve for what I called y but instead only briefly mention that solutions must account for the bus going back.

701E — Connecting Universities

I think the way they explained 701E pretty well other than the fact that l_v is unnecessary since l_v = 1 for all v. However, it is missing lots of detail if you're not so good at working with trees. Here's my implementation of the editorial solution.

701F — Break Up

I could not understand from the editorial's explanation how to find the second edge in the case of two edges until I looked at the dfs2() function from this solution to 701F from Vosatorp, so thanks to them for this solution!

Basically, in our depth first search, if an edge from v_a to v_b in path from s to t is unnecessary, then there will contain a cycle with v_b and a vertex before v_b in the path. This is because since the edge is unnecessary, there exists another path from s to t without that edge. Let v_l be the vertex farthest along the original path that is before v_b on this new path. and v_r be the vertex closest to v_b that is after it on this new path. Then, using the old path, we get to v_b, then v_r and using the new path, we get to v_l. Thus, by this process, we have created a cycle with both v_b and v_l, proving our original point. Ergo, by contrapositives, edges where v_b is not in any cycle with a previous vertex in the path are necessary.

This process seems to be a variation on Tarjan's Bridge-Finding algorithm which I did not know before.

Other than that proof of correctness, everything else is pretty much explained in the comments of my re-writing of Vosatorp's solution in C, without vector.

700D — Huffman Coding on Segment

Coming when I have enough time/experience to understand the solution!

700E — Cool Slogans

Coming when I have enough time/experience to understand the solution!

Full text and comments »

#364, editorial, solutions

Noble_Mushtak
8 years ago
0

Number of Empty Subgrids of a Grid

By Noble_Mushtak, history, 8 years ago, In English

There's this really good article on how to find the maximal empty subgrid of a grid so one might wonder, "How do we find the number of empty subgrids of a grid?" I first saw the answer to the second question in this solution to CHSGMNTS on CodeChef by lohit_97 (found on CodeChef and on CodeForces), so thanks to them for the following algorithm!

First, here is a more formalized way of saying the problem:

The first line of input contains two integers width and height (where 1 ≤ width, height ≤ 1000).
The next height lines of input contain width integers that fit into long in C/C++.
Let matrix[i][j] be the integer number i on row number j.
Find the number of tuples (a, b, c, d) such that matrix[i][j] is 0 for all a ≤ i ≤ c and b ≤ j ≤ d.

The way we can solve this problem is by keeping an array emptyToLeft[i][j] which is the maximum number such that i - emptyToLeft[i][j] < x ≤ i implies matrix[x][j] is 0. (If matrix[i][j] is not 0, then we just set emptyToLeft[i][j] to 0 to make the inequality always false.)

Once we have that, we loop through each column i and in each column, we make a stack and a tempAnswer = 0. In the bottom of a stack, we start with - 1 and as we loop through each row j, each row is popped on and some rows are popped off in order to preserve the property that between two consecutive stack elements a and b where b is directly above a, we have a < y ≤ b implies emptyToLeft[i][y] ≥ emptyToLeft[i][b]. By the definition of emptyToLeft, this means i - emptyToLeft[i][b] < x ≤ i and a < y ≤ b implies matrix[x][y] is 0. Thus, all such (x, y) as (a, b) and (i, j) as (c, d) in the tuple (a, b, c, d) is an answer. Therefore, two consecutive stack elements a and b give us (b - a) * emptyToLeft[i][b] answers, so we add that to tempAnswer when we add b to the stack and subtract that from tempAnswer when we pop b off. Finally, the answer is simply the sum of all tempAnswers across all i, j.

Here is the C code for this algorithm:

#include <stdio.h>
#include <stdlib.h>
#define REPEAT(token, num) for (token = 0; token < num; token++)

typedef long num_cells;
typedef long cell;
typedef long long num_matrices;

cell matrix[1000][1000];
num_cells width, height, emptyToLeft[1000][1000], stack[1000], stackLength;
num_matrices answer, tempAnswer;

int main() {
    num_cells i, j, prevToLeft;
    //Get the input for the matrix:
    scanf("%li %li", &width, &height);
    REPEAT(j, height) REPEAT(i, width) scanf("%li", matrix[i]+j);

    //Create emptyToLeft by looping through the rows:
    REPEAT(j, height) {
        //This is the previous emptyToLeft, which starts out at 0 because at the beginning of a row, there are no elements to the left.
        prevToLeft = 0;
        //Loop through each column:
        REPEAT(i, width) {
            //If matrix[i][j] != 0, then emptyToLeft[i][j] = 0.
            if (matrix[i][j]) emptyToLeft[i][j] = 0;
            //Otherwise, add one to prevToLeft.
            else emptyToLeft[i][j] = prevToLeft+1;
            //Also, update prevToLeft:
            prevToLeft = emptyToLeft[i][j];
        }
    }

    REPEAT(i, width) {
        //For each column, create a stack with -1 at the bottom.
        stack[0] = -1, stackLength = 1;
        //Also, reset tempAnswer:
        tempAnswer = 0;
        //Loop through each row:
        REPEAT(j, height) {
            //If emptyToLeft[i][stack[stackLength-1]] >= emptyToLeft[i][j], then putting j onto the stack will violate the stack property, so we need to pop the stack:
            while (stack[stackLength-1] != -1 && emptyToLeft[i][stack[stackLength-1]] >= emptyToLeft[i][j]) {
                //Update tempAnswer now that we are going to pop the stack:
                //Note that here, stack[stackLength-1] is described as b above while stack[stackLength-2] is a.
                tempAnswer -= (stack[stackLength-1]-stack[stackLength-2])*emptyToLeft[i][stack[stackLength-1]];
                //Pop the stack:
                stackLength--;
            }
            //Add the current row to the stack and update tempAnswer accordingly.
            stack[stackLength++] = j;
            tempAnswer += (stack[stackLength-1]-stack[stackLength-2])*emptyToLeft[i][stack[stackLength-1]];
            //Finally, add tempAnswer to answer across all i, j.
            answer += tempAnswer;
        }
    }
    //Print the answer:
    printf("%lli\n", answer);
    
    exit(0);
}

Notice that we have a while loop inside of the j loop, giving us a third inner loop and thus possibly O(height²width) complexity. However, each run of the while loop pops an element off the stack and we only add an element to the stack for each j, meaning overall, for one i, we can do at most height pops. Thus, the j loop is O(height) for each i and the while loop is O(height) for each i, which gives us O(height·width) complexity for this problem.

Full text and comments »

dynamic programming, subarray, counting

Noble_Mushtak
8 years ago
0

One 32-Bit Variable Solution to Contest 363 Div. 2 Problem C:

By Noble_Mushtak, history, 8 years ago, In English

After the contest, I read other people's solution to Vacations and people kept using big dynamic programming arrays on Div. 2 Problem C, which confused me, so I decided to show all of you that this was unnecessary by solving this problem using only one uint_32 variable in C. Using bit fields in a struct, we can get this down to only three bytes.

The intuition behind this solution is that you only need to keep track of what is possible from the day before and if nothing is possible, then you have to rest this day and everything will be possible the next day. I think it's kind of a greedy algorithm because unlike the DP algorithms, which seemed to take into account solutions where we rest early in order to do something the next day, this algorithm does something on every day when it can and rests only when it finds that it needs to. I do this because whether or not we rest early or the day after when we have to is irrelevant: We still end up resting one day, so the answer is the same. Thus, no dynamic programming is necessary and we only need to keep track of one answer at every point.

Full text and comments »

#363 (div. 2) c, dynamic programming, greedy, memory manipulation

Noble_Mushtak
8 years ago
5

Confused about Complexity of Contest 362 Div. 1 Problem A:

By Noble_Mushtak, history, 8 years ago, In English

Here is my analysis of the solution to Contest 362 Div. 1 Problem A:

There are $\text{[math]}$ edges per query.
Each edge takes $\text{[math]}$ to process. There are $\text{[math]}$ added every query and O(q) queries, so we have $\text{[math]}$ edges, meaning each edge is $\text{[math]}$ to process.
Finally, there are O(q) queries.

This gives us $\text{[math]}$ overall. However, the editorial does not have the $\text{[math]}$ term and simply has $\text{[math]}$ . How did they get rid of the $\text{[math]}$ term in the analysis? Did I do something wrong here?

Full text and comments »

#362 (div. 2) c, analysis

-3

Noble_Mushtak
8 years ago
2