w33z8kqrqk8zzzx33's blog

By w33z8kqrqk8zzzx33, history, 7 weeks ago, In English

My last blog was a bit too esoteric

This blog assumes that you can do a basic convolution (e.g. FFT). After I saw that many people did not know how to do Chirp Z-Transform with arbitrary exponents fast from my blog 9 weeks ago I decided to shitpost write a short tutorial on it.

Let's say we have a polynomial $$$P(x)$$$, such that

$$$\begin{align}P(x)=\sum_{i=0}^{n-1}a_ix^i\end{align}$$$

Now, let's say we have $$$c$$$ and $$$m$$$, and we want to find $$$P(c^0),P(c^1),P(c^2),\dots,P(c^m)$$$. Let $$$b_j=P(c^j)$$$. Consider the implications of having $$$x$$$ as such a form:

$$$\begin{align}b_j=\sum_{i=0}^{n-1}a_ic^{ij}\end{align}$$$

Tinkering around with different expressions for $$$ij$$$, one finds that $$$ij=\frac{(i+j)^2}{2}-\frac{i^2}{2}-\frac{j^2}{2}$$$. This means that

$$$\begin{align}b_j=\sum_{i=0}^{n-1}a_ic^{\frac{(i+j)^2}{2}-\frac{i^2}{2}-\frac{j^2}{2}}\end{align}$$$
$$$\begin{align}b_jc^{\frac{j^2}{2}}=\sum_{i=0}^{n-1}(a_ic^{-\frac{i^2}{2}})c^{\frac{(i+j)^2}{2}}\end{align}$$$

Hence we can find $$$b_j$$$ from the difference-convolution of $$$a_ic^{-\frac{i^2}{2}}$$$ and $$$c^{\frac{i^2}{2}}$$$. However, in many cases — especially when working under a modulus — we can't find the $$$c^{\frac{i^2}{2}}$$$ as $$$i$$$ may be odd. We use a workaround: $$$ij=\binom{i+j}{2}-\binom{i}{2}-\binom{j}{2}$$$. Proof:

$$$ij=\binom{i+j}{2}-\binom{i}{2}-\binom{j}{2}$$$
$$$ij=\frac{(i+j)(i+j-1)}{2}-\frac{(i)(i-1)}{2}-\frac{(j)(j-1)}{2}$$$
$$$2ij=(i+j)(i+j-1)-(i)(i-1)-(j)(j-1)$$$
$$$2ij=(i)(i+j-1)+(j)(i+j-1)-(i)(i-1)-(j)(j-1)$$$
$$$2ij=(i)((i+j-1)-(i-1))+(j)((i+j-1)-(j-1))$$$
$$$2ij=(i)(j)+(j)(i)$$$

Modifying our formula a bit, we get

$$$\begin{align}b_jc^{\binom j2}=\sum_{i=0}^{n-1}(a_ic^{-\binom i2})c^{\binom{i+j}2}\end{align}$$$

As for implmentation details, notice that

$$$\begin{align}b_jc^{\binom j2}=\sum_{i=0}^{n-1}(a_{(n-(n-i))}c^{-\binom{n-(n-i)}2})c^{\binom{i+j}2}\end{align}$$$

Define $$$C_i=a_{n-i}c^{-\binom{n-i}2}$$$; $$$D_i=c^{\binom i2}$$$. We hence have

$$$b_jx^{\binom j2}=\sum_{i=0}^{n-1}C_{n-i}D_{i+j}$$$
$$$b_j=x^{-\binom j2}(C*D)_{n+j}$$$

(the second through definition of convolution)

You can test your implementations here, mod 998244353 and here, mod 10^9+7, although note that the second one is both intense in precision and constant factor.

My code for the former

This method can be used to cheese 1184A3 - Heidi Learns Hashing (Hard) and 1054H - Epic Convolution, and is also a core point in 901E - Cyclic Cipher.

Read more »

 
 
 
 
  • Vote: I like it
  • +146
  • Vote: I do not like it

By w33z8kqrqk8zzzx33, history, 2 months ago, In English

A brief introduction into the applications of square root decomposition

Square root decomposition is the process of separating a structure of size $$$O(N)$$$ into $$$O(\sqrt{N})$$$ "blocks" of size $$$O(\sqrt{N})$$$ each, in a way that aids manipulation of the entire structure. Square root decomposition is extremely versatile. Some of its most well-known use cases are:

  • answering queries on a static array, with methods such as Mo's algorithm or block precomputation
  • "lazy" brute force modifications, where it is easy to query an entire block, but tag pushing isn't obvious
  • separating objects based on a threshold, when there exists both an $$$O(x)$$$ algorithm and an $$$O(n/x)$$$ algorithm

In this tutorial we will walk through multiple types of square root decomposition.

Answering queries on a static array, offline (Mo's algorithm)

Consider a problem where we are asked to find the answer for certain intervals $$$[l,r]$$$. We can't quickly compute the answer for an arbitrary interval, but we know how to transition to $$$[l,r\pm1]$$$ and $$$[l\pm1,r]$$$ fast given some information remaining from $$$[l,r]$$$ answer calculation. The number of transitions we need to do to get from $$$[l_1,r_1]$$$ to $$$[l_2,r_2]$$$ is $$$|l_1-l_2|+|r_1-r_2|$$$.

If there are only two intervals we need to answer such transitioning wouldn't help us. However, if there are many intervals, choosing a good transitioning route will drastically reduce the total time needed. Finding the best transition route quick is allegedly NP-hard, so we will focus on estimating a "good enough" route.

Consider the following scheme for 2D Manhattan TSP: We do square root decomposition on the $$$x$$$ coordinate, separating the grid into vertical blocks, $$$n$$$ cells tall and $$$B$$$ cells wide. There are $$$n/B$$$ such blocks.

For an arbitrary block, call the number of points we need to visit in it $$$C$$$. To get to all of them, we can visit them in order from top to bottom: we spend at most $$$O(BC)$$$ transitions moving left and right, and $$$O(n)$$$ transitions going down.

If for each block we need $$$O(BC+n)$$$ transitions, in total we will need $$$O(\sum(BC+n))$$$ = $$$O(Bq+n^2/B)$$$ transitions. Selecting a block size of $$$B=O(n/\sqrt{q})$$$ gives us a total transition count of $$$O(qn/\sqrt{q}+n\sqrt{q})$$$ = $$$O(n\sqrt{q})$$$.

Implementation of sorting these transitions can be done as follows:

int B = ???;
struct query {
    int l, r, id;
    const bool operator<(const query& o) const {
        if(l/B == o.l/B) {
            // they are in the same block, sort going down
            return r < o.r;
        } else {
            // they are not in the same block, sort by block number
            return l < o.l;
        }
    }
}

Observe here that between each block, we need to run from the bottom of the "grid" all the way up. We can prevent this by using a zig-zag pattern, getting a bit of a constant optimization:

int B = ???;
struct query {
    int l, r, id;
    const bool operator<(const query& o) const {
        if(l/B == o.l/B) {
            if((l/B) % 2 == 0) return r < o.r;
            else return r > o.r;
        } else {
            return l < o.l;
        }
    }
}

In the following we will look at some examples.

Range distinct query (SPOJ DQUERY)

Statement: Given a static array and queries of the form $$$[l,r]$$$, count the number of different values that occur between $$$[l,r]$$$.

The naive transition is intuitive: we maintain a table $$$v$$$ where $$$v[x]$$$ is the number of times $$$x$$$ occured in the interval that we are maintaining. When adding an element $$$x$$$ such that $$$v[x]=0$$$, increment the answer; when deleting an element $$$x$$$ such that $$$v[x]=1$$$, decrement the answer.

However, this needs many random accesses to this table, which is very cache-unfriendly, leading to a massive constant. Can we do better?

Consider arrays $$$pre[i]$$$ and $$$nxt[i]$$$, which equal the last location of the element $$$i$$$ and the next location of the element $$$i$$$, respectively (equal to some infinity if this element doesn't exist.) We can use these arrays to quickly check whether or not some element occurs in an range.

  • Case 1: $$$[l,r]\rightarrow[l,r+1]$$$. We add 1 if and only if $$$r+1$$$ has not occured in $$$[l,r]$$$, or equivalently, $$$pre[r+1] < l$$$.
  • Case 2: $$$[l,r]\rightarrow[l,r-1]$$$. We subtract 1 iff $$$r$$$ has not occured in $$$[l,r-1]$$$.
  • Case 3: $$$[l,r]\rightarrow[l-1,r]$$$. We add 1 iff $$$l-1$$$ has not occurned in $$$[l,r]$$$, or equivalently, $$$r < nxt[l-1]$$$.
  • Case 4: $$$[l,r]\rightarrow[l+1,r]$$$. We subtract 1 iff $$$l$$$ has not occured in $$$[l+1,r]$$$.

This leads to a much quicker solution, which runs plausibly fast even for $$$n=10^6$$$ even if we directly implement Mo's.

Range inversion query (naive version)

Statement: Given a static array and queries of the form $$$[l,r]$$$, count the number of pairs $$$(i,j)$$$ such that $$$a[i]>a[j]$$$. $$$O(n\log n\sqrt n)$$$.

Every time we add or erase an element, we consider how much this element contributes to the answer. If we maintain a Fenwick tree corresponding to the current interval, we can quickly find out how many inversions involve a certain value. Transition is $$$O(\log n)$$$; hence overall it is $$$O(n\log n\sqrt n)$$$.

Advanced: Sweepline Mo / Offline Again

This is 617E - XOR and Favorite Number, except there are multiple favorite numbers.

Statement: Given a static array, a static set $$$S$$$ of size $$$C$$$ and queries of the form $$$[l,r]$$$, count the number of pairs $$$(i,j)$$$ such that $$$a[i]\text{ xor }a[j]\in S$$$. $$$O(nC+n\sqrt n)$$$ time; $$$O(n)$$$ memory. Assume $$$q=O(n)$$$.

Notice that $$$a[i]\text{ xor }a[j]\in S$$$ is equivalent to $$$\exists v\in S:a[i]\text{ xor }v=a[j]$$$. We can consider each $$$v$$$ separately and run $$$C$$$ instances of Mo's, but that would be $$$O(nC\sqrt n)$$$. Consider the transitions: we add an element $$$j$$$ and want to know the amount of $$$i$$$ such that $$$a[i]\text{ xor }a[j]\in S$$$ and $$$i\in[l,r]$$$.

Call the transition delta $$$f(l,r,p)$$$ where $$$p$$$ is the element we are adding. Removing can be done by subtracting $$$f(l,r-1,p)$$$ or $$$f(l+1,r,p)$$$ depending on the direction. Sweepline Mo can be used when this transition function is differentiable: i.e. it satisfies $$$f(l,r,p)=f(1,r,p)-f(1,l-1,p)$$$.

The main idea is to first run a "dry" Mo and find all $$$f(1,x,p)$$$ that we need to calculate. Then, we do a sweep-line on $$$x$$$, updating the data structure accordindly, and calculating the $$$f(1,x,p)$$$ that involve this $$$x$$$. We generally needs the "calculation" part to be $$$O(1)$$$, like normal Mo, but we make more wiggle room for the update part.

However, if we store all $$$(x,p)$$$, we get $$$O(n\sqrt n)$$$ memory and a ridiculous constant factor. Notice that because of the Mo transition order, we can separate all $$$p$$$ into several types, given a fixed $$$x$$$:

  1. $$$p=x$$$
  2. $$$p=x+1$$$
  3. $$$p\in[A,B]$$$

For the first and second type, we calculate $$$ansPrev[i]$$$ and $$$ansNext[i]$$$ and don't update anything, leaving these arrays for the second "wet" Mo to use.

For the third type, we find all $$$(p,A,B)$$$ that result from the dry Mo. There are at most $$$4(M-1)$$$ such intervals because the movement of any end of the Mo interval creates exactly one interval. Then, on the corresponding $$$x$$$, we iterate through the interval and add the transition value we found to the respective query.

Example implementation:

    sort(qs, qs + m);
    cl = 1, cr = 0;
    for(int i = 0; i < m; i++) {
        if (cr < qs[i].r) {
            iv[cl - 1].push_back({cr + 1, qs[i].r, i, -1});
            cr = qs[i].r;
        }
        if (qs[i].l < cl) {
            iv[cr].push_back({qs[i].l, cl - 1, i, 1});
            cl = qs[i].l;
        }
        if (qs[i].r < cr) {
            iv[cl - 1].push_back({qs[i].r + 1, cr, i, 1});
            cr = qs[i].r;
        }
        if (cl<qs[i].l) {
            iv[cr].push_back({cl, qs[i].l - 1, i, -1});
            cl = qs[i].l;
        }
    }
    for(int i = 1; i <= n; i++) {
        ansPrev[i] = ansPrev[i-1] + cgc[ar[i]];
        add(ar[i]);
        for (auto [l, r, i, c] : iv[i]) {
            ll g = 0;
            for(int p = l; p <= r; p++)
                g += cgc[ar[p]];
            ans[i] += c * g;
        }
        ansThere[i] = ansThere[i - 1] + cgc[ar[i]];
    }
    for(int i = 0; i < m; i++) {
        curans += ans[i];
        fans[qs[i].i] = curans + ansPrev[qs[i].r] + ansThere[qs[i].l-1];
    }
    for(int i = 0; i < m; i++) {
        print(fans[i]);
        pc('\n');
    }

Special implementation details:

  • Here add just updates the cgc array, which is a population count of the things in the current prefix xor the things in the set.
  • ivs is the intervals on each x, along with the query and contribution coefficient.
  • If we do a prefix sum on ansPrev and ansThere, we don't even need to use a while loop in the second Mo.

Range inversion query, again

$$$O(n\sqrt n)$$$

Consider sweepline mo. Sweepline mo in general is a method used to reduce the number of modifications of the Mo data structure to $$$O(n)$$$, not affecting the query count of $$$O(n\sqrt n)$$$. This means modifications have to be at most $$$O(\sqrt n)$$$ and queries have to be at most $$$O(1)$$$.

For this problem we need to support point increment/decrement in $$$O(\sqrt n)$$$ and prefix sum query in $$$O(1)$$$. Flip it around to get an equivalent problem: suffix increment/decrement in $$$O(\sqrt n)$$$ and point query in $$$O(1)$$$.

Separate this auxillary array into blocks of size $$$O(\sqrt n)$$$. For each block, maintain a value $$$lazy$$$. When we do increment/decrement a suffix, for blocks that intersect with this suffix yet are not completely covered, we change the elements with brute force. Otherwise, we modify the $$$lazy$$$ tag. To query a position, we add the (brute force) value with its corresponding $$$lazy$$$ tag.


For people who think there is not much of a difference between $$$O(n\sqrt n)$$$ and $$$O(n\log n\sqrt n)$$$: When $$$n=100000$$$, the former runs in $$$300ms$$$ while the latter runs in $$$3000ms$$$. Normally time limits are $$$1000ms$$$.

Answering queries on a static array, online

When the data structure for Mo's is small enough, you can use first do block decomposition into size $$$B$$$ and then calculate the data structure for all intervals of blocks, creating $$$O((N/B)^2)$$$ structures.

Then, assuming the transition time for this data structure is still $$$O(T)$$$, querying an arbitrary interval can be done in $$$O(TB)$$$.

Normally intialization of each structure is $$$O(B)$$$ and transition is $$$O(1)$$$, so it would be best to choose a block size of $$$O(\sqrt n)$$$ to get $$$O(n\sqrt n)$$$ precalculation and $$$O(\sqrt n)$$$ query time. Here we look at some examples:

522D - Closest Equals

Statement: Given a static array and queries of the form $$$[l, r]$$$, for each query calculate $$$\min(|i-j|:a[i]=a[j]\wedge i,j\in[l,r])$$$. $$$n\le 500000$$$

Doing Mo's for this problem looks a bit intractable because although insertion is easy using $$$pre$$$ and $$$nxt$$$ arrays, it seems impossible to delete values and "roll back" the $$$\min$$$ value. Hence we consider block decomposition, which would only require insertion.

Decompose this array into contiguous blocks of size $$$O(B)$$$; for each contiguous interval of blocks compute the answer, for a total precomputation complexity of $$$O((N/B)N)$$$.

Note that if we want the minimum distance the only possible $$$j$$$ values when we fix an $$$i$$$ that couuld possibly contribute to the answer are $$$pre[i]$$$ and $$$nxt[i]$$$. As such, when extending a precomputed block, we calculate if $$$pre[i]$$$ and $$$nxt[i]$$$ are in the interval we need to answer, and if so we try to update the minimum, for a query complexity of $$$O(B)$$$.

Total time complexity is $$$O(n^2/B+qB)$$$, selecting $$$B=O(\sqrt n)$$$ is best, getting a total time complexity of $$$O((n+q)\sqrt n)$$$ with tiny constant. Note that input and outout for this problem is a massive portion of the total running time.

617E - XOR and Favorite Number online version

Statement: Given a static array, a number $$$k$$$ and queries of the form $$$[l,r]$$$, for each query calculate the number of $$$(i,j)$$$ such that $$$a[i]\text{ xor }a[j]=k$$$.

$$$a[i]\text{ xor }a[j]=k \iff a[i]\text{ xor }k=a[j]$$$; we precalculate answers for each block and now count the contribution of new elements to the block interval. We want to know how many elements with value $$$a[i]\text{ xor }k$$$ there are in the current interval.

First of all the current interval is block chunk + at most $$$O(B)$$$ elements, so we can directly maintain a population array on these $$$O(B)$$$ elements. For the block chunk, notice that the count "# of $$$x$$$ in blocks from $$$l$$$ to $$$r$$$" is differentiable. Precalculate "# of $$$x$$$ in blocks from $$$1$$$ to $$$r$$$", and then we're done.

But not quite; this method takes $$$O(B\max A)$$$ memory for the chunk count array. Notice that there are at most $$$O(n)$$$ possible values for $$$a[i]$$$ and $$$a[i]\text{ xor }k$$$; hash these values and we get $$$O(nB+(n/B)^2)$$$ memory, $$$O((n+q)B+(n/B)^2)$$$. Square root decomposition gives $$$O((n+q)\sqrt n)$$$.

Range distinct query revisited

Statement: Range distinct query but online

Obviously direct precomputation and utilizing $$$pre[i]$$$ and $$$nxt[i]$$$ arrays lead to an $$$O((n+q)\sqrt n)$$$ running time. But can we do better?

Notice that the bottleneck is in the precomputation $$$O((n/B)n)$$$, because for each block we need to compute the answers for all intervals that have this block as its first block. Yet we only store $$$O((n/B)(n/B))$$$ values. Obviously time complexity will be lower bounded by this if we use decomposition methods. Here we will reach this lower bound.

Consider the definition of the answer for an arbitrary interval $$$[l,r]$$$:

$$$A_{[l,r]}=(\sum_i[pre[i]+1\le l\wedge i\le r])-(\sum_i[pre[i]+1\le l\wedge i\le l-1])$$$

Let's put the points $$$(pre[i]+1, i)$$$ onto the 2D plane. Define a function $$$p(x,y)$$$ as the number of points such that its x-coordinate is smaller than $$$x$$$ and its y-coordinate is smaller than $$$y$$$, or a 2D prefix sum. Our answer is now

$$$A_{[l,r]}=p(l,r)-p(l,l-1)$$$

but $$$p(x,y)$$$ takes $$$O(n^2)$$$ time to calculate. Consider transforming the $$$A$$$ expression to incorporate decomposition into blocks of size $$$B$$$:

$$$A_{[l,r]}=(\sum_i[pre[i]/B+1\le l\wedge i/B\le r])-(\sum_i[pre[i]/B+1\le l\wedge i/B\le l-1])$$$

Now this is a prefix sum on the points $$$(pre[i]/B+1,i/B)$$$ where division is rounded down. This can obiously be computed in $$$O((n/B)^2)$$$.

Consider the optimal choice of $$$B$$$. The time complexity is

$$$O(\frac{n^2}{B^2}+qB)$$$

Selecting $$$B=O(\sqrt[3]{n})$$$ gives us a total time complexity of

$$$O((n+q)\sqrt[3]{n})$$$

This runs faster than the currently best known method of persistent segment trees because of its tiny constant for $$$n=10^6$$$. Its memory usage is also much, much smaller.

More compilicated examples

Optimizing brute force: 911G - Mass Change Queries

Statement: Given an array and operations of the form $$$l,r,x,y$$$, set $$$a[i]=y$$$ for all $$$i$$$ such that $$$i\in[l,r]$$$ and $$$a[i]=x$$$. Here $$$1\le a[i],x,y\le10^5$$$.

We observe that we can apply these operations to the entire array pretty quickly by storing the locations each value occur in something simlar to a vector and do small-to-large merging.

Do block decomposition on this array into blocks of size $$$B$$$. For blocks that are completely contained by an operation, do small-to-larger merging; otherwise, reconstruct this block and modify with brute force.

I don't know how to prove the true time complexity rigorously, but we can upper bound it. If there is no brute force changes, for each block it will take a total of $$$O(B\log B)$$$ time, because whenever an element is moved, the size of the vector that it is in at least doubles. Total time would thus be $$$O(NB\log B)$$$. If brute force changes on a block absolutely breaks amortized time, moving would take $$$O((N+Q)B\log B)$$$ time, for a total of $$$O((N+Q)B\log B+QN/B)$$$. Assume $$$Q=O(N)$$$ to simplify; $$$O(NB\log B+N^2/B)$$$. Selecting $$$B=\sqrt{\frac{N}{\log N}}$$$ gives

$$$O(N\sqrt{\frac{N}{\log N}}\log \sqrt{\frac{N}{\log N}}+\frac{N^2}{\frac{\sqrt{N}}{\sqrt{\log N}}}) \le O(N\sqrt{N\log N})$$$

Note here that the $$$\log$$$ is inside the square root sign.

Value distance query

Statement: Given a static array and queries of the form $$$x, y$$$, find $$$\min(|i-j|:a[i]=x,a[j]=y)$$$.

There are at most $$$O(\sqrt n)$$$ values that occur more than $$$O(\sqrt n)$$$ times. Consider a pair $$$x,y$$$ such that at least one of $$$x,y$$$ occurs more than $$$O(\sqrt n)$$$ times. We try to precompute answers for all such $$$x, y$$$ as there are at most $$$O(n\sqrt n)$$$ pairs.

For all $$$x$$$ where $$$x$$$ occurs more than $$$O(\sqrt n)$$$ times, we sweep the array using $$$x$$$, once forwards and once backwards. We keep track of the last location where $$$x$$$ occured in the direction that we are sweeping, and then we try to update the answer for $$$x$$$ and $$$a[i]$$$ (i is the index of the sweep pointer.)

As well as that, for each value keep a sorted vector of the indices where said value occurs.

When we answer queries, if we already precomputed the answer, we directly use it; otherwise, we run two-pointers on the corresponding vectors that we stored. The total time complexity is $$$O((n+q)\sqrt n)$$$.

[Ynoi2008] rplexq

Statement: Given a rooted tree and queries of the form $$$(l,r,x)$$$, for each query find the number of $$$i,j$$$ such that $$$l\le i<j\le r$$$ and the LCA of $$$i$$$ and $$$j$$$ is $$$x$$$.

First of all, we answer these queries offline and put the $$$[l,r]$$$ values on the corresponding node. We do dsu on tree, maintaining an implicit segtree of the indices of nodes that belong to some subtree with small-to-large merging.

If we don't even care about the intervals, the amount of pairs that have LCA at node $$$x$$$ is equal to

$$$\binom{sz_x}2-\sum\binom{sz_j}2$$$

where $$$j$$$ is the children of $$$x$$$.

When a node has a pretty small amount of children, we can answer the queries with brute force, as prefix sum queries on an implicit segtree has a small enough constant. But when a node has a lot of children AND has a lot of queries on it, this is much too slow.

First off we do a special check for the heavy child, so as to guarantee that the amount of nodes we parse are on the order of $$$O(n\log n)$$$. We then take the light children and condense the indices. Now, we can run Mo's on these indices, where what we store is $$$\sum\binom{colors}2$$$. Due to the ridiculously high initialization constant for Mo's it might be better to run brute force when queries or children count is small enough.

Note that the complexity for Mo's with length $$$n$$$ queries $$$m$$$ is $$$O(n\sqrt m)$$$. Because of

$$$\sum n_i\sqrt m_i\le(\sum n_i)\sqrt{\max m}$$$

the total time complexity is upper-bounded by $$$O(n\log n\sqrt m)$$$. ODT claims that the time complexity can be $$$O(n\sqrt{m\log n})$$$ to $$$O(n\sqrt m)$$$, but I do not know how to prove this.


If you have other interesting square root decomposition problems, please share them in the comments below and I'll include them in the next edit!

Read more »

 
 
 
 
  • Vote: I like it
  • +272
  • Vote: I do not like it

By w33z8kqrqk8zzzx33, history, 2 months ago, In English

Currently the best known complexities for querying the number of inversions in a range of a static array is $$$O(n+m)$$$ space $$$O(n\sqrt{m})$$$ time for offline; $$$O(n\sqrt{m})$$$ space same time for online. Is there any proof of these lower bounds (a proof that it's impossible to solve range inversion query in linear times polylog time), or can range inversion queries theoretically be done with lower time complexity?

source for current time complexities

Read more »

 
 
 
 
  • Vote: I like it
  • +5
  • Vote: I do not like it

By w33z8kqrqk8zzzx33, history, 3 months ago, In English

1054H - Epic Convolution

When I first saw this problem, I did not understand the editorial at all, so I decided to come up with my own solution. After noticing the extremely small modulo, I tried using it to my advantage.

First of all, $$$i^2$$$ and $$$j^3$$$ are purely random functions. They are really just a generic function that maps integers to numbers in the range $$$[0,MOD-2]$$$. Here we can take them modulo $$$MOD-1$$$ because of Fermat's little theorem and how the given mod, $$$490019$$$, is prime.

Call the function for array $$$a$$$ as $$$f$$$, and the function for array $$$b$$$ as $$$g$$$. Notice that if we "relabel" the arrays $$$a$$$ and $$$b$$$ into arrays $$$a'$$$ and $$$b'$$$ respectively such that $$$a'$$$ and $$$b'$$$ satisfy the following equalities:

$$$\begin{align}a'_i=\sum_{f(j)=i}a_j,b'_i=\sum_{g(j)=i}b_j\end{align}$$$

then the answer that we seek effectively becomes

$$$\begin{align}\sum_{i=0}^{MOD-2}\sum_{j=0}^{MOD-2}a'_ib'_jc^{ij}\end{align}$$$

Now, if we factor out $$$a'_i$$$, this becomes

$$$\begin{align}\sum_{i=0}^{MOD-2}a'_i\sum_{j=0}^{MOD-2}b'_jc^{ij}\end{align}$$$

Notice how the internal part is effectively evaluating the generating function of $$$b'$$$ at point $$$c^i$$$. Since $$$b'$$$ is a finite sequence of length $$$MOD-1$$$ evaluating it quickly at a points of the form $$$c^i$$$ is doable with Chirp-Z transform. The answer is now reduced to

$$$\begin{align}\sum_{i=0}^{MOD-2}a'_iB'(c^i)\end{align}$$$

which is computable in $$$O(\text{mod}\log\text{mod})$$$ time.

Read more »

 
 
 
 
  • Vote: I like it
  • +87
  • Vote: I do not like it

By w33z8kqrqk8zzzx33, history, 7 months ago, In English

Wow, I've never seen a problem that needs to use as much DP as 1322F - Assigning Fares! /s

Apparently the dp tag was added multiple times:

MikeMirzayanov, please fix this bug...

Read more »

 
 
 
 
  • Vote: I like it
  • +4
  • Vote: I do not like it

By w33z8kqrqk8zzzx33, history, 11 months ago, In English

Normally, when a Division 2 round is hosted with a Division 1 round, candidate masters are only allowed to register in Division 1. This isn't the case today:

Read more »

 
 
 
 
  • Vote: I like it
  • -11
  • Vote: I do not like it

By w33z8kqrqk8zzzx33, history, 13 months ago, In English

Since the next contest is division 2, users with >2100 rating are marked with an asterisk and as "Out of competition", like so:

r597

However, in the upcoming Educational Round, >2100 users are not marked with the standard unrated flag, unlike normal division 2 rounds:

edu76

Note that there is no "Out of competition" count, either.

This shouldn't be intended as the name says "Rated for Div. 2". What happened, and is it rated for both divisions?

Read more »

 
 
 
 
  • Vote: I like it
  • +11
  • Vote: I do not like it

By w33z8kqrqk8zzzx33, history, 13 months ago, In English

There is no difference between the rating (6 months) tab and the rating (all) tab. Although I can't find explicit differences, the last entries of the 5th page are the same, which shouldn't happen if old users are listed.

Read more »

 
 
 
 
  • Vote: I like it
  • +28
  • Vote: I do not like it

By w33z8kqrqk8zzzx33, history, 13 months ago, In English

While scrolling through the registrants for round #597, I saw this:

User dewitast's color should be yellow (as a master) and not purple. On his/her user page the error shows too:

Read more »

 
 
 
 
  • Vote: I like it
  • +26
  • Vote: I do not like it