### errorgorn's blog

By errorgorn, 2 years ago,

Recently, there was a local contest which I set a problem for (task 4 in this pdf).

The abridged problem statement is that you choose an array $A$ of length $N$ where the elements are $<2^X$. The grader will pick $[L,R]$ and return $A_L \oplus A_{L+1} \oplus \ldots \oplus A_R$, where $\oplus$ is the bitwise xor operation. You need to find any integer $z$ such that $L \leq z \leq R$.

I set this in contest with the constraints of $N=2^{19}$ and $X=192 \approx \frac{(\log N)^2}{2}$ where you also had to answer queries quickly.

Solution

During testing, oolimry and icypiggy decided it would be funny to write solutions that had $X \approx c \cdot \log N$ (of course they also had to ensure they could answer queries quickly so these are more of describing speedup techniques rather than techniques for reducing number of bits).

oolimry's solution
icypiggy's solution

Anyways, this raises some interesting questions about the minimal $X$ we need if we completely do not care about answering queries quickly. A trivial lower bound is $X \approx \log N$ since we could query $[pos,pos]$ for all values of $pos$. An upper bound is randomly choosing bits. I would think that it is reasonable to assume that the xor-sum of subarrays should be independent and we should require somewhere around $X \approx 4 \cdot \log N$ bits?

Is there a better lower and upper bound $X$ in this problem?

• +117

 » 2 years ago, # |   +57 I believe that I can improve the lower bound to $2 \log n$ (asymptotically) by showing that any array $A$ for which the problem is 100% solvable will also allow you to recover the interval $[L, R]$ (up to a small edge case that doesn't affect asymptotics). In this case, being able to recover any interval means that any two distinct intervals must have different xor-sums.Let $P_i = A_1 \oplus A_2 \oplus \dots \oplus A_i$. Then the xor-sum of $[L,R]$ is $P_{L-1} \oplus P_R$. Now the only way an interval is un-recoverable is if there are two interval $[L,R], [L',R']$ (WLOG $L \leq L'$) with the same xor-sum. This means $P_{L-1} \oplus P_R = P_{L'-1} \oplus P_R' \implies P_{L-1} \oplus P_R \oplus P_{L'-1} \oplus P_R' = 0 \implies P_{L-1} \oplus P_{L'-1} = P_R \oplus P_R'$.Now if we know the original problem is solvable, then there must be some $z$ we can answer such that $L \leq z \leq R$, $L' \leq z \leq R'$. This implies $L' \leq z \leq R \implies L' - 1 < R$.Now this means that the two intervals $[L, L'-1], [R+1, R']$ (if they exist) must have the same xor-sum and do not overlap, meaning the original problem was not 100% solvable. Note that if $R > R'$ we use $[R'+1,R]$ instead.The only exception is when $L = L'$ or $R = R'$. However in either case this leads to some interval $[B, C]$ with xor-sum $0$, which means (unless $B = C$) that the xor-sums of the non-overlapping intervals $[B,B],[B+1,C]$ are the same, which is bad. The only remaining exception is an interval of length $1$ with xor-sum $0$, which we can see is an actual counterexample to my earlier claim in the form of arrays like $A = [0, 1]$ for which the original problem is solvable, but there are two distinct intervals with xor-sum 1. However, as we can have at most one element $A_i$ equal to 0, this only allows us to increase $N$ by at most 1. For completeness, I'll explain why being able to recover the interval forces $X \geq 2 \log N$ asymptotically. This is just because there must be at least $\binom{N}{2}$ xor-sums of intervals, and each xor-sum is between $0$ and $2^X$. As each xor-sum must be distinct, we get $2^X \geq \binom{N}{2} \implies X \geq \log \binom{N}{2} \approx 2 \log N$.
 » 2 years ago, # |   +39 The upper bound is $2 \cdot \log N$ bits. Check XX OpenCup GP of Zhejiang Problem J.