Max XOR of a subarray in O(nlog(C))

#	User	Rating
1	tourist	3880
2	jiangly	3669
3	ecnerwala	3654
4	Benq	3627
5	orzdevinwang	3612
6	Geothermal	3569
6	cnnfls_csy	3569
8	jqdai0815	3532
9	Radewoosh	3522
10	gyh20	3447

#	User	Contrib.
1	awoo	161
1	maomao90	161
3	adamant	156
4	maroonrk	153
5	-is-this-fft-	148
5	atcoder_official	148
5	SecondThread	148
8	Petr	147
9	nor	144
10	TheScrasse	142

As a pupil there are a lot of problems I come across that I can't figure out by myself, at least not in a contest's time. Most of the time there are great resources available online and on this very site to learn more and understand all the nuances implied in these problems, but I recently stumbled upon what is regarded as a "standard problem" and couldn't find satisfying explanations about it so I decided to write one myself hoping it'll be of some help to other beginners. I'm going to discuss the problem of finding the maximum bitwise XOR of a subarray in $O(n \cdot log(C))$ as it pertains to the problem [problem:1847C]. I'll try to be as thorough as possible in the first part and as formal as possible in the second, only skipping over basic concepts (what is a subarray, what is bitwise xor, etc).↵
↵
#### Task↵
Given an array $a$, let's define $f(l, r)$ as the xor of elements of $a$ from index $l$ to $r$, where $0 \leq l < r < n$. Find $\text{max} f(l, r)$ in $O(n \cdot log(C))$ where $C$ is the greatest value in the array.↵
↵
#### Solution↵
The idea is to use the fact that $f(0, r) = f(0, l-1) \oplus f(l, r) \rightarrow f(0, r) \oplus f(0, l-1) = f(l, r)$. This means that one can traverse the array and maintain a variable $\text{pre_xor} = f(0, i)$ where $i$ is the current position, then all that remains to do is to find $l-1$ such that $\text{pre_xor} \oplus f(0, l-1)$ is maximised, this way one has found $\text{max} (f(0, i) \oplus f(0, l-1)) = \text{max} f(l, i)$ and solves the problem by iterating over all values of $i$ from left to right.↵
↵
The ideal way to find $l-1$ is to create a trie data structure. A trie is a k-ary search tree but in this case a branch is going to represent a bit and a path a number, since bits can only be 1s and 0s the trie will simply be a standard binary tree.↵
↵
~~~~~↵
struct trieNode{↵
ll value;↵
trieNode *arr[2];↵
};↵
↵
trieNode *newNode(){↵
trieNode *temp = new trieNode;↵
temp->value = 0;↵
temp->arr[0] = NULL;↵
temp->arr[1] = NULL;↵
return temp;↵
}↵
~~~~~↵
↵
Start traversing $a$ from index 0 and insert $\text{pre_xor}$ into the trie. To insert a number in the trie take a look at its bits from the most significant downwards, if it's set to 1 take a step right, otherwise take a step left and create new nodes along the way if needed. Exiting this cycle a path'll have been created through the trie representing $\text{pre_xor}$. Append $\text{pre_xor}$ to the end of this path so that when one reaches a leaf he has a convenient way of knowing what number he has arrived to. Note that nowhere in a trie node 1s and 0s appear, one just has to decide on a convention (right = 1, left = 0) and the encoding of 1s and 0s becomes implicit.↵
↵
↵
~~~~~↵
void insert(trieNode *root, ll pre_xor, ll max_bits){↵
trieNode *temp = root;↵
↵
bool val;↵
for (ll i=max_bits; i>=0; i--){↵
val = pre_xor & (1<<i);↵
↵
if (temp->arr[val] == NULL) temp->arr[val] = newNode();↵
temp = temp->arr[val];↵
}↵
↵
temp->value = pre_xor;↵
}↵
~~~~~↵
↵
Given $\text{pre_xor}$ the ideal value to xor it with is one that has 0s where $\text{pre_xor}$ has 1s and viceversa. The trie helps in finding the value that comes closest to the optimal one among $f(0, l-1)$ for all $0 \leq l \leq i$. Just start traversing it from top to bottom and try to go right (right = bit set to 1) if the most significant bit of $\text{pre_xor}$ is set to 0, left otherwise. If at any point one can't go where he wishes he shall go down the other path. When finally one reaches a leaf he'll find there the value representing the path he just went down, which is the optimal $f(0, l-1)$.↵
↵
~~~~~↵
ll query(trieNode *root, ll pre_xor, ll max_bits){↵
trieNode *temp = root;↵
↵
bool val;↵
for (ll i=max_bits; i>=0; i--){↵
val = pre_xor & (1<<i);↵
↵
if (temp->arr[!val] != NULL) temp = temp->arr[!val];↵
else if (temp->arr[val] != NULL) temp = temp->arr[val];↵
}↵
↵
return pre_xor^(temp->value);↵
}↵
↵
ll maxSubarrayXOR(vector<ll>& arr, ll n = -1, ll max_val = -INF){↵
if(n == -1) n = arr.size();↵
if(max_val == -INF) max_val = *max_element(arr.begin(), arr.end());↵
ll max_bits = 64 - __builtin_clzll(max_val | 1);↵
↵
trieNode *root = newNode();↵
insert(root, 0, max_bits);↵
ll result = 0, pre_xor =0;↵
↵
for (ll i=0; i<n; i++){↵
pre_xor = pre_xor^arr[i];↵
insert(root, pre_xor, max_bits);↵
result = max(result, query(root, pre_xor, max_bits));↵
}↵
↵
return result;↵
}↵
~~~~~↵
↵
#### Why does this solve [problem:1847C]↵
It's not at all evident that problem 1847C tasks one to find the max xor subarray, but indeed it does, with ~~just~~ one little adjustment. The function to optimise is still $f(l, r)$ but instead of $0 \leq l < r < n$, $0 \leq l \leq r < n$ stands where $f(i, i)$ is equal by definition to $a[i]$ for all $0 \leq i < n$. This means that if there's a value in the array which is not smaller than any xor will ever be then one needs to return that value. This is easily solved by inserting 0 into the trie at the beginning so that each number can be xored with 0 (that is, remain the same number) if there isn't anything better to xor it with. The piece of code posted above already does so.↵
↵
If one doesn't find himself in such a lucky case, then the vampire powers only allow xoring with a fixed right index, that is $f(l, n-1)$, but allow to do so multiple times. Let's show that each xor of the form $f(l, r)$ can be constructed from 2 xors of the form $f(l_{1}, n-1)$, $f(l_{2}, n-1)$. This will prove that the max xor subarray is not bigger than the solution to the problem at hand.↵
↵
It goes as simple as this $f(l, r) = f(l, n-1) \oplus f(r+1, n-1)$, but there's a little caveat: after having performed the first xor the length of array $a$ will have grown to n+1 so, formally, the process is described as follows:↵
↵
$a[0, ..., n-1] \rightarrow ~~f(l, n-1) \rightarrow~~ a[0, ..., n]$ where $a[n] = f(l, n-1) \rightarrow a[0, ..., n+1]$ where $ a[n+1] = f(r+1, n) = f(r+1, n-1) \oplus a[n] = f(r+1, n-1) \oplus f(l, n-1) = f(l, r)$↵
↵
To end the proof that the solution to problem 1847C is the max xor of a subarray it's still required to show that vampire powers cannot create a solution that's better than $\text{max} f(l, r)$, let's do it by contradiction.↵
Suppose vampire powers can create a solution that's better than $\text{max} f(l, r)$, this means that after having appended to the array several elements and having expanded it to be like $a[0, ..., m]$ where $a[i] = f(l_{i}, i-1)$ for all $n \leq i \leq m$, $0 \leq l_{i} < i-1$ one finds himself able to create a solution $f(l, m) >$ max xor of a subarray.↵
↵
$f(l_{1}, m) = f(l_{1}, m-1) \oplus a[m] = f(l_{1}, m-1) \oplus f(l_{2}, m-1)$. Let $l_{+} = max(l_{1}, l_{2})$ and $l_{-} = min(l_{1}, l_{2})$, then $f(l_{1}, m) = f(l_{-}, l_{+})$. Repeating this process eventually yields the equality $f(l_{1}, m) = f(l, r)$ where $0 \leq l < r < n$ which is absurd and concludes the proof and this post. Have a nice rest of the day.

Rev.	By	When	Δ	Comment
en13	TunaMayoSandwich	2024-07-26 21:58:08	66	Tiny change: 'eq m$, $0 leq l_{i} ' -> 'eq m$, $0 \leq l_{i} ' (published)
en12	TunaMayoSandwich	2024-07-26 21:43:15	42
en11	TunaMayoSandwich	2024-07-26 21:29:33	675	Tiny change: 'barray in O(nlogC) as it per' -> 'barray in $O(n \cdot logC)$ as it per' (saved to drafts)
en10	TunaMayoSandwich	2024-01-01 03:59:17	2	Tiny change: ' all 0 <= I <= n-1. T' -> ' all 0 <= i <= n-1. T'
en9	TunaMayoSandwich	2023-12-18 23:27:31	2	Tiny change: 'y case, than the vamp' -> 'y case, then the vamp'
en8	TunaMayoSandwich	2023-12-18 20:31:13	2	Tiny change: 'of a form index l ' -> 'of a from index l '
en7	TunaMayoSandwich	2023-12-17 23:01:47	2	Tiny change: 'st value it the array' -> 'st value in the array'
en6	TunaMayoSandwich	2023-12-17 16:54:31	23	Tiny change: 'having expended it to' -> 'having expanded it to'
en5	TunaMayoSandwich	2023-12-17 14:52:49	4	Tiny change: 'ray in O(n) as it pe' -> 'ray in O(nlogC) as it pe'
en4	TunaMayoSandwich	2023-12-17 14:51:42	51
en3	TunaMayoSandwich	2023-12-17 14:16:27	0	(published)
en2	TunaMayoSandwich	2023-12-17 14:09:12	8183	Tiny change: 'ains to this problem [' -> 'ains to the problem ['
en1	TunaMayoSandwich	2023-12-17 13:25:25	2842	Initial revision (saved to drafts)

History