Alternative Editorial to 2019 USACO Gold February, Problem 1 (Cow Land)

#	User	Rating
1	tourist	3690
2	jiangly	3647
3	Benq	3581
4	orzdevinwang	3570
5	Geothermal	3569
5	cnnfls_csy	3569
7	Radewoosh	3509
8	ecnerwala	3486
9	jqdai0815	3474
10	gyh20	3447

#	User	Contrib.
1	maomao90	174
2	awoo	165
3	adamant	161
4	TheScrasse	160
5	nor	158
6	maroonrk	156
7	-is-this-fft-	152
8	orz	146
9	SecondThread	145
9	pajenegod	145

The official editorial to this problem is not asymptotically optimal and involves an advanced concept that's infrequent in USACO Gold. I managed to derive a relatively "low-tech" and faster solution, which I will describe below.

Problem link

Solution

for the sake of convenience, let $$$\sum\limits_u^v$$$ denote the XOR of all labels on the path from node $$$u$$$ to node $$$v$$$.

Subtask 1

We want a way to efficiently calculate the XOR of every label along an arbitrary path without modifications. Note that after rooting the tree at an arbitrary root $$$r$$$, if we split a path by the LCA of its endpoints, this can be computed alongside binary-jumping LCA, but that approach can't be easily transferred to subtask 2.

The key observation is that, since XOR is an involution, if we take the XOR of $$$\sum\limits_u^r$$$ and $$$\sum\limits_v^r$$$, nodes that are on both paths will not contribute. By definition, these are exactly the common ancestors of $$$u$$$ and $$$v$$$, and almost all of them are nodes that we wouldn't want to count in the query!

Since the LCA is the only common ancestor of $$$u,v$$$ on the path between them, we can simply return $$$(\sum\limits_u^r\oplus\sum\limits_v^r)\oplus e_{\text{LCA(u,v)}}$$$ for each query. The "prefix sums" of labels can be pre-computed in $$$O(n)$$$ with any graph traversal algorithm, so the overall complexity of this subtask is $$$O(n+b(n)+q\cdot (1+c(n))$$$, where $$$b(n),c(n)$$$ are the pre-computation and query times of our LCA method, respectively.

Subtask 2

Going further along the idea to keep track of paths sourcing from the root, for some arbitrary node $$$x$$$, which prefix sums will have to be updated if it is modified? The answer is exactly every node that has $$$x$$$ as an ancestor. This is the subtree of $$$x$$$ by definition, and subtree modifications are right up the alley of the Euler Tour technique!

If we establish an Euler Tour of the tree, we can effectively modify subtrees in $$$O(\log(n))$$$ with any data structure that supports range modification, such as a lazy propagation segment tree. However, note that when retrieving the answer, we only need to query two individual nodes, so it's sufficient to use a slightly modified version of an iterative segment tree due to Al.Cash, where a node keeps track of every update ever done over its corresponding segment, instead of the value of its corresponding segment; that way, to query a specific index, we can traverse along the path from it to the root of the segment tree, and their XOR is the current value of that index. A more detailed explanation of iterative segment trees and this specific trick can be found here.

(EDIT: it's also plausible to simply use a typical point-update range-query data structure on the difference array. Not sure why I didn't include that in my initial revision.)

Every technique used here can be found in multiple past USACO Gold problems (excluding range modification/point query, but that is a fairly natural extension of vanilla segment trees IMO), hence I consider my solution canonical to the gold division.

Time complexity: $$$O(n+b(n)+q\log(n))$$$.

My C++ code, which comfortably AC's, is shown below. Credits to Al.Cash once again for the segment tree implementation.

Click

The DFS2 method implicitly keeps track of a reverse topological sort, and retrieves the original tag for each node by XOR-ing its prefix sum with its parent's prefix sum in reverse topological order.

#include <bits/stdc++.h>
using namespace std;

#define MAXN 100000
#define MAXLOG 18
int t[2*MAXN],st[MAXN],en[MAXN],n,q,timer,val[MAXN],up[MAXN][MAXLOG],l2;
vector<int> adj[MAXN];


void dfs(int node, int parent) {
    st[node] = timer++;
    up[node][0] = parent;
    for (int i = 1; i <= l2; ++i){
        up[node][i] = up[up[node][i-1]][i-1];
    }
    for (int i : adj[node]) {
        if (i != parent) {
            val[i] ^= val[node];
            dfs(i, node);
        }
    }
    en[node] = timer;
}

void dfs2(int node, int parent) {
    for (int i : adj[node]) {
        if (i != parent) {
            dfs2(i,node);
        }
    }
    if (node) val[node] ^= val[parent];
}

void modify(int l, int r, int value) {
    for (l += n, r += n; l < r; l >>= 1, r >>= 1) {
        if (l&1) t[l++] ^= value;
        if (r&1) t[--r] ^= value;
    }
}


int query(int p) {
    int res = 0;
    for (p += n; p > 0; p >>= 1) res ^= t[p];
    return res;
}


bool is_ancestor(int u, int v) {
    return (st[u] <= st[v]) && (en[u] >= en[v]);
}


int lca(int u, int v) {
    if (is_ancestor(u, v))
        return u;
    if (is_ancestor(v, u))
        return v;
    for (int i = l2; i >= 0; --i) {
        if (!is_ancestor(up[u][i], v)) u = up[u][i];
    }
    return up[u][0];
}


int main() {
    freopen("cowland.in", "r", stdin);
    freopen("cowland.out", "w", stdout);
    cin >> n >> q;
    l2 = (int)ceil(log2(n));
    for (int i = 0; i < n; i++) cin >> val[i];
    for (int i = 1; i < n; i++){
        int a,b;
        cin >> a >> b;
        adj[--a].push_back(--b); adj[b].push_back(a);
    }
    dfs(0,0);
    for (int i = 0; i < n; i++) t[n + st[i]] = val[i];
    dfs2(0,0);
    while(q--){
        int type; cin >> type;
        if (type == 1){
            int i,v;
            cin >> i >> v;
            int delta = val[--i]^v;
            val[i] = v;
            modify(st[i], en[i], delta);
        }
        else {
            int i,j;
            cin >> i >> j;
            int ret = query(st[--i]) ^ query(st[--j]);
            ret ^= val[lca(i,j)];
            cout << ret << endl;
        }
    }
}

Thank you for reading, and remember to praise the cows.

nsqrtlog's blog

Problem link

Solution