Rating changes for last rounds are temporarily rolled back. They will be returned soon. ×

willy108's blog

By willy108, history, 11 months ago,

This is the editorial for a recent contest Teamscode. The problems are open for upsolving on this gym. Problems were prepared by oursaco, dutin, thehunterjames, Bossologist, Esomer, and me.

Editorial
Code

Solution
Code

Solution
Code

Solution
Code

Hint
Solution
Code

Solution
Code

Hint 1
Hint 2
Solution
Code

H. A Certain Scientific Tree Problem

There are many solutions to this problem, some simpler than others, but I'll present the intended solution which involves the distance formula between two nodes.

Hint 1
Hint 2
Solution
Code

Solution
Code

Editorial
Code

Hint 1
Solution
Code

Solution
Code

M. Magic labyrinth

Hint 1
Hint 2
Hint 3
Solution
Code for the first method
Code for the second method

Hint 1
Hint 2
Solution
Code

Hint 1
Hint 2
Hint 3
Solution
Code

Hint 1
Hint 2
Solution
Code

Q. Another Floors Problem

A solution for this problem is not published yet. For now, please refer to this tester solution.

Code

R. Bingo

Solution for k = 20 that also happens to cheese and ac
Intended solution. Can be used to solve k = 27
Code
• +87

By willy108, history, 15 months ago,
• +202

By willy108, history, 3 years ago,

The long awaited editorial and CF gym for the TeamsCode Summer 2021 Contest is finally out! The CF gym is here with all 25 problems sorted in difficulty. In each problem at the bottom (after the sample explanation) are credits for each problem as well as in which divisions they appeared in. The editorial is linked in the form of a pdf in the CF gym and can also be found here.

The problem difficulties range from less than $800$ to well over $2500$ in terms of CF ratings. I challenge you to upsolve them. I know that the problemsetting team, codicon, Bossologist, Mantlemoose, Spark, and me, put a lot of work into making 25 quality problems this time around.

We also have a github with relevant materials as well as a Discord.

Special thanks to BucketPotato, lunchbox, Daveeed, and jli505 for testing the Advanced Division and Shreybob, Nikhil Chatterjee (nikhilc1527), and Ryan Chou for testing the Intermediate Division. Also massive props to PurpleCrayon for coordinating the round! We could not have done it without him. If there are any problems that you do not like, I guarantee you that Purple played a part in making it :).

Also thanks if you took the round! I know that a few of the problems were scuffed and harder than previous years, but I still hope you enjoyed! Now go upsolve the problems :)

Announcement of Teamscode Summer 2021
• +117

By willy108, history, 3 years ago,

disclaimer: yet again i do not pretend to claim that this algorithm is useful. bin lifting is probably better in all aspects.

======================================================================================

intro

if you havent alr, consider reading this since most of my terminology will be from there.

cube root decomp

before i start, this algorithm is entirely useless. bin lifting has both a shorter impl and it has a better time complexity. but being able to understand this algorithm will help with the generalized algorithm later in this article.

so i first came up with this idea when i doing the CSES Distance Queries with my sqrt lca and it tle'd. so what i was thinking was "why on earth is sqrt lca so slow", then it hit me, sqrt per query is horrible (and i should learn bin lifting, but i would rather brainstorm useless algs like this instead of learning bin lifting tho). but then i thought, what if i used cube root instead of square root? then, i thought, the complexity would go from O(n^(1/2)) to O(n^(2/3)) as i had to loop over n^(2/3) segments of length n^1/3. but thankfully we can do better than that.

so lets do some defining of terms:

CUBE is a constant such that CUBE^3 >= n and CUBE is minimal (a fancy way of saying the cube root of n)

par[u] is the parent directly up from u

super[u] is redefined as the parent CUBE steps up from u

duper[u] is defined as the parent CUBE^2 steps up from u (i have great naming skills btw)

we'll get onto how to compute super and duper fast is a second, but first i'll show you why they help.

so first is the k_up function

int k_up(int u, int k){ //returns the k'th parent up from node u
for(; k >= CUBE*CUBE; k -= CUBE*CUBE) u = duper[u];
for(; k >= CUBE; k -= CUBE) u = super[u];
for(;k--;) u = par[u];
return u;
}


so what is the complexity of the above code? well, you will iterate the first loop at most CUBE times since there are at most n/(CUBE*CUBE) <= CUBE since k <= n and you go up CUBE*CUBE steps each time, its similar to how sqrt lca works. then you iterate the second loop also CUBE times at most, since after you finish the first loop k will be < CUBE*CUBE, and they are only CUBE CUBE's in CUBE*CUBE. the third loop will also run in CUBE time since k < CUBE after the second loop. and so this k_up implementation runs in O(CUBE) (hahah CUBE is a constant so its O(1) hahahh ahahah big O abuse hahaha).

now the important thing is the actual lca finding, so for two nodes on the same level, you can also use the duper/super/par array to find the lca of the two in O(CUBE) (if they are on different levels, just use the k_up function to move the lower one up). so lets say two nodes u and v (on the same level) have the same duper value, their lca will not be further from u (and v) than duper[u] is. since if it was further, you can take duper and it would be closer (this is more or less the same thing as the proof of correctness of the sqrt lca). if they are different, you set u = duper[u] and v = duper[v] and try again. this will happen at most CUBE times (just like the k_up complexity). then you can do the same thing with the super array. since the distance from u to duper[u] is exactly CUBE^2 and we know the lca is at least as close as duper[u], the problem just becomes the same thing with a tree size CUBE^2 (instead of CUBE^3 as it was previously), and then the tree becomes size CUBE with the par array, and you have an lca! this is very similar to the sqrt lca idea, its just that i used 2 arrays (duper and super) instead of 1 (super).

below is an implementation of the code.

int lca(int u, int v){
if(dep[u] < dep[v]) swap(u, v); //im just yoinking this
u = k_up(u, dep[u] - dep[v]);   //from my sqrt lca code
while(duper[u] != duper[v]) u = duper[u], v = duper[v]; //you can just duper CUBE times before it becomes the root
while(super[u] != super[v]) u = super[u], v = super[v]; //you can just super CUBE times before you actually just wouldve jumped duper again
while(u != v) u = par[u], v = par[v]; //you can jump par CUBE times before you would've just jumped super
return u; //or v since they should be equal
}


and this is O(CUBE) per lca query since k_up runs in O(CUBE) and each loop runs in O(CUBE). since each duper[u] is CUBE^2 nodes up and super[u] is CUBE up, you can just loop over super CUBE times and it'll work. below is some code since this'll come up later

void precomp(){
//assume dep and par are computed elsewhere, if you dont know how to compute those, you shouldnt be reading abt lca :/
for(int i = 1; i<=n; i++){ //assume n is global and there are n nodes in the tree
super[i] = i;
for(int i = 0; i < CUBE && super[i] != 0; i++) //optimization to stop if super goes out of the tree
super[i] = par[super[i]]; //just looping par up CUBE times, like the sqrt lca but with CUBE
}
for(int i = 1; i<=n; i++){
duper[i] = i;
for(int i = 0; i<CUBE && duper[i]; i++)
duper[i] = super[duper[i]]; //the same thing but jumping super up to save time

}
}


the precomp complexity is O(n*CUBE) as the nested loops are, and queries and O(CUBE). yet again. you can do precomp in O(n) but im too lazy to cover it. you figured out how to do O(n) for sqrt precomp, its the same thing but with 2 arrays to maintain. this finally ac'ed the cses problem :D. since CUBE < SQRT this algorithm runs better (even if barely and more scuffed).

nth root lca

well, can we generalize it? what if you used 3 arrays super, duper, and juper (haha great naming) and did O(n^(1/4)) per query and O(n * n^(1/4)) for precomp, or 4 arrays for n^(1/5)? you can and i dont want to implement each one so.... the idea is that you can use R arrays (including the par array) to handle queries in O(R * n^(1/R)) with O(n * R * n^(1/R)) just note if you make R too big, nothing happens since you're jumping up by 1 or 0 each time. actually, if you set R to be log2(n) you end up with bin lifting. so imagine if i actually implemented decomp with n^(1/4) and then i realized i should make it a 2d array where super[u][j] stores the n^(j/4) parent up from u. then if i decided to not limit myself my 4 and i picked a constant R and super[u][j] stores to parent n^(j/R) up from u. you can actually just set the par array to be super[][0] since n^(0/R) is 1 and thats' exactly what the parent array stores. below is an implementation of the functions needed for lca (namely, k_up, lca, and precomp).

i left RT and LOG blank, that's for the reader to play with :), LOG is the exponent and RT is the "root"

super[u][j] is the parent LOG^j times up

base[i] = RT^i

par[u] is the parent

dep[u] is the distance from the root, dep[root] is arbitrary (0 or 1, your choice)

const int max_v = ;
const int RT = , LOG = ; //RT^LOG approx max_v
int super[max_v][LOG * 2], par[max_v], dep[max_v], base[max_v], n, q;

void dfs(int u, int p, int d){ //general util dfs, this O(n)
par[u] = super[u][0] = p;
dep[u] = d;
if(v != p)
dfs(v, u, d + 1);

}

void precomp(){ //three nested for loops so its just O(n * RT * LOG)
base[0] = 1;
for(int k = 1; k<=LOG; k++){
base[k] = base[k - 1] * RT;
for(int i = 1, j; i<=n; i++){
for(j = 0, super[i][k] = i; j < RT && super[i][k]; j++){
super[i][k] = super[super[i][k]][k - 1];
}
}
}
}

int k_up(int u, int k){          //O(LOG*RT)
for(int i = LOG; i >= 0; i--){ //similar proof of complexity as the cube root decomp
while(k >= base[i])k -= base[i], u = super[u][i];
}
return u;
}

int LCA(int u, int v){
if(dep[u] < dep[v]) swap(u, v);
u = k_up(u, dep[u] - dep[v]);
if(u == v) return u;
for(int i = LOG; i >= 0; i--){ //this is always O(LOG*RT), just think abt the proof for cube root but then realizing the "root" we're using isnt constant.
while(super[u][i] != super[v][i]) u = super[u][i], v = super[v][i];
}
return (par[u]) ? par[u] : u; //cant return 0
}


hopefully, it makes sense, it's just compressing the idea described above with some nested for loops and 2d arrays. so it is O(n * RT * LOG) for precomp and O(RT * LOG) per query. just note we were playing with small values of LOG (2 and 3) so we didnt really care abt it, but when you use anything more, it's no longer "constant" if you see what i mean.

resolution?

thank you for reading this scuffed blog post and please do in fact learn bin lifting, it cleaner than my "nth root decomp" and it runs faster too (unless you set RT to 2 and LOG to log2(n) >:) ). the advantages of choosing this method only lies is you can trade memory for time easily. if you choose RT to be larger, you'll use less memory, and if you set RT to be small, you'll use more. the memory complexity is always O(n * LOG) since that's the size of the super array. yet again, please do learn bin lifting, this article is what came out of someone who was too lazy to learn it. and after doing all of this, i always write bin lifting for lca. hahah

thanks for reading and i apologize for any typoes, unclearness, or over scuffedness.

• +10

By willy108, history, 3 years ago,

Disclaimer: I will not promise that this blog will make you code lca better. binary lifting is generally better than the following algorithm and i am only sharing it since i find it very interesting.

===================================================================================

intro (skip this if you know what lca is)

if you didnt know, lca stands for least common ancestor (and implies a rooted tree). so any ancestor of a node u. is any node on u's path to the root node (including the root itself). in the diagram below nodes 1 and 2 are both ancestors of node 4. so common ancestors of nodes u and v are the nodes that are both ancestors of u and ancestors of v. the least common ancestor of nodes u and v is first node that u and v share as an ancestor (first meaning the ancestor with the largest depth). and you can see in the picture below the lca of 4 and 6 is 2, since that's the node with the furthest depth from the root that is both an ancestor of u and an ancestor of v. 1 does not count as the lca since 2 is further down and 5 does not count since it is not an ancestor of 4 (or 6 for that matter). for any pair of nodes on a rooted tree there is only one lca (hopefully this is intuitive), and this blog post describes ways to find that lca.

here is an image (from google search) of a tree. remember lca only works on a rooted tree since you have a sense of "up" and "down" only if there is a root

Sqrt LCA

imagine if we had a function k_up where k_up(u, k) returns the k'th node up from u. so from the diagram above k_up(2, 1) is 1 and k_up(8, 2) is also 1. this will be very helpful to do lca as we will see later. the array par[] stores for each u the direct parent of u (the parent of the root node is an imaginary node, 0 or -1 depending upon your implementation). in the diagram above par[2] = 1, par[5] = 2, and par[8] = 3.

so in all of the following code assume the par array is already calcutated, as for how to calculate it.... go google it.

int k_up(int u, int k){
for(int i = 0; i<k; i++){
u = par[u];
}
return u;
}


so what is the complexity of the above code? its O(k) since you loop over k to find par[u] k times. darn that's slow. i really wish we could do that faster. well that's the point of this blog.

so now lets pick a define a constant R, all we have to know abt R is that it's a constant <= the number of nodes on the tree. now for each u we define super[u] as the parent of u R times up (super for superparent, i love that word "superparent"). in the case that going k times from u takes us higher than the tree allows supor[u] = -1. so how does this help? so now in k_up(u, k) we can first loop over floor(k/R) times the superparent then we loop over k%R times on the parent array. so in total we move up floor(k/R)*R + k%R times (or k times :OOO). now the complexity is reduced to O(k/R + R) since the first half can only take k/R turns at most and the second half can take R steps at most. below is some code of the idea

int k_up(int u, int k){
while(k >= R) k = k-R, u = super[u];
for(int i = 0; i<k; i++) u = par[u];
return u;
}


now there are a few problems, namely how to find super[] and what is R, If we set R to be sqrt(n), O(k/R + R) is minimized. (hopefully the intuition makes sense), in fact picking R like this is so common its even got a name: "SQRT decomp". and to find super, you can for each u, loop up R times to find the parent R steps up. you can do this even faster, but im too lazy to cover that.

be sure you understand all of this since the concept of a k_up function will come up a lot later in this blog.

now onto the actual lca part. so consider a rooted tree and any two nodes u and v that are on the same level or have the same distance from the root. if we just marched from each node taking 1 step from each at the same time, the first node where they are the same is their lca. (note this only works if u and v start on the same level). so below is an implementation of the idea

int lca(int u, int v){ //remember that they have the same depth
while(u != v){
u = par[u], v = par[v];
}
return u;
}


so whats the complexity of this code? its O(d) where d is the difference of depth from the lca to node u (or v). this at worst case can be O(n) where n is the number of nodes in the tree (imagine a chain). so what can we do to speed it up? let us refer to the super array again. so if super[u] == super[v] then we know that the lca of u and v has to be super[u] or further down the tree. since if it was further up, we'd just take super[u] instead since super[u] is an ancestor of u and super[v] is an ancestor of v (and they are the same :O). so now (remember nodes u and v have the same depth) we can just loop over the super[u] and super[v] while they are different, and when they are the same, we'll loop over the parent array. below is the code for this idea.

int lca(int u, int v){ //remember that they have the same depth
while(super[u] != super[v]){
u = super[u], v = super[v];
}
while(u != v){
u = par[u], v = par[v];
}
return u;
}


what is the complexity of the above code? its O(n/R + n%R) worst case. which is also O(sqrt(n)) since R is sqrt(n). sadly this only find the lca when the two nodes have the same depth from the root. luckily this is where our k_up function comes in handy. so if we define dep[u] as the depth of node u from the root and dep[root] = 1, then we can just set u to be the further down node out of u and v and then make u = k_up(u, dep[u] — depv]). and now it becomes lca on the same depth. below is the implementation of the idea.

int k_up(int u, int k){
while(k >= R) k = k/R, u = super[u];
for(int i = 0; i<k; i++) u = par[u];
return u;
}

int lca(int u, int v){
if(dep[u] < dep[v]) swap(u, v); //setting u to be the node with greater depth
u = k_up(u, dep[u] - dep[v]);   //moving u up to the same level at v
while(super[u] != super[v]){    //iterating over super
u = super[u], v = super[v];
}
while(u != v){                  //ieterating over par
u = par[u], v = par[v];
}
return u;                       //u and v will be the same, so returnv works too
}



since k_up is also O(root(n)) worst case, the. overall query is still O(root(n)) in time complexity.

and our lca implemention is complete! its O(n sqrt(n)) precomp (or O(n) if you are smart abt it, try figuring out how later :D) and its O(sqrt(n)) per query. a more complete implementaion can be found here (along with calculating the par, super, and dep arrays).

Conclusion

the overall time complexity is O((n+q)sqrt(n)) and the memory is O(n) since we just have to store par, super, and dep arrays. but if you squish the code, it becomes a pretty short impl (and slow) and you can make the compleixty O(n + q*sqrt(n)) if you want.

please remember O(sqrt(n)) for lca queries is really slow. reaaallllly slow. there are better lca algorithms but this is cool one i wanted to introduce. also note the actually fun lca algorithm (that's not bin lifiting) will be on the next blog post since this one got too long. rip.

thank you reading and i apologize for any typos, instances of unclearness, and bugs in the code.

• +64

By willy108, history, 3 years ago,

This is my first blog post, so it probably will end up being scuffed.

==============================

The problem

So I was working on CF 472D, Design Tutorial: Inverse the Problem and it requires an MST (sorry for spoiling) and I chose to use Kruskals MST to do this (just note that I could've avoided this issue by using any other MST alg, but that's not the point here). So I just coded vanilla Kruskals where I std::sort'ed the edges and used a DSU with path compression and union by size. And I submitted the code and lo and behold, it was TLE on tc 39! I entered into a state of depression and decided to change my std::sort to std::stable_sort and wow, it ac'ed (with 1934ms on a 2 second TL)! Well I was talking some people on Discord later and I this up and one of them told me that the only reason std::sort was slow was since I had used c++11 (which I did) instead of c++17 where std::sort was upgraded. So I submitted it again with std::sort and c++17 and yay it passed (with 1466ms)! and ~470ms faster than my last submission! But to satisfy my curiosity I submitted it again with c++17 and std::stable_sort and woah, it passed with 1154ms, some 310ms faster than the last one.

Please do not judge my template/coding style as that is not the point here, but if you look at my code, the only differences between the submissions are the submission settings and a different sort on line 65 (or the second line of the kruskals function).

the submissions from each correlate to:

• 1154ms, c++17 with stable_sort

• 1466ms, c++17 with sort

• 1934ms, c++11 with stable_sort

• TLE tc 39, c++11 with sort

also note I did not benchmark memory since that was not something that bothered me in all of this

===============================

The resolution?

So here is one (of probably very few) situations where std::stable_sort outperformed std::sort. The problem that I have right now is that I do not want to miss a submission in a future contest over which sort I chose, since there are definitely problems out here where this problem cannot be avoided by something just as simple as using Prim instead of Kruskal.

If anyone has a good suggestion of which to use (std::stable_sort or std::sort) or a simple impl of a sorting alg that I can write up a little more time for similar results, please link it. I will probably just end up using stable_sort from now on since that was what made a difference here unless there is a really convincing argument or implementation that'll make me switch. No upvoting is needed (unless you want to :D), I just need the help. Thanks for reading and I apologize for any typos, unclear sections, and overall scuffedness.

• +73