retrograd's blog

By retrograd, history, 4 days ago, In English,

TL;DR I am thinking of holding a series of livestreams on youtube/twitch of me solving problems without having seen them beforehand. I want to highlight how to approach a medium-hard programming task when you are faced one. I want this to be a social and chill environment. Your ideas/feedback would be greatly appreciated.

TS;WR

Hello guys! I have been doing competitive programming for a while, and lately I have been thinking about how I have developed myself during the past years, and what is different between how I used to approach problems in the beginning and how I do it nowadays. I was thinking that it would be a pretty nice idea to give back my thoughts to the community.

What I was thinking about is recording some videos or streaming myself solving some competitive programming tasks that I haven't seen before. The inspiration for this idea came from a friend (teammate) of mine who wanted to do this quite some time ago, in order to train for ICPC. The format would sound like this: you would propose tasks that you find interesting or that you had been challenged by when approaching them on first sight, and I would try to solve them during the stream (deciphering the statement, finding the solution, implementing the solution).

Why I want to do this

I have been coaching some people for a while, and during this time I have the feeling that somehow explaining a problem/solution turns out to be very different when you have solved the problem beforehand as opposed to coming up with it when you are facing with the task for the very first time. Not only are there a lot of pitfalls that I tend to forget having made after I had solved a problem, but also it's that a lot of the times I don't recall the thought process that needed to happen in order to solve it. That's why it's sometimes hard for me to even give hints to people, if they're stuck on a task, for example.

With this format, I am hoping to capture and highlight the general thought process that happens between reading the statement of a problem and getting AC on it, in addition to the solution itself. It might also include tips on implementation and debugging along the way.

Some more details (FAQ style)

So is this some sort of online coaching sessions?

Yes and no. While the main goal of this is me trying to highlight the thought process of solving the problem, I want to see it as more of a group where you and I both discover which ideas and approaches work on tasks, and which don't

What happens when you run out of ideas for a solution?

Some problems might pose difficult for me as well, and given that I would be live, there's a high chance that I might get stuck. At these times, I think I might move on to another problem (which is probably what I would do in a contest), or ask for a hint from you guys.

How do we propose tasks?

What I was thinking of is a simple poll where you could add options, and vote for the best ones. This website seems to provide a quite neat interface for creating exactly what I need. You could also post a comment, and whichever comments receive most upvotes will be chosen. This way, the post will be kept more alive until the actual stream.

Some people solve contests; why don't you just do that?

While I certainly don't want to restrict myself to this proposed format, the reason why I am thinking of solving problems "offline" as opposed to during the contest boils down to the fact that even though solving contests would yield a similar scenario, I am usually more stressed during contests, and that would relate to less talking and more thinking; for now, I think solving tasks offline would provide me with enough calm to more easily and succinctly describe my thoughts.

When is it going to happen?

I was thinking of doing the first stream on Sunday (probably at around 3 PM EEST, probably on YouTube). Not much is settled (I would like to hear your opinions first); more details coming soon.

Vote your options here (you can also just comment down below)

Please only use links when adding options. I would pick the tasks that have most votes during the stream. For this session, please restrict yourselves to problems that are not too hard, so that I would be able to build up my courage :). Problems can be on any judge from Codeforces, Atcoder, CSAcademy, Kattis, Infoarena, and others. They can be either ICPC style (preferred) or IOI style. Difficuly level should be around Codeforces Div1 A/B/C (or Div2 C/D/E)

Please, do your best to check beforehand if I have solved a task, and do not add/upvote tasks which I have already solved.

Final words

I will be updating this post as the week progresses, but I was very eager to post this. I am very excited about this idea, and I can't wait to hear your feedback. Any ideas of yours are greatly appreciated.

I hope that I'll see you during the stream!

Read more »

 
 
 
 
  • Vote: I like it
  • +181
  • Vote: I do not like it

By retrograd, history, 4 weeks ago, In English,

There seems to be a lot of encouragement for new people to learn segment trees, and in particular the lazy propagation technique, and it seems to me that most of the time it is not actually needed.

As a quick refresher (although I feel most of you would already know), lazy propagation is a technique in segment trees that lets you do updates on whole ranges of elements, by first only updating the smallest factoring of the update range in the tree. For example, in order to update range $$$[2, 5]$$$, what you do is you update only ranges $$$[2, 2], [3, 4], [5, 5]$$$ in the segment tree, and next time node $$$[3, 4]$$$ is accessed you "propagate" the updates downward into ranges $$$[3, 3]$$$ and $$$[4, 4]$$$. This allows you to effectively aggregate (combine) multiple updates on a given range, to save extra work.

Usually when people talk about lazy propagation, some tasks like "add on range" — "minimum on range" or "add on range" — "sum of range" naturally come to mind. However, I feel like these are poor examples of applications, as most of these can be solved without lazy propagation.

OMG, how is that possible?

An alternative way one could approach these kind of problems is to keep an extra array $$$lazy$$$ with the usual segment tree. $$$lazy(node)$$$ is an aggregate of all the update operations done on $$$node$$$ until present time. In the above examples, $$$lazy(node)$$$ would be the sum of all the updates done on the node. Then the recurrence becomes $$$data(node) = lazy(node) + min(data(node * 2), data(node * 2 + 1))$$$ in the "minimum on range" case and $$$data(node) = data(node * 2) + data(node * 2 + 1) + lazy(node) * length(node)$$$ in the "sum on range" case (here $$$length(node)$$$ denotes the length of the range corresponding to $$$node$$$ in the tree.

The way to query is by having an extra parameter passed down in the traversal, which aggregates the operations inside $$$lazy$$$ while going down in the tree. The technique is similar to something called "upwards-downwards dynamic programming on trees" in Romania.

An example implementation is here.

So, you still store lazy array, but you just don't propagate. Why should we care?

Honestly, firstly I would say that it's more simple and natural. A lot of data structure problems I've encountered need some sort of "lazy" value that holds an aggregate of operations that affect the whole structure (take a simple data structure problem, where you have to model adding a constant value to all elements of a collection, and pop the min element, which can be done by using a priority queue, along with an extra value that lazily aggregates all add operations).

Second of all, I think it's easier to debug a solution that does not propagate, as I feel that I can reason about the correctness of the segment tree values more easily with this approach. In contests like ACM ICPC, it is key to debug on paper as much as possible when applicable, and with this approach one could simulate the range updates and build expectations upon the data structure at different points in time easier.

Third of all (and maybe most importantly), it is way faster. I will probably make a full comparison if people are interested, but from my experience I found that lazy propagation yields a particularly slow solution (hidden constant is bigger), probably because it does a lot more mutations on the underlying array, and this approach reduces that constant very considerably.

Ok, then, why ever propagate?

Well, it seems that not all problems can be solved via this method. For example, a "set to constant value on range" and "sum on range" type problem cannot be easily solved by this. What is particular about these scenarios is that the update operations are order-dependent (in other words, they are not commutative). However, in a lot of cases the update operations are commutative, and I don't see why any of these kind of problems would use lazy propagation instead of this technique.

I'm curious to hear your opinions on this.

Read more »

 
 
 
 
  • Vote: I like it
  • +65
  • Vote: I do not like it

By retrograd, history, 2 months ago, In English,

I was recently trying to solve problem K. Kingdom of Ants on Kattis. However, my submission strangely receives a Memory Limit Exceeded verdict, and I can see no way of achieving that with my code.

Code: https://pastebin.com/0sXQaCdi

Do you guys spot any reasons why the above code would give MLE? Maybe some weird C++ undefined behaviour, or maybe some compiler flaw?

UPDATE: After checking for the (very degenerate) case if there is only one y-coordinate (right between lines 119 and 120), the solution gets Accepted verdict. However, I still don't understand why the initial verdict was MLE, and it was very misleading.

Read more »

 
 
 
 
  • Vote: I like it
  • +3
  • Vote: I do not like it

By retrograd, history, 22 months ago, In English,

I wanted to showcase a simpler algorithm for the closest pair of points in 2D problem, and maybe to discuss its performance / countercases.

The problem is:

Given N distinct points in Euclidean 2D space, compute the minimum (squared) distance between any two distinct points.

The usual approach here is a divide-and-conquer algorithm, which can be found virtually anywhere, including on Wikipedia. The complexity of this algorithm is O(nlog(n)), but it is rather tricky to achieve this complexity.

The alternative approach (based on the same algorithm), is to do sweep-line. We sort the points based on the x-coordinate and we keep a set of the points in the region x - d, x, sorted by y coordinate. Here d is the smallest distance so far (we do that with the two-pointers technique). Now, for each new point x, y, we query the whole range y - d, y + d in this set and possibly update our answer.

Due to the proof of the D&C algorithm, at each time the quieried range should be of size O(1) on average, so total complexity would be O(nlog(n)). Code is below:

long long ClosestPair(vector<pair<int, int>> pts) {
    int n = pts.size();
    sort(pts.begin(), pts.end());
    set<pair<int, int>> s;

    long long best_dist = 1e18;
    int j = 0;
    for (int i = 0; i < n; ++i) {
        int d = ceil(sqrt(best_dist));
        while (pts[i].first - pts[j].first >= best_dist) {
            s.erase({pts[j].second, pts[j].first});
            j += 1;
        }

        auto it1 = s.lower_bound({pts[i].second - d, pts[i].first});
        auto it2 = s.upper_bound({pts[i].second + d, pts[i].first});
        
        for (auto it = it1; it != it2; ++it) {
            int dx = pts[i].first - it->second;
            int dy = pts[i].second - it->first;
            best_dist = min(best_dist, 1LL * dx * dx + 1LL * dy * dy);      
        } 
        s.insert({pts[i].second, pts[i].first}); 
    }
    return best_dist;
}

What do you think? Are there any special cases (e.g. many points at the same x coordinate) that would break the solution?

Read more »

 
 
 
 
  • Vote: I like it
  • +43
  • Vote: I do not like it

By retrograd, history, 2 years ago, In English,

This article will be presenting a rather classical problem that can be solved using deques, along with an extension that allows you to solve the problem in its more general multi-dimensional case. I have decided to write this article after this discussion on 2D range-minimum query.

The article will be mainly based on this following problem:

You are given an array of numbers A[] of size n and a number k ≤ n. Find the minimum value for each continuous subarray of size k.

We will be now focusing on the linear-time solution to this problem.

Solution:

Consider sweeping from left to right through the array. At every moment we keep a list of "candidates" for minimum values throughout the process. That means that at each moment, you have to add one element to the list and (potentially) remove one element from the list.

The key observation is that, during the sweep line process, we find two values A[i] and A[j] which have i < j and A[i] ≥ A[j], then we can safely discard A[i]. That is because, intuitively, A[j] will continue to "live" in our sweep line more than A[i], and we will never prefer A[i] instead of A[j].

We should now consider pruning all the "useless" values ("useless" as in the statement above). It is easy to see now that doing this will lead to a strictly increasing list of candidates (why?). In this case, the minimum will always be the first element (O(1) query).

In order to insert an element to the back of the pruned candidate list, we will do a stack-like approach of removing all elements that are greater than it, and to erase on element, we just pop the front of the list (if it is not already removed).

This is a well-known approach for finding minima over fixed-size continuous subarrays. I will now present an extensions that allows you to do the same trick in matrices and even multi-dimensional arrays.

The multi-dimensional extension

Problem (2D):

You are given an matrix of numbers A[][] of size n × m and two numbers k ≤ n, l ≤ m. Find the minimum value for each continuous submatrix of size k × l.

Solution:

Consider the matrix as a list of rows. For each row vector of A, use the 1D algorithm to compute the minimum value over all l-length subarrays, and store them in ColMin[][] (obviously, ColMin[][] is now a n × (m - l + 1)-sized matrix).

Now, consider the new matrix as a list of columns. For each column vector of ColMin, use the algorithm to compute the minimum value over all k-length subarrays, and store them in Ans[][] (of size (n - k + 1) × (m - l + 1)).

The Ans[][] is the solution to our problem.

The following picture shows the intutition behind how it works for computing Ans[1][1] for n = 5, m = 7, k = 3, l = 4

The pseudocode is as follows:

def solve_2d(M, k, l):
  column_minima = {} # empty list
  for each row in M.rows:
    # We suppose we have the algorithm that solves
    # the 1D problem
    min_row = solve_1d(row, l)
    column_minima.append_row(min_row)
  
  ans = {}
  for each col in column_minima.cols:
    min_col = solve_1d(col, k)
    ans.append_col(min_col)
  
  return ans

Note that the pseudocode is (deliberately) hiding some extra complexity of extracting rows / columns and adapting the 1D algorithm to the 2D problem, in order to make the understanding of the solution clearer.

The total complexity of the algorithm can be easily deduced to be O(n * m)

Multi-dimensional case analysis

The solution can be extended to an arbitrary order of dimensions. For a d-dimensional matrix of size s1, s2, ..., sd, the time-complexity of the problem is O(d * s1 * ... * sd), and the memory complexity is O(s1 * ... * sd). This is much better than other algorithms that do the same thing on non-fixed size submatrices (e.g. multi-dimensional RMQ has O(s1 * ... * sd * log(s1) * ... * log(sd)) time and memory complexity).

Finding the best k minima

The deque approach itself is limited in the sense that it allows you to find only the minimum value over the ranges. But what happens if you want to calculate more that one minimum? We will discuss an approach that I used during a national ACM-style contest where we were able to calculate the best 2 minima, and then argue that you can extend to an arbitrary number of minimum values.

In order to store the lowest 2 values, we will do the following:

Keep 2 deques, namely D1 and D2. Do a similar algorithm of "stack-like popping" on D1 when you add a new element, but instead of discarding elements from D1 when popping, transfer them down to D2 and "stack-like pop" it.

It is easy to see why the lowest 2 elements will always be in one of the two deques. Moreover, there are only 2 cases for the lowest two elements: they are either the first two elements of D1, or the first elements of D1 and D2 subsequently. Checking the case should be an easy thing to do.

The extension to an arbitrary number of minima is, however, not so great, in the sense that the complexity of this approach becomes O(n * k2) for a n-sized array, currently bottlenecked by the number of elements you have to consider in order to find the first k minima. [Maybe you can come up with a cleverer way of doing that?]

Useful links

This is the problem I referred to above: http://www.infoarena.ro/problema/smax. I recommend trying to think it through and implementing it, and translating the statement via Google Translate or equivalent.

Read more »

 
 
 
 
  • Vote: I like it
  • +349
  • Vote: I do not like it

By retrograd, history, 4 years ago, In English,

Hello! I am a fairly advanced programmer (although I'm pretty new), and I know the basic algorithms and data structures used for most problems. However, I don't know how to get better from this state onwards.

I should say that my strength is mainly DS problems. Greedy, D&C, DP I do pretty well (once I recognize a specific-type problem), constructive algorithms seem very hard to me (for example link). For DS, I know pretty much all there is to know about segment trees, BIT, sqrt-decomposition (I call it a DS, don't blame me :D), BSTs, hash, etc (the basic ones), although problems that involve advanced tricks with these (e.g. persistent segment trees, lazy propagation) seem very appealing.

I want to prepare mysef for this year's ACM-ICPC contest, as it is my first year in Uni. I have been learning algo intensively since December.

I would be grateful if you know any good lists of problems that are really crucial and/or teach useful new techniques, as I am stuck momentarily. I have read the results on Google for ACM algorithms and such, so a personal response would be a lot more appreciated :).

I will also try to keep a list updated with interesting problems and techniques, in case other people struggle with the same issue. Maybe this tread will become a learning portal for more people :D.

LIST:

  • (http://e-maxx.ru/algo/) -> there are a lot of algorithmic problems greatly explained, as well as tricks and great algorithms -- the website is in Russian, although it can be translated using Google Translate -- great resource

Thank you a lot!

Read more »

 
 
 
 
  • Vote: I like it
  • +14
  • Vote: I do not like it