-is-this-fft-'s blog

By -is-this-fft-, history, 4 days ago, In English

Hi everyone!

After a successful camp in February, we are excited to announce that Osijek camp will be returning this September. Precise dates and registration details will be announced later, but for now, we are looking for contests and problemsetters.

To apply, contact us at ocpc (at) mathos (dot) hr or contact me (-is-this-fft-) or adamant directly. Specific compensation might depend on how successful we are with sponsors and participation fees. As a reference point, last time, we offered 2000€ per contest. Note that the compensation is subject to Croatian income tax (around 25%).

You don't need to have enough ideas for a full problemset yet, but you should have a somewhat good idea of what your contest is going to be, i.e. you should have at least 6-7 problem ideas already. The most important criterion for us is whether your contest fits the camp in terms of style and difficulty. The camp is aimed at the highest level of competition, thus the difficulty should be comparable or harder than typical ICPC regional contests. However, it is good if the difficulty distribution between problems is spread out: do not hesitate to add a few very easy or super-hard problems.

Any specific deadlines and things you would want me to do?

We would expect the problem set to be finalized in terms of problem ideas a month before that, and be mostly ready (prepared on Polygon) around a week before the camp. We will also expect that you present a tutorial after the contest, or make a PDF doc or presentation with some written details, so that we can explain it to participants. We will likely accept contest proposals on first come, first served basis, so it's better to confirm your commitment as soon as you're sure that you want to set a contest.

Can I propose a contest if I am also a camp participant?

Yes, absolutely! In fact, we encourage you to. If you are a camp participant and prepare a contest, your team will not participate on that day, but will otherwise take part in the camp normally. We will also charge no admission fee from your team if you author a contest.

What if I have some interesting ideas, but no time to prepare them?

We recommend you to find a co-author for your contest, who may help you with preparation. Feel free to reach out to us, we can discuss some options. Please bear in mind that since a significant part of the work involved in authoring a contest is preparing the tests, the compensation might be reduced for such proposals.

Can I re-use problems or contests that appeared somewhere else before?

Generally, we prefer problems that are original, or have very little public exposure. In principle, we can still consider your contest if it has some problems from low profile local or private competitions that already happened, but we would need to minimize the possibility that our participants were exposed to them. We might also reduce the compensation for the contest, if you already was compensated for it before.

Can I re-use the problems after I used them in the camp?

We recognize that it is natural for an author to desire for greater exposure for their problems, and we believe that it is a nice thing to give back to the community by making the contests more accessible. As such, we do not want to restrict you from re-using or publishing your problems after the camp has concluded.

However, based on the feedback from the previous camp, we will have a "silence period" after the camp, during which camp materials won't be released to the public. Thus, we ask you to refrain from publishing the problems at least until the end of November 2023.

Full text and comments »

  • Vote: I like it
  • +88
  • Vote: I do not like it

By -is-this-fft-, history, 5 months ago, In English

If I was a YouTuber, this would be the place where I fill the entire screen with screenshots of people asking me to make this. Anyway, not so long ago I gave a lecture on FFT and now peltorator is giving away free money, so let's bring this meme to completion.

In this blog, I am going to cover the basic theory and (competitive programming related) applications of FFT. There is a long list of generalizations and weird applications and implementation details that make it faster. Maybe at some point I'll write about them. For now, let's stick to the basics.

The problem. Given two arrays $$$a$$$ and $$$b$$$, both of length $$$n$$$. You want to quickly and efficiently calculate another array $$$c$$$ (of length $$$2n - 1$$$), defined by the following formula.

$$$c_k = \displaystyle \sum_{i + j = k} a_i \cdot b_j.$$$

Solve it in $$$O(n \log n)$$$.

First of all, why should you care? Because this kind of expression comes up in combinatorics all the time. Let's say you want to count something. It's not uncommon to see a problem where you can just write down the formula for it and notice that it is exactly in this form.

FFT is what I like to call a very black-boxable algorithm. That is, in roughly 95% of the FFT problems I have solved, the only thing you need to know about FFT is that FFT is the key component in an algorithm that solves the problem above. You just need to know that it exists, have some experience to recognize that and then rip someone else's super-fast library off of judge.yosupo.jp.

But if you are still curious to learn how it works, keep reading...

Full text and comments »

  • Vote: I like it
  • +436
  • Vote: I do not like it

By -is-this-fft-, history, 8 months ago, In English

When solving problems in competitive programming, it is almost never a good idea to use the following inbuilt C++ functions. You will be hacked or fail a pretest or worse, a systest.

Why? Because they use floating-point numbers. They are designed to be used with a floating-point input and a floating-point output. The issue is that on a floating-point number, the result may not be exact. Worse, floating-point numbers may not be able to accurately encode the integer.

To calculate $$$\lfloor \frac{a}{b} \rfloor$$$ for positive integers $$$a$$$ and $$$b$$$:
  • Don't use floor((double) a / (double) b) or similar.
  • Do use a / b. It will round down.
  • Warning: be careful with negative numbers. The answer depends on whether we should round down or towards 0.
To calculate $$$\lceil \frac{a}{b} \rceil$$$ for positive integers $$$a$$$ and $$$b$$$:
  • Don't use ceil((double) a / (double) b) or similar.
  • Do use (a + b - 1) / b.
  • Warning: the same caveat goes for negative numbers as before.
To calculate $$$\lfloor \sqrt{a} \rfloor$$$ for a nonnegative integer $$$a$$$:
  • Don't use sqrt(a) or similar.
  • Do use binary search to calculate the square root.
Implementation
To calculate $$$a^b$$$ for nonnegative integers $$$a$$$ and $$$b$$$:
  • Don't use pow(a, b) or similar.
  • Do use the naive algorithm. If you really want to do this and $$$a^b$$$ fits in a long long, then it will take no more than 64 steps, unless $$$a = 0$$$ or $$$a = 1$$$, in which case you can just return $$$a$$$.
To calculate $$$\lfloor \log_2 (a) \rfloor$$$ for a positive integer $$$a$$$:
  • Don't use log2(a) or similar.
  • Do use __lg or an approach based on __builtin_clz or __builtin_clzll. The "clz" stands for "count leading zeroes" and it does exactly what it says — on the binary representation of the number. If you have access to C++20, there is also the bit_width library function.
  • Warning: __builtin_clz(0) and __builtin_clzll(0) are undefined behavior. Also all these functions starting with __ are specific to GCC and you're technically not meant to use them, but it's fine in competitive programming.

Exceptions

The only major exception is if floating-point numbers are an inherent part of your solution. Most commonly, this happens in problems where the output is itself a floating-point number (especially geometry problems, sometimes probability). There are also some algorithms where you need to work with floating-point numbers even if the input and output are integer (FFT, possibly some LP).

But even in these cases, it's always a good idea to do as much as possible with integers and move to floats as late as possible.

I got WA with one compiler, accepted with another

Most likely it has to do with 64-bit versus 32-bit. Also the widths of the various floating-point types vary. It's possible that in some circumstances your solution is actually correct. However, rely on this only if you know what you're doing.

Full text and comments »

  • Vote: I like it
  • +500
  • Vote: I do not like it

By -is-this-fft-, 8 months ago, In English

Perhaps this will be helpful to some. The reason I'm writing this is that these words are used often, and a lot of people understand them the wrong way. This means that if you read some advice containing these words and misunderstand them, you will walk away with a wrong idea.

  • Constructive. Refers to problems where a significant part of the solution is about coming up with a pattern that satisfies some properties, e.g. a grid coloring or a graph. A blatant example is this. "Constructive" does not mean the same as "ad hoc", nor does it mean "a problem where no well-known algorithm is needed", nor "a problem where you have to make observations".
  • Doubt. You have a doubt when you are not sure about something. That is, you have a pretty good idea, but something seems strange or a little off. If you have no clue, then it is not a doubt.
  • Observation. An observation is a friendlier synonym for words like "theorem", "lemma" and "proposition". It is a (usually relatively short) chain of reasoning leading to a conclusion. "Observing" does not in general mean looking at input/output pairs and noticing patterns.
  • Plagiarism, plagiarize. Refers to the act of copying itself, not to its detection. For example, "I solved this problem, my friend plagiarized it from me" is correct, "my friend copied my solution and got plagiarized" is not.
  • Plague. A disease. Doesn't have much to do with plagiarism.
  • Solution, solve. These kind of have two meanings. Almost every problem in CP has two parts: first, you think, and when you have thought something up, you code. "Solving" primarily refers to the former and only secondarily to the latter. When people tell you to solve more problems, they mostly mean that you have to think more, not implement more. Reading an editorial and then implement the idea they give there is not "solving".
  • Subquadratic complexity. Refers to any complexity strictly less than $$$n^2$$$ (formally, any complexity in $$$o(n^2)$$$). This includes complexities in $$$\Theta(1)$$$, $$$\Theta(\log n)$$$, $$$\Theta(n)$$$ etc., not only things like $$$\Theta(n^{1.999})$$$.
  • Trivial. Something that does not require any special insight beyond what is already given. It does not necessarily mean "easy" or "simple". For example, it would be valid to say that something is "a trivial FFT problem" if the application of FFT is fairly straightforward. This statement doesn't imply that the problem would be easy for everyone, including beginners who have never even heard of FFT. In a similar fashion, a long calculation is often called trivial if it is all pretty much mechanical.

Full text and comments »

  • Vote: I like it
  • +330
  • Vote: I do not like it

By -is-this-fft-, history, 9 months ago, In English

For no reason in particular I'm curious about this, for both spring and fall (or winter and summer) semesters. There seems to be a lot of regional variation.

Here is a Google Form I made.

Please note that I'm interested in specific answers, i.e. not "in the spring" but rather "from April 31 to May 15".

Full text and comments »

  • Vote: I like it
  • +55
  • Vote: I do not like it

By -is-this-fft-, history, 9 months ago, In English

Introduction

When it comes to "problem-solving techniques", there are roughly 3 levels:

  1. Concrete algorithms (Kruskal's algorithm, Li-Chao tree, fast Fourier transform)
  2. General patterns (dynamic programming, greedy, square-root decomposition)
  3. Meta-strategies ("how do I even go about solving this problem?")

There is some grey area, but in general this classification works well.

Many tutorials have been written about ideas that fall into 1 or 2. But very little has been written about the third category. On the catalog, I think the only blogs that really qualify are this and this. There is also this valuable comment. As to why there is so little written, I think I can identify two reasons.

  • Most strong contestants don't really consciously think about these. After solving a problem with FFT, it's hard not to know you used FFT. If you used DP, even if you did it without thinking that it's DP, it's not hard to recognize that you used DP later anyway. But about things in the third category... it's hard to recall how exactly your thought process went. Furthermore, at least I have never really learned (i.e. read about and understood) anything from that category. Rather my thought process has developed from a lot of practice and experience.
  • When writing about algorithms, it is easy to know you are right. You just prove it. But now we are veering into a more uncomfortable realm, where it is harder to understand what is correct or what is even useful. It is not as bad as writing about psychology, but still. Even with this blog, as I'm writing this, I think "does it even make sense?".
  • It is not clear if writing blogs of this type actually helps anyone. I still maintain that in your ability to solve problems, the most important thing that you have control over is experience and seeking substitutes for that (algorithms to come up with algorithms, reading other people's thought processes) etc don't work. However there are many people who in theory should have a lot of experience but whose results don't reflect that.

It's far too much for me to write and describe exactly how my thought process works, and I'm not at all convinced that this would be useful for anyone. But I think I've identified something that beginners regularly get wrong, so I'll try my best to explain that.

Full text and comments »

  • Vote: I like it
  • +548
  • Vote: I do not like it

By -is-this-fft-, history, 10 months ago, In English

Part 1: [Tutorial] My way of understanding Dinitz's ("Dinic's") algorithm
Part 2: [Tutorial] Minimum cost (maximum) flow
Part 3: [Tutorial] More about minimum cost flows: potentials and Dinitz

Introduction

As mentioned in the introduction of part 2, there is something in out ICPC notebook called "min cost dinic". This cycle (no pun intended) of blogs was initially meant to be just one blog explaining it all, but in time it developed into something much longer. The general idea here doesn't appear to be particularly widely discussed, but it is certainly not novel either; TopCoder, for example, calls it the primal-dual algorithm (why is it called that? That's a story for another time, but it has to do with LP duality.)

Picking up from where we left off, there are still two things to talk about:

  • Where does Dinitz come in?
  • Something called "potentials".

Even more generally, recall the "universal max-flow algorithm template" from part 1.

Algorithm 1. Given a flow network $$$(G, c, s, t)$$$.

  • Initialize $$$f$$$ by setting $$$f(e) \gets 0$$$ for all edges. This is a valid flow.
  • While $$$t$$$ is reachable from $$$s$$$ in $$$G_f$$$:
    • Find some flow $$$\Delta f$$$ in $$$G_f$$$.
    • Let $$$f \gets f + \Delta f$$$.
  • Return $$$f$$$.

Our aim is to prove the correctness of the following generalization.

Algorithm 2. Given a flow network $$$(G, c, a, s, t)$$$ without negative cycles.

  • Initialize $$$f$$$ by setting $$$f(e) \gets 0$$$ for all edges.
  • While $$$t$$$ is reachable from $$$s$$$ in $$$G_f$$$:
    • Let $$$S$$$ be the shortest-paths subgraph of $$$G_f$$$.
    • Find some flow $$$\Delta f$$$ in $$$S$$$.
    • Let $$$f \gets f + \Delta f$$$.
  • Return $$$f$$$.

A special case of this is if we use "maximum" instead of "some".

Again thanks to adamant and brunovsky for reviewing, as the two blogs used to be just one.

Full text and comments »

  • Vote: I like it
  • +126
  • Vote: I do not like it

By -is-this-fft-, history, 10 months ago, In English

I tried to remove the "constructive algorithms" tag from 1704F - Игра-раскраска because the way I see it, there is nothing "constructive" about that problem. Within a minute, it reappears.

The first impression one might have is that I got into an "edit war" with someone who adds it back. However, I tried it 3 different times during the day (each time multiple times) and it has reappeared quickly every time. I don't believe that out of the few hundred people who have tag edit access on that problem, out of whom only a fraction care about the tags, out of whom only a fraction think the problem is "constructive", someone would instantly notice that the tag was gone and re-add it every time.

There is the possibility that it is done by a bot, but it seems unlikely that someone would write a bot like that seeing as it serves no purpose and is not that trivial to do.

The most likely interpretation still seems to be that removing the tag is actually broken. My theory is that the server caches that the tag is supposed to be removed but never actually removes it from the database, so when the cache is invalidated or reloaded, the tag reappears.

Some more experiments:

  • I can add and remove new tags.
  • I asked someone else to add a bogus tag to a problem. I was able to remove that.
  • I asked other people to remove the tag. It reappeared (which should prove that I'm not "shadow-banned")
  • I added a bogus tag to F and let it sit there for an hour or so (the fact that it remained should prove that no human is re-adding the tags: anyone like that would also remove "schedules"). I was able to remove that.

In my opinion, no problem in that contest is "constructive". D might be under a very generous interpretation.

Full text and comments »

  • Vote: I like it
  • +120
  • Vote: I do not like it

By -is-this-fft-, history, 10 months ago, In English

Part 1: [Tutorial] My way of understanding Dinitz's ("Dinic's") algorithm
Part 2: [Tutorial] Minimum cost (maximum) flow
Part 3: [Tutorial] More about minimum cost flows: potentials and Dinitz

Introduction

There is a section in our ICPC notebook from 2019 called "min cost dinic". This blog started as an attempt to dissect what was written in there and understand why it works. In time, I needed to refer to many general ideas about flow, so it developed into a more general treatment of (min-cost) flow problems and spawned an entire separate blog about maximum flow and Dinitz. Even after splitting the blog, this blog was still too long, so I split it yet again. This blog will deal with the basic ideas of minimum cost flow; there will be a part 3, where I will generalize to a Dinitz-like algorithm and also talk a bit about something called potentials.

This blog is somewhat more technical and formal than I would ideally write, but there is a reason: there are things that one might overlook if they were not careful, and it is easy to gloss over something important. I remember well some misconceptions I used to have about flow back in the day. These algorithms are "fragile" — I don't mean that in a negative way, but they exist because a number of interesting facts come together, and being too informal may mean that we don't fully appreciate just how incredible it is.

Definition 1. Given a flow network $$$(G, c, s, t)$$$ and a cost function $$$a \colon\, E \to \mathbb{R}$$$ (in the following I will abbreviate this to "given a flow network $$$(G, c, a, s, t)$$$") the cost of a flow $$$f$$$ in $$$G$$$ is

$$$ \mathrm{cost}(f) = \sum_{e \in E} f(e) a(e). $$$

The minimum cost maximum flow problem is to find a flow $$$f$$$ in $$$G$$$ such that the value of the flow is maximized, and among all possible solutions with maximum value, to find the one with minimum cost.

The aim of this blog is to prove the correctness of the "classical" minimum cost maximum flow algorithm (I have no idea if it has a name...):

Algorithm 2. Given a flow network $$$(G, c, a, s, t)$$$ without negative cycles.

  • Initialize $$$f$$$ by setting $$$f(e) \gets 0$$$ for all edges.
  • While $$$t$$$ is reachable from $$$s$$$ in $$$G_f$$$:
    • Let $$$P$$$ be a shortest path from $$$s$$$ to $$$t$$$.
    • Let $$$\gamma$$$ be the minimum capacity among the edges of $$$P$$$ in $$$G_f$$$.
    • Define $$$\Delta f$$$ by putting $$$\gamma > 0$$$ flow on every edge in $$$P$$$.
    • Let $$$f \gets f + \Delta f$$$.
  • Return $$$f$$$.

Here and in the following, "shortest path" always means shortest with respect to edge costs (and not number of edges like in Edmonds-Karp).

Thanks to adamant and brunovsky for reviewing this blog.

Full text and comments »

  • Vote: I like it
  • +220
  • Vote: I do not like it

By -is-this-fft-, 11 months ago, In English

Part 1: [Tutorial] My way of understanding Dinitz's ("Dinic's") algorithm
Part 2: [Tutorial] Minimum cost (maximum) flow
Part 3: [Tutorial] More about minimum cost flows: potentials and Dinitz

Introduction

My philosophy on writing blogs is that I only write blogs when I feel that I have something new and interesting to contribute. This is usually not the algorithm itself, of course, but it can be that I recently found or heard of a new perspective or I figured out a way to explain something in a new way. This is in contrast to my dear colleagues at GeeksForGeeks who write the most low-effort (and often wrong) tutorial on everything and then sit at the top of Google results using their somehow world-class SEO.

This is one such case. I have an idea how to explain Dinitz's algorithm in a way I haven't heard of before (but it is not very hard to come up with, so this is not a claim to originality). In addition, this blog is a multi-parter, and in my next blog it is very convenient to start from the notions I establish here today.

The approach I am going to take is explaining the idea from a perspective of "the space of all possible changes". At any given step of the algorithm, we have a flow. We can choose a way to change the flow from this "space" and apply the change. If we detect there is no possible increasing change to the flow, the algorithm is terminated. The space of changes is such that any valid flow can be reached from any valid flow by applying just one change. This idea is almost geometrical and can in fact be thought of as wandering around within a convex shape in high-dimensional space.

I believe that this viewpoint gives a broader and more natural understanding of the algorithm and the "big picture" compared to more narrow explanations that look like "well we have flow and sometimes we add reverse edges and mumble mumble blocking flow". In addition this grounding allows us to see Dinitz as a direct improvement on Edmonds-Karp instead of them being merely vaguely reminiscent of one another.

Full text and comments »

  • Vote: I like it
  • +280
  • Vote: I do not like it

By -is-this-fft-, history, 12 months ago, In English

Why?

Weak pretests discourage guessing.

There are many contestants who only have a faint idea of why their solution works at all. The problem style of recent years has contributed to this. If the solution boils down to "check this elegant condition, answer YES or NO based on that", then it is very much possible to solve problems by guessing. Especially with full feedback.

Remember 1375C - Element Extermination? The solution is to check if $$$a_1 < a_n$$$. Tons of people under the contest announcement were excited to explain that this is how to solve the problem. Only a few could explain why it worked.

This is also a situation where it is easy for a high-rated author or coordinator to deceive themselves. If you already know such a solution, then it may seem very easy. The implementation is easy and if there is a clean observation you have to check, making that observation seems easy in hindsight. And the large number of solves in the contest "confirms" that it is easy. But in fact, the problem is only easy to guess, not easy to solve.

In the good old days it was not seen as a big violation if pretests were weak. It was normal. Right now weak pretests cause an uproar because people are not expecting it. If weak pretests were the norm, it would not be such an issue.

Make weak pretests the norm again. It will improve the quality of standings. Thanks.

Full text and comments »

  • Vote: I like it
  • -63
  • Vote: I do not like it

By -is-this-fft-, history, 13 months ago, In English

Introduction

Instead of algorithms and math, I'll be doing something completely different here. There is something that has bothered me for a long time: people not understanding the tools they use to participate in contests and thus failing to take real advantage of them. And it all goes back to not understanding the command line.

Some of you might say, "well surely, the command line is something obsolete and you're only clinging to it because you [insert ridiculous amateur-psychology here]?" No! The command line is not obsolete, and unless competitive programming becomes so mainstream that software vendors start writing professional-grade tools aimed specifically at us, it won't be.

Full text and comments »

  • Vote: I like it
  • +443
  • Vote: I do not like it

By -is-this-fft-, history, 14 months ago, In English

Background

When Catalog was first introduced, I asked for a way to sort by blog creation time to be able to check for new blogs. Sadly, this has not been implemented, at least the way I imagined it (there is a time filter though). So, I decided to implement it myself. The result is an userscript.

Features

I think the concept is pretty self-explanatory. Some notes:

  • You can click on the table headers to sort by a field. By default, we sort descending based on creation time.
  • Whether or not you are using the table view is saved in a cookie. If you have turned on table view, it will still be on the next time you visit the page, so you can use it as a "default" if you want to.
  • The script is disabled when in management mode because you can't really do management in this view.
  • Open source, so you can customize it if you want.

Installation

  • Install an userscript manager to your browser. I used TamperMonkey, I can't guarantee anything for other userscript managers.
  • Find the "create new script" option and paste in the code. The code is available here.

PS: the code is very quick-and-dirty, don't judge!

Full text and comments »

  • Vote: I like it
  • +98
  • Vote: I do not like it

By -is-this-fft-, history, 15 months ago, In English

Introduction

There are a number of "small" algorithms and facts that come up again and again in problems. I feel like there I have had to explain them many times, partly because there are no blogs about them. On the other hand, writing a blog about them is also weird because there is not that much to be said. To settle these things "once and for all", I decided to write my own list about about common "small tricks" that don't really warrant a full tutorial because they can be adequately explained in a paragraph or two and there often isn't really anything to add except for padding.

This blog is partly inspired by adamant's blog from a few years ago. At first, I wanted to mimic adamant's blog structure exactly, but I found myself wanting to write longer paragraphs and using just bolded sentences as section headers got messy. Still, each section is short enough that it would not make much sense to write separate blogs about each of these things.

All of this is well-known to some people, I don't claim ownership or novelty of any idea here.

Full text and comments »

  • Vote: I like it
  • +804
  • Vote: I do not like it

By -is-this-fft-, history, 15 months ago, In English

At the time of writing, it has -513.

Now usually, it would mean the round has some issues. But in this case, I don't think this is the reason. An announcement that has this many downvotes is usually accompanied by a lot of complaints in the comment section. Today we have none of that. Sure there are a few people who didn't like this or that, but there is no avalanche of negative comments that would usually accompany a -500 announcement. (So if you plan to reply to this blog with some complaint about the round — is it realistic that enough people downvoted it to get to -500 and nearly no one left a comment?)

Is someone botting?

Full text and comments »

  • Vote: I like it
  • +272
  • Vote: I do not like it

By -is-this-fft-, history, 17 months ago, In English

For some reason, everyone refers to it with a different name which creates a lot of confusion. This debate comes up any time someone mentions any of the names, so let's have a poll to settle this thing once and for all.


Consider the following problem.

Problem. There is an array $$$A$$$ of length $$$10^{18}$$$, initially filled with zeros. You are given $$$q$$$ queries of the following form:

  • Given $$$l$$$ and $$$r$$$, find $$$A[l] + A[l + 1] + \cdots + A[r]$$$.
  • Given $$$i$$$ and $$$x$$$, set $$$A[i] \gets x$$$.

What do you call the data structure that solves the problem above in time $$$O(q \log C)$$$?

Details if you're not sure what I mean
  • Compressed segment tree
  • Dynamic segment tree
  • Implicit segment tree
  • Sparse segment tree
  • Lazy creation segment tree
  • I don't know this data structure
  • I'm green or below

Full text and comments »

  • Vote: I like it
  • +111
  • Vote: I do not like it

By -is-this-fft-, history, 17 months ago, In English

Introduction

New Year's traditional masquerade of colors and ranks is going to end pretty soon, and before it goes away, I decided to download a snapshot of everyone's true ratings and current handle colors to see how everyone's using the feature. I regret that I only have one snapshot; I only managed to push myself to finish this today. It would have been really cool to see how people's usage of the feature has evolved — perhaps an initial surge of masquerading as a nutella, then slowly coming to one's senses and opting for something milder?

Data was collected on January 9th, at about 01:00 UTC. I collected the data by scraping the pages of the "rating" tab (sorry for making a lot of requests!) — the table contains people's actual ratings, and by looking at the title-text of people's handles, I could also get their possibly-magicked ranks. I only included the active people because it was already taking a lot of time, and I believe we got most of the interesting parts anyway. Sadly this excludes magic-using unrated and inactive users, but I think they won't affect the statistics much.

The data

First, let's look at the basic data: how many people of a given rank are masquerading as another given rank?

Down: true rank, right: magic rank N P S E CM M IM GM IGM LGM
Newbie705542092534073942451251361361565
Pupil1621208793146195101624345430
Specialist21327746712916180655738329
Expert26071606202161111624740308
Candidate Master9948443518864317271690
Master763466402313117311261
International Master189129812498313
Grandmaster182218141234326620
International Grandmaster10516444201827
Legendary Grandmaster42234320041

Now that we have the data, it's time to analyze it!

Full text and comments »

  • Vote: I like it
  • +275
  • Vote: I do not like it

By -is-this-fft-, history, 17 months ago, In English

I generally don't like to give much advice on "how to practice", but recently I have been thinking a lot about this and I realized that there is something that I believe affects a lot of people (though it is admittedly hard to tell) that barely gets talked about: self-deception.

Self-deception is a devious thing. It can be very hard to detect, yet can be fatal to improvement.

This blog is mainly directed to greys, greens and cyans that can't get out of these categories. Most examples are given with that in mind. That being said, I believe others have something to think about as well.

This is my first blog on such "psychological issues" and I don't know if I'll ever make another one. It's certainly a difficult topic to write about because for most of the blog, the message is "your practice is not as good as you think it is" or even "you aren't as good as you think you are" which can of course be a hard thing to hear. Thus, I have to somehow be very gentle while getting the point across, and I'm not sure how well I can even do either of these things individually. I also want to stress that nothing in this blog applies to everyone and nothing in this blog applies to everything you do.

Read this blog carefully. More than once I have explained these ideas and people just walk away with what they wanted to hear about "how to practice". I don't want to see this again.

I want to thank poomas, nor, SlavicG and OmniMan for reviewing this blog.

My sad story

To start, I'll give a personal story as to not seem like I'm talking from a high horse. Self-deception can affect anyone.

Full text and comments »

  • Vote: I like it
  • +1292
  • Vote: I do not like it

By -is-this-fft-, 17 months ago, In English

Introduction

Most of us are familiar with the union-find data structure and the two optimizations one can do on it: union by rank and path compression. Most tutorials will introduce these methods and state that if both optimizations are used, the complexity will be $$$\mathcal{O}((n + m) \alpha(n))$$$. Here, $$$n$$$ and $$$m$$$ are the number of vertices and queries respectively. What is $$$\alpha$$$? Generally, we are told that $$$\alpha$$$ is some mysterious function that grows very slowly: $$$\alpha(N) < 5$$$ where $$$N$$$ is the number of subsets of particles in the observable universe and whatnot. What it is is usually not clarified very much. And there is typically no proof (or intuition and other such justifications) as to why the complexity is like that: usually we're told that the complexity analysis is quite long and complicated. At the time of writing, both cp-algorithms and Wikipedia are like that.

After telling students more or less the same thing for the $$$N$$$-th time, I finally decided to actually delve into the proof and see what is so difficult about it. My main motivation was to find out:

  • How does one even analyze the time complexity of path compression — it seems quite chaotic?
  • How does the Ackermann function of all things show up here? Why is it related to finding connected components or shortening paths on a tree?

In this blog, I will show the proof first published in Efficiency of a Good But Not Linear Set Union Algorithm by R. E. Tarjan, published in Journal of the ACM, Volume 22, Issue 2 in 1975. I will attempt to give a reader-friendly exposition of Tarjan's proof. Besides proving an upper bound on the complexity of the union-find data structure, in the article a lower bound is also calculated. I will not delve into the latter proof in this blog. In addition to the article, I decided to add examples of how to calculate worse upper bounds, because the techniques involved are similar and I believe it gives a better understanding of the ideas if we don't immediately jump to the hardest case.

Having read the proof, I believe it is not at all very long and complicated. Although it must've taken serious brainpower to come up with the techniques involved, the proof itself is a perfectly sensible chain of reasoning. The proof appears to be roughly 4 pages long at first glance, but 1 page of it is spent on the case without union-by-rank and 1 page is spent on listing various properties of the Ackermann function. The actual proof is therefore about 2 pages long, which I would consider short (the pages seem to be A5 with a rather large font). Furthermore, it is elementary: it's not one of those things that you can only understand if you have taken 3 semesters of algebraic topology. I can of course understand why this proof is not usually given in DSU lectures: knowing the proof is not particularly useful for solving problems and other very slow-growing upper bounds can be derived easily. But at least I think the difficulty of the argument should not be an excuse.

Prerequisites: comfortable with union-find and math.

Full text and comments »

  • Vote: I like it
  • +448
  • Vote: I do not like it

By -is-this-fft-, history, 23 months ago, In English

Today, I want to talk a bit about the relationship between two well-known methods for calculating the $$$n$$$-th term of a linear recurrence: generating functions and matrix exponentiation. More specifically, I want to explore how these two things are related to one another or how they can be thought of as variants of one another.

This blog shouldn't be treated as a tutorial. If you know these algorithms, I doubt that I can teach you to solve some new problem in this blog. Rather, I'm kind of thinking out loud. I think it is a good idea to try to reflect and find connections between different things we have learned. This kind of thinking tends to strengthen the understanding of both methods and makes it easier to improve upon them to solve problems.

And I'm sure that what I'm about to present is already quite well-known to some people ;). But when I realized this it was cool to me, so I want to share it.

Problem. You have two sequences $$$b_1, b_2, \ldots, b_m$$$ (where $$$b_m \ne 0$$$) and $$$c_0, c_1, \ldots, c_{m - 1}$$$. The infinite sequence $$$a_0, a_1, a_2, \ldots$$$ is defined as

$$$ a_i = \begin{cases} c_i & \text{if } i < m \\ \sum_{k = 1}^m a_{i - k} b_k & \text{otherwise.} \end{cases} $$$

Calculate $$$a_n$$$.

As a simple example, you can consider the following simple variant: the Fibonacci sequence is defined by $$$F_0 = 0$$$, $$$F_1 = 1$$$ and $$$F_i = F_{i - 1} + F_{i - 2}$$$ for any $$$i \ge 2$$$. Calculate $$$F_n$$$.

Full text and comments »

  • Vote: I like it
  • +313
  • Vote: I do not like it

By -is-this-fft-, history, 2 years ago, In English

Preface

When the problem came out, someone asked for an explanation of how people come up with such solutions. I wanted to write this blog even before that, but I've just been busy. Anyway now, two months later, I finally had the time to put this blog together.

But the reason I really wanted to write this blog is that the solution to 1503C - Travelling Salesman Problem has a very much non-obvious greedy solution. From reading beginners' comments, it seems that a lot of people have this implicit belief that greedy algorithms are invented by guessing: for example, guessing an order in which to process things, guessing how to process things and then maybe proving correctness if you have time (but really skipping it, because who cares, right?). I think some contest sites even perpetuate this idea by presenting a greedy algorithm in the editorial without any kind of proof or explanation where this comes from. Even here, you may have tutorials that say "here's the algorithm" and after that "here's the proof". While it is convenient (and completely fine!) to present the solution in this form (algorithm and then proof), it's not necessarily how the solution was invented.

While you can solve problems this way (guessing and then proving (or not proving)) and I have solved problems this way, it is quite unusual for me. If someone were to think (perhaps implicitly) that greedy algorithms are invented like that, they'd be shocked when they read the solution to 1503C - Travelling Salesman Problem: how could anyone guess that, how could anyone come up with ordering the cities like this and then doing things like that?

And the answer is that you really can't. It would be very hard to guess this solution unless you already have some insights to the problem. That is roughly how I feel I solve problems: try to really understand the problem, gather insights about it. And at some point, you have found enough observations that you can piece together a solution. The solution isn't proved afterwards — the proof develops alongside the solution. It isn't always like this, but very often it is. (Of course, there is some guesswork involved, but it is more "I guess I should think about this thing", not "I guess the entire algorithm").

In the following, I'll try to show you what I mean.

The problem

Given $$$n$$$ cities and two arrays $$$a$$$ and $$$c$$$ of length $$$n$$$, you want to visit every city once, starting from city $$$1$$$ and returning to city $$$1$$$. The cost of going from $$$i$$$ to $$$j$$$ is

$$$\max(c_i, a_j - a_i),$$$

and you must minimize the total cost. We have $$$2 \le n \le 10^5$$$ and $$$0 \le a_i, c_i \le 10^9$$$.

Full text and comments »

  • Vote: I like it
  • +368
  • Vote: I do not like it

By -is-this-fft-, history, 2 years ago, In English

A week ago, I made this blog post. Earlier, I already wrote a small update to the blog, but I wanted to make a separate post to get more contribution properly announce what I did as well as to share my code.

Thanks to vjudge's suggestion, there is now a way to participate in CodeChef Cook-Off contests virtually on vjudge.

How to participate

  • Go to the list.
  • Open a contest by clicking on its name.
  • Click the "clone" button to create your own private copy of the contest.
  • Set the "begin time" to the time you'd like to start the contest, somewhere in the future.

Keep reading if you want to know how it works.

Full text and comments »

  • Vote: I like it
  • +93
  • Vote: I do not like it

By -is-this-fft-, history, 2 years ago, In English

Hi!

I like doing virtual contests. As it stands, I am quickly running out of contests (especially non-ancient ones) on Codeforces, Atcoder and CSAcademy. So, CodeChef would be the next logical place.

However, it seems that there is no option to do that on CodeChef. Since there are always new blogs about tools for competitive programming, I was wondering if someone had already made this or if someone knows of a service like this.

Some things I have looked into:

  • vjudge doesn't really fit the bill, it seems that it is intended more as a "create cross-platform mashups" type of thing. My issues:
    • You need to manually create the contest unless someone else has already done it.
    • Can't look at "the number of people who solved the problem up to this point in time" — this part is important to me.
    • Can't look at "scoreboard at this point in time" — this part is less important.
  • This project seems to be exactly what I'm looking for. However, 149.129.139.145 is down. I'm thinking about hosting this myself (probably not publicly), but I'm not sure I can. They use OAuth to authenticate (and use the bearer token in all subsequent requests to the CodeChef API), and I don't have valid credentials to do that. They do actually have an OAuth client secret pushed to the repo (:O), but when I try using that, the client id is invalid.
  • Writing one myself — it probably wouldn't be very hard to do, but I kind of run into the same problem as above: I can't access the CodeChef API. Web-scraping is also an option, but it isn't exactly pleasant, especially on CodeChef.

To summarize, I have two questions:

  • Does anyone know of such a service?
  • Does anyone have experience with getting access to CodeChef API? Do they just give access to anyone who asks? Do they reply to emails if I write?

Update

Thanks to the recommendation of vjudge, I have found a solution: to fetch ranklists from Codechef and upload them as replays on vjudge. As proof of concept, I have uploaded replays of the last contest:

I tried participating in Div. 1 this morning and was satisfied.

The Div. 2 and Div. 3 ranklists are truncated, featuring only the top 2000 participants. This is a vjudge limitation. Maybe it is better to randomly sample the participants, to get a better feel of "number of solvers"? Let me know what you think.

To participate in them virtually, you need to press the clone button on vjudge: you can then set the start date of the clone to the future and participate. You will be able to see the ranklist as it was at that point, any time during your participation, just like you can on Codeforces gyms and virtuals.

I will try to upload all Cook-Offs to vjudge in the coming days.

Full text and comments »

  • Vote: I like it
  • +127
  • Vote: I do not like it

By -is-this-fft-, 2 years ago, In English

I wanted to suggest some improvements for the blog system. Some of these things have probably been said before. Some of these things should be pretty easy to implement, others are harder and might require a complete rework. I tried to put the easy things first.

Full text and comments »

  • Vote: I like it
  • +432
  • Vote: I do not like it

By -is-this-fft-, history, 3 years ago, In English

There was a blog today with a similar title, and although it was (I think, not really sure as I didn't get a good look) some kind of satire/joke blog, I wanted to post some serious advice about things I (don't) like to see when people are PMing me. However I decided that this wouldn't be in the spirit of the blog and decided to make a separate blog. And now that blog is deleted anyway.

While it's not a bad thing to write messages to more experienced users, most people who write me do things that make them very difficult to deal with. These are my opinions, but somewhy I feel that many reds will agree. Some things here feel silly to write because they are so obvious. But I'm only writing them because people regularly mess them up.

Full text and comments »

  • Vote: I like it
  • +746
  • Vote: I do not like it