-is-this-fft-'s blog

By -is-this-fft-, 2 months ago,

Introduction

Instead of algorithms and math, I'll be doing something completely different here. There is something that has bothered me for a long time: people not understanding the tools they use to participate in contests and thus failing to take real advantage of them. And it all goes back to not understanding the command line.

Some of you might say, "well surely, the command line is something obsolete and you're only clinging to it because you [insert ridiculous amateur-psychology here]?" No! The command line is not obsolete, and unless competitive programming becomes so mainstream that software vendors start writing professional-grade tools aimed specifically at us, it won't be.

• +443

By -is-this-fft-, history, 3 months ago,

Background

When Catalog was first introduced, I asked for a way to sort by blog creation time to be able to check for new blogs. Sadly, this has not been implemented, at least the way I imagined it (there is a time filter though). So, I decided to implement it myself. The result is an userscript.

Features

I think the concept is pretty self-explanatory. Some notes:

• You can click on the table headers to sort by a field. By default, we sort descending based on creation time.
• Whether or not you are using the table view is saved in a cookie. If you have turned on table view, it will still be on the next time you visit the page, so you can use it as a "default" if you want to.
• The script is disabled when in management mode because you can't really do management in this view.
• Open source, so you can customize it if you want.

Installation

• Install an userscript manager to your browser. I used TamperMonkey, I can't guarantee anything for other userscript managers.
• Find the "create new script" option and paste in the code. The code is available here.

PS: the code is very quick-and-dirty, don't judge!

• +98

By -is-this-fft-, history, 4 months ago,

Introduction

There are a number of "small" algorithms and facts that come up again and again in problems. I feel like there I have had to explain them many times, partly because there are no blogs about them. On the other hand, writing a blog about them is also weird because there is not that much to be said. To settle these things "once and for all", I decided to write my own list about about common "small tricks" that don't really warrant a full tutorial because they can be adequately explained in a paragraph or two and there often isn't really anything to add except for padding.

This blog is partly inspired by adamant's blog from a few years ago. At first, I wanted to mimic adamant's blog structure exactly, but I found myself wanting to write longer paragraphs and using just bolded sentences as section headers got messy. Still, each section is short enough that it would not make much sense to write separate blogs about each of these things.

All of this is well-known to some people, I don't claim ownership or novelty of any idea here.

• +804

By -is-this-fft-, history, 4 months ago,

At the time of writing, it has -513.

Now usually, it would mean the round has some issues. But in this case, I don't think this is the reason. An announcement that has this many downvotes is usually accompanied by a lot of complaints in the comment section. Today we have none of that. Sure there are a few people who didn't like this or that, but there is no avalanche of negative comments that would usually accompany a -500 announcement. (So if you plan to reply to this blog with some complaint about the round — is it realistic that enough people downvoted it to get to -500 and nearly no one left a comment?)

Is someone botting?

• +272

By -is-this-fft-, history, 6 months ago,

For some reason, everyone refers to it with a different name which creates a lot of confusion. This debate comes up any time someone mentions any of the names, so let's have a poll to settle this thing once and for all.

Consider the following problem.

Problem. There is an array $A$ of length $10^{18}$, initially filled with zeros. You are given $q$ queries of the following form:

• Given $l$ and $r$, find $A[l] + A[l + 1] + \cdots + A[r]$.
• Given $i$ and $x$, set $A[i] \gets x$.

What do you call the data structure that solves the problem above in time $O(q \log C)$?

Details if you're not sure what I mean
• Compressed segment tree
• Dynamic segment tree
• Implicit segment tree
• Sparse segment tree
• Lazy creation segment tree
• I don't know this data structure
• I'm green or below

• +111

By -is-this-fft-, history, 6 months ago,

Introduction

New Year's traditional masquerade of colors and ranks is going to end pretty soon, and before it goes away, I decided to download a snapshot of everyone's true ratings and current handle colors to see how everyone's using the feature. I regret that I only have one snapshot; I only managed to push myself to finish this today. It would have been really cool to see how people's usage of the feature has evolved — perhaps an initial surge of masquerading as a nutella, then slowly coming to one's senses and opting for something milder?

Data was collected on January 9th, at about 01:00 UTC. I collected the data by scraping the pages of the "rating" tab (sorry for making a lot of requests!) — the table contains people's actual ratings, and by looking at the title-text of people's handles, I could also get their possibly-magicked ranks. I only included the active people because it was already taking a lot of time, and I believe we got most of the interesting parts anyway. Sadly this excludes magic-using unrated and inactive users, but I think they won't affect the statistics much.

The data

First, let's look at the basic data: how many people of a given rank are masquerading as another given rank?

Down: true rank, right: magic rank N P S E CM M IM GM IGM LGM
Newbie705542092534073942451251361361565
Pupil1621208793146195101624345430
Specialist21327746712916180655738329
Expert26071606202161111624740308
Candidate Master9948443518864317271690
Master763466402313117311261
International Master189129812498313
Grandmaster182218141234326620
International Grandmaster10516444201827
Legendary Grandmaster42234320041

Now that we have the data, it's time to analyze it!

• +275

By -is-this-fft-, history, 6 months ago,

I generally don't like to give much advice on "how to practice", but recently I have been thinking a lot about this and I realized that there is something that I believe affects a lot of people (though it is admittedly hard to tell) that barely gets talked about: self-deception.

Self-deception is a devious thing. It can be very hard to detect, yet can be fatal to improvement.

This blog is mainly directed to greys, greens and cyans that can't get out of these categories. Most examples are given with that in mind. That being said, I believe others have something to think about as well.

This is my first blog on such "psychological issues" and I don't know if I'll ever make another one. It's certainly a difficult topic to write about because for most of the blog, the message is "your practice is not as good as you think it is" or even "you aren't as good as you think you are" which can of course be a hard thing to hear. Thus, I have to somehow be very gentle while getting the point across, and I'm not sure how well I can even do either of these things individually. I also want to stress that nothing in this blog applies to everyone and nothing in this blog applies to everything you do.

Read this blog carefully. More than once I have explained these ideas and people just walk away with what they wanted to hear about "how to practice". I don't want to see this again.

I want to thank poomas, nor, SlavicG and OmniMan for reviewing this blog.

To start, I'll give a personal story as to not seem like I'm talking from a high horse. Self-deception can affect anyone.

• +1292

By -is-this-fft-, 6 months ago,

Introduction

Most of us are familiar with the union-find data structure and the two optimizations one can do on it: union by rank and path compression. Most tutorials will introduce these methods and state that if both optimizations are used, the complexity will be $\mathcal{O}((n + m) \alpha(n))$. Here, $n$ and $m$ are the number of vertices and queries respectively. What is $\alpha$? Generally, we are told that $\alpha$ is some mysterious function that grows very slowly: $\alpha(N) < 5$ where $N$ is the number of subsets of particles in the observable universe and whatnot. What it is is usually not clarified very much. And there is typically no proof (or intuition and other such justifications) as to why the complexity is like that: usually we're told that the complexity analysis is quite long and complicated. At the time of writing, both cp-algorithms and Wikipedia are like that.

After telling students more or less the same thing for the $N$-th time, I finally decided to actually delve into the proof and see what is so difficult about it. My main motivation was to find out:

• How does one even analyze the time complexity of path compression — it seems quite chaotic?
• How does the Ackermann function of all things show up here? Why is it related to finding connected components or shortening paths on a tree?

In this blog, I will show the proof first published in Efficiency of a Good But Not Linear Set Union Algorithm by R. E. Tarjan, published in Journal of the ACM, Volume 22, Issue 2 in 1975. I will attempt to give a reader-friendly exposition of Tarjan's proof. Besides proving an upper bound on the complexity of the union-find data structure, in the article a lower bound is also calculated. I will not delve into the latter proof in this blog. In addition to the article, I decided to add examples of how to calculate worse upper bounds, because the techniques involved are similar and I believe it gives a better understanding of the ideas if we don't immediately jump to the hardest case.

Having read the proof, I believe it is not at all very long and complicated. Although it must've taken serious brainpower to come up with the techniques involved, the proof itself is a perfectly sensible chain of reasoning. The proof appears to be roughly 4 pages long at first glance, but 1 page of it is spent on the case without union-by-rank and 1 page is spent on listing various properties of the Ackermann function. The actual proof is therefore about 2 pages long, which I would consider short (the pages seem to be A5 with a rather large font). Furthermore, it is elementary: it's not one of those things that you can only understand if you have taken 3 semesters of algebraic topology. I can of course understand why this proof is not usually given in DSU lectures: knowing the proof is not particularly useful for solving problems and other very slow-growing upper bounds can be derived easily. But at least I think the difficulty of the argument should not be an excuse.

Prerequisites: comfortable with union-find and math.

• +448

By -is-this-fft-, history, 12 months ago,

Today, I want to talk a bit about the relationship between two well-known methods for calculating the $n$-th term of a linear recurrence: generating functions and matrix exponentiation. More specifically, I want to explore how these two things are related to one another or how they can be thought of as variants of one another.

This blog shouldn't be treated as a tutorial. If you know these algorithms, I doubt that I can teach you to solve some new problem in this blog. Rather, I'm kind of thinking out loud. I think it is a good idea to try to reflect and find connections between different things we have learned. This kind of thinking tends to strengthen the understanding of both methods and makes it easier to improve upon them to solve problems.

And I'm sure that what I'm about to present is already quite well-known to some people ;). But when I realized this it was cool to me, so I want to share it.

Problem. You have two sequences $b_1, b_2, \ldots, b_m$ (where $b_m \ne 0$) and $c_0, c_1, \ldots, c_{m - 1}$. The infinite sequence $a_0, a_1, a_2, \ldots$ is defined as

$a_i = \begin{cases} c_i & \text{if } i < m \\ \sum_{k = 1}^m a_{i - k} b_k & \text{otherwise.} \end{cases}$

Calculate $a_n$.

As a simple example, you can consider the following simple variant: the Fibonacci sequence is defined by $F_0 = 0$, $F_1 = 1$ and $F_i = F_{i - 1} + F_{i - 2}$ for any $i \ge 2$. Calculate $F_n$.

• +313

By -is-this-fft-, history, 13 months ago,

Preface

When the problem came out, someone asked for an explanation of how people come up with such solutions. I wanted to write this blog even before that, but I've just been busy. Anyway now, two months later, I finally had the time to put this blog together.

But the reason I really wanted to write this blog is that the solution to 1503C - Travelling Salesman Problem has a very much non-obvious greedy solution. From reading beginners' comments, it seems that a lot of people have this implicit belief that greedy algorithms are invented by guessing: for example, guessing an order in which to process things, guessing how to process things and then maybe proving correctness if you have time (but really skipping it, because who cares, right?). I think some contest sites even perpetuate this idea by presenting a greedy algorithm in the editorial without any kind of proof or explanation where this comes from. Even here, you may have tutorials that say "here's the algorithm" and after that "here's the proof". While it is convenient (and completely fine!) to present the solution in this form (algorithm and then proof), it's not necessarily how the solution was invented.

While you can solve problems this way (guessing and then proving (or not proving)) and I have solved problems this way, it is quite unusual for me. If someone were to think (perhaps implicitly) that greedy algorithms are invented like that, they'd be shocked when they read the solution to 1503C - Travelling Salesman Problem: how could anyone guess that, how could anyone come up with ordering the cities like this and then doing things like that?

And the answer is that you really can't. It would be very hard to guess this solution unless you already have some insights to the problem. That is roughly how I feel I solve problems: try to really understand the problem, gather insights about it. And at some point, you have found enough observations that you can piece together a solution. The solution isn't proved afterwards — the proof develops alongside the solution. It isn't always like this, but very often it is. (Of course, there is some guesswork involved, but it is more "I guess I should think about this thing", not "I guess the entire algorithm").

In the following, I'll try to show you what I mean.

The problem

Given $n$ cities and two arrays $a$ and $c$ of length $n$, you want to visit every city once, starting from city $1$ and returning to city $1$. The cost of going from $i$ to $j$ is

$\max(c_i, a_j - a_i),$

and you must minimize the total cost. We have $2 \le n \le 10^5$ and $0 \le a_i, c_i \le 10^9$.

• +368

By -is-this-fft-, history, 13 months ago,

A week ago, I made this blog post. Earlier, I already wrote a small update to the blog, but I wanted to make a separate post to get more contribution properly announce what I did as well as to share my code.

Thanks to vjudge's suggestion, there is now a way to participate in CodeChef Cook-Off contests virtually on vjudge.

How to participate

• Go to the list.
• Open a contest by clicking on its name.
• Click the "clone" button to create your own private copy of the contest.
• Set the "begin time" to the time you'd like to start the contest, somewhere in the future.

Keep reading if you want to know how it works.

• +93

By -is-this-fft-, history, 13 months ago,

Hi!

I like doing virtual contests. As it stands, I am quickly running out of contests (especially non-ancient ones) on Codeforces, Atcoder and CSAcademy. So, CodeChef would be the next logical place.

However, it seems that there is no option to do that on CodeChef. Since there are always new blogs about tools for competitive programming, I was wondering if someone had already made this or if someone knows of a service like this.

Some things I have looked into:

• vjudge doesn't really fit the bill, it seems that it is intended more as a "create cross-platform mashups" type of thing. My issues:
• You need to manually create the contest unless someone else has already done it.
• Can't look at "the number of people who solved the problem up to this point in time" — this part is important to me.
• Can't look at "scoreboard at this point in time" — this part is less important.
• This project seems to be exactly what I'm looking for. However, 149.129.139.145 is down. I'm thinking about hosting this myself (probably not publicly), but I'm not sure I can. They use OAuth to authenticate (and use the bearer token in all subsequent requests to the CodeChef API), and I don't have valid credentials to do that. They do actually have an OAuth client secret pushed to the repo (:O), but when I try using that, the client id is invalid.
• Writing one myself — it probably wouldn't be very hard to do, but I kind of run into the same problem as above: I can't access the CodeChef API. Web-scraping is also an option, but it isn't exactly pleasant, especially on CodeChef.

To summarize, I have two questions:

• Does anyone know of such a service?

Update

Thanks to the recommendation of vjudge, I have found a solution: to fetch ranklists from Codechef and upload them as replays on vjudge. As proof of concept, I have uploaded replays of the last contest:

I tried participating in Div. 1 this morning and was satisfied.

The Div. 2 and Div. 3 ranklists are truncated, featuring only the top 2000 participants. This is a vjudge limitation. Maybe it is better to randomly sample the participants, to get a better feel of "number of solvers"? Let me know what you think.

To participate in them virtually, you need to press the clone button on vjudge: you can then set the start date of the clone to the future and participate. You will be able to see the ranklist as it was at that point, any time during your participation, just like you can on Codeforces gyms and virtuals.

I will try to upload all Cook-Offs to vjudge in the coming days.

• +127

By -is-this-fft-, 17 months ago,

I wanted to suggest some improvements for the blog system. Some of these things have probably been said before. Some of these things should be pretty easy to implement, others are harder and might require a complete rework. I tried to put the easy things first.

• +432

By -is-this-fft-, history, 21 month(s) ago,

There was a blog today with a similar title, and although it was (I think, not really sure as I didn't get a good look) some kind of satire/joke blog, I wanted to post some serious advice about things I (don't) like to see when people are PMing me. However I decided that this wouldn't be in the spirit of the blog and decided to make a separate blog. And now that blog is deleted anyway.

While it's not a bad thing to write messages to more experienced users, most people who write me do things that make them very difficult to deal with. These are my opinions, but somewhy I feel that many reds will agree. Some things here feel silly to write because they are so obvious. But I'm only writing them because people regularly mess them up.

• +746

By -is-this-fft-, history, 21 month(s) ago,

See this and this for example.

It seems that sometimes after writing an equation on a separate line with Latex like [dollar][dollar][equation][dollar][dollar], the next equation written inline like [dollar][equation][dollar] doesn't work.

From some experiments it seems like Codeforces internally changes single dollar signs to triple dollar signs if there is a match, and uses those to delimit Latex. I suspect the double dollar signs screw up the parser and after an inline equation, the single dollar signs won't get converted.

• +45

By -is-this-fft-, history, 22 months ago,

There have been some questions about this comment (and its parents) and it seems different people understand the expression differently. So I wanted to create a poll.

If your answer is "depends on the context", vote based on the context of that comment.

• +52

By -is-this-fft-, history, 2 years ago,

I wanted to create this blog because there will probably not be really much discussion of solutions in the award ceremony (currently in progress). Also people who aren't finalists themselves can't see that anyway.

I'm interested to hear approaches of the higher-scoring finalists. (We, TÜ Randomized Algorithms did not do anything interesting :D)

A bit on the negative side: this seemed to be not be a 4-hour contest at all. The problem was kinda complex, I felt that I spent all the contest swamped in implementation, not being able to implement even 1/10 of the ideas I had. Well, it's also partially our fault for coming to troll in the finals with a 2-person team :P.

Also, did anyone manage to use the visualizer?

• +87

By -is-this-fft-, history, 2 years ago,

Introduction

Recently there have been several suggestions that there is significant rating inflation now due to COVID-19. The typical hypothesis seems to be along the lines of:

Because people stay at home more now, there is a lot of new inexperienced users who add a lot of rating to the system, pushing "old" users upwards without actual improvement.

A simplified and a bit more general hypothesis that I'm going to test:

Because of the virus I'm gaining more rating than usual without working more.

But most of these comments are based on anecdotal evidence, which is not worth much. I wanted to see if there is any truth to these claims. Specifically, this blog is an implementation (although I modified the approach a bit) of the following comment by geckods:

• +340

By -is-this-fft-, history, 2 years ago,

Hi.

Sometimes I use Codeforces groups for private trainings. On occassion, this means importing (via Polygon) problems from some other contests (that have public test data and are free to be used like that).

But when importing test cases "from the files" or "from the archive", this happens a lot:
(By the way I have no idea why it happens so frequently. My best guess is sometimes that OI-style contests have to repeat test cases in different subtasks. But I have also seen this with ICPC-style.)

Anyway, my question is:

• Why does Polygon even care if there are multiple equal tests? It doesn't seem to care if there are two equal tests from generators.
• Why does it only show one error at a time, and a seemingly random pair at that (It doesn't seem like it is the "lexicographically smallest" pair)?
• Can I somehow turn this validation off? There are some (probably more important) checks that you can turn off.

(I guess I should be smarter and write a local script to remove the duplicates).

• +32

By -is-this-fft-, history, 3 years ago,

The Open Competition is an annual week-long competition for Estonian high school students which serves as the first contest of the season. Just as in last year, we are welcoming international participants as well. The contest started on 14 October, 08:00 EEST and lasts a week, ending at 21 October, 00:00 EEST.

The contest consists of 7 problems with scoring distribution 20-30-40-50-60-60-60. The problems have partial scoring. The first five problems are "classical" problems of increasing difficulty, while the last two slots are reserved for output-only, interactive, out of scope, approximation, optimization — just in general "weird" problems. The problems are mostly aimed at Div. 2 level participants.

Problem statements will be available in Estonian, English and Russian. However, to register, you will need to navigate through a bit of Estonian.

• Registration (Google Translate works well; foreign participants should sign up in the "other" category and enter the country name in the "school/institution" field)
• Ranking
• The contest itself

Please note that you might not appear on the scoreboard immediately after registering.

• +16

By -is-this-fft-, history, 3 years ago,

(Huawei Marathon 2)

• +42

By -is-this-fft-, history, 3 years ago,

Inspired by people losing their mind over dp[mask][-1] here...

Motivation

Sometimes you need to implement an array with negative indices. For example you may want to have an array dp[i] which stores values for all $i$, $-10^5 \le i \le 10^5$. One way is to make dp an (unordered) map. But this is slow and can get TLE in certain problems. Another way is this:

const int MAX_N = 2e5 + 20;
const int ZERO = 1e5 + 5;
int dp [MAX_N];


and instead of calling dp[i] you call dp[ZERO + i]. If I want to access dp[-5] for example, I access dp[ZERO - 5]. But this is tedious and error-prone. It's very easy to make a mistake by forgetting to write ZERO somewhere.

The solution

Do the above, but do it like this!

const int MAX_N = 2e5 + 20;
const int ZERO = 1e5 + 5;
int dp_ [MAX_N];

int& dp (int idx) { // the '&' is very important!
return dp_[ZERO + idx];
}


A very important part is that dp doesn't return an int; it returns an int& — a reference to the given position in the array. This means that it's very convenient to use. We can assign to dp(i), modify dp(i) etc.

int main () {
int main () {
dp(-300) = 7;
dp(40) = 5;
dp(-300) += 777; // All of those are fine!
cout << dp(40) << " " << dp(-300) << endl;

swap(dp(40), dp(-300)); // Even this works!
cout << dp(40) << " " << dp(-300) << endl;
}


• +121

By -is-this-fft-, history, 3 years ago,

Introduction

This is a tutorial/exploration of problems that can be solved using the "DFS tree" of a graph.

For a way too long time, I didn't really understand how and why the classical algorithm for finding bridges works. It felt like many tutorials didn't really explain how it works, kind of just mentioned it in passing and quickly just moved on to implementation. The day someone explained what the DFS tree is, I finally understood it properly. Before, it took me ages to implement bridge-finding properly, and I always had to look up some detail. Now I can implement it at typing speed.

But more importantly, I began to see how the same techniques can be used to solve graph problems that have more or less nothing to do with bridges. The thing is, when you have a black box, you can only ever use it as a black box. But if you have a box that you understand well, you can take it into pieces, repurpose the pieces for completely different things, all without getting lost.

In my opinion, the DFS tree one of the most useful techniques for solving structural problems about graphs that I know of. Also, sometimes there are questionable greedy algorithms whose correctness becomes obvious once you think about the DFS tree.

• +565

By -is-this-fft-, history, 3 years ago,

Sometimes we see problems where a seemingly naive algorithm — for example simple brute force — is actually the correct solution. Mostly, I mean problems where, due to some clever observations, the complexity of the brute force algorithm is greatly reduced.

For example, in a recent contest we had 1168B - Good Triple. You can notice that any string of length at least 9 contains a "good triple", which means a brute force is sufficient here and runs in $O(n)$.

Or 1028F - Make Symmetrical where you can notice that on any given circle, there are not too many lattice points.

Randomized input is also a good source of these. In 896C - Willem, Chtholly and Seniorious you can observe that after a bit of time, most adjacent elements of the array are equal and write something seemingly naive based on that.

What are some other examples of problems where a stupid brute force is actually the correct solution?

• +54

By -is-this-fft-, history, 3 years ago,

The Baltic Olympiad in Informatics will be held in Tartu, Estonia from April 27 to May 2. Both days of the contest will feature an online mirror contest. We invite you to take part.

Both online mirrors will start one hour after the corresponding onsite contests. Both contests last 5 hours.

Solutions will be accepted in the following programming languages: C++, Java and Python. We have separate time limits for C++/Java and Python. Tasks have IOI-style batched partial scoring. Problem statements will be available in English and a multitude of languages spoken in the participating countries.