Codeforces contests results and Data Analysis

Revision en1, by ilyakor, 2017-08-01 14:51:04

Introduction

In the recent tourist's blogpost about a new interesting strategy for some types of contests, there was an interesting comment from ftiasch. Apparently, some participants try to artificially boost their rating with the following strategy: solve problem C, if it goes fast and nice — submit and continue with the contest, otherwise don't submit anything. The assumption is, other problems in the contest are created by the same author, so if one performs good on problem C, they are more likely to perform good on other problems.

In the comment thread, some people expressed doubts about this strategy. Reading their comments, I wondered: is there any point in discussing this with groundless arguments, when there is large volume of historical data of contest results, and we can just analyse them to understand if the strategy really works. Luckily, Codeforces even provides API, so there is no need to implement custom scrapers.

Collecting and cleaning up data

For the dataset, I've used all the contests with CF rules (it doesn't make sense to evaluate the strategy on other rulesets). Even among CF-rule contests, there is a big variety, in particular, sometimes there are more than 5 problems. For the sake of the analysis, I've truncated all the problems to the standard "A-B-C-D-E" format (discarding FGH when they are present). Also, I've removed from the standings all the users with 0 points, to reduce noise.

Next question is, over which users to compute statistics. It is very likely that things work differently for Div2 users than for Div1 ones. Also, orange users are likely to have more noisy and inconsistent results than red and nutella ones. That's why I restricted stats computation only to users with rating >= 2400 at the time of scraping (note though that all users are considered when computing contest-level stats).

Verifying if the strategy works

First look at the data

Let's look at the scraped dataset. We have 695 contests (many of those are Div2 though, so they don't affect our stats). We have 107975 total users, but only 383 of them are red.

Let's see how red coders perform on different problems. First, what are the success rates for each problem? Here is the data (95%-conf intervals are printed):

A B C D E
0.91 ± 0.00 0.80 ± 0.01 0.59 ± 0.01 0.35 ± 0.01 0.15 ± 0.00

As we can see, the rates are even monotonic (so the popular claim "E is much easier than E!11" isn't true on average). What other stats are there besides success rates? An interesting one is how often a user's solution places among the top solutions in the contest (e.g. by score). The data looks as follows (with "top" defined as top-10%):

A B C D E
0.38 ± 0.01 0.38 ± 0.01 0.37 ± 0.01 0.29 ± 0.01 0.14 ± 0.00

An interesting observation is that for problem E, top-10% rate is almost the same as accept rate, while for problem A, these rates are very different.

A very naive attempt

TODO

Tags strategy, data analysis, api

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en3 English ilyakor 2017-08-02 23:12:15 5049
en2 English ilyakor 2017-08-01 15:39:56 6337 (published)
en1 English ilyakor 2017-08-01 14:51:04 3434 Initial revision (saved to drafts)