Wielomian's blog

By Wielomian, 3 years ago, In English

Hello, fellow Codeforces fans!

During my training I select problems by their Problem Rating. But recently I noticed something peculiar about this feature. Namely, problems that have the same ratings may actually differ in hardness! The difference in hardness cannot be, in my opinion, explained by the rounding-to-the-nearest-100 rule, as my percieved difference should be sometimes in the order of 200-250. In my opinion this is connected to the Divisions system.

Suppose that I want to solve a 1900 rated problem from the Problemset. I noticed that in such situation I'm able to solve such a problem in around 15 mins if it comes from Div3, around 30 mins if it comes from Div2 and up to 50 mins for Div1 problems. Those times include implementation time.

My explaination for that is that a 1900 rated problem in Div3 is either F or G, in Div2 it's D or E and in Div1 it's A or B. If most participants solve problems in the A,B,C,... order, div3 participants have to solve 5 or 6 problems simply to reach this problem, let alone solve it during the round. Meanwhile, div1 participants would basically start by solving this problem.

In other words, if a round duration is 2 hrs, a (strong) div3 participant solving F or G will have some 40 minutes for this problem (as they spend say 80 minutes solving A-E/F). But div1 participant have full 2 hrs to solve their A / B problem, since they start the round with them. What I mean to say, is that the Problem Ratings of late-div3 problems is inflated -- most participants don't even manage to reach those problems, so they have a high "during the competition" failure rate and in consequence they seem harded than they really are. On the other hand, rating of early-div1 problems is deflated -- even if many participants will struggle for say over an hour, it will be counted as successfully solved "during the competition" thus decreasing the Problem Rating.

Other explaination that comes to my mind is that I'm simply biased and the Problem Rating System is working well. For example, maybe when I see "div3" next to the problem statement I'm more relaxed about the problem and the solution comes faster. And for "div1" problems I anticipate some nasty trick and overcomplicate solutions? Different bias could be a confirmation bias -- if I solved a div1 1900 problem in 15 minutes I would subconciously think something like "well that was obviously an outlier" while in fact on average all 1900 problems are similarly hard?

Do you think that there is something to it or am I simply too fast with conclusion? Do you think the system should take into account more parameters than "during-competition solvers rating" (e.g. problem number on the list, average time to submit the problem, or something else)? Do you think that Problem Ratings may be artificially influenced by cheaters or unofficial div1 participants? Please, comment you opinion because good Problem Rating System is very important for effective training.

Happy problem solving and many ACs, Wielomian

  • Vote: I like it
  • +117
  • Vote: I do not like it

»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

What about 1575B - Building an Amusement Park? Why does it have rating $$$2300$$$ (it's way more difficult imo)? I think this is because the problem ratings were updated later and the solves count was higher.

  • »
    »
    3 years ago, # ^ |
      Vote: I like it +19 Vote: I do not like it

    I wonder how does one use the formula for problem rating for team-based competition, maybe that is the clue of this apparent low rating?

    Also, I recently saw a video by Algorithms Live! about geometric sweeps so for me this problem was rather easy (I didn't take part in the round though). Here is a Youtube link to the aforementioned video if you were interested. So maybe it's a well known trick?

  • »
    »
    3 years ago, # ^ |
      Vote: I like it +17 Vote: I do not like it

    In particular, the problem ratings of COMPFEST 13 are completely broken. There is one problem that only had 8 accepteds during the round, with multiple LGMs failing to solve it, and it only has $$$2400$$$ rate. And several other problems are rated 2100/2200 which IMO are much much more harder than 2400 rated div1 problems.

»
3 years ago, # |
  Vote: I like it +20 Vote: I do not like it

This is my experience as well (ie. div3 problem ratings are inflated).

On top of this, I feel that 1600 rated problems from two years ago are as hard as 1400 problems today. That is, rating for older problems is inflated.

So to offset these issues I sort by most recent and skip div3 problems.

»
3 years ago, # |
  Vote: I like it +76 Vote: I do not like it

Didn't read the blog but it's always been broken, people put too much faith into it.

  • »
    »
    3 years ago, # ^ |
      Vote: I like it +16 Vote: I do not like it

    Sure it's not perfect but still a much better estimator of problem hardness than number of solves. I put faith into it because I need to somehow select relevant problems for training xd

»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Agree. Maybe we should set a offset according to div 1/2/3. -200~-100 to div3, +100~+200 to div1 imo

»
3 years ago, # |
  Vote: I like it 0 Vote: I do not like it

In general, div3 ratings are totally and completely broken. You should never trust them. This is probably because rating of unofficial participants are not taken into account, or even if they are, there are very few reds and oranges who participate in div3.

On the other end of the spectrum are team competitions. Because of the longer duration, and the fact that people participate in teams, the ratings are a bit on the lower side.

Also, in div1 rounds, since more reds and oranges participate, and they have more time to solve 1C/1D, problem ratings from if the same problems were in div2 are slightly lower. But not by much, I think the difference is 100 most of the times or 200 at the most.

That being said, if we remove div3 rounds, I have found problem ratings to be consistent, that is an $$$X$$$ rated problem feels about as hard as any other $$$X$$$-rated problem.