shart23's blog

By shart23, history, 15 months ago, translation, In English

There was a problem 1400E - Clear the Multiset in Educational Codeforces Round 94 (Rated for Div. 2) (yesterday's contest) and it got difficulty 2200.

Why is it strange? It wouldn't be, IF...

Get look at this 448C - Painting Fence. Yeah, these problems are same! In fact, even editorials for them are same.

As you've already guessed difficulties are different. Actually 448C - Painting Fence got only 1900!

It looks like that such a huge difference is a result of their positions in problemset (E and C exactly).

Of course, this situation is rare enough to be a huge problem. But more tasks more similar situations. Anyway for now it's just a strange fact.

Also I want to thank sevlll as a person who found 448C - Painting Fence.

What do you think about this?

 
 
 
 
  • Vote: I like it
  • +102
  • Vote: I do not like it

»
15 months ago, # |
  Vote: I like it +14 Vote: I do not like it

Really strange, yeah

»
15 months ago, # |
  Vote: I like it +6 Vote: I do not like it

Auto comment: topic has been translated by shart23 (original revision, translated revision, compare)

»
15 months ago, # |
  Vote: I like it +24 Vote: I do not like it

Yes, it is clear that C will try to solve more people than E, and they will have more time to do it. In any case, it's a pretty cool fact, but it seems even funnier to me that the author originally made an English text with Russian names of tasks. I in no way want to humiliate the author, I also had similar failures, it's just really funny :)

  • »
    »
    15 months ago, # ^ |
      Vote: I like it +11 Vote: I do not like it

    Yeah, the article firstly had only russian version, because I use russian version of codeforces. But it also strange that problems included by id don't translated to english with english version of cf. Maybe it need to be fixed. MikeMirzayanov

»
15 months ago, # |
Rev. 2   Vote: I like it +55 Vote: I do not like it

Many problems with later positions have a very high difficulty despite not being very hard. 571D - Campus has 3100 but is quite easy. However getting it right in 2 hours together with all the previous problems is very hard, indeed there are I think 4 ACs during the contest.

They are just artifacts of the way the difficulty system works.

  • »
    »
    15 months ago, # ^ |
      Vote: I like it -145 Vote: I do not like it

    Pleassse, don't say "easy" about problems. For you maybe it quite easy, but for many others not. Just say "easier than it's difficulty says".

    • »
      »
      »
      15 months ago, # ^ |
        Vote: I like it +73 Vote: I do not like it

      I didn't say "easy", I said "quite easy". And I think the meaning of "easy" was obvious from the context.

  • »
    »
    15 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    It is same problem as described below. See rating the top users are 2600-2760. So 3100 is extrapolation error.

    • »
      »
      »
      15 months ago, # ^ |
      Rev. 2   Vote: I like it 0 Vote: I do not like it

      I don't think it is. The argument below would predict the problem to have a lower rating, wouldn't it?

      • »
        »
        »
        »
        15 months ago, # ^ |
        Rev. 4   Vote: I like it -8 Vote: I do not like it

        CF formula designed to "boost" for top place in competition in other way there is no point for top users to write the contest. When only small amount of people solving task they will raise their rating not matter how high it is now. When there is no people to compete with than rating will be higher. If you add Div.1 people to Edu Div.2 task will have lower rating. But it is in general. The only correct answer it is an Error rating. To calculate correct task rating we need people with rating higher than task level to compete with.

      • »
        »
        »
        »
        15 months ago, # ^ |
        Rev. 4   Vote: I like it -8 Vote: I do not like it

        To fight against inflation (the main joke goes here :) ) "Choose a group of the most rated (before the round) participants and decide that their total rating shouldn’t change." For this round group size is 104. And only 4 of those people solved D. So when rating calculated for task D then 4 people receive rating of other 100 people who lose their rating "against task B". Boost size is ~105.

»
15 months ago, # |
Rev. 3   Vote: I like it +18 Vote: I do not like it

It is rating inflation. When 448C take place Div.1 started from 1700. So 1700 was rating for top users from div.2. See rating change. And if you calculate rating for task.  you will see that rating calculated correct based on CF formula. But it has limitation based on user rating in competition.

So it's calculated correct but not relevant to nowaday.

  • »
    »
    15 months ago, # ^ |
      Vote: I like it +14 Vote: I do not like it

    So you want to say, that old problems are harder the new with similar difficulty?

    • »
      »
      »
      15 months ago, # ^ |
        Vote: I like it +16 Vote: I do not like it

      old problems at the top of div2 yes. For other rating it require more research.

    • »
      »
      »
      15 months ago, # ^ |
        Vote: I like it +8 Vote: I do not like it

      The correct answer is. When task rating is higher than rating of people in competition than it calculated incorrect (CF formula is bad for extrapolation). All we know is 448C was too hard for 1700 and 1400E is near ok for 2100.

»
15 months ago, # |
  Vote: I like it 0 Vote: I do not like it

I'm pretty sure ratings are completely based on the # of people who solve the problem during the contest, so later problems are usually more inflated as most people start from the beginning, leaving less time for the end problems.

  • »
    »
    15 months ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    There is explanation from Mike. Problem rating is based on the same formula as rating system. It depends on the number of people who solved the problem and their rating and also people who not solved the problem and their rating. If you add more people who not solved the problem to contest rating of problem will grow.

»
15 months ago, # |
  Vote: I like it +11 Vote: I do not like it

One thing I want to add here. Apart from one being C and the other being E, the problem E version has $$$a_i = 0$$$. My submission 90938786 passed the tests during the contest (where I did not handle the case). But after system testing, it got Wrong Answer. I guess many other contestants did similar mistake. That might have contributed a bit to the rating of this problem.

  • »
    »
    15 months ago, # ^ |
      Vote: I like it -28 Vote: I do not like it

    I checked it. There are no much falls during system tests, so you're wrong.

    • »
      »
      »
      15 months ago, # ^ |
        Vote: I like it +13 Vote: I do not like it

      I checked too: 200 solutions failed in system tests and 150+ solutions got hacked.

      • »
        »
        »
        »
        15 months ago, # ^ |
          Vote: I like it -24 Vote: I do not like it

        It's not much (only 7%).

        • »
          »
          »
          »
          »
          15 months ago, # ^ |
          Rev. 2   Vote: I like it +5 Vote: I do not like it

          7% of what excuse me?

          There were only 750 successful submissions during the contest

          350/1100 does not seem like 7%

»
15 months ago, # |
  Vote: I like it -7 Vote: I do not like it

There are also some other weird things also about problem rating.1400D - Zigzags from educational round 94 has difficulty of 1900. So, if an expert solves 4 problems in time, he should obviously get positive rating change. Isn't it? But, there are experts who solved 4 problems and got negative rating change. Why?

  • »
    »
    15 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Task rating 1900 is after ceiling. Task D rating without ceiling is ~1821 (according to my calculation). Rating to solve 4 tasks (with maximum penalty) without rating change is 1786 (you can check it here).

    • »
      »
      »
      15 months ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Then, isn't D worth of 1786 at most. Maybe it's less. Cause there are participants who solved D but couldn't manage to solve 4 problems. Then, D would be at max 1800. Isn't it?

      • »
        »
        »
        »
        15 months ago, # ^ |
          Vote: I like it +16 Vote: I do not like it

        No if those participants have rating less than 1786. There are many participant with rating <1700 who solved D but not solved B.

»
15 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Maybe at that time,everyone has a lower rating than now. So the difficulty is calculated lower.

  • »
    »
    15 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    U mean at that time problems with same rating as of now is tougher compared to now ??

    • »
      »
      »
      15 months ago, # ^ |
      Rev. 2   Vote: I like it +8 Vote: I do not like it

      I don't think so. In fact, sometimes I observe that some old (say from round 200 to 400) contest problems are overrated. (In fact, I just solved 899E - Segments Removal which is 2000 rated which took me less than 10 minutes to think and around 20 minutes to code which is definitely much faster than I do recent 2000 rated problems.)

      I also think that gaining rate now is as tough (if not tougher) as ever. This observation is based on the virtuals I have given.

»
15 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Although the solution to both the problems are same but reducing 1400E to 448C is not straight-forward, I'm still not sure how can I prove that both the problems are basically same.

  • »
    »
    15 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    You can just submit your code of any of that problems to another. And it will pass, so they're equal.

  • »
    »
    15 months ago, # ^ |
      Vote: I like it +16 Vote: I do not like it

    Proof By AC:).

    • »
      »
      »
      15 months ago, # ^ |
        Vote: I like it -9 Vote: I do not like it

      AC = accepted (for people, who didn't know)

  • »
    »
    15 months ago, # ^ |
      Vote: I like it +3 Vote: I do not like it

    First note that in 1400E, the order in which operations are performed is irrelevant. Give disjoint and nested the obvious meaning for type-1 operations (those that remove one occurrence of each number from $$$l$$$ to $$$r$$$).

    Given two type-1 operations $$$(l_1,\, r_1)$$$ and $$$(l_2,\, r_2)$$$, if they are not disjoint then the pair of operations $$$(\min{\{l_1, l_2\}},\, \max{\{r_1, r_2\}})$$$ and $$$(\max{\{l_1, l_2\}},\, \min{\{r_1, r_2\}})$$$ is equivalent, but is nested. By repeatedly making such substitutions for many pairs of type-1 operations, it is possible to arrive at a sequence of operations in which every pair of type-1 operations is either disjoint or nested.

    Given such a sequence, it is easy to see that the nesting relationship allows one to describe the type-1 operations as several rooted trees. Then, we can translate a type-1 operation $$$(l,r)$$$ at depth $$$d$$$ in its tree in 1400E-speak to "paint a horizontal strip covering planks $$$l$$$ through $$$r$$$ between height $$$d-1$$$ and $$$d$$$" in 448C-speak. A type-2 operation $$$(i, x)$$$ gets translated to "paint a vertical strip covering the highest $$$x$$$ meters of plank $$$i$$$ that are not already painted."

    The reduction of 448C to 1400E should be easy using similar ideas.

»
15 months ago, # |
  Vote: I like it +1 Vote: I do not like it

This is certainly due to their position in the problemset, many people would not have had the time to solve it and some might not have even opened it because it is "E" and not "C".

My opinion about problem ratings: These are calculated based on the rating of participants, since nowadays participation is more than ever, so codeforces has more data to process and to calculate problem ratings. Since larger is the data more is the accuracy, so i think nowadays calculation of problem ratings is much more accurate.

»
15 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Thinking a lot about this now

»
15 months ago, # |
  Vote: I like it +3 Vote: I do not like it

Problems on Codeforces are rating-inflated.