Everule's blog

By Everule, history, 11 days ago, In English

It seems many people struggle to understand what a proof even looks like, for greedy problems. The issue is the polarized forms of proof in CP and academic math, which may seem to be overly rigorous and difficult to during contest. However there is spectrum of what a proof even is. It is important to understand, that you cannot truly write and verify a proof within yourself, or within an entity (the entirety of codeforces for example). It is necessary to have judgement on what level of rigor is sufficient.

If you want to tell yourself that you just don't have math background and therefore can't prove and will perpetually use it as excuse this blog is not for you.

Academic proofs usually tend to be as rigorous as possible, and are carefully verified by other experts in the field, to be objectively certain of its correctness. Clearly that is not a level of rigor you need while solving a codeforces problem. You only need to prove it to yourself.

For this, there are certain heuristics you can use to solve these problems with proofs almost as fast as just solving as you do now, but with more certainty of correctness, which saves you time on implementing incorrect solutions or getting WAs and not knowing if your implementation or idea is bugged. Not only do these strategies help you prove your solutions, they also make it much faster to find approaches that could reasonably work, making you much faster at being able to solve, and you're not at the mercy of the tests while practicing.

On top of that, on problems with multistep solutions, finding the solution is exponential in number of steps, but by proving you can optimize it to linear. Most of my practice has been done without writing a single line of code, because I'm sufficiently experienced in knowing whether my solution is correct or not.

A lot of professional mathematicians believe there are 3 big segments of ability in any particular mathematical domain.

  • Level 1 : You either cannot make sensible claims, or cannot back your claims up to anyone else with anything more than heuristic arguments.
  • Level 2 : You're able to make claims and prove them, but you don't see the relation between your abstract writing and the intuition it works.
  • Level 3 : You don't need to even write down the proof anymore, and at this point you can tell if you can prove something or not.

IT IS NOT POSSIBLE TO REACH LEVEL 3 WITHOUT EXPERIENCE IN LEVEL 2

For example, you might see a proof like. You have a set $$$S$$$ of $$$n$$$ elements. Find the largest subset $$$T \subseteq S$$$ such that of size $$$k$$$, or $$$|T| = k$$$, and the sum of elements in $$$T$$$ is maximised, or $$$\sum_{t \in T} t$$$ is maximised.

It is quite obvious to simply pick the $$$k$$$ largest elements in $$$T$$$. You might think a formal proof would require you to do the following. Let $$$T$$$ be your solution, and let $$$T*$$$ be the optimal solution. If $$$T = T*$$$, our solution is optimal. Let $$$T \not= T*$$$. Then let us consider the symmetric difference of $$$T$$$ and $$$T*$$$. Notice that all elements in $$$T$$$ and not in $$$T*$$$ are larger than those in $$$T*$$$ and not in $$$T$$$. Therefore $$$sum(T) \ge sum(T*)$$$, and is optimal.

But this is not really a proof you need in Competitive programming. This is necessary for math, because everything relies on the basic axioms, and all arguments should derive from it. However, in most cases you don't need to prove a self-evident claim like this. It's useful, but not necessary to understand the formal reasoning of a proof of something really simple, so you can apply it to cases where you don't have as strong intuition as picking the $$$k$$$ largest. I can go on a tangent about how intuition is derived from the primal instincts of man, and is optimized for real life scenarios. For an interesting read on that, you can try this.

But effectively, what I'm trying to say that this claim is not obvious to you because its easy to prove. Its because the claim has applied for humans long enough that its "common sense" to our brain. Really the easiest way to improve in Competitive Programming is the art of developing intuition of something you can't quite intuit anything about. It's also why I enjoy easier problems that I struggle in. It usually means I had to develop intuition in something I didn't really have good intuition for. I personally believe that is something that helps in all areas of thinking, but you might disagree.

I hope that we're in agreement on what a proof should do.

An easy way to sanity check your proof is. Think of every step you took while proving. If you wanted to justify to an opponent you were correct, what would the opponent try to argue against. If you think you can't justify your reasoning over his to a jury without reasonable doubt, your proof needs to be more specific.

I will now show how to write effective proofs, and why they're still formal, despite not having the illusory abstract variables describing everything.

I've decided to make this part interactive. You can send me any problem that you might have guessed, or any constructive you did not understand how to derive, and I will reply to it (if I can solve it). If I think its sufficiently educational, I will add it to the blog.

Let's try this atcoder problem. You've binary string made with $$$AB$$$. You can change some pair of adjacent characters to "AB". Find if you can make the given string a palindrome.

Solution

This form of reasoning is applicable to solving problems, and not just proving your solutions.

Let's start with this unnecessarily hated problem for example click.

Create $$$n$$$ by $$$n$$$ matrix $$$M$$$, where every number from $$$1$$$ to $$$n^2$$$ appears once, such that the number of different values of $$$|x - y|$$$ for adjacent elements in $$$M$$$ is maximized.

Solution
 
 
 
 
  • Vote: I like it
  • +322
  • Vote: I do not like it

»
11 days ago, # |
Rev. 2   Vote: I like it +21 Vote: I do not like it

Hello.

I found these two problems hard to prove without resorting to math. (I think)

https://codeforces.com/contest/999/problem/D

https://codeforces.com/contest/1779/problem/C

Thanks for the blog!

  • »
    »
    11 days ago, # ^ |
      Vote: I like it +16 Vote: I do not like it

    For the first problem

    Hint 1
    Hint 2

    For the next problem, I guess you want proof for delaying as much as possible.

    Proof
»
11 days ago, # |
  Vote: I like it +11 Vote: I do not like it

You've binary string

skull_emoji

(Nice blow btg)

»
11 days ago, # |
  Vote: I like it +11 Vote: I do not like it

I seem to struggle a lot with number theory constructive problems. For example,

https://codeforces.com/contest/1734/problem/E

https://codeforces.com/problemset/problem/1748/D

Thanks for the blog.

  • »
    »
    11 days ago, # ^ |
      Vote: I like it +15 Vote: I do not like it

    Neither of them require much number theory, their observations come from elsewhere. However if you find number theory unintuitive, it could be bottleneck. You can read this.

    Hint 1
    Hint 2

    The other problem is less number theory, and more focused on what the constraints is.

    Hint 1
»
11 days ago, # |
  Vote: I like it 0 Vote: I do not like it

May I know how would one approach the proof for a question like this? https://open.kattis.com/problems/caching I tend to struggle with these qns, as I just test out my intuition and hope it works out. Thanks!

  • »
    »
    11 days ago, # ^ |
      Vote: I like it +15 Vote: I do not like it

    It is hard to think of globally optimal solution. But its easy to prove that certain decision locally. Let $$$C$$$ be the set of elements in cache currently. Assume $$$|C| = c$$$ and we have to add new element $$$x$$$ s.t. $$$x \notin C$$$.

    We only need to think of which is the optimal element to remove. Let us assume we remove some element $$$a \in C$$$, and that $$$a$$$ appears before some other element $$$b \in C$$$ that we didn't remove. Let us show that we can make the solution not worse by removing $$$b$$$ instead. By swapping $$$a$$$ and $$$b$$$, we can have the rest of the cache behave the same until we reach $$$a$$$. Instead of removing something and adding $$$a$$$, we already have $$$a$$$. Since not having an element can only worsen the solution by at most one cache read (because we can just add it now which would make it same as optimal), the solution doesn't worsen.

    Therefore removing the one that appears the latest is optimal.

»
11 days ago, # |
  Vote: I like it 0 Vote: I do not like it

Hey, thanks for the blog!

Could you help me with a intuitive proof for the greedy strategy solution in this FB Hacker Cup problem about a card game.

  • »
    »
    11 days ago, # ^ |
    Rev. 4   Vote: I like it +14 Vote: I do not like it

    The most important thing to notice is that only the set of best cards at each step matters. For step 1 its just the cards that $$$A$$$ has. The next most important thing to notice is that, for a team, the best card order is high card from team, low card from team, low card from opponent, high card from opponent. This order is justifiable, as replacing any card with a better card will not worsen the team's answer, You can prove this by simply playing the same cards as the worse case.

    For when you can improve a single card and always do better, it means that for a set, if the $$$k$$$ th best in set $$$1$$$ is at least as good than the $$$k$$$ th best in set 2, set 1 is at least as good as set 2 from construction of previous claim.

    Now you simply need to prove you can find solution that is "best", i.e. previous condition holds for all other sets. I think its a relatively easy exchange argument.

»
11 days ago, # |
  Vote: I like it 0 Vote: I do not like it

https://codeforces.com/problemset/problem/1780/B

Note that I proved it but I'm interested in a formal proof :).

  • »
    »
    10 days ago, # ^ |
      Vote: I like it +5 Vote: I do not like it

    Note that the gcd of an array equals the gcd of its prefix sum. Thus, the problem is equivalent to selecting some prefix sums of the original array ($$$a$$$) and maximizing their gcd. Note that you must select the entire sum, i.e. $$$a_1+\cdots+a_n$$$, which we call $$$S$$$. The answer is the biggest factor of $$$S$$$ which also divides another prefix sum, thus it is $$$\max\gcd(S,a_1+\cdots+a_i)$$$ for $$$1\le i\le n-1$$$.

»
11 days ago, # |
  Vote: I like it 0 Vote: I do not like it

Can you prove the problem 1731F — Function Sum? Like, authors did provide the solution, but did not provide why it works.

  • »
    »
    11 days ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    I don't see why this problem needs polynomial interpolation, its sufficiently simple to find the polynomial directly by looking at each index $$$i$$$, then counting the number of indices to left that are less than $$$a[i]$$$ and number of indices to right greater than $$$a[i]$$$. Its easy enough to calculate for fixed $$$a[i]$$$, and then its sum of this polynomial from $$$1$$$ to $$$k$$$. Just use Faulhauber's Formula which also proves that this sum is polynomial. The stupidest brute force way to write it is $$$O(n^5)$$$ so its fast enough.

    • »
      »
      »
      10 days ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      If you go deeper into faulhauber's formulas, you'll find out that computing them isn't easy. That's exactly the reason for using interpolation, so that you're able to implicitly compute them

»
11 days ago, # |
Rev. 2   Vote: I like it 0 Vote: I do not like it

Hello, I am excited to find out how you proving https://codeforces.com/contest/1475/problem/F. Thank you!

  • »
    »
    11 days ago, # ^ |
    Rev. 2   Vote: I like it +5 Vote: I do not like it

    First notice that doing the operation is commutative, and also that doing it twice undo's the previous operation. So the only thing that matters is the set of chosen rows and columns. Also notice that doing it on all rows and all columns is like doing nothing. Create variables $$$r_i$$$, whether you do operation on row $$$i$$$, and similarly define $$$c_j$$$.

    So you have constraints like $$$a_{i,j} \bigoplus r_i \bigoplus c_j = b_{i, j}$$$. This tells you that $$$r_i \bigoplus c_j = a_{i, j} \bigoplus b_{i, j}$$$. These form relations in a graph of $$$r_i$$$ and $$$c_j$$$, with $$$n + m$$$ nodes, and you can easily check if there exists solution that satisifies it, since knowing one element in each component defines the rest uniquely.

»
11 days ago, # |
Rev. 3   Vote: I like it +3 Vote: I do not like it

Great blog as always, i'd love to hear your thoughts on this problem and this problem

I personally couldn't prove all parts that my solution uses in latter problem, just noticed the pattern.

»
11 days ago, # |
  Vote: I like it 0 Vote: I do not like it

Problem C from today's code forces round? Can you prove it is NP hard?

»
11 days ago, # |
  Vote: I like it +31 Vote: I do not like it

Can you prove this problem? For which $$$n$$$ does there exist $$$x\neq y$$$ such that $$$x\oplus nx=y\oplus ny$$$?

  • »
    »
    11 days ago, # ^ |
    Rev. 3   Vote: I like it +21 Vote: I do not like it

    All odd $$$n$$$ work, and they are the only $$$n$$$ that work. For a proof of sufficiency, take the smallest power of $$$2$$$ greater than $$$n$$$, say $$$2^k$$$. Then $$$(x, y) = (2^k - 1, 2^k + 1)$$$ works. For necessity, note that you can build $$$x$$$ bit by bit from $$$x \oplus nx$$$ if $$$n$$$ is even.

»
11 days ago, # |
  Vote: I like it +21 Vote: I do not like it

An aid for the ``three levels of ability'': Terry Tao and 'Cheating Strategically'. Terence Tao refers to the three stages as pre-rigorous, rigorous, and post-rigorous.

»
11 days ago, # |
  Vote: I like it -28 Vote: I do not like it

This is nonsense

»
11 days ago, # |
  Vote: I like it +3 Vote: I do not like it

Hi,

Would you say that proof by induction and proof by contradiction are the most common ways to do proof for competitive programming? Also here is a question, I developed a "proof" that the optimal solution will have a 0, but wanted to verify its correctness.

https://codeforces.com/problemset/problem/1667/A

My Proof

Thanks for the blog!

  • »
    »
    10 days ago, # ^ |
      Vote: I like it +2 Vote: I do not like it

    Yes, it's completely valid proof, because there no arbitrary decision or wishfulness.

    This the level of confidence you should have while reasoning on all problems.

»
11 days ago, # |
Rev. 2   Vote: I like it +4 Vote: I do not like it

nice blog

»
11 days ago, # |
  Vote: I like it +5 Vote: I do not like it

Thanks for the blog Everule. Can you tell me how to prove solution of Movie Festival 2 cses problem. I have been trying to get a formal proof for quite a long time but got nowhere.

  • »
    »
    10 days ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    If you are downvoting then you must be expert in proving these kind of problems. Please share your knowledge.

»
11 days ago, # |
  Vote: I like it +7 Vote: I do not like it

Thank you for this post.

I struggled for quite some time on both these problems, but i was unable to prove why the intuitive solution works for both of them.

1685B — Linguistics

1539D — PriceFixed

»
10 days ago, # |
Rev. 2   Vote: I like it +14 Vote: I do not like it

How to estimate the time complexity of the editorial solution of 1732D1 - Balance (Easy version)? The best bound would be some bound which is independent of the magnitude of x. Proving a bound of $$$O(q \log q)$$$ would be nice.

»
10 days ago, # |
  Vote: I like it 0 Vote: I do not like it
»
10 days ago, # |
Rev. 2   Vote: I like it +23 Vote: I do not like it

Fantastic blog, very well written and inspiring! Thank you, Everule.

Proving one's own solution before submitting is a commonly overlooked aspect in CP, especially by beginners. I'd say this blog is more important and valuable, than most "How to practice" and "How to reach XYZ rating" blogs. Also the decision to turn this blog into an interactive session, is a really nice learning experience for the community.

And finally, as a Codeforces connoisseur, I'd like to nominate this blog for the peltorator's best blog post prize.

»
6 days ago, # |
  Vote: I like it 0 Vote: I do not like it

You have a set S of n elements. Find the largest subset T

I don't follow this argument with S = [0,1,2,3,4,5], n = 6, k = 3, T = [0,1,2] and T* = [3,4,5]. How is sum(T) >= sum(T*)?

  • »
    »
    6 days ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Why is T = [0, 1, 2]?

    • »
      »
      »
      6 days ago, # ^ |
      Rev. 2   Vote: I like it 0 Vote: I do not like it

      I'm a beginner, so I probably read it wrong. But that's how I read it: We have S = [0,1,2,3,4,5], n = 6, k = 3 T* = [3,4,5] is optimal. First let's pick T = T*, T is optimal. Now let's pick T != T*, why not [0,1,2]?

»
6 days ago, # |
  Vote: I like it +14 Vote: I do not like it

I feel like Problem E from yesterday's contest belongs here, since many people (including myself) solved it without proving the construction always works.

  • »
    »
    6 days ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    Assume you already found the lower bound of size of a subsequence to have xor $$$x$$$ (which is $$$2$$$ except a subsequence with $$$x$$$ itself), let try to prove you can achieve this lower bound. (or upper bound on $$$k$$$)

    First, you must have at least one element have $$$x$$$'s highest bit equal to $$$1$$$, let call it $$$a$$$. if you want to achieve lower bound, you must take another element $$$b$$$ s.t. $$$b = a \text{ xor } x$$$, you can notice that this kind of pairing is one to one, so different $$$a$$$ will match to different $$$b$$$, and consider comparing $$$a$$$ and $$$b$$$ from highest bit, they will have same bits until $$$x$$$'s highest bit, where $$$a$$$ have $$$1$$$ and $$$b$$$ have $$$0$$$ at that bit, so $$$a$$$ always greater than $$$b$$$, and for every $$$a \le n$$$, you can find some $$$b$$$ s.t. $$$b < a \le n$$$, so you can make #$$$[1 \le i \le n, i \text{ have x's highest bit equal to 1}]$$$ different subsequence with xor equal to $$$x$$$, which is the upper bound of $$$k$$$.