Edu Round 93 Problem C accepted with n^2 solution ?? - Codeforces

→ Pay attention

Before contest
Pinely Round 4 (Div. 1 + Div. 2)
34:03:40
Register now »

*has extra registration

→ Streams

Atcoder ABC #364 Solution Discussion

By aryanc403

Before stream 09:13:40

Codeforces Pinely Round 4 (Div 1 + Div 2) Solution Discussion

By Shayan

Before stream 37:08:40

View all →

→ Top rated

#	User	Rating
1	tourist	3880
2	jiangly	3669
3	ecnerwala	3654
4	Benq	3627
5	orzdevinwang	3612
6	Geothermal	3569
6	cnnfls_csy	3569
8	jqdai0815	3532
9	Radewoosh	3522
10	gyh20	3447

Countries | Cities | Organizations

→ Top contributors

#	User	Contrib.
1	awoo	161
1	maomao90	161
3	adamant	156
4	maroonrk	153
5	-is-this-fft-	148
5	atcoder_official	148
5	SecondThread	148
8	Petr	147
9	nor	144
10	TheScrasse	142

View all →

→ Find user

→ Recent actions

Detailed →

nur_riyad's blog

Edu Round 93 Problem C accepted with n^2 solution ??

By nur_riyad, history, 4 years ago, In English

In English

### O(n^2) solution link

But if I just erase first three lines of that solution, then solution time changes from 4s+ to 1.2s why?

+102

nur_riyad
4 years ago
49

Comments

Comments (45)

Show archived | Write comment?

»

4 years ago, # |

Vote: I like it

0

Vote: I do not like it

Just wait for the open hacking phase to finish. You will have your answer.

→ Reply

»

»

4 years ago, # ^ |

Vote: I like it

+3

Vote: I do not like it

I tried with worst-case but it passes through.

→ Reply

»

4 years ago, # |

Vote: I like it

-39

Vote: I do not like it

I think the solution is relying on the third line #pragma GCC optimization ("unroll-loops"). I guess GCC will be able to optimize it so that it's only O(n) (or something similar in run time). I think this actually might be much harder to hack than expected.

→ Reply

»

»

4 years ago, # ^ |

Vote: I like it

+10

Vote: I do not like it

#pragma GCC optimize("Ofast") #pragma GCC target("avx,avx2,fma") These commands are important, you can check that it passes without "unroll-loops".

→ Reply

»

»

BohdanPastuschak

4 years ago, # ^ |

← Rev. 3 →

Vote: I like it

+25

Vote: I do not like it

Obviously, compiler can't optimize this to $$$O(n)$$$ (see runtime — almost second), it is just very fast $$$O(n^2)$$$. And I bet it won't be hacked — you can check by yourself, this code runs less than a second on maxtest.

→ Reply

»

»

»

XORring-Samurai

4 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

How does this happen?? Why the solution could pass even the worst test case?

→ Reply

»

»

»

»

BohdanPastuschak

4 years ago, # ^ |

← Rev. 2 →

Vote: I like it

0

Vote: I do not like it

C++ is fast , especially with this type of operations and #pragma GCC optimize("Ofast"). Without any optimizations it runs 3.5 seconds.

→ Reply

»

»

»

»

»

XORring-Samurai

4 years ago, # ^ |

← Rev. 3 →

Vote: I like it

+1

Vote: I do not like it

Quite a few times in CP, we deal with problems where the naive solution gives O(n^2) time complexity. Owing to the constraints which are usually of the order of 10^5 and 2 seconds, we optimize our code to O(n). But I had this question, if we can solve all of these problems using the O(n^2) naive solution and this optimization.

→ Reply

»

»

»

»

»

»

4 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

Not all of them.

→ Reply

»

»

»

»

»

»

»

XORring-Samurai

4 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

So, is the above mentioned case, an exception?

→ Reply

»

»

»

»

»

4 years ago, # ^ |

Vote: I like it

+1

Vote: I do not like it

I wonder, why this not always work then? what it depends on exactly.

→ Reply

»

»

»

»

»

»

BohdanPastuschak

4 years ago, # ^ |

Vote: I like it

+1

Vote: I do not like it

Depends on many things such as type of operations and whether it can be optimized by compiler.

→ Reply

»

»

»

»

»

»

»

4 years ago, # ^ |

← Rev. 2 →

Vote: I like it

0

Vote: I do not like it

So can this sometimes make the program even slower? Because otherwise I would include it every time and it could probably make the program faster.

→ Reply

»

»

4 years ago, # ^ |

Vote: I like it

+15

Vote: I do not like it

how can it do it in only O(n)? That is better than most people's optimal solution(which is O(nlog(n))?) Oh, and also, the compiler cannot optimize comparisons in that case. Because it knows that it has to continue through the loops to get the answer(there is no pattern found, or no case where it is useless to continue the loop anymore). If what you are saying is true, then we would have a sort that works in O(n) lol. That would be better than the best sort which works in O(nlog(n)) with any numbers/values/datatype.

→ Reply

»

4 years ago, # |

Vote: I like it

+3

Vote: I do not like it

I tried all 6 possible cases with first three line and found that this line "#pragma GCC optimize("Ofast") " must be included for soln to pass

→ Reply

»

»

4 years ago, # ^ |

Vote: I like it

+1

Vote: I do not like it

Only that is not enough, check

→ Reply

»

»

»

4 years ago, # ^ |

Vote: I like it

+2

Vote: I do not like it

89966688 It is enough

→ Reply

»

»

»

»

4 years ago, # ^ |

Vote: I like it

+1

Vote: I do not like it

Wow, didn't know that 64-bit was that fast. Thanks.

→ Reply

»

4 years ago, # |

← Rev. 2 →

Vote: I like it

+5

Vote: I do not like it

It is fast because,

Code is branchless(CPU can easily predict what comes next)
AVX enables vector instruction, which in this case gives 4x boost by combining and calculating 4 ints at same time
Ofast also gives significant boost, but why idk

→ Reply

»

»

4 years ago, # ^ |

← Rev. 2 →

Vote: I like it

0

Vote: I do not like it

how can we actively try to write such code,

i found the youtube video about

return condition false * answer + condition true * otherans //zero/false will cancel it

and instead of if (x>4) return true else return false

writing return x>4

→ Reply

»

»

»

4 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

Most of the time compiler optimizes branches, mostly try to avoid false*ans1+true*ans2, use ?: instead

→ Reply

»

4 years ago, # |

← Rev. 3 →

Vote: I like it

0

Vote: I do not like it

I think the only solution to this is that if O(n) is expected, then make the time limit at most 0.8 second. It's 20 times slower than the intended solution.

→ Reply

»

»

4 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

Then,it will be very unfair for python users

→ Reply

»

»

»

4 years ago, # ^ |

Vote: I like it

-12

Vote: I do not like it

Limit C++ only then.

→ Reply

»

»

»

4 years ago, # ^ |

← Rev. 2 →

Vote: I like it

+5

Vote: I do not like it

Making the n up to $$$3*10^5$$$ is enough. Most problems which its intended solution is $$$O(n)$$$ or $$$O(nlog_2(n))$$$ are $$$3*10^5$$$ which would obviously make that $$$O(n^2)$$$ solution to fail. I wonder why this time it is only $$$10^5$$$...

→ Reply

»

»

4 years ago, # ^ |

Vote: I like it

+8

Vote: I do not like it

I think increasing max N is typically better.
Increasing max N 10 times would increase n^2 solution time 100 times.

→ Reply

»

»

»

4 years ago, # ^ |

Vote: I like it

-14

Vote: I do not like it

Probably it will fail Python. The thing that there is a time limit equality among all the languages, must be changed, imho.

→ Reply

»

»

»

»

4 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

Probably it will fail Python

Less likely than with decreasing time limit

→ Reply

»

4 years ago, # |

Vote: I like it

+1

Vote: I do not like it

I got TLE during the contest(using O(n^2) solution). But using your first 3 lines , it's accepted now.

→ Reply

»

4 years ago, # |

Vote: I like it

+9

Vote: I do not like it

From what I see with this pragma code begins to use SSE instructions, so I assume it basically processes four values at a time

→ Reply

»

4 years ago, # |

Vote: I like it

+1

Vote: I do not like it

if i use those 3 lines every solution,, can i get advantage all time or not??

→ Reply

»

»

4 years ago, # ^ |

Vote: I like it

+3

Vote: I do not like it

Sometimes, it mostly depends on structure of code

→ Reply

»

4 years ago, # |

Vote: I like it

+1

Vote: I do not like it

LoL, quite unexpected, but cool!
Turns out you can do order of 10^10 operations in a second on CodeForces servers.

→ Reply

»

»

4 years ago, # ^ |

← Rev. 2 →

Vote: I like it

0

Vote: I do not like it

Not accurate. $$$10^{10}$$$ operations would theoretically take around 1800 ms or more in that problem with the same conditions. The number of operations is specifically in worst case is $$$n*(n-1)/2$$$ which is around $$$5*10^9$$$ operations.

→ Reply

»

»

»

4 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

More or less accurate:
The number of loop iterations is 5 * 10^9,
but each loop iteration consists of several operations: ans+=(ll)(a[j]-a[i-1]==j-i+1);
and then there are also operations to do looping: one increment and one comparison per iteration.

→ Reply

»

»

»

»

4 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

Oh, yes you are right. So that is two array accesses and one comparison and increment in each operation. You are right!

→ Reply

»

»

»

»

4 years ago, # ^ |

Vote: I like it

+8

Vote: I do not like it

Yes, but in this case compiler uses AVX instruction set to calculate 4 iterations at same time, if you comment #pragma GCC target("avx,avx2,fma") time will jump to 4 seconds.

→ Reply

»

4 years ago, # |

← Rev. 2 →

Vote: I like it

0

Vote: I do not like it

pragma added to default template....lol

Also, I wonder how many people lost score on this because of failing to break this....

→ Reply

»

»

4 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

98 hacking attempts were made.

→ Reply

»

»

»

4 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

That's massive....I am glad that I do not participate in hacking...

→ Reply

»

»

4 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

You don't lose or gain score in open hacking.

→ Reply

»

»

»

4 years ago, # ^ |

Vote: I like it

0

Vote: I do not like it

I see...my bad then...

→ Reply

»

4 years ago, # |

Vote: I like it

0

Vote: I do not like it

Also see this problem, the brute-force solution also passed using command-lines.

Command-lines do make your solution faster, it might pass $$$10^{10}$$$ operations within seconds, it's almost impossible to hack because it's about the basic-level optimization.

→ Reply

»

4 years ago, # |

Vote: I like it

-10

Vote: I do not like it

i have also used the above concept but was getting tle then i read this blog and used those 3 lines but still i am getting tle

Your code here...
#pragma GCC optimize("Ofast")
#pragma GCC target("avx,avx2,fma")
#pragma GCC optimization ("unroll-loops")
#include<bits/stdc++.h>
using namespace std;
int main()
{
    int i,j,n,t,ans;string s;
    cin>>t;
    while(t--)
    {
        cin>>n;ans=0;cin>>s;
        int a[n+1],p[n+1];
        p[0]=0;
        for(i=1;i<=n;i++)
        {
        a[i]=s[i-1]-'0';
        //if(a[i]==1)
        //ans++;
        p[i]=p[i-1]+a[i];
        }
        for(i=1;i<=n;i++)
        for(j=1;j<=n;j++)
        {
            if(i==j){
            if((p[i]-p[i-1])==i-j+1)
            ans++;}
 
        else if((j-i+1)==(p[j]-p[i-1])&&(i<j))
        ans++;
        }
        cout<<ans<<endl;
    }
}

→ Reply

»

»

4 years ago, # ^ |

Vote: I like it

+3

Vote: I do not like it

It won't work with any $$$O(n^2)$$$ solution, it needs to be coded in very specific way, so compiler can understand and optimize it.

→ Reply