Omega1's blog

By Omega1, history, 9 years ago, In English

Today I solved problem E. A Simple task that was given at last contest,and I find something intersting and new for me, posible and for other users.Why if I declare k,l,st,dr as a global variable I get TLE on test 9 and if I declare as a local variable my code past all tests?

Can someone explain why?

k=x;
for (l=26;l>=1;l--) {
    st=lower_bound(g[l].begin(),g[l].end(),x)-g[l].begin();
    dr=upper_bound(g[l].begin(),g[l].end(),y)-g[l].begin();
    for (;st<dr;st++) {
         g[l][st]=k; k++;
    }
}

get TLE;

int k=x;
for (int l=26;l>=1;l--) {
    int st=lower_bound(g[l].begin(),g[l].end(),x)-g[l].begin();
    int dr=upper_bound(g[l].begin(),g[l].end(),y)-g[l].begin();
    for (;st<dr;st++) {
         g[l][st]=k; k++;
    }
}

past all tests.

  • Vote: I like it
  • +4
  • Vote: I do not like it

| Write comment?
»
9 years ago, # |
  Vote: I like it 0 Vote: I do not like it

May be it's some compiler optimization effects...

»
9 years ago, # |
  Vote: I like it 0 Vote: I do not like it

Between the two submissions you linked, you also changed cin to scanf.

When viewing a submission, you can use the Compare button on the right to check that.

  • »
    »
    9 years ago, # ^ |
      Vote: I like it +8 Vote: I do not like it

    Look at this code no differents, also TLE on test 9.

    • »
      »
      »
      9 years ago, # ^ |
      Rev. 2   Vote: I like it +12 Vote: I do not like it

      Indeed, and I get the same behavior locally.

      For example, if we take the passing solution (12126668) make the loop variable l global (12126676), the time used raises from ~1.7 seconds to ~4.3 seconds, more than 2x!

      The effect is the same when using GNU C++11 (local, global) and GNU C++ (local, global). There is no such effect when using Visual Studio (local, global), and the runtime is ~2.9 seconds — something in between.

      As to why that happens, my guess is the following.

      When variable l is local, it can not normally be altered by other code, and that allows the compiler to store it implicitly in a register, which speeds up things.

      On the other hand, when variable l is global, the compiler can not prove that global state is not affected between executing the inner loop instructions (perhaps trying to be multithread-aware), and so has to store and access it explicitly, recalculating the address of g[l][st] before every write.

      The most readable disassembly I've got is with objdump -Sl <exefile>. Here is what looks as the inner loop at lines 41-43.

      C++ source:

                      for (;st<dr;st++) {
                          g[l][st]=k; k++;
                      }
      

      Local version:

        402022:	89 3c b9             	mov    %edi,(%ecx,%edi,4)
        402025:	83 c7 01             	add    $0x1,%edi
        402028:	39 c7                	cmp    %eax,%edi
        40202a:	75 f6                	jne    402022 <_main+0x192>
      

      Global version:

        402046:	8b 3d 74 e8 41 00    	mov    0x41e874,%edi
        40204c:	8d 3c 7f             	lea    (%edi,%edi,2),%edi
        40204f:	8b 3c bd 40 60 40 00 	mov    0x406040(,%edi,4),%edi
        402056:	89 04 17             	mov    %eax,(%edi,%edx,1)
        402059:	83 c0 01             	add    $0x1,%eax
        40205c:	83 c2 04             	add    $0x4,%edx
        40205f:	39 c8                	cmp    %ecx,%eax
        402061:	75 e3                	jne    402046 <_main+0x1b6>
      
      • »
        »
        »
        »
        9 years ago, # ^ |
          Vote: I like it +8 Vote: I do not like it

        Thanks for answer Gassa,now I understand why this small change make my code ran faster,now I will use only locals variables in for and while,because it's really make difference.

»
9 years ago, # |
  Vote: I like it 0 Vote: I do not like it

I've tested so amazing !!!

»
9 years ago, # |
Rev. 3   Vote: I like it 0 Vote: I do not like it

http://codeforces.com/contest/558/submission/12119277
i confused things you are right and im wrong :D