Блог пользователя Errichto

Автор Errichto, 8 лет назад, По-английски

Both a participant and a contest organizer sometimes wants to stress test their solution with the brute force. Sometimes random tests are quite weak and one needs many thousands (and sometimes millions) of them to become sure about the correctness. The thing is that running a program is quite slow itself, what may hurt if the computation part is fast. On my laptop running a C++ program with empty main() one thousand times takes 1.3s, what doesn't satisfy me. How to make it faster?

I recently prepared a problem with a binary grid (SRM 699, div1-hard TwoSquares) and I wanted to be very careful about the correctness. I wrote slow solutions in C++ and the intended one in Java. Only then I realized how slow usual stress testing is. If they all were in one language, I would quite easily get everything into one program with classes (structs) and I would just run it once, without any overheads. But since the languages were different, I had to rewrite one solution, what not only requires time, but also is a possible place for a mistake.

Does running a program on Windows take the similar amount of time? Is it possible to run a program on Ubuntu or Windows faster than in a milisecond? Given two programs in C++, how to automatically merge them into one program and test them (this feature would be awesome in Polygon I think, where stress testing is able to process only thousands of tests too)?

Regarding my last question (automatically merging two programs) some idea is to wrap everything except includes into a struct and creating one global instance of such a struct. It should be automatically zero-ed (all global variables should be 0, as the user intended).

code

The good thing is that the memory doesn't become huge if we process many tests (or does it?). But mnbvmar noticed that it won't work for static variables. Any way to make it work? Any other ideas?

  • Проголосовать: нравится
  • +58
  • Проголосовать: не нравится

»
8 лет назад, # |
  Проголосовать: нравится +10 Проголосовать: не нравится

for static's you probably may create two cpps, each looks like this:

namespace { //anon
 //program1
  int main() {}
}

int public_main1() {
  return main();
}
  • »
    »
    8 лет назад, # ^ |
    Rev. 3   Проголосовать: нравится +3 Проголосовать: не нравится

    Hmm, the problem I noticed was that static variables inside functions should be reset between the tested's main function calls, as in (pardon this artificial example):

    #include <bits/stdc++.h>
    using namespace std;
    
    struct S {
      int func(int input) {
        static int x = 0;
        return input + (++x);
      }
    } S_global;
    
    int main() {
      for (int test = 0; test < 1000; ++test) {
        S s1 = S_global;
        cout << s1.func(0);   // we want to make it ALWAYS zero.
      }
    }
    

    Thanks for noticing a problem with another static meaning (that is, variables or functions with static linkage). For others: method described in the post fails on the following code:

    int y;
    static int f(x) { return x + 2 * y; }
    
    int main() {
      int x;
      cin >> x >> y;
      cout << f(x) << endl;
    }
    
»
8 лет назад, # |
  Проголосовать: нравится +62 Проголосовать: не нравится

I suggest an alternative solution: just use multitest. Your program considers the input as concatenation of several test cases and returns answers to all of them together. Then you check the equality of big outputs together. There is a problem when your checker is harder then diff, but usually it's enough.

  • »
    »
    8 лет назад, # ^ |
      Проголосовать: нравится -75 Проголосовать: не нравится

    I don't know the opinion of others, but I completely hate this WA #2 on all the problems.

    • »
      »
      »
      8 лет назад, # ^ |
        Проголосовать: нравится +5 Проголосовать: не нравится

      Multitests on Codeforces are quite rare. I would imagine -XraY- meant it only as a problem testing device, not for use in contest :)

      • »
        »
        »
        »
        8 лет назад, # ^ |
          Проголосовать: нравится +18 Проголосовать: не нравится

        Actually, you can find it in almost every program my team and I write. It's too slow for me to copy and paste a dozen of test cases every time I find a bug.

        • »
          »
          »
          »
          »
          8 лет назад, # ^ |
            Проголосовать: нравится +5 Проголосовать: не нравится

          Thanks for pointing that out, I stand corrected. Are the problems you've written available in English? I would love to see them.

          • »
            »
            »
            »
            »
            »
            8 лет назад, # ^ |
              Проголосовать: нравится +31 Проголосовать: не нравится

            I think there's a communication problem here. From what I understood, his team writes multitest code even when solving single-test contests like CF (because a multitest program can solve a single case too, and it's easier for him to debug).

            He did not answer question about posing multitest problems because that wasn't even in his mind when he suggested writing multitest code.

            (Personally, I always forget to zero some global variable when I'm writing multitest code, and end up looking for bugs that don't exist.)

            • »
              »
              »
              »
              »
              »
              »
              8 лет назад, # ^ |
              Rev. 2   Проголосовать: нравится +15 Проголосовать: не нравится

              You are absolutely right, I misunderstood him.

              On a side note, I can definitely relate... I've wasted a lot of valuable minutes on initialisation bugs during GCJ.

              Edit: It seems like this whole thread is full of misunderstanding...

  • »
    »
    8 лет назад, # ^ |
      Проголосовать: нравится +3 Проголосовать: не нравится

    That's one hardcore approach. It's harder to write such a code, isn't it? You must care about global variables and arrays. I didn't know anybody does it by default (at least in C++). I would say it's one more possibility to make a bug.

    It's too slow for me to copy and paste a dozen of test cases every time I find a bug.

    I store tests in files in1 in2 in3 ... and I run them with bash loop if there are many of them.

    Still, I appreciate completely new idea.

    • »
      »
      »
      8 лет назад, # ^ |
        Проголосовать: нравится +10 Проголосовать: не нравится

      If you introduce a bug while implementing multitest, stress testing should find it, isn't it? By the way, I never use non-constant global variables in my solutions. If I can't stuff all of them in the main function, I create a struct to hold them.

      If multitest by itself is not enough and your computer have multiple cores, you can try to use multithreading. It is much easier in C++11 than before since we have std::thread now.

      • »
        »
        »
        »
        8 лет назад, # ^ |
          Проголосовать: нравится +5 Проголосовать: не нравится

        -XraY- said he usually does it even in contests, but I'd say one doesn't always stress-test their solution during the contest — this is why I'm surprised he uses multiple test cases by default. During the contest preparation, do you say all testers/authors should write their programs like that?

        I guess it's a good habit not to use non-constant global variables by the way, though I doubt I will switch, because global arrays are quite comfortable.

        I often use many threads to test my solution(s), what makes everything two times faster. It isn't much compared to the issue I described, where computing part is hundreds times faster than opening a program.

        • »
          »
          »
          »
          »
          8 лет назад, # ^ |
            Проголосовать: нравится +15 Проголосовать: не нравится

          I do it to keep good habits.

          If solution needs to be stress-tested, I think authors should write it without global variables. I sometimes do a conversion in Educational Rounds for others solutions. I also run them under asan or msan to find more bugs.

          • »
            »
            »
            »
            »
            »
            8 лет назад, # ^ |
              Проголосовать: нравится 0 Проголосовать: не нравится

            +1 for ASan (especially in combination with appropriately sized vectors instead of global arrays). I'd also add UBSan for overflow- and basic array bounds checking.

        • »
          »
          »
          »
          »
          4 года назад, # ^ |
            Проголосовать: нравится 0 Проголосовать: не нравится

          Errichto How do you use threads to test the solution and not cause multiple threads to interfere with each other? Can you share your code ?

»
8 лет назад, # |
Rev. 2   Проголосовать: нравится -14 Проголосовать: не нравится

Deleted. Went off topic

  • »
    »
    8 лет назад, # ^ |
      Проголосовать: нравится +18 Проголосовать: не нравится

    It seems that u completely missed the point

  • »
    »
    8 лет назад, # ^ |
      Проголосовать: нравится +5 Проголосовать: не нравится

    A tip:

    every time you compile your code, it takes about 1.3 seconds.

    You know what ? you can use precompiled headers to speed up the compilation. You can learn more about it here and here. In my case, g++ takes around 0.8s to compile with out precompiled headers, but with precompiled headers, it takes 0.2s.

»
8 лет назад, # |
  Проголосовать: нравится +28 Проголосовать: не нравится

You can make the struct templatized (e.g. template <int N> struct { ... }), that way static variables will be different when you use different template parameter. Then, you could write some kind of recursive template which invokes your function and then calls itself with a different value of template parameter. Template depth can't be very large but you can, for example, call it like this 100 times and then re-start the program. It should hopefully reduce the overhead from re-running the binary for each test case.

  • »
    »
    8 лет назад, # ^ |
    Rev. 2   Проголосовать: нравится +10 Проголосовать: не нравится

    You may increase allowed depth using compiler options (if you have access to them).

    And you may run the template in tree-like way run<i> -> run<2i> and run<2i+1>, that way you'll have 2^{allowed_depth} runs in one file (with appropriate break condition)

»
8 лет назад, # |
  Проголосовать: нравится +25 Проголосовать: не нравится

On my laptop running a C++ program with empty main() takes 1.3s, what doesn't satisfy me.

I've re-read the statement several times to make sure I understand it correctly. How is it even possible? What is it doing all these 1.3s? I would start with debugging what exactly is going on — maybe smth is wrong with the system on your laptop.

For comparison (this is desktop, but I don't think it should be 1000 times faster):

$ time ./a.out

real	0m0.001s
user	0m0.000s
sys	0m0.001s
  • »
    »
    8 лет назад, # ^ |
      Проголосовать: нравится +10 Проголосовать: не нравится

    Oh, I meant to write "running it 1000 times takes 1.3s". This is why later I ask "is it possible to run ... faster than in a milisecond". It's strange that nobody noticed it before. Thanks (corrected now).

»
8 лет назад, # |
Rev. 2   Проголосовать: нравится +10 Проголосовать: не нравится

There have been some prior work on this in the context of fuzzing. AFL does a neat thing with changing the program to call fork() at the beginning of main and doing independent runs in forked children, often winning a factor ~10 runtime. See e.g. http://lcamtuf.blogspot.com/2014/10/fuzzing-binaries-without-execve.html.

Edit: in your case this would take the form of something like:

int main() {
  for (;;) {
    // generate the next set of test data, maybe write to a string that you rebind stdin to
    if (fork() == 0) {
      // freopen stdout or otherwise check that the output is reasonable
      solve();
      return 0;
    }
    int status;
    wait(&status);
    assert(WIFEXITED(status));
    assert(WEXITSTATUS(status) == 0);
  }
}
  • »
    »
    8 лет назад, # ^ |
      Проголосовать: нравится +5 Проголосовать: не нравится

    ... and if you want you can even get some free multithreading out of this by leaving out, say, the first 10 wait calls.