How to stress test two programs millions times? Help

#	User	Rating
1	ecnerwala	3649
2	Benq	3581
3	orzdevinwang	3570
4	Geothermal	3569
4	cnnfls_csy	3569
6	tourist	3565
7	maroonrk	3531
8	Radewoosh	3521
9	Um_nik	3482
10	jiangly	3468

#	User	Contrib.
1	maomao90	174
2	awoo	164
3	adamant	161
4	TheScrasse	159
5	nor	158
6	maroonrk	156
7	-is-this-fft-	152
8	SecondThread	147
9	orz	146
10	pajenegod	145

Both a participant and a contest organizer sometimes wants to stress test their solution with the brute force. Sometimes random tests are quite weak and one needs many thousands (and sometimes millions) of them to become sure about the correctness. The thing is that running a program is quite slow itself, what may hurt if the computation part is fast. On my laptop running a C++ program with empty main() one thousand times takes 1.3s, what doesn't satisfy me. How to make it faster?

I recently prepared a problem with a binary grid (SRM 699, div1-hard TwoSquares) and I wanted to be very careful about the correctness. I wrote slow solutions in C++ and the intended one in Java. Only then I realized how slow usual stress testing is. If they all were in one language, I would quite easily get everything into one program with classes (structs) and I would just run it once, without any overheads. But since the languages were different, I had to rewrite one solution, what not only requires time, but also is a possible place for a mistake.

Does running a program on Windows take the similar amount of time? Is it possible to run a program on Ubuntu or Windows faster than in a milisecond? Given two programs in C++, how to automatically merge them into one program and test them (this feature would be awesome in Polygon I think, where stress testing is able to process only thousands of tests too)?

Regarding my last question (automatically merging two programs) some idea is to wrap everything except includes into a struct and creating one global instance of such a struct. It should be automatically zero-ed (all global variables should be 0, as the user intended).

code

#include <bits/stdc++.h>
using namespace std;

Struct P1 {
	int s, t[100000];
	int main() {
		int n;
		scanf("%d", &n);
		for(int i = 0; i < n; ++i) {
			scanf("%d", &t[i]);
			s += t[i];
		}
		printf("%d\n", s);
		return 0;
	}
} P1_global;

struct P2 {
	// something similar
} P2_global;

int main() {
	for(int rep = 0; rep < 1000; ++rep) {
		// generate the test and either print it to the file or put it to some stream
		auto p1 = P1_global;
		p1.main();
		auto p2 = P2_global;
		p2.main();
		// compare outputs
	}
}

The good thing is that the memory doesn't become huge if we process many tests (or does it?). But mnbvmar noticed that it won't work for static variables. Any way to make it work? Any other ideas?

#include <bits/stdc++.h> using namespace std; struct S { int func(int input) { static int x = 0; return input + (++x); } } S_global; int main() { for (int test = 0; test < 1000; ++test) { S s1 = S_global; cout << s1.func(0); // we want to make it ALWAYS zero. } }

int main() { for (;;) { // generate the next set of test data, maybe write to a string that you rebind stdin to if (fork() == 0) { // freopen stdout or otherwise check that the output is reasonable solve(); return 0; } int status; wait(&status); assert(WIFEXITED(status)); assert(WEXITSTATUS(status) == 0); } }

Comments (25)

Write comment?

riadwaw

8 years ago, # |

+10

for static's you probably may create two cpps, each looks like this:

namespace { //anon
 //program1
  int main() {}
}

int public_main1() {
  return main();
}

→ Reply

mnbvmar

8 years ago, # ^ |

← Rev. 3 →

Hmm, the problem I noticed was that static variables inside functions should be reset between the tested's main function calls, as in (pardon this artificial example):

Thanks for noticing a problem with another static meaning (that is, variables or functions with static linkage). For others: method described in the post fails on the following code:

int y;
static int f(x) { return x + 2 * y; }

int main() {
  int x;
  cin >> x >> y;
  cout << f(x) << endl;
}

Ah, yeah, I missed the key problems with static

-XraY-

+62

I suggest an alternative solution: just use multitest. Your program considers the input as concatenation of several test cases and returns answers to all of them together. Then you check the equality of big outputs together. There is a problem when your checker is harder then diff, but usually it's enough.

Upsolver

-75

I don't know the opinion of others, but I completely hate this WA #2 on all the problems.

gongy

Multitests on Codeforces are quite rare. I would imagine -XraY- meant it only as a problem testing device, not for use in contest :)

+18

Actually, you can find it in almost every program my team and I write. It's too slow for me to copy and paste a dozen of test cases every time I find a bug.

Thanks for pointing that out, I stand corrected. Are the problems you've written available in English? I would love to see them.

ffao

+31

I think there's a communication problem here. From what I understood, his team writes multitest code even when solving single-test contests like CF (because a multitest program can solve a single case too, and it's easier for him to debug).

He did not answer question about posing multitest problems because that wasn't even in his mind when he suggested writing multitest code.

(Personally, I always forget to zero some global variable when I'm writing multitest code, and end up looking for bugs that don't exist.)

← Rev. 2 →

+15

You are absolutely right, I misunderstood him.

On a side note, I can definitely relate... I've wasted a lot of valuable minutes on initialisation bugs during GCJ.

Edit: It seems like this whole thread is full of misunderstanding...

Errichto

That's one hardcore approach. It's harder to write such a code, isn't it? You must care about global variables and arrays. I didn't know anybody does it by default (at least in C++). I would say it's one more possibility to make a bug.

It's too slow for me to copy and paste a dozen of test cases every time I find a bug.

I store tests in files in1 in2 in3 ... and I run them with bash loop if there are many of them.

Still, I appreciate completely new idea.

halyavin

If you introduce a bug while implementing multitest, stress testing should find it, isn't it? By the way, I never use non-constant global variables in my solutions. If I can't stuff all of them in the main function, I create a struct to hold them.

If multitest by itself is not enough and your computer have multiple cores, you can try to use multithreading. It is much easier in C++11 than before since we have std::thread now.

-XraY- said he usually does it even in contests, but I'd say one doesn't always stress-test their solution during the contest — this is why I'm surprised he uses multiple test cases by default. During the contest preparation, do you say all testers/authors should write their programs like that?

I guess it's a good habit not to use non-constant global variables by the way, though I doubt I will switch, because global arrays are quite comfortable.

I often use many threads to test my solution(s), what makes everything two times faster. It isn't much compared to the issue I described, where computing part is hundreds times faster than opening a program.

I do it to keep good habits.

If solution needs to be stress-tested, I think authors should write it without global variables. I sometimes do a conversion in Educational Rounds for others solutions. I also run them under asan or msan to find more bugs.

simonlindholm

+1 for ASan (especially in combination with appropriately sized vectors instead of global arrays). I'd also add UBSan for overflow- and basic array bounds checking.

rav.gupta

4 years ago, # ^ |

Errichto How do you use threads to test the solution and not cause multiple threads to interfere with each other? Can you share your code ?

likecs

-14

Deleted. Went off topic

i.trofimow

It seems that u completely missed the point

achaitanyasai

A tip:

every time you compile your code, it takes about 1.3 seconds.

You know what ? you can use precompiled headers to speed up the compilation. You can learn more about it here and here. In my case, g++ takes around 0.8s to compile with out precompiled headers, but with precompiled headers, it takes 0.2s.

KADR

+28

You can make the struct templatized (e.g. template <int N> struct { ... }), that way static variables will be different when you use different template parameter. Then, you could write some kind of recursive template which invokes your function and then calls itself with a different value of template parameter. Template depth can't be very large but you can, for example, call it like this 100 times and then re-start the program. It should hopefully reduce the overhead from re-running the binary for each test case.

You may increase allowed depth using compiler options (if you have access to them).

And you may run the template in tree-like way run<i> -> run<2i> and run<2i+1>, that way you'll have 2^{allowed_depth} runs in one file (with appropriate break condition)

ilyakor

+25

On my laptop running a C++ program with empty main() takes 1.3s, what doesn't satisfy me.

I've re-read the statement several times to make sure I understand it correctly. How is it even possible? What is it doing all these 1.3s? I would start with debugging what exactly is going on — maybe smth is wrong with the system on your laptop.

For comparison (this is desktop, but I don't think it should be 1000 times faster):

$ time ./a.out

real	0m0.001s
user	0m0.000s
sys	0m0.001s

Oh, I meant to write "running it 1000 times takes 1.3s". This is why later I ask "is it possible to run ... faster than in a milisecond". It's strange that nobody noticed it before. Thanks (corrected now).

There have been some prior work on this in the context of fuzzing. AFL does a neat thing with changing the program to call fork() at the beginning of main and doing independent runs in forked children, often winning a factor ~10 runtime. See e.g. http://lcamtuf.blogspot.com/2014/10/fuzzing-binaries-without-execve.html.

Edit: in your case this would take the form of something like:

... and if you want you can even get some free multithreading out of this by leaving out, say, the first 10 wait calls.

Errichto's blog