Whistle's blog

By Whistle, history, 8 years ago, In English

I heard many top-guys say that when their solutions get WA for reasons they can't find, they usually generate tests and compare the output on a bruteforce solution with the output on their fast solution. Do you have a script to do that, or each time you generate test and try it on both solution? Could you elaborate on when and how you do it?

  • Vote: I like it
  • +12
  • Vote: I do not like it

»
8 years ago, # |
Rev. 2   Vote: I like it +7 Vote: I do not like it

Here is a rudimentary and fast way to do it in Python:

import os

while 1:
        # os.system('./generator ' + str(seed) + ' > input.txt')
        os.system('./generator > input.txt')
        os.system('./bruteforce < input.txt > bruteforce.txt')
        os.system('./fast_solution < input.txt > fast_solution.txt')
        if open('bruteforce.txt').read() != open('fast_solution.txt').read():
                print 'WA'
                exit(0)


If you are on windows then I think you have to change the " ./program " for "program_name.exe".

This code assumes you have a generator, a bruteforce solution and a fast one already written. Also, it only works when solutions are unique (or, at least, when you know your fast solution will write the same as the bf program).

So, you let it run for some seconds, if it doesn't print WA then you have some guarantee that your solution is quite reasonable. If it does, then you have the input and both outputs written and ready to be checked.

»
8 years ago, # |
Rev. 2   Vote: I like it +14 Vote: I do not like it

I took an algorithm course in Coursera there they mentioned a technique called Stress testing , The following text is taken from one of their readings.

1.A few small manual tests.

2.A test for each possible type of answer (smallest answer, biggest answer, answer doesn't exist, etc.)

3.Test for time/memory limit: generate a test with the largest possible size of input ("max test"), run your program, measure time and memory consumption.

3.Tests for corner cases: Smallest possible "n": the length of the input sequence or string, the number of queries, etc.

4.Equal numbers, equal letters in the string, more generally, equal objects in the input. Any two objects that are not restricted to be different in the problem statement can be equal.

5.Largest numbers in the input allowed by the problem statement — for example, to test for integer overflow.

6.Degenerate cases like empty set, three points on one line, a tree which consists of just one chain of nodes.

If after all of that you're sure that all answers are correct, but your solution is not accepted in the testing system, and you don't know what is the test case where your solution fails, there's the last resort called stress testing — a very efficient technique that will find you a test case for which your solution fails probably in 95% of cases not covered by the previously mentioned test types.

A stress test consists of four parts:

  1. The solution you want to test.
  1. A different, possibly trivial and very slow, but easy to implement and obviously correct solution to the problem.
  1. A test generator.
  1. An infinite loop in which a new test is generated, it is fed into both solutions, then the results are compared, and if they differ, the test and both answers are output, and the program stops, otherwise the loop repeats.

The idea behind stress testing is that if you have two correct solutions, and the answer to the problem is unique, then for any possible test case they are guaranteed to give the same answer. If, however, at least one of the solutions is incorrect, then with very high probability there exists a test on which their answers differ. The only case when it is not so is when there is a common mistake in both solutions, but that is very unlikely (unless the mistake is somewhere in the input/output routines which are common to both solutions — check for that separately). Indeed, if one solution is correct and the other is wrong, then there obviously exists a test case on which they differ. If both are wrong, but the bugs are different, then most probably there exists a test on which one solution gives a correct answer and another gives wrong answer, so they differ.

Taken from the course Algorithmic Toolbox , University of California, San Diego & Higher School of Economics