Good afternoon, Codeforces!
Today I'll tell you about a new feature of Polygon system, which is used to prepare all Codeforces rounds. Of course the system is open to any user – many contests for other competitions and training camps are prepared there.
Two key elements of a problem, besides the author's solution, tests and statements, are two programs: the validator and the checker.
The validator is the program that reads the test and reports whether it corresponds to the condition of a problem or not. Validators must be absolutely formal – a validator validates a test if and only if it meets the conditions of the problem and can be safely added to the test set. You can easily write validators using the testlib.h library. Sometimes authors neglect validators (which never happens during the Codeforces contests) and it threatens the validity of tests. Since the Codeforces contests contain hacks, the importance of correct validator greatly increases. Naturally, all the hacks are validated before reaching a contestant’s solution. Most tasks have relatively simple validators, but when a problem contains additional conditions (for example, that there is a solution for the test), then the complexity of the validator is greatly increased.
The checker is the program that receives the test, the output of the participant’s code, the output of the jury’s code and determines the correctness of the participant’s output. Unfortunately, errors in the checker often lead to serious consequences. Not all problems let you simply compare the solutions. For example, in problem 234H - Merging Two Decks the checker uses a Cartesian tree. If the problem statement requires a certificate, then it’s a good idea to write the checker in the concept of readAnswer(ans)/readAnswer(ouf). You can easily write checkers using the testlib.h library.
Testing of these programs usually takes place either manually from the command line or indirectly — by adding wrong solutions and temporarily adding of non-valid tests. In fact, the authors often neglect to test checkers and validators. This method of testing is inconvenient, and the tests are not saved. When there are two authors cooperating, a co-author cannot view the tests, on which the validator/checker were tested, or restart them after the correction of the validator or checker.
The updated version of Polygon has improved immensely! We’ve made a convenient means for testing the validator and the checker.
Tests for the test validator
This tool is now available in the Polygon. The tests are displayed on the Validator page. You can easily add multiple tests at once, separated by a special marker. For each added test indicate the expected validator verdict (valid or invalid). You can also run these tests for review. The validator tests are a full part of the problem and go to the problem pack. For this reason, we’ve updated the problem.xml format, here is an example of a more detailed description of the validator (elements binary and testset are optional):
<validator> <source path="files/v.cpp" type="cpp.g++"/> <binary path="files/v.exe" type="exe.win32"/> <testset> <test-count>2</test-count> <input-path-pattern>validator-tests/%02d</input-path-pattern> <tests> <test verdict="valid"/> <test verdict="invalid"/> </tests> </testset> </validator>
If the validator has no tests, you will receive the traditional Polygons warning.
Tests for Checker
The situation is similar. Generating such tests is a little more complicated, since it is necessary to introduce not only the input, but the output (sort of the participant’s output) and answer (sort of the answer, the output of the author's solution). You do not have to create tests for standard checkers. Here is an example of an updated description of the checker from problem.xml (elements binary, copy, testset are optional):
<checker name="std::wcmp.cpp" type="testlib"> <source path="files/check.cpp" type="cpp.g++"/> <binary path="check.exe" type="exe.win32"/> <copy path="check.cpp" type="cpp.g++"/> <testset> <test-count>4</test-count> <input-path-pattern>checker-tests/%02d</input-path-pattern> <output-path-pattern>checker-tests/%02d.o</output-path-pattern> <answer-path-pattern>checker-tests/%02d.a</answer-path-pattern> <tests> <test verdict="ok"/> <test verdict="wrong-answer"/> <test verdict="wrong-answer"/> <test verdict="presentation-error"/> </tests> </testset> </checker>
If a non-standard checker has no tests, you will receive a traditional Polygon warning.
As you can see, all changes in problem.xml are backwards compatible. Checking the test is embedded into the scripts of deploying of problem doall.bat/doall.sh. When you import a problem, you can manually or automatically check how the tests pass, as they are formally described in problem.xml and are contained in the problem package.
I believe that the presence of such a tool will make life easier for the authors of problems and help avoid the probable errors.