New feature of Polygon: tests for checkers and validators

#	User	Rating
1	tourist	3880
2	jiangly	3669
3	ecnerwala	3654
4	Benq	3627
5	orzdevinwang	3612
6	Geothermal	3569
6	cnnfls_csy	3569
8	jqdai0815	3532
9	Radewoosh	3522
10	gyh20	3447

#	User	Contrib.
1	awoo	161
2	maomao90	160
3	adamant	156
4	maroonrk	153
5	-is-this-fft-	148
5	atcoder_official	148
5	SecondThread	148
8	Petr	147
9	nor	144
9	TheScrasse	144

Good afternoon, Codeforces!

Today I'll tell you about a new feature of Polygon system, which is used to prepare all Codeforces rounds. Of course the system is open to any user – many contests for other competitions and training camps are prepared there.

Two key elements of a problem, besides the author's solution, tests and statements, are two programs: the validator and the checker.

The validator is the program that reads the test and reports whether it corresponds to the condition of a problem or not. Validators must be absolutely formal – a validator validates a test if and only if it meets the conditions of the problem and can be safely added to the test set. You can easily write validators using the testlib.h library. Sometimes authors neglect validators (which never happens during the Codeforces contests) and it threatens the validity of tests. Since the Codeforces contests contain hacks, the importance of correct validator greatly increases. Naturally, all the hacks are validated before reaching a contestant’s solution. Most tasks have relatively simple validators, but when a problem contains additional conditions (for example, that there is a solution for the test), then the complexity of the validator is greatly increased.

The checker is the program that receives the test, the output of the participant’s code, the output of the jury’s code and determines the correctness of the participant’s output. Unfortunately, errors in the checker often lead to serious consequences. Not all problems let you simply compare the solutions. For example, in problem 234H - Merging Two Decks the checker uses a Cartesian tree. If the problem statement requires a certificate, then it’s a good idea to write the checker in the concept of readAnswer(ans)/readAnswer(ouf). You can easily write checkers using the testlib.h library.

Testing of these programs usually takes place either manually from the command line or indirectly — by adding wrong solutions and temporarily adding of non-valid tests. In fact, the authors often neglect to test checkers and validators. This method of testing is inconvenient, and the tests are not saved. When there are two authors cooperating, a co-author cannot view the tests, on which the validator/checker were tested, or restart them after the correction of the validator or checker.

The updated version of Polygon has improved immensely! We’ve made a convenient means for testing the validator and the checker.

Tests for the test validator

This tool is now available in the Polygon. The tests are displayed on the Validator page. You can easily add multiple tests at once, separated by a special marker. For each added test indicate the expected validator verdict (valid or invalid). You can also run these tests for review. The validator tests are a full part of the problem and go to the problem pack. For this reason, we’ve updated the problem.xml format, here is an example of a more detailed description of the validator (elements binary and testset are optional):


<validator>
    <source path="files/v.cpp" type="cpp.g++"/>
    <binary path="files/v.exe" type="exe.win32"/>
    <testset>
        <test-count>2</test-count>
        <input-path-pattern>validator-tests/%02d</input-path-pattern>
        <tests>
            <test verdict="valid"/>
            <test verdict="invalid"/>
        </tests>
    </testset>
</validator>

If the validator has no tests, you will receive the traditional Polygons warning.

Tests for Checker

The situation is similar. Generating such tests is a little more complicated, since it is necessary to introduce not only the input, but the output (sort of the participant’s output) and answer (sort of the answer, the output of the author's solution). You do not have to create tests for standard checkers. Here is an example of an updated description of the checker from problem.xml (elements binary, copy, testset are optional):


<checker name="std::wcmp.cpp" type="testlib">
    <source path="files/check.cpp" type="cpp.g++"/>
    <binary path="check.exe" type="exe.win32"/>
    <copy path="check.cpp" type="cpp.g++"/>
    <testset>
        <test-count>4</test-count>
        <input-path-pattern>checker-tests/%02d</input-path-pattern>
        <output-path-pattern>checker-tests/%02d.o</output-path-pattern>
        <answer-path-pattern>checker-tests/%02d.a</answer-path-pattern>
        <tests>
            <test verdict="ok"/>
            <test verdict="wrong-answer"/>
            <test verdict="wrong-answer"/>
            <test verdict="presentation-error"/>
        </tests>
    </testset>
</checker>

If a non-standard checker has no tests, you will receive a traditional Polygon warning.

Conclusion

As you can see, all changes in problem.xml are backwards compatible. Checking the test is embedded into the scripts of deploying of problem doall.bat/doall.sh. When you import a problem, you can manually or automatically check how the tests pass, as they are formally described in problem.xml and are contained in the problem package.

I believe that the presence of such a tool will make life easier for the authors of problems and help avoid the probable errors.

Sincerely, Ivan.

Comments (7)

Write comment?

xa.mohsen

11 years ago, # |

-19

Great Work :)

→ Reply

ahmed_aly

+12

I have something to say about Polygon.

In 2011 I was a judge and problem setter in the ACM ICPC Arab regional contest, and we didn't use any specific system to prepare the problems, and it was really painful.

In 2012 I was the chief judge, and we used Polygon, which made our life much easier with more features and safety checks, and I want to thank everyone who worked in this amazing system.

In 2013 I'll be the chief judge again, and definitely I'll use Polygon.

And I have a suggestion also, since Polygon is being used to prepare huge contests, I think some people (like me) might be worried about some security issues, so I think it will be good idea if you can use secure browsing (https).

I_love_natalia

11 years ago, # ^ |

← Rev. 4 →

-16

What if I say you that polygon has much worse security issues rather than http instead of https?

I think it's not good idea to say what are these issues in public, but I think you should report these issues to someone (if you didn't do already).

MikeMirzayanov

Please write me private message with the details.

Jacob

7 years ago, # |

+30

For a newly created problem it appears that validator tests require the checker to run. To me it seems quite strange, because the validator supposedly only runs on the test inputs.

7 years ago, # ^ |

Thanks, I added an issue: https://github.com/Codeforces/polygon-issue-tracking/issues/157 Will be resolved soon.

Fefer_Ivan's blog

Tests for the test validator

Tests for Checker

Conclusion