By MikeMirzayanov, history, 4 years ago, translation, In English,

What are generators?

Generators are helper programs that output test. Most programming task usually has a large input (for example, an array of up to 2 * 105 elements, a tree of up to 105 vertices), so it's not possible to add all the tests manually. In these cases, generators come to the rescue.

If you are writing a generator in C++, it is recommended to use the testlib.h library.

Types of generators

There are two types of generators:

  1. Single-test generator: output exactly one test in a single run. Usually, to generate several tests, one must run the generator several times with different command line parameters. Such generators output the test to the standard output stream (to the screen).
  2. Multiple-test generator: output many tests in a single run. Such generators output tests to files (one file for each test).

An example single-test generator with testlib.h

The following generator output a pair of integers with each element from 1 to n — where n is a command line parameter passed to the generator.

#include "testlib.h"
#include <iostream>

using namespace std;

int main(int argc, char* argv[])
{
    registerGen(argc, argv, 1);
    int n = atoi(argv[1]);
    cout << rnd.next(1, n) << " ";
    cout << rnd.next(1, n) << endl;
}

Why testlib.h?

On the surface, it seems that testlib.h is not necessary to write a generator. This is not true. Almost all generators need to generate random values, and it is tempted to use rand(). However, this is a bad practice. A basic principle of writing generators is that: a generator must output the same test when compiled by any compiler on any platform if it is run in the same way (using the same command line parameters). When using rand() or C++11 classes like mt19937/uniform_int_distribution, your program will output different tests when compiled with different compilers.

The random value generator in testlib.h ensures that the same value is generated regardless of the (test) generator and platform. Besides, testlib.h has various conveniences for generating tests, for example, rnd.next("[a-z]{1,10}") will return a random word of length 1 to 10 from letters a to z.

Translator's note: There are more issues with using rand() aside from the above one. Refer to this blog for a detailed explanation about these issues.

Available method

To initialize a testlib generator, the first line of your generator must be of the form registerGen(argc, argv, 1); (where 1 is the version of the random number generator used). After that, it will be possible to use the rnd object, which will be initialized with a hash from all the command line arguments. Thus, the output of g 100 and g.exe "100" will be the same, while g 100 0 will be different.

rnd is of type random_t. That is, you can create your own generator, but this is not necessary in most of the cases.

rnd has many useful member functions. Here are some examples:

Call Return value
rnd.next(4) An equiprobable random integer from 0 to 3 (inclusive)
rnd.next(4, 100) An equiprobable random integer from 4 to 100 (inclusive)
rnd.next(10.0) An equiprobable random real number in the half-interval [0;10)
rnd.next("one|two|three") An equiprobable random word out of 'one', 'two' and 'three'
rnd.next("[1-9][0-9]{99}") An equiprobable random 100-digit number as a string
rnd.wnext(4,t) wnext is a method of obtaining an uneven distribution (with a biased expectation), the parameter t denotes the number of calls to the maximum operation for similar next calls. For example rnd.wnext(3, 1) is equivalent to max(rnd.next(3), rnd.next(3)), and rnd.wnext(4, 2) is equivalent to max(rnd.next(4), max(rnd.next(4), rnd.next(4))). If t < 0, then -t will find the minimum. If t = 0, then wnext is equivalent to next.
rnd.any(container) A random element of the container container (with random access via an iterator), for example, it works for std::vector and std::string

Also, please do not use std::random_shuffle, use the shuffle from testlib.h instead. It also takes two iterators, but works using rnd.

Translator's note: If my understanding is correct, rnd.wnext is defined as follows:

Example: generating an undirected tree

Below is the code of an undirected tree generator that takes two parameters — the number of vertices and the 'elongation' of the tree. For example:

  • For n = 10, t = 1000, a path graph (degree of all vertices are at most 2) is likely to be generated
  • For n = 10, t =  - 1000, a star graph (there's only one non-leaf vertex in the tree) is likely to be generated.
registerGen(argc, argv, 1);

int n = atoi(argv[1]);
int t = atoi(argv[2]);

vector<int> p(n);

/* setup parents for vertices 1..n-1 */
for(int i = 0; i < n; ++i)
    if (i > 0)
        p[i] = rnd.wnext(i, t);

printf("%d\n", n);

/* shuffle vertices 1..n-1 */
vector<int> perm(n);
for(int i = 0; i < n; ++i)
    perm[i] = i;
shuffle(perm.begin() + 1, perm.end());

/* put edges considering shuffled vertices */
vector<pair<int,int> > edges;
for (int i = 1; i < n; i++)
    if (rnd.next(2))
        edges.push_back(make_pair(perm[i], perm[p[i]]));
    else
        edges.push_back(make_pair(perm[p[i]], perm[i]));

/* shuffle edges */
shuffle(edges.begin(), edges.end());

for (int i = 0; i + 1 < n; i++)
    printf("%d %d\n", edges[i].first + 1, edges[i].second + 1);

How to write a multiple-test generator?

A multiple-test generator in one execution can output more than one test. Tests by such a generator are output to files. In the generator on testlib.h it is enough to write startTest(test_index) before the test output. This will re-open (freopen) the standard output stream to a file named test_index.

Please note that if you are working with the Polygon system, in this case, you need to write something like multigen a b c > {4-10} in the script (if it is assumed that starting the multiple-test generator will return tests 4, 5, 6, 7, 8, 9, and 10).

Other notes about generators

  • Strictly follow the format of the test — spaces and line breaks should be placed correctly. The test should end with a line feed. For example, if the test consists of a single number, then output it as cout << rnd.next (1, n) << endl; — with a line feed at the end.
  • If the test size is large, it is prefered to use printf instead of cout — this will improve the performance of the generator.
  • It is better to use cout to output long long, but if you want printf, then use the I64 constant (for example, printf(I64, x);).
  • Please be aware about various cases of C++ undefined behavior. For example, in the first example generator above, if the two cout commands are combined into one, the order of the rnd.next function calls is not defined.

Translator's note: about the third point, using lld constant with printf to output long long used to be problematic in the past, but is no longer an issue now.

Further examples

Further examples of generators can be found in the release notes or directly at the GitHub repository.

 
 
 
 
  • Vote: I like it
  • +79
  • Vote: I do not like it

»
4 years ago, # |
Rev. 3   Vote: I like it 0 Vote: I do not like it

thanks a lot, but this post is in Russian not English!

»
4 years ago, # |
  Vote: I like it -8 Vote: I do not like it

Translation of the page

Generators — these support programs in the programming tasks that output tests. Not always manual tests in the problem small enough to only do them. In this case, come to the aid and generators. If you write a generator in C ++, using testlib.h — a good choice.

Types of generators There are two types of generators: conventional and multigeneratory.

The first one it is run exactly one output test. Typically, to generate several tests, such a generator is necessary to run multiple times with different command-line options. Such generators output the test to the standard output (the screen). Multigeneratory a single run many tests at once withdrawn. Such generators output test files (one file — one test). An example of a simple conventional generator testlib.h Displays a pair of numbers from 1 to n , where n — the parameter passed start the generator.

include "testlib.h"

include

using namespace STD ;

int main ( int argc , char * argv []) { registerGen ( argc , argv , 1 ); int n = atoi ( argv [ 1 ]); cout << rnd . next ( 1 , n ) << "" ; cout << rnd . next ( 1 , n ) << endl ; } Why testlib.h? When inattentive glance it seems that almost testlib.h not need to write the generator. In fact this is not true. Almost every generator need to be able to get a random value, and there is a great temptation to use rand () . Do not do that. The basic principle of writing generator: generator should output the same test when compiling any compiler on any platform, if it is started the same way . When using the rand () , or C ++ classes such as 11 mt19937 / uniform_int_distribution your program will display different tests after compiling different compilers.

The random values ​​in testlib.h ensures that will generate the same regardless of the generator and the platform. In addition, there are different testlib.h convenient for the test generation, for example, rnd.next ("[AZ] {1.10}") returns a random word length from 1 to 10 of the letters from a to z .

What have testlib.h? To initialize testlib-generator, the first line of your generator should be in the form registerGen (argc, argv, 1) (where 1 — is the version used a random number generator). You can then use the object rnd , which is initialized hash of all the command line arguments. Thus, the result of an output of 100 g and g.exe "100" will be the same, and at 100 g 0 will be different.

Object rnd has type random_t , meaning you can create and own generator, but this is usually not necessary.

An object rnd has many useful member functions. Here are some examples:

Challenge What is he doing rnd.next (4) equiprobable integer random number from 0 to 3 inclusive rnd.next (4, 100); equiprobable integer random number from 4 to 100 inclusive rnd.next (10.0) equiprobable real random number in the interval [0,10) rnd.next ("one | two | three") equiprobable random one word from three one, two and three rnd.next ("[1-9] [0-9] {99}") equiprobable random 100-digit number as a string rnd.wnext (4, t) wnext — a means of obtaining nonequiprobability distribution (with offset math expectancy), the parameter t represents the number of calls the operation "maximum" for a call to next, for example rnd.wnext (3, 1) is equivalent to the max (rnd.next (3), rnd.next ( 3)), and rnd.wnext (4, 2) is equivalent to max (rnd.next (4), max (rnd.next (4), rnd.next (4))). If T  <0 , then  -  T will be found at least. If T  = 0 , the equivalent wnext next. rnd.any (container) returns a random element container container (random access iterator on), for example, works for std :: vector and std :: string In addition, do not use STD :: random_shuffle , use the shuffle of testlib. It also takes two iterators, but works using rnd .

Example: Generate an undirected tree Below is the basic code generator undirected tree, which takes two parameters — the number of vertices and the degree of its elongation . For example, when n  = 10,  T  = 1000 certainly be generated circuit, and when n  = 10,  T  = — 1000 certainly be generated chamomile (asterisk).

registerGen ( argc , argv , 1 );

int n = atoi ( argv [ 1 ]); int T = atoi ( argv [ 2 ]);

Vector P ( n );

/ * Setup parents for vertices 1..n-1 * / Forn ( i , n ) if ( i > 0 ) P [ i ] = rnd . wnext ( i , T );

printf ( "% D \ n" , n );

/ * Shuffle vertices 1..n-1 * / Vector Perm ( n ); Forn ( i , n ) Perm [ i ] = i ; shuffle ( Perm . begin () + 1 , Perm . end ()) ;

/ * Put edges considering shuffled vertices */ vector < pair < int , int > > edges ; for ( int i = 1 ; i < n ; i ++) if ( rnd . next ( 2 )) edges . push_back ( make_pair ( perm [ i ], perm [ p [ i ]])); else edges . push_back ( make_pair ( perm [ p [ i ]], perm [ i ]));

/ * * Shuffle Edges / shuffle ( Edges . begin (), Edges . end ());

for ( int i = 0 ; i + 1 < n ; i ++) printf ( "% D% D \ n" , Edges [ i ]. first + 1 , Edges [ i ]. Second + 1 ); How to write multigenerator? Multigenerator per performance can bring more than one test. Tests such generator output to a file. The generator on testlib.h enough to write a test output StartTest (test_index) . This will lead to rediscovery (freopen) standard output to a file named test_index . Note that in the Polygon in this case the script should write something like MultiGen abc> {4-10} (if it is assumed that the launch multigeneratora return tests 4, 5, 6, 7, 8, 9 and 10).

What else to pay attention? Strictly follow the format of the test — spaces, line breaks must be perfectly satisfied. The test should end with a newline. For example, if the test consists of a single number, the output it as cout << rnd.next (1, n) << endl; — with line break at the end. If the test output can be quite large, the preferred printf (not cout ) — it will improve performance. It is better to deduce long long by cout , but if you want to printf , then use the constant I64 (eg, printf (I64, x); ). You must not forget the various cases of uncertain behavior of the language C ++. For example, in the above example, the generator can not combine the two teams cout one because Then the order of function calls rnd.next not defined. More examples

»
7 months ago, # |
  Vote: I like it +37 Vote: I do not like it

»
7 weeks ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by adamant (previous revision, new revision, compare).

»
7 weeks ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by dalex (previous revision, new revision, compare).

»
7 weeks ago, # |
  Vote: I like it 0 Vote: I do not like it

The link to problems with rand() seems to be broken. If you could fix it, that would be very much appreciated.