TCR trick: Line hashes

Hey guys. I wanted to share a neat trick regarding the Team Contest Reference (TCR) that we're allowed to bring to ICPC contests, for example. I got the idea after spending many minutes debugging a Convex Hull implementation I wrote directly from my TCR in the middle of a regional contest. I knew the implementation in the TCR was correct, but I had probably made a typo somewhere in the code when I was inputting it, but it was impossible to find.

The idea is to store hashes of the code in the TCR. For example, for each code snippet in the TCR we could store its MD5 sum. Then after typing up some snippet from the TCR, we can check that the MD5 sum of the file matches the MD5 sum in the TCR. This is however not very beneficial, since we can only check whether or not our code is identical. We don't get any information about where the error was made.

Another idea is to store the MD5 sum of each line of each code snippet in the TCR. This can take up much space, but the first couple of characters from the MD5 sum should be enough. After typing up a code snippet, we can go through each line and check that the MD5 sum matches, allowing us to easily find the line where the input error was made. When the code snippet has many lines, this will take a long time, and is therefore not practical either.

A much better idea is to do the following. After the ith line of the code we store the MD5 sum of the first i lines of the code. Then the last line contains the MD5 sum of the whole file, which can be used for quickly checking if all the code is identical. If it isn't, we can use binary search to find the first input error in our code. That is, we first compute the MD5 sum of the first half of the lines of the code and check if it matches the corresponding MD5 sum in the TCR. If it doesn't, then the input error was made in the first half of the code. Otherwise, it was made in the second half of the code. We do this again until we narrow the range down to the one line where the first input error occurred.

I've used this in a contest and it was really helpful. Just being able to validate that the code is correct gives you much more confidence, but being able to quickly locate the error can be a real timesaver. I use Vim, so the workflow I use is to visually select the lines that I want to check, then execute the following in command mode:

:'<,'>w !sed 's/\s*//g' | md5sum

Notice that I ignore all whitespace when computing the MD5 sum (the same has to be done in the TCR). With this in the command history and using the 'gv' shortcut to reselect previous selection, it takes no time to locate the input errors, even in long snippets.

I have this in my TCR, which you can find here.

Comments (3)

Write comment?

niklasb

10 years ago, # |

← Rev. 2 →

+11

Nice idea. You could also save the snippet in a file and print all the prefix hashes at once:

f=test.txt; for i in $(seq 1 `wc -l < $f`); do echo -n "$i "; head -n $i $f | sed 's/\s//g' | md5sum; done

→ Reply

SuprDewd

10 years ago, # ^ |

That should save some work, cool!

frankrlt

8 years ago, # |

+12

Very useful idea to check printed code against his md5 hash. I found very useful too your TCR, thanks for sharing it!

#	User	Rating
1	ecnerwala	3648
2	Benq	3580
3	orzdevinwang	3570
4	cnnfls_csy	3569
5	Geothermal	3568
6	tourist	3565
7	maroonrk	3530
8	Radewoosh	3520
9	Um_nik	3481
10	jiangly	3467

#	User	Contrib.
1	maomao90	174
2	awoo	164
2	adamant	164
4	TheScrasse	160
5	nor	159
6	maroonrk	156
7	SecondThread	150
8	-is-this-fft-	149
9	pajenegod	146
10	BledDest	144

SuprDewd's blog