Anti Hashing. - Codeforces

→ Pay attention

Before contest
Codeforces Round 940 (Div. 2) and CodeCraft-23
3 days
Register now »

*has extra registration

→ Top rated

#	User	Rating
1	ecnerwala	3648
2	Benq	3580
3	orzdevinwang	3570
4	cnnfls_csy	3569
5	Geothermal	3568
6	tourist	3565
7	maroonrk	3530
8	Radewoosh	3520
9	Um_nik	3481
10	jiangly	3467

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	maomao90	174
2	adamant	164
2	awoo	164
4	TheScrasse	160
5	nor	159
6	maroonrk	156
7	-is-this-fft-	150
7	SecondThread	150
9	orz	146
10	pajenegod	145

View all →

→ Find user

→ Recent actions

Detailed →

Rajan_sust's blog

Anti Hashing.

By Rajan_sust, history, 6 years ago, In English

The problem was set in acm icpc preliminary contest 2017 in Dhaka site. Problem Link : E.Anti Hash

Problem is : you will given a string S of length N consisting of lowercase letters (a-z) only.Also given a base B and mod-value M for doing polynomial hashing. Note : B and M are both prime.

Your task is to find another string T, satisfying all of the following constraints: Length of T is exactly N. T consists of only lowercase letters (a-z). T and S have the same hash value that means, collision happens. For hashing in both case you have to use B and M.

Any idea? Thanks in advance.

Rajan_sust
6 years ago
3

Comments (2)

Show archived | Write comment?

Rezwan.Arefin01

6 years ago, # |

For the subtask with M ≤ 10⁶, you can have atmost 10⁶ different hashes. So, if you generate more than 10⁶ random strings, there is high probability that you'll hit one with same hash as the given string. However, you can't generate string naively. You can start with a string and try changing letters randomly.

For M < 2³¹ I don't know what to do.

→ Reply

Z0RR0

6 years ago, # |

← Rev. 2 →

+11

Generate 2 sets of random strings of size ceil(N/2) and floor(N/2). Lets denote them as set A and set B. Let the size of each set is X. You can generate random strings by changing just a single random character of previous string as this will change the hash value randomly. Now generate their hashes. Now on, we only will deal with the hash values, forget the strings.

For each hash value of set B, we can say we need a desired hash from set A such that we can make the final hash value. Say hash value of an element of set B is H2. So we need another hash value from set A, H1 such that this equation satisfies ( H1 + (H2 << ceil(N/2)) ) % M = final_hash. Here shift operator denotes the polynomial shifting, hope it is understandable.

Now, for the first H2 of set B, we will not be able to find the desired H1 with a probability (1-X/M). Now, as our desired hash value is not found in set A for the first element, for the second element we can assume it's desired hash value will not collide with the first one's. So, now the sample space is reduced by one. So, this will also not match with a probability (1-X/(M-1)). For the third element the probability is (1-X/(M-2)) and so on. This is the birthday paradox. With a good enough size of set A and set B the probability of not finding desired pair becomes 0. Like when X = 1e4, chance of not finding is more than 95%, when X = 1e5, chance of not finding is less than 1%. But when you take X = 1e6, chance of not finding is way way close to 0.

I might miss some calculation here, but this is the idea.

→ Reply