Help needed with a hard problem

#	User	Rating
1	tourist	3880
2	jiangly	3669
3	ecnerwala	3654
4	Benq	3627
5	orzdevinwang	3612
6	Geothermal	3569
6	cnnfls_csy	3569
8	jqdai0815	3532
9	Radewoosh	3522
10	gyh20	3447

#	User	Contrib.
1	awoo	161
2	maomao90	160
3	adamant	156
4	maroonrk	153
5	-is-this-fft-	148
5	atcoder_official	148
5	SecondThread	148
8	Petr	147
9	nor	144
9	TheScrasse	144

Hello everyone, I am stuck at this problem. I think we can use a Trie. But I am unable to handle the special character '?'. Please help me with this problem. Thanks in advance :).

There are N words in a dictionary such that each word is of fixed length M, and consists of only lower case english alphabets. A query word Q also has length M. Query word also contains only lower case english alphabets but at some places instead of an alphabet it has "?". match_count of Q, is the count of words in the dictionary (already populated with words of same length M) and contains the same english letters (excluding the letter that can be in the position of ?) in the same position as the letters are there in the query word Q. Your are given the query word Q and you have to find the match_count.

Input Format

First line contains two space seperated integers N and M denoting the number of words in the dictionary and the length of each word
Next N line contains the words in the dictionary
Next line Q contains the number of query words
Next Q lines contain one query word each


Output
For each query word output the match_count

Constrains
1<=N<=5x10^4
1<=M<=7
1<=Q<=10^5

Time Limit - 4 secs

Sample Input:
5 3
cat
map
bat
man
pen
4
?at
ma?
?a?
??n


Sample Output
2
2
4
2

Problem Link

Comments (4)

Show archived | Write comment?

Slamur

4 years ago, # |

← Rev. 4 →

Length of words $$$M$$$ is small ($$$\le$$$ $$$7$$$), so you can do next approach:

For each word generate $$$2^M$$$ binary masks, where $$$0$$$ at the $$$i$$$-th bit means that letter is replaced by '?' and $$$1$$$ means that $$$i$$$-th letter of the word used.

For each mask you can calculate hash of transformed word — so you would have $$$O(N \cdot 2^M)$$$ hashes. Now for each query you need simply take all hashes for mask equal to mask of query (depending on positions '?') and look at the count of hashes equal to hash of the query.

For reducing amount of needed memory you can do next: group all queries by their mask and process only one mask at the time — for each mask you calculate $$$O(N)$$$ hashes of input words, after you answer on all queries with the same mask.

Total time complexity is $$$O(N \cdot 2^M + Q)$$$.

→ Reply

Romok007

4 years ago, # ^ |

Thanks a lot for your help. I coded it and it was accepted :)

ChenKaifeng

For every word in the dictionary, list all its possible ways of being replaced by ? and insert it into a Trie, than query as a usual trie.

Then the complexity will be $$$O(n\cdot 2^m+q\cdot m)$$$, I think it will pass.

Time Complexity wise its totally fine. Maybe it can cause issues with memory

Romok007's blog