### Блог пользователя teja349

Автор teja349, история, 23 месяца назад, ,

Assume there are 2 patterns P1 and P2 such that P1 is a substring of P2 which is nether prefix nor suffix.Then how does a KMP type on string T work
Eg:
P1=of
P2=sofa
T=sofa
in this if we run aho corasick .will it detect string "of" or not

And also any good tutorial link on it will be apreciated
PS:I havent implemented it yet.I was studying it and I got this doubt as I read about its implementation

•
• +8
•

 » 23 месяца назад, # |   0 Auto comment: topic has been updated by teja349 (previous revision, new revision, compare).
 » 23 месяца назад, # |   +3 Yes, it will. In Aho-Corasick algorithm you create a trie nodes of which correspond to a prefix of some pattern. And you have suffix links for each node (similar to failure function in KMP). To find all the patterns which can end at a particular position in the string, you keep moving up using the suffix links as long as possible from the node you are currently at.When traversing the text "sofa", when you reach the third position, the corresponding node in the trie corresponds to "sof". But this node has a suffix link to the node for "of". So when you move up the suffix link from the former to the latter, you will detect pattern P1. :)
•  » » 23 месяца назад, # ^ | ← Rev. 2 →   +3 But,isnt it like only when there is a mismatch we go to suffix link so after reaching "sof".when we see "a" is present in continuation "sof" in trie .Then why will we see the suffix link.We only take suffix link at "sofa",am I wrong??
•  » » » 23 месяца назад, # ^ |   +3 No, you always traverse along suffix links to get all the patterns which end at that position, not only when there is a mismatch. I think what you are referring to (going to suffix link when there is a mismatch), that happens when you want to go the corresponding node in the trie after traversing the next character of the text.
•  » » » » 23 месяца назад, # ^ |   +6 I am not able to underastand what you are saying.i dont get by what u mean traversing all suffix links Can you provide ur implementation for aho corasick
•  » » 23 месяца назад, # ^ |   +3 You should build two different types of upward links: failure links (this always goes to the closest node that is also a valid prefix) and output links. Traversing output links from a node to the root should visit a subset of the nodes that are visited by doing the same with failure links. The sequence of output links should only visit those nodes in which we found an occurrence of a pattern.
 » 23 месяца назад, # | ← Rev. 2 →   +14 Hey, I think the best "tutorial" on Aho-Corasick algorithm is the original paper. Seriously, it is really well written, thorough in explanations, and has detailed pseudocode. Look for "Efficient string matching: An aid to bibliographic search". (This link works for me.)
 » 6 месяцев назад, # | ← Rev. 2 →   +10 I apologize for the bump (and I'm sure OP has figured out his issues by now). However, when I was trying to implement Aho-Corasick I had the same concern as OP, ran across this post, and couldn't find an implementation I liked. I was using the description here, but they don't explain how to actually match on the text (which is the cause of confusion for the OP), nor do they have code for jumping to leafs (which is necessary for proper complexity).So for the sake of what me from 12 hours ago would have wanted, here's my implementation (for 475D, extending upon the emaxx code: https://ideone.com/YIbEn5It contains the missing parts from emaxx that are needed for a full implementation.Aho-Corasick just by itself can be found here: https://gist.github.com/Chillee/69141ce899e9bc65291f8b120ee57b90
 » 6 месяцев назад, # |   0 Here is my implementation of aho-corasick: https://ideone.com/N3hMtKI learnt it from here: https://www.cs.uku.fi/~kilpelai/BSA05/lectures/slides04.pdf Implementation part and some optimizations is well described here: https://e-maxx-eng.appspot.com/string/aho_corasick.html.
•  » » 4 месяца назад, # ^ |   0 Hmm, I don't think your implementation works.https://ideone.com/1nQiE3
•  » » » 4 месяца назад, # ^ | ← Rev. 3 →   0 Actually I forgot to initialize the links. Added two extra lines. Here is the updated implementation : https://ideone.com/6FDgeP. Thanks for pointing it out. Hope that it'll work.
•  » » » » 4 месяца назад, # ^ |   0 One of the links is broken.Please update it.Thanks in advance.
•  » » » » 4 месяца назад, # ^ |   0 Hmm, I'm not really where it fails, but it's still not working on a aho-corasick problem. http://codeforces.com/contest/963/submission/41003199
•  » » » » » 4 месяца назад, # ^ | ← Rev. 2 →   +13 Finally realized my mistake. I have made two more changes to your code to get AC. Here's the link : https://ideone.com/pE42ue. Sorry if I caused you any trouble. I don't think there's any more mistakes.
•  » » » » » » 3 месяца назад, # ^ |   +13 Don't worry about it. The only reason I was even testing out your implementation is because I thought your method of generating leaflinks was nicer than mine :)There were a couple small things that didn't really make sense to me in your code, so here's my version (it's mostly the same, just with some conditions collapsed): https://ideone.com/v75Jg8