Regular Expression problem -- better solution?

Revision en1, by mouse_wireless, 2018-01-09 21:29:38

I've recently come across the following problem in a real life scenario:

You are given a regular expression with two special characters: * matches any sequence of characters (including the empty sequence) and ? matches any one character.

For example, a?b matches acb, but it does not match abc, accb or ab. a*?b however does match accb. abc*f?h matches abcdefgh.

You have to write a program that checks if a pattern and a string match.

Obviously, we can write an O(n^2) algorithm which looks like this:

bool matches(const char *pat, const char *s) {
  if (*pat == 0)
    return (*s == 0);
  if ((*pat == '?' || *pat == *s) && (*s != 0))
    return matches(pat + 1, s + 1);
  if (*pat == '*')
    return ((*s != 0) && matches(pat, s + 1)) || matches(pat + 1, s);
  return false;
}

For the scenario I have encountered, O(n^2) is more than enough, however I've been wondering for a while if a linear time algorithm (or anyway something better than quadratic) exists. I've been giving it some thought and can't seem to come up with anything. It looks like so it seems like some linear algorithm should exist, right?

Tags string, regex, regexp, algorithm complexity

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en1 English mouse_wireless 2018-01-09 21:29:38 1225 Initial revision (published)