Pawan's blog

By Pawan, history, 10 months ago, In English,

PROBLEM:

"Given an array of strings, find the string which is made up of maximum number of other strings contained in the same array."

INPUT :

[ “rat”, ”cat”, “abc”, “xyz”, “abcxyz”, “ratcatabc”, “xyzcatratabc” ]

OUTPUT:

“xyzcatratabc”

Explaination :

“abcxyz” contains 2 other strings
“ratcatabc” contains 3 other strings,
“xyzcatratabc” contains 4 other strings"

I want to know what is the best time complexity with can solve this problem and how ?
I shall be highly thankful to codeforces community!

 
 
 
 
  • Vote: I like it  
  • +17
  • Vote: I do not like it  

»
10 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Maybe Aho-corasick

»
10 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Can you share the problem link?

  • »
    »
    10 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    it is an interview question which I came across!

»
10 months ago, # |
Rev. 4   Vote: I like it 0 Vote: I do not like it

I can think of a solution with string matching and DAG longest path with approximate time complexity O(n * m), where m = sum of all strings' length, and n = number of strings. (you will also need some extra memory) Firstly, you need a linear string matching algorithm, I would use KMP (https://es.wikipedia.org/wiki/Algoritmo_Knuth-Morris-Pratt). Also be familiar with DAG longest path problem (http://www.geeksforgeeks.org/find-longest-path-directed-acyclic-graph/).

For each string A do the following:

For every string B with smaller length than A:

Use KMP to find B occurrences in A. Now imagine A as a DAG (https://en.wikipedia.org/wiki/Directed_acyclic_graph) where each index is a node. So for each occurence of B in index i, store an edge E(i -> i + B.length). This is where you need extra memory.

After matching all strings B smaller than A, find longest path in A from index 0 to index A.length — 1. If there is no such path, A cannot be made up entirely of other strings. Otherwise, longest path value is the biggest number of strings that can form A. So the string that has the biggest result is your answer.

»
10 months ago, # |
  Vote: I like it 0 Vote: I do not like it

This problem can easily be solved in linear time using the Aho-Corasick Algorithm.

  1. Build a trie with all the strings as words in the data structure.
  2. Use the Aho-Corasick algorithm to build a string matching automaton.
  3. Run the automaton on each string of the dictionary and check the number of matches.

Total complexity: time and space.