### ShivanshJ's blog

By ShivanshJ, history, 2 months ago, Many of my friends asked this interesting OA problem to me:

Given an array of strings $A$ of size $N$ and two integers $B$ and $C$.

Let $D$ be the number of strings of length $C$ that contains exactly $B$ of the strings in $A$ as substring.

Return $D$ mod $10^9+9$.

Constraints

$1 \le N \le 6$

$1 \le |A[i]| \le 50$

All the $N$ strings are distinct

$0 \le B \le N$

$1 \le C \le 50.$

Note that if a substring belonging to set $A$ occurs multiple times in my resulting string, it is counted only once.

My approach:

Let $Z$ be the size of the alphabet $(26)$.

Let $dp[i][j][k][m]$ denote the number of strings satisfying the constraints:

1) It is of length $i$.

2) The longest suffix present in it which is a proper prefix of some string belonging to set $A$ is the substring $[k...i]$ and the string whose proper prefix is the $j^{th}$ string in set $A$, in case of multiple such strings in $A$ choose the one with longest length. In case no such suffix exists, we can put a fake "empty string" at index $0$ in set $A$ (rest of the strings are numbered from $1$ to $N$) and assume that substring is $[i+1 , i]$.

3) The substrings (belonging to set $A$) which has already been occurred is denoted by mask $m$, more formally, if the $w^{th}$ string in $A$ has already occurred, then $w^{th}$ bit of $m$ is set, otherwise its not.

I'll write transitions via forward style dp, if I'm adding the $(i+1)^{th}$ character, then, it might "complete" some substrings, by this I mean, some suffix which was a proper prefix of some string in $A$ before adding character will now be a complete string belonging to set $A$. Note that all such strings will be the suffix of that longest suffix.

So, some new bits in mask $m$ will be set, All this can be calculated, since we already know the longest suffix, in fact lets precalculate, $info[i][j][k]$ which gives a tuple $(bitmask, L, idx)$. If we consider the prefix of length $j$ of $i^{th}$ string in set $A$ and add character $k$ at the end, $w^{th}$ bit in bitmask is set iff, entire $w^{th}$ string in $A$ occurs as a substring in that prefix after adding character $k$, $L$ denotes the length of the longest suffix in resulting string (after adding character $k$) that is a proper prefix of $idx^{th}$ string in set $A$, this precomputation can be done naively in $O(N*C*Z*N)$.

So, after adding $(i+1)^{th}$ character (denote it by $c$), new mask is $(m | info[j][i-k+1][c])$, new length of longest suffix is $info[j][i-k+1][c]$ so, add its contribution towards state $dp[i+1][info[j][i-k+1][c]][i+2-info[j][i-k+1][c]][(m | info[j][i-k+1][c])]$.

I think the complexity would be $O(C*N*C*2^{N}*Z)$, which might pass. However, I think I'm overkilling it and there has to be simpler solution. I'm not even sure whether my solution is correct or not. dp, Comments (10)
 » Auto comment: topic has been updated by ShivanshJ (previous revision, new revision, compare).
 » Auto comment: topic has been updated by ShivanshJ (previous revision, new revision, compare).
 » 2 months ago, # | ← Rev. 2 →   Edit: incorrect
•  » » How do you ensure that we count all strings that occur? For example, suppose that you expect the strings to appear in the order 1, 2, 3, 4, 5, 6 and are counting occurrences where four of the strings appear. How are you ensuring that string 5 never appears before any of the first four strings?
•  » » » I do not, thank you for correcting me.
 » A simpler solution (in the sense of requiring less original thought after applying a standard algorithm; the underlying idea is probably basically the same) that has the same time complexity as yours is to build an Aho-Corasick automaton on the $N$ input strings. For each node in the automaton, create a bitmask indicating which of the $N$ input strings could end in this node.Then, run a DP where your state is the number of characters used, a mask indicating which of the $N$ strings have appeared in the string you're building, and the node of the Aho-Corasick automaton corresponding to the string you've built so far. To transition, iterate over the next letter in the string, transition to the corresponding Aho-Corasick node, and add any input strings appearing in that node to your bitmask.There are $O(C2^N N |A[i]|)$ states and $26$ transitions per state, giving the same complexity as your solution.
•  » » Thanks! I don't know Aho Corasick, time for me to learn this!
•  » » It's literally the same solution though, is it not?
 » 2 months ago, # | ← Rev. 3 →   I am little confused in understanding the task. (Edit: I got it now , so we need to find A_i as substring of assumed Good string S not in other way. I thought we need to find number of substring in A_i or in total A) yup N <= 6, DP bitmask seems the good intuitive way here.
 » 2 months ago, # | ← Rev. 2 →   oh! i have a solution of O(|A|)