Task. Extend TrieMatching
algorithm so that it handles correctly cases when one of the patterns is a prefix of another one.
Input Format. The first line of the input contains a string Text, the second line contains an integer n, each of the following n lines contains a pattern from Patterns = {p1, . . . , pn}.
Constraints. 1 ≤ n ≤ 100; 1 ≤ |pi| ≤ 100 for all 1 ≤ i ≤ n; pi’s contain only symbols A, C, G, T; no pi is a prefix of pj for all 1 ≤ i ̸= j ≤ n; it can be the case that pi is a prefx of pj for some i, j.
Output Format. All starting positions in Text where a string from Patterns appears as a substring in increasing order (assuming that Text is a 0-based array of symbols). If more than one pattern appears starting at position i, output i once.
Sample 1.
Input
AAA
1
AA
Output
0 1
Sample 2.
Input
ACATA
3
AT
A
AG
Output
0 2 4
List<Node> buildTrie(List<String> patterns) {
List<Node> trie = new ArrayList<>();
trie.add(new Node());
for (String pattern : patterns) {
int curr = 0;
for (int i = 0; i < pattern.length(); i++) {
char symbol = pattern.charAt(i);
Node n = trie.get(curr);
int idx = letterToIndex(symbol);
if (n.next[idx] != Node.NA) {
curr = n.next[idx];
} else {
trie.add(new Node());
n.next[idx] = trie.size() - 1;
curr = trie.size() - 1;
}
}
trie.get(curr).patternEnd = true;
}
return trie;
}
int prefixTrieMatching(String text, int start, List<Node> trie) {
for (int i = start, v = 0; i < text.length(); i++) {
char symbol = text.charAt(i);
int idx = letterToIndex(symbol);
v = trie.get(v).next[idx];
if (v == Node.NA)
return -1;
else if (trie.get(v).patternEnd)
return start;
}
return -1;
}
List <Integer> solve (String text, int n, List <String> patterns) {
List <Integer> result = new ArrayList <Integer> ();
List<Node> trie = buildTrie(patterns);
for (int i = 0; i < text.length(); i++) {
int pos = prefixTrieMatching(text, i, trie);
if (pos >= 0)
result.add(pos);
}
return result;
}
Compile with javac TrieMatchingExtended.java
and run with java TrieMatchingExtended
.
This is only for discussion and communication. Please don't use this for submission of assignments.