How to represent a trie in a 2D array?

№	Пользователь	Рейтинг
1	ecnerwala	3648
2	Benq	3580
3	orzdevinwang	3570
4	cnnfls_csy	3569
5	Geothermal	3568
6	tourist	3565
7	maroonrk	3530
8	Radewoosh	3520
9	Um_nik	3481
10	jiangly	3467

№	Пользователь	Вклад
1	maomao90	174
2	awoo	164
2	adamant	164
4	TheScrasse	159
4	nor	159
6	maroonrk	156
7	-is-this-fft-	150
8	SecondThread	147
9	orz	146
10	pajenegod	145

I have seen some people's code to represent a trie in a 2D array. For example, this is my old code to solve Autocomplete. However, I forgot how it works!

Can someone explain how to code a 2D array that represents a trie? I know that one dimension of the array represents the alphabet, and that the values inside the trie array are the node numbers, where the first letter you add is node number 1. Still, I'm confused where to insert values into the trie array, and how strings with the same prefixes would be distinguished in the tire.

Thanks.

#include <bits/stdc++.h>
#define rd(s) freopen(s, "r", stdin);
#define wt(s) freopen(s, "w", stdout);
using namespace std;

const int MAX=1e6+2, ALF=26+2;
int t, trie[MAX][ALF];

int main()
{
    //rd("test.txt");

    scanf("%d", &t);
    for (int i=1; i<=t; ++i) {
        int ans=0, n, numm1=1;
        for (int _=0; _<MAX; ++_) for (int __=0; __<ALF; ++__) trie[_][__] = 0;
        scanf("%d", &n);
        for (int j=0; j<n; ++j) {
            int node=0, idx=0;
            string word; cin>>word;
            while (trie[node][word[idx]-'a'] && idx < word.length()) {
                node=trie[node][word[idx]-'a'];
                ++ans; ++idx;
            }
            if (idx != word.length()) ++ans;
            //printf("ans is %d\n", ans);
            for (; idx<word.length(); ++idx) {
                trie[node][word[idx]-'a'] = numm1;
                //printf("new path from %d to %d via %c\n", node, numm1, word[idx]);
                node=numm1;
                ++numm1;
            }
        }
        printf("Case #%d: %d\n", i, ans);
    }
}

Комментарии (7)

Показать архивные | Написать комментарий?

arknave

7 лет назад, # |

Each row in your array represents a node in the trie structure. Node 0 is the root.

When you want to insert a new string, begin at node 0. Let c be the first character of your string, and see if trie[0][c] is valid or not. If it is not valid "allocate" a new node by incrementing the size of your trie and set trie[0][c] to the newly allocated node.

Here's a simple example. I'm using -1 to signify that there is no node there.

Initially:

0: -1 -1 -1 ...
1: -1 -1 -1 ...
2: -1 -1 -1 ...
3: -1 -1 -1 ...

Size of the trie is 1 (because of the root node). Now let's insert string "ABC". Then since trie[0]['A'] is -1, we need to allocate a new node and set trie[0]['A'] to it. Then our structure is

0:  1 -1 -1 ...
1: -1 -1 -1 ...
2: -1 -1 -1 ...
3: -1 -1 -1 ...

Now we are currently at node 1 and need to insert "BC". Following this procedure twice, our trie looks like

0:  1 -1 -1 ...
1: -1  2 -1 ...
2: -1 -1  3 ...
3: -1 -1 -1 ...

→ Ответить

wag

7 лет назад, # ^ |

← Rev. 2 →

By the way, check out my new standup

trentinitaco

8 месяцев назад, # ^ |

What if we want to store 'AB' and 'BB', on the second row don't we need two separation versions of 'B'?

XGptw

i just have a small doubt is it true that if we implement a trie in this way using predefined sizes for arrays then it is more faster than pointer oriented code ... .

bhikkhu

Lets not make it complicated. First, think of how you can represent a tree using an array. If you can do that you can do for trie as well since a trie is also a tree.

Now suppose tree is a binary tree. Normally you represent a tree as Node {int val; Node *l, *r} Where l, r are pointers to the children. Now suppose you don't do dynamic memory allocation and replace l, r with indices of Nodes in array. First you can check whether l points to a children. If not, then you will use an unused Node of our predefined array. Similarly for r.

For a trie, suppose a string has length L. Then in the worst case, for each suffix we need to create a linked list. So total size would be L+L-1+....+1=O(L^2). But if alphabet is just lowercase(a-z) then, some suffixes will have same starting characters and memory will be less than L^2.

shadycoderr

4 года назад, # |

how would you know we have reached at the end of the word?

For ex: we need to store 2 strings 1) "abc" 2) "ab"

mujtaba1747

4 года назад, # ^ |

We'd have to maintain another array of type : bool end[MAX][ALF] for that.

Блог пользователя maximaxi