Skip to content

Latest commit

 

History

History
191 lines (140 loc) · 7.86 KB

README.md

File metadata and controls

191 lines (140 loc) · 7.86 KB

@kamilmielnik/trie

Version License Node version

Test Coverage Prettier

Trie data structure implemented in TypeScript:

Table of contents

Installation

npm

npm install @kamilmielnik/trie --save

yarn

yarn add @kamilmielnik/trie

See full API Docs - generated by typedoc.

Good to know:

  • all objects are mutable
  • every class, interface, type, constant and function is exported
  • all exports are named (there is no default export)

There are 2 ways to use the API.

Create Trie instance and use its methods.

Example

import { Trie } from '@kamilmielnik/trie';

const trie = new Trie();
trie.add('master');
trie.add('mask');
trie.hasPrefix('man'); // false
trie.hasPrefix('mas'); // true
trie.has('mas');       // false
trie.remove('mas');    // false
trie.has('master');    // true
trie.serialize();      // "(m(a(s(t(e(r))k))))"
trie.remove('master'); // true
trie.serialize();      // "(m(a(s(k))))"

Manipulate existing instances of Node with functions.

Example

The following example works identically to the object-oriented example above.

import { add, has, hasPrefix, Node, remove, serialize } from '@kamilmielnik/trie';

const root: Node = {};

add(root, 'master');
add(root, 'mask');
hasPrefix(root, 'man'); // false
hasPrefix(root, 'mas'); // true
has(root, 'mas');       // false
remove(root, 'mas');    // false
has(root, 'master');    // true
serialize(root);        // "(m(a(s(t(e(r))k))))"
remove(root, 'master'); // true
serialize(root);        // "(m(a(s(k))))"

Examples

Load dictionary from file

import { fromArray, Node } from '@kamilmielnik/trie';
import fs from 'fs';

const fromFile = (filepath: string): Node => {
  const file = fs.readFileSync(filepath, 'utf-8');
  // Assuming file contains 1 word per line
  const words = file.split('\n').filter(Boolean);
  const node = fromArray(words);
  return node;
};

Serialize Node to a file

import { Trie } from '@kamilmielnik/trie';
import fs from 'fs';

const toFile = (filepath: string, trie: Trie): void => {
  const serialized = trie.serialize();
  fs.writeFileSync(filepath, serialized);
};

Load serialized Node from a file

import { deserialize, Node } from '@kamilmielnik/trie';
import fs from 'fs';

const fromFile = (filepath: string): Node => {
  const serialized = fs.readFileSync(filepath, 'utf-8');
  const node = deserialize(serialized);
  return node;
};

Find all words with given prefix

import { find, Node, toArray } from '@kamilmielnik/trie';

const findWordsWithPrefix = (node: Node, prefix: string): string[] => {
  const prefixNode = find(node, prefix) || {};
  const descendants = toArray(prefixNode, { prefix, sort: true, wordsOnly: true });
  const words = descendants.map(({ prefix: word }) => word);
  return words;
};

Serialization and compression

This package can be used to efficiently serialize and compress dictionaries. It reaches 54.79 compression ratio (98.17% space saving) for Polish dictionary when combined with 7-Zip at ultra compression level.

Language 🇺🇸 en-US 🇬🇧 en-GB 🇵🇱 pl-PL
Name TWL06 SOWPODS SJP.PL
Source Download Download Download
Words count 178,691 267,753 3,045,878
File size [B] 1,941,856 2,974,773 42,838,508
File size (serialized) [B] (-29.75%) 1,364,242 (-31.57%) 2,035,697 (-56.33%) 18,705,990
File size (7z) [B] (-80.46%) 379,483 (-81.04%) 563,913 (-87.58%) 5,320,397
File size (serialized + 7z) [B] (-89.94%) 195,299 (-90.40%) 285,430 (-98.17%) 781,875

Performance

add, find, has, hasPrefix, remove are very fast - $O(\log_2 n)$ (millions of operations per second).

image


deserialize, fromArray, serialize, toArray are slow - $O(n)$ (few operations per second).

image