Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/antalsz/text-set

Use a DAFSA (aka a DAWG) to implement a set of strings
https://github.com/antalsz/text-set

Last synced: about 1 month ago
JSON representation

Use a DAFSA (aka a DAWG) to implement a set of strings

Awesome Lists containing this project

README

        

# text-set

Use a [DAFSA](https://en.wikipedia.org/wiki/DAFSA) (directed acyclic finite
state automaton; aka a DAWG, a directed acyclic word graph) to implement a set of strings.

The algorithm for building the DAFSA is Algorithm 1 from the paper “Incremental
Construction of Minimal Acyclic Finite-State Automata”, by Jan Daciuk, Stoyan
Mihov, Bruce W. Watson, and Richard E. Watson. Published in 2000 in
_Computational Linguistics_ 26(1), pp.3-16. Available online at
.

The ENABLE wordlist, in `ENABLE-wordlist.txt`, was downloaded from [Peter
Norvig’s page about _Natural Language Corpus Data_](https://norvig.com/ngrams/).