Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/daac-tools/crawdad
🦞 Rust library of natural language dictionaries using character-wise double-array tries.
https://github.com/daac-tools/crawdad
cjk-characters data-structures double-array no-std rust search trie
Last synced: 2 months ago
JSON representation
🦞 Rust library of natural language dictionaries using character-wise double-array tries.
- Host: GitHub
- URL: https://github.com/daac-tools/crawdad
- Owner: daac-tools
- License: apache-2.0
- Created: 2022-03-20T14:22:49.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2023-02-20T13:23:26.000Z (almost 2 years ago)
- Last Synced: 2024-11-18T22:52:52.906Z (3 months ago)
- Topics: cjk-characters, data-structures, double-array, no-std, rust, search, trie
- Language: Rust
- Homepage: https://docs.rs/crawdad
- Size: 3.77 MB
- Stars: 28
- Watchers: 2
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE-APACHE
Awesome Lists containing this project
README
# 🦞 Crawdad: ChaRActer-Wise Double-Array Dictionary
[![Crates.io](https://img.shields.io/crates/v/crawdad)](https://crates.io/crates/crawdad)
[![Documentation](https://docs.rs/crawdad/badge.svg)](https://docs.rs/crawdad)
![Build Status](https://github.com/daac-tools/crawdad/actions/workflows/rust.yml/badge.svg)
[![Slack](https://img.shields.io/badge/join-chat-brightgreen?logo=slack)](https://join.slack.com/t/daac-tools/shared_invite/zt-1pwwqbcz4-KxL95Nam9VinpPlzUpEGyA)## Overview
Crawdad is a library of natural language dictionaries using character-wise double-array tries.
The implementation is optimized for strings of multibyte-characters,
and you can enjoy fast text processing on strings such as Japanese or Chinese.For example, on a large Japanese dictionary of IPADIC+Neologd, Crawdad has a better time-space tradeoff than other Rust libraries.
![](./figures/neologd.svg)
The detailed experimental settings and other results are available on [Wiki](https://github.com/daac-tools/crawdad/wiki/Performance-Comparison).
### What can do
- **Key-value mapping**: Crawdad stores a set of string keys with mapping arbitrary integer values.
- **Exact match**: Crawdad supports a fast lookup for an input key.
- **Common prefix search**: Crawdad supports fast *common prefix search* that can be used to enumerate all keys appearing in a text.### Data structures
Crawdad contains the two trie implementations:
- `crawdad::Trie` is a standard trie form that often provides the fastest queries.
- `crawdad::MpTrie` is a minimal-prefix trie form that is memory-efficient for long strings.## Slack
We have a Slack workspace for developers and users to ask questions and discuss a variety of topics.
* https://daac-tools.slack.com/
* Please get an invitation from [here](https://join.slack.com/t/daac-tools/shared_invite/zt-1pwwqbcz4-KxL95Nam9VinpPlzUpEGyA).## License
Licensed under either of
* Apache License, Version 2.0
([LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0)
* MIT license
([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT)at your option.
## Acknowledgment
The initial version of this software was developed by LegalOn Technologies, Inc.,
but not an officially supported LegalOn Technologies product.## Contribution
See [the guidelines](./CONTRIBUTING.md).