https://github.com/danieldk/sentencepiece
Rust binding for the sentencepiece library
https://github.com/danieldk/sentencepiece
rust sentencepiece
Last synced: 28 days ago
JSON representation
Rust binding for the sentencepiece library
- Host: GitHub
- URL: https://github.com/danieldk/sentencepiece
- Owner: danieldk
- License: other
- Created: 2020-01-22T15:30:22.000Z (over 5 years ago)
- Default Branch: main
- Last Pushed: 2025-04-21T16:13:50.000Z (30 days ago)
- Last Synced: 2025-04-23T04:17:12.057Z (28 days ago)
- Topics: rust, sentencepiece
- Language: Rust
- Homepage:
- Size: 282 KB
- Stars: 20
- Watchers: 2
- Forks: 7
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE-APACHE
Awesome Lists containing this project
README
# sentencepiece
This Rust crate is a binding for the
[sentencepiece](https://github.com/google/sentencepiece) unsupervised
text tokenizer. The [crate
documentation](https://docs.rs/sentencepiece/) is available
online.## `libsentencepiece` dependency
This crate depends on the `sentencepiece` C++ library. By default,
this dependency is treated as follows:* If `sentencepiece` could be found with `pkg-config`, the crate will
link against the library found through `pkg-config`. **Warning:**
dynamic linking only works correctly with sentencepiece 0.1.95
or later, due to
[a bug in earlier versions](https://github.com/google/sentencepiece/issues/579).
* Otherwise, the crate's build script will do a static build of the
`sentencepiece` library. This requires that `cmake` is available.If you wish to override this behavior, the `sentencepiece-sys` crate
offers two features:* `system`: always attempt to link to the `sentencepiece` library
found with `pkg-config`.
* `static`: always do a static build of the `sentencepiece` library
and link against that.