https://github.com/wintercore/staubsauger
https://github.com/wintercore/staubsauger
Last synced: about 1 year ago
JSON representation
- Host: GitHub
- URL: https://github.com/wintercore/staubsauger
- Owner: WinterCore
- Created: 2024-02-28T13:38:55.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-07-15T17:59:18.000Z (almost 2 years ago)
- Last Synced: 2025-02-14T20:31:20.964Z (over 1 year ago)
- Language: Rust
- Size: 3.91 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Staubsauger
### References
- https://medium.com/filament-ai/making-text-search-learn-from-feedback-4fe210fd87b0
- https://medium.com/@oduguwadamilola40/byte-pair-encoding-the-tokenization-algorithm-powering-large-language-models-5055fbdc0153
- https://medium.com/@abhishekjainindore24/all-about-tokenization-stop-words-stemming-and-lemmatization-in-nlp-1620ffaf0f87
- https://medium.com/@hsinhungw/understanding-byte-pair-encoding-fd196ebfe93f