https://github.com/jkrukowski/swift-sentencepiece
Use SentencePiece in Swift for tokenization and detokenization.
https://github.com/jkrukowski/swift-sentencepiece
sentencepiece tokenization
Last synced: 8 months ago
JSON representation
Use SentencePiece in Swift for tokenization and detokenization.
- Host: GitHub
- URL: https://github.com/jkrukowski/swift-sentencepiece
- Owner: jkrukowski
- License: mit
- Created: 2024-11-28T13:51:07.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-06-26T16:59:43.000Z (11 months ago)
- Last Synced: 2025-10-01T08:51:39.538Z (8 months ago)
- Topics: sentencepiece, tokenization
- Language: Swift
- Homepage:
- Size: 2.43 MB
- Stars: 15
- Watchers: 1
- Forks: 5
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# `swift-sentencepiece`
[](https://swiftpackageindex.com/jkrukowski/swift-sentencepiece)
[](https://swiftpackageindex.com/jkrukowski/swift-sentencepiece)
Use [SentencePiece](https://github.com/google/sentencepiece) in Swift for tokenization and detokenization. It wraps `v0.2.0` of the original library.
## Installation
Add the following to your `Package.swift` file. In the package dependencies add:
```swift
dependencies: [
.package(url: "https://github.com/jkrukowski/swift-sentencepiece", from: "0.0.3")
]
```
In the target dependencies add:
```swift
dependencies: [
.product(name: "SentencepieceTokenizer", package: "swift-sentencepiece")
]
```
## Usage
### Encoding
```swift
import SentencepieceTokenizer
// load tokenizer from file
let tokenizer = try SentencepieceTokenizer(modelPath: "/path/to/sentencepiece.model")
// encode text
let encoded = tokenizer.encode("Hello, world!")
print(encoded)
// decode tokens
let decoded = tokenizer.decode([35378, 4, 8999, 38])
print(decoded)
```
## Command Line Demo
To run the command line demo, use the following command:
```bash
swift run sentencepiece-cli --model-path [--text ]
```
Command line options:
```bash
--model-path
--text (default: Hello, world!)
-h, --help Show help information.
```
## Code Formatting
This project uses [swift-format](https://github.com/swiftlang/swift-format). To format the code run:
```bash
swift format . -i -r --configuration .swift-format
```
## Acknowledgements
This project wraps the original implementation [SentencePiece](https://github.com/google/sentencepiece)