Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/AuracleTech/jayce
jayce is a tokenizer 🌌
https://github.com/AuracleTech/jayce
tokenizer
Last synced: 2 months ago
JSON representation
jayce is a tokenizer 🌌
- Host: GitHub
- URL: https://github.com/AuracleTech/jayce
- Owner: AuracleTech
- Created: 2022-03-26T14:48:04.000Z (almost 3 years ago)
- Default Branch: master
- Last Pushed: 2024-03-16T16:22:43.000Z (10 months ago)
- Last Synced: 2024-09-18T01:05:22.076Z (4 months ago)
- Topics: tokenizer
- Language: Rust
- Homepage:
- Size: 251 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
Awesome Lists containing this project
- awesome-blazingly-fast - jayce - jayce is a blazing fast tokenizer 🌌 (Rust)
README
# jayce
jayce is a tokenizer 🌌
##### Example
```rust
use jayce::{Duo, Tokenizer};
use std::sync::OnceLock;const SOURCE: &str = "Excalibur = 5000$; // Your own language!";
fn duos() -> &'static Vec> {
static DUOS: OnceLock>> = OnceLock::new();
DUOS.get_or_init(|| {
vec![
Duo::new("whitespace", r"^[^\S\n]+", false),
Duo::new("commentLine", r"^//(.*)", false),
Duo::new("commentBlock", r"^/\*(.|\n)*?\*/", false),
Duo::new("newline", r"^\n", false),
Duo::new("price", r"^[0-9]+\$", true),
Duo::new("semicolon", r"^;", true),
Duo::new("operator", r"^=", true),
Duo::new("name", r"^[a-zA-Z_]+", true),
]
})
}fn main() -> Result<(), Box> {
let mut tokenizer = Tokenizer::new(SOURCE, duos());while let Some(token) = tokenizer.consume()? {
println!("{:?}", token);
}Ok(())
}
```##### Result
```rust,ignore
Token { kind: "name", value: "Excalibur", pos: (1, 1) }
Token { kind: "operator", value: "=", pos: (1, 11) }
Token { kind: "price", value: "5000$", pos: (1, 13) }
Token { kind: "semicolon", value: ";", pos: (1, 18) }
```##### Info
`Tokenizer::consume` returns `Result Option Token`
1. `Ok Some` match found
2. `Ok None` end of source
3. `Err` an error occurs`Tokenizer::consume_all` returns `Result Vec Token`
1. `Ok Vec Token` tokens matched
2. `Err` an error occurs##### Performances
initialization in ~`3 nanoseconds`
tokenization of [Yuumi](https://github.com/AuracleTech/yuumi) in ~`4 milliseconds`##### Features
- `generic-simd`
- `runtime-dispatch-simd` default enabled, to disable modify `Cargo.toml` as follows```toml
jayce = { version = "X.X.X", default-features = false }
```