Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/stevencyb/gotokenizer
A golang helper that provides a regex based tokenizer.
https://github.com/stevencyb/gotokenizer
Last synced: 28 days ago
JSON representation
A golang helper that provides a regex based tokenizer.
- Host: GitHub
- URL: https://github.com/stevencyb/gotokenizer
- Owner: StevenCyb
- License: mit
- Created: 2024-04-09T09:27:35.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2024-04-09T11:32:49.000Z (9 months ago)
- Last Synced: 2024-06-20T14:09:08.521Z (7 months ago)
- Language: Go
- Homepage:
- Size: 19.5 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# gotokenizer
[![GitHub release badge](https://badgen.net/github/release/StevenCyb/gotokenizer/latest?label=Latest&logo=GitHub)](https://github.com/StevenCyb/gotokenizer/releases/latest)
![GitHub Workflow Status](https://img.shields.io/github/actions/workflow/status/StevenCyb/gotokenizer/ci-test.yml?label=Tests&logo=GitHub)
![GitHub](https://img.shields.io/github/license/StevenCyb/gotokenizer)This library provide a basic regex based tokenizer.
## Installation
To install the library, you can use the following command:
```sh
go get github.com/StevenCyb/gotokenizer
```## How to use
```go
package mainimport (
"fmt"
// Import the package
"gotokenizer/pkg/tokenizer"
)func main() {
// Define token types that you want to classify.
var (
WordType tokenizer.Type = "WORD"
SeparatorType tokenizer.Type = "SEPARATOR"
SkipType tokenizer.Type = "SKIP"
)input := "hello,world"
// Create a tokenizer
tokenizer := tokenizer.New(
// Input to parse
input,
// Token that should be skipped e.g. spaces.
SkipType,
// Specs for token types and how to identify them.
[]*tokenizer.Spec{
tokenizer.NewSpec(`^\s+`, SkipType),
tokenizer.NewSpec("^[a-z]+", WordType),
tokenizer.NewSpec("^,", SeparatorType),
})for {
// Get next token
token, err := tokenizer.GetNextToken()
if err != nil {
panic(err)
} else if token == nil {
// At the end
break
}// Print result (see image below)
fmt.Printf("\x1B[32m%s\033[0m -> \x1B[31m%s\033[0m\n", token.Type, token.Value)
}
}
```