https://github.com/stevencyb/gotokenizer
A golang helper that provides a regex based tokenizer.
https://github.com/stevencyb/gotokenizer
Last synced: 7 months ago
JSON representation
A golang helper that provides a regex based tokenizer.
- Host: GitHub
- URL: https://github.com/stevencyb/gotokenizer
- Owner: StevenCyb
- License: mit
- Created: 2024-04-09T09:27:35.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-04-09T11:32:49.000Z (over 1 year ago)
- Last Synced: 2024-06-20T14:09:08.521Z (about 1 year ago)
- Language: Go
- Homepage:
- Size: 19.5 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# gotokenizer
[](https://github.com/StevenCyb/gotokenizer/releases/latest)

This library provide a basic regex based tokenizer.
## Installation
To install the library, you can use the following command:
```sh
go get github.com/StevenCyb/gotokenizer
```## How to use
```go
package mainimport (
"fmt"
// Import the package
"gotokenizer/pkg/tokenizer"
)func main() {
// Define token types that you want to classify.
var (
WordType tokenizer.Type = "WORD"
SeparatorType tokenizer.Type = "SEPARATOR"
SkipType tokenizer.Type = "SKIP"
)input := "hello,world"
// Create a tokenizer
tokenizer := tokenizer.New(
// Input to parse
input,
// Token that should be skipped e.g. spaces.
SkipType,
// Specs for token types and how to identify them.
[]*tokenizer.Spec{
tokenizer.NewSpec(`^\s+`, SkipType),
tokenizer.NewSpec("^[a-z]+", WordType),
tokenizer.NewSpec("^,", SeparatorType),
})for {
// Get next token
token, err := tokenizer.GetNextToken()
if err != nil {
panic(err)
} else if token == nil {
// At the end
break
}// Print result (see image below)
fmt.Printf("\x1B[32m%s\033[0m -> \x1B[31m%s\033[0m\n", token.Type, token.Value)
}
}
```![]()