Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/stagas/tokenizer-next

iterator based tokenizer for writing parsers
https://github.com/stagas/tokenizer-next

iterable iterator lexer parser parsing regexp regexp-match tokenizer

Last synced: about 1 month ago
JSON representation

iterator based tokenizer for writing parsers

Host: GitHub
URL: https://github.com/stagas/tokenizer-next
Owner: stagas
License: mit
Created: 2021-11-01T19:07:39.000Z (about 3 years ago)
Default Branch: main
Last Pushed: 2022-08-04T05:48:41.000Z (over 2 years ago)
Last Synced: 2024-12-11T11:01:02.248Z (about 1 month ago)
Topics: iterable, iterator, lexer, parser, parsing, regexp, regexp-match, tokenizer
Language: TypeScript
Homepage:
Size: 940 KB
Stars: 2
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        


tokenizer-next    





iterator based tokenizer for writing parsers





npm i tokenizer-next 

pnpm add tokenizer-next 

yarn add tokenizer-next

## API

  #  createTokenizer(regexps)     – Create a {@link TokenizerFactory} for the given RegExps.  src/index.ts#L19  
      



To capture, RegExps must use a [named group](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions/Groups_and_Ranges#using_named_groups).

```ts

const tokenize = createTokenizer(

  /(?[a-z]+)/, // named groups determine token `group`

  /(?[0-9]+)/

)

```



  #  regexps     – RegExps to match.

    

RegExp  []
        
  createTokenizer(regexps)   =>  

TokenizerFactory

    #  Token     – Token interface  src/match-to-token/dist/types/token.d.ts#L5            #  constructor(value, group, source)      src/match-to-token/dist/types/token.d.ts#L23  
      #  new Token()        


Token
        #  value        


string
        
#  group        

string
        
#  source        

RegExpMatchArrayLike
        
  
    
#  group     – The group it matched.  src/match-to-token/dist/types/token.d.ts#L10  

string
        
#  source     – The input string.  src/match-to-token/dist/types/token.d.ts#L22  

RegExpMatchArrayLike
        
#  index      src/match-to-token/dist/types/token.d.ts#L18          
#  value      src/match-to-token/dist/types/token.d.ts#L14          
#  as(value, group)      src/match-to-token/dist/types/token.d.ts#L25          #  value        


string
        
#  group        

string
        
  

as(value, group)   =>  

Token

    
#  is(group, value)      src/match-to-token/dist/types/token.d.ts#L24          #  group        


string
        
#  value        

string
        
  

is(group, value)   =>  

boolean
    
#  create(value, group, source)      src/match-to-token/dist/types/token.d.ts#L6          #  value        


string
        
#  group        

string
        
#  source        

RegExpMatchArrayLike
        
  

create(value, group, source)   =>  

Token
    
  #  RegExpMatchArrayLike      src/match-to-token/dist/types/index.d.ts#L2            #  index      src/match-to-token/dist/types/index.d.ts#L3  


number
        
#  input      src/match-to-token/dist/types/index.d.ts#L4  

string
        


  #  Token      src/match-to-token/dist/types/index.d.ts#L6  

MatchToken & string
        
#  TokenizerCallableIterable     – Can be called to return next <a href="https://github.com/stagas/match-to-token#token">Token</a> or can be used as an

Iterable

on for-of and spread operations.  src/index.ts#L74  

#  ()        
          

()   =>  

Token

    
 & Iterable<Token>        #  TokenizerFactory      src/index.ts#L67  

#  (input)     – Create a {@link TokenizerCallableIterable} for given input string.    
      



```ts

// using next()

const next = tokenize('hello 123')

console.log(next()) // => {group: 'ident', value: 'hello', index: 0}

console.log(next()) // => {group: 'number', value: '123', index: 6}

console.log(next()) // => undefined

// using for of

for (const token of tokenize('hello 123')) {

  console.log(token)

  // => {group: 'ident', value: 'hello', index: 0}

  // => {group: 'number', value: '123', index: 6}

}

// using spread

const tokens = [...tokenize('hello 123')]

console.log(tokens)

// => [

//   {group: 'ident', value: 'hello', index: 0},

//   {group: 'number', value: '123', index: 6}

// ]

```



  #  input     – The string to tokenize.

    

string
        
  (input)   =>  

TokenizerCallableIterable

    
        #  createTokenizer(regexps)     – Create a {@link TokenizerFactory} for the given RegExps.  src/index.ts#L19        



To capture, RegExps must use a [named group](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions/Groups_and_Ranges#using_named_groups).

```ts

const tokenize = createTokenizer(

  /(?[a-z]+)/, // named groups determine token `group`

  /(?[0-9]+)/

)

```



  #  regexps     – RegExps to match.

    

RegExp  []
        
  createTokenizer(regexps)   =>  

TokenizerFactory
    


## Credits

- [match-to-token](https://npmjs.org/package/match-to-token) by [stagas](https://github.com/stagas) – transform a RegExp named group match to a more useful object

## Contributing

[Fork](https://github.com/stagas/tokenizer-next/fork) or [edit](https://github.dev/stagas/tokenizer-next) and submit a PR.

All contributions are welcome!

## License

MIT © 2022 [stagas](https://github.com/stagas)