An open API service indexing awesome lists of open source software.

https://github.com/wgrape/lexer

A lexical analyzer based on DFA that is built using JS and supports multi-language extensions / 一个基于DFA的支持多语言扩展的JS版开源词法分析器
https://github.com/wgrape/lexer

dfa javascript lexer lexical-analysis lexical-analyzer

Last synced: 6 months ago
JSON representation

A lexical analyzer based on DFA that is built using JS and supports multi-language extensions / 一个基于DFA的支持多语言扩展的JS版开源词法分析器

Awesome Lists containing this project

README

          


image






GitHub release (latest by date)



It is a lexical analyzer based on DFA that is built using JS and supports multi-language extensions. For a quick understanding and experience , please check the online website


Document :中文 / English


Contents

- [1、Background](#1)
-      [(1) Situation](#11)
-      [(2) Task](#12)
-      [(3) Solution](#13)
- [2、Features](#2)
-      [(1) Complete lexical analysis](#21)
-      [(2) Support multi-language extension](#22)
-      [(3) Provide state flow log](#23)
- [3、Get project](#3)
- [4、Ussage](#4)
-      [(1) In your project](#41)
-      [(2) Web preview and testing](#42)
- [5、Contributions](#5)
-      [(1) Project Statistics](#51)
-      [(2) Source code explanation](#52)
-      [(3) Content contribution](#53)
-      [(4) Release version](#54)
-      [(5) Q&A](#55)
- [6、License](#6)

## 1、Background

### (1) Situation

Most lexical analyzers are closely coupled with the language, the amount of code is relatively large. It's hard to pay attention to the essential principles of lexical analyzer.

### (2) Task

In order to focus on the working principle of lexical analyzer , not to consider the small differences caused by different languages , an idea of making a ```lexer``` project that is completely decoupled from the language was born.

### (3) Solution

```lexer``` through the following two files, realize the decoupling of lexical analyzer and language

- ```src/lexer.js``` is the core part of lexical analyzer within 300 lines, including ```ISR``` and ```DFA```
- ```src/lang/{lang}-define.js```is the language extension of lexical analyzer. Support different languages,such as ```src/lang/c-define.js```

## 2、Features

### (1) Complete lexical analysis

From inputting the character sequence to generating ```token``` after the analysis, ```lexer``` has complete steps for lexical analysis, and 12 token types for most language extensions

### (2) Support multi-language extension

```lexer``` supports different language extensions such as ```Python```, ```Go```, etc. How to make different language extensions, please check [Contributions](#5)

- C :A popular programming language,[click here](https://wgrape.github.io/lexer/?lang=c) to see its lexical analysis
- SQL :A popular database query language,[click here](https://wgrape.github.io/lexer/?lang=sql) to see its lexical analysis
- Goal :A goal parser problem from leetCode ,[click here](https://wgrape.github.io/lexer/?lang=goal) to see its lexical analysis

### (3) Provide state flow log

The core mechanism of lexical analyzer is based on the state flow of ```DFA```. For this reason, ```lexer``` records detailed state flow log to achieve the following requirements of you

- Debug mode
- Automatically generate ```DFA``` state flow diagram

## 3、Get project

After ```git clone``` command, no need for any dependencies, and no extra installation steps

## 4、Ussage

### (1) In your project

If you need use ```lexer``` in your project, such as code editor, etc.

#### Using NPM
```
npm install chain-lexer
```

```js
var chainLexer = require('chain-lexer');
let lexer = chainLexer.cLexer;

let stream = "int a = 10;";
lexer.start(stream);
let parsedTokens = lexer.DFA.result.tokens;

lexer = chainLexer.sqlLexer;
stream = "select * from test where id >= 10;";
lexer.start(stream);
parsedTokens = lexer.DFA.result.tokens;
```

#### Using Script
Import the ```package/{lang}-lexer.min.js``` file, then visit ```lexer``` variable to get the object of lexical analyzer,and visit ```lexer.DFA.result.tokens``` to get ```tokens```

```js
// 1. The code that needs lexical analysis
let stream = "int a = 10;";

// 2. Start lexical analysis
lexer.start(strem);

// 3. After the lexical analysis is done, get the generated tokens
let parsedTokens = lexer.DFA.result.tokens;

// 4. Do what you want to do
parsedTokens.forEach((token) => {
// ... ...
});
```

The [Provide state flow log](#23) part in features,visit ```flowModel.result.paths``` will get the detail logs of state flow inside ```lexer```. The data format is as follows

```js
[
{
state: 0, // now state
ch: "a", // read char
nextSstate: 2, // next state
match: true, // is match
end: false, // is last char
},
// ... ...
]
```

### (2) Web preview and testing

In order to preview the process of ```lexer``` in real time, to debug and test, there is a ```index.html``` file in the root directory of this project. Open it directly in your browser, and after entering the code will automatically output the ```Token``` generated after ```lexer``` analysis, as shown in the figure below

```c
int a = 10;
int b =20;
int c = 20;

float f = 928.2332;
char b = 'b';

if(a == b){
printf("Hello, World!");
}else if(b!=c){
printf("Hello, World! Hello, World!");
}else{
printf("Hello!");
}
```

![img](https://user-images.githubusercontent.com/35942268/137584888-28a1ce09-3474-4158-8e6f-ccbdb8614930.gif)

or check the [online website](wgrape.github.io/lexer/)

## 5、Contributions

### (1) Project Statistics

### (2) Source code explanation
Documents about source code development, project design, unit testing, automated testing, development specifications, and how to make extensions in different languages, please read [source code explanation](/doc/explain.md)

### (3) Content contribution
- Add more new features
- Add more extensions ```/src/lang/{lang}-define.js```

### (4) Release version
The project is released with the version number of ```A-B-C```,regarding release log, you can check the [CHANGELOG](./CHANGELOG.md) or the [release record](https://github.com/WGrape/lexer/releases)

- ```A```:Major upgrade
- ```B```:Minor upgrade
- ```C```:bug fix / features / ...

### (5) Q&A
If you have any problems or questions, please [submit an issue](https://github.com/WGrape/lexer/issues/new)

## 6、License

![GitHub](https://img.shields.io/github/license/WGrape/lexer)