{"id":15567569,"url":"https://github.com/wgrape/lexer","last_synced_at":"2025-04-05T08:06:17.616Z","repository":{"id":43575881,"uuid":"399891437","full_name":"WGrape/lexer","owner":"WGrape","description":"A lexical analyzer based on DFA that is built using JS and supports multi-language extensions / 一个基于DFA的支持多语言扩展的JS版开源词法分析器","archived":false,"fork":false,"pushed_at":"2023-03-19T08:44:38.000Z","size":6727,"stargazers_count":341,"open_issues_count":0,"forks_count":23,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-03-29T07:04:24.137Z","etag":null,"topics":["dfa","javascript","lexer","lexical-analysis","lexical-analyzer"],"latest_commit_sha":null,"homepage":"https://wgrape.github.io/lexer/","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/WGrape.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":".github/CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-08-25T16:45:43.000Z","updated_at":"2025-03-25T08:33:32.000Z","dependencies_parsed_at":"2024-06-19T00:19:37.140Z","dependency_job_id":"38f5fcd3-afef-482d-af3b-b5facde2a67d","html_url":"https://github.com/WGrape/lexer","commit_stats":{"total_commits":86,"total_committers":4,"mean_commits":21.5,"dds":0.2674418604651163,"last_synced_commit":"eb0daa168b9108ecc79f530930a50b727c910917"},"previous_names":[],"tags_count":9,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WGrape%2Flexer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WGrape%2Flexer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WGrape%2Flexer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WGrape%2Flexer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/WGrape","download_url":"https://codeload.github.com/WGrape/lexer/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247305933,"owners_count":20917208,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dfa","javascript","lexer","lexical-analysis","lexical-analyzer"],"created_at":"2024-10-02T17:11:48.221Z","updated_at":"2025-04-05T08:06:17.587Z","avatar_url":"https://github.com/WGrape.png","language":"JavaScript","readme":"\u003cp align=\"center\"\u003e\n\u003cimg width=\"300\" alt=\"image\" src=\"https://user-images.githubusercontent.com/35942268/226105723-86eb3042-fb51-4f51-b382-b833a952e9a2.png\"\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n   \u003ca href=\"https://www.oscs1024.com/project/oscs/WGrape/lexer?ref=badge_small\" alt=\"OSCS Status\"\u003e\u003cimg src=\"https://www.oscs1024.com/platform/badge/WGrape/lexer.svg?size=small\"/\u003e\u003c/a\u003e\n    \u003cimg src=\"https://img.shields.io/badge/JavaScript-ES5+-blue.svg\"\u003e\n    \u003cimg src=\"https://img.shields.io/npm/dt/chain-lexer.svg\"\u003e\n    \u003ca href=\"https://app.travis-ci.com/github/WGrape/lexer\"\u003e\u003cimg src=\"https://app.travis-ci.com/WGrape/lexer.svg?branch=main\"\u003e\u003ca\u003e\n    \u003cimg alt=\"GitHub release (latest by date)\" src=\"https://img.shields.io/github/v/release/wgrape/lexer\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/Document-中文/English-orange.svg\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/License-MIT-green.svg\"\u003e   \n\u003c/p\u003e\n\n\u003cdiv align=\"center\"\u003e    \n    \u003cp\u003eIt is a lexical analyzer based on DFA that is built using JS and supports multi-language extensions. For a quick understanding and experience , please check the \u003ca href=\"https://wgrape.github.io/lexer/\"\u003eonline website\u003c/a\u003e\u003c/p\u003e\n    \u003cp\u003eDocument ：\u003ca href=\"/README.zh-CN.md\"\u003e中文\u003c/a\u003e / \u003ca href=\"/README.md\"\u003eEnglish\u003c/a\u003e\u003c/p\u003e\n\u003c/div\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eContents\u003c/summary\u003e\n\n- [1、Background](#1)\n- \u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;[(1) Situation](#11)\n- \u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;[(2) Task](#12)\n- \u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;[(3) Solution](#13)\n- [2、Features](#2)\n- \u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;[(1) Complete lexical analysis](#21)\n- \u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;[(2) Support multi-language extension](#22)\n- \u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;[(3) Provide state flow log](#23)\n- [3、Get project](#3)\n- [4、Ussage](#4)\n- \u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;[(1) In your project](#41)\n- \u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;[(2) Web preview and testing](#42)\n- [5、Contributions](#5)\n- \u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;[(1) Project Statistics](#51)\n- \u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;[(2) Source code explanation](#52)\n- \u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;[(3) Content contribution](#53)\n- \u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;[(4) Release version](#54)\n- \u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;[(5) Q\u0026A](#55)\n- [6、License](#6)\n\n\u003c/details\u003e\n\n## \u003cspan id=\"1\"\u003e1、Background\u003c/span\u003e\n\n### \u003cspan id=\"11\"\u003e(1) Situation\u003c/span\u003e\n\nMost lexical analyzers are closely coupled with the language, the amount of code is relatively large. It's hard to pay attention to the essential principles of lexical analyzer.\n\n### \u003cspan id=\"12\"\u003e(2) Task\u003c/span\u003e\n\nIn order to focus on the working principle of lexical analyzer , not to consider the small differences caused by different languages , an idea of making a ```lexer``` project that is completely decoupled from the language was born.\n\n### \u003cspan id=\"13\"\u003e(3) Solution\u003c/span\u003e\n\n```lexer``` through the following two files, realize the decoupling of lexical analyzer and language\n\n- ```src/lexer.js``` is the core part of lexical analyzer within 300 lines, including ```ISR``` and ```DFA```\n- ```src/lang/{lang}-define.js```is the language extension of lexical analyzer. Support different languages，such as ```src/lang/c-define.js```\n\n## \u003cspan id=\"2\"\u003e2、Features\u003c/span\u003e\n\n### \u003cspan id=\"21\"\u003e(1) Complete lexical analysis\u003c/span\u003e\n\nFrom inputting the character sequence to generating ```token``` after the analysis, ```lexer``` has complete steps for lexical analysis, and 12 token types for most language extensions\n\n\u003cimg width=\"850\" alt=\"\" src=\"https://user-images.githubusercontent.com/35942268/137583888-8c12a85c-4af7-4288-942f-d2a2fcfe30c6.png\"\u003e\n\n### \u003cspan id=\"22\"\u003e(2) Support multi-language extension\u003c/span\u003e\n\n```lexer``` supports different language extensions such as ```Python```, ```Go```, etc. How to make different language extensions, please check [Contributions](#5)\n\n- C ：A popular programming language，[click here](https://wgrape.github.io/lexer/?lang=c) to see its lexical analysis\n- SQL ：A popular database query language，[click here](https://wgrape.github.io/lexer/?lang=sql) to see its lexical analysis\n- Goal ：A goal parser problem from leetCode ，[click here](https://wgrape.github.io/lexer/?lang=goal) to see its lexical analysis\n\n### \u003cspan id=\"23\"\u003e(3) Provide state flow log\u003c/span\u003e\n\nThe core mechanism of lexical analyzer is based on the state flow of ```DFA```. For this reason, ```lexer``` records detailed state flow log to achieve the following requirements of you\n\n- Debug mode\n- Automatically generate ```DFA``` state flow diagram\n\n\u003cimg width=\"700\" src=\"https://user-images.githubusercontent.com/35942268/136378451-e025fffd-425d-43f1-8a58-454a1011e9c3.png\" /\u003e\n\n## \u003cspan id=\"3\"\u003e3、Get project\u003c/span\u003e\n\nAfter ```git clone``` command, no need for any dependencies, and no extra installation steps\n\n## \u003cspan id=\"4\"\u003e4、Ussage\u003c/span\u003e\n\n### \u003cspan id=\"41\"\u003e(1) In your project\u003c/span\u003e\n\nIf you need use ```lexer``` in your project, such as code editor, etc. \n\n#### Using NPM\n```\nnpm install chain-lexer\n```\n\n```js\nvar chainLexer = require('chain-lexer');\nlet lexer = chainLexer.cLexer;\n\nlet stream = \"int a = 10;\";\nlexer.start(stream);\nlet parsedTokens = lexer.DFA.result.tokens;\n\nlexer = chainLexer.sqlLexer;\nstream = \"select * from test where id \u003e= 10;\";\nlexer.start(stream);\nparsedTokens = lexer.DFA.result.tokens;\n```\n\n#### Using Script\nImport the ```package/{lang}-lexer.min.js``` file, then visit ```lexer``` variable to get the object of lexical analyzer，and visit ```lexer.DFA.result.tokens``` to get ```tokens```\n\n```js\n// 1. The code that needs lexical analysis\nlet stream = \"int a = 10;\";\n\n// 2. Start lexical analysis\nlexer.start(strem);\n\n// 3. After the lexical analysis is done, get the generated tokens\nlet parsedTokens = lexer.DFA.result.tokens;\n\n// 4. Do what you want to do\nparsedTokens.forEach((token) =\u003e {\n    // ... ...\n});\n```\n\nThe [Provide state flow log](#23) part in features，visit ```flowModel.result.paths``` will get the detail logs of state flow inside ```lexer```. The data format is as follows\n\n```js\n[\n    {\n        state: 0, // now state\n        ch: \"a\", // read char\n        nextSstate: 2, // next state\n        match: true, // is match\n        end: false, // is last char\n    },\n    // ... ...\n]\n```\n\n### \u003cspan id=\"42\"\u003e(2) Web preview and testing\u003c/span\u003e\n\nIn order to preview the process of ```lexer``` in real time, to debug and test, there is a ```index.html``` file in the root directory of this project. Open it directly in your browser, and after entering the code will automatically output the ```Token``` generated after ```lexer``` analysis, as shown in the figure below\n\n```c\nint a = 10;\nint b =20;\nint c = 20;\n\nfloat f = 928.2332;\nchar b = 'b';\n\nif(a == b){\n    printf(\"Hello, World!\");\n}else if(b!=c){\n    printf(\"Hello, World! Hello, World!\");\n}else{\n    printf(\"Hello!\");\n}\n```\n\n![img](https://user-images.githubusercontent.com/35942268/137584888-28a1ce09-3474-4158-8e6f-ccbdb8614930.gif)\n\nor check the [online website](wgrape.github.io/lexer/)\n\n## \u003cspan id=\"5\"\u003e5、Contributions\u003c/span\u003e\n\n### \u003cspan id=\"51\"\u003e(1) Project Statistics\u003c/span\u003e\n\n\u003ca href=\"https://starchart.cc/WGrape/lexer\"\u003e\u003cimg src=\"https://starchart.cc/WGrape/lexer.svg\" width=\"700\"\u003e\u003c/a\u003e\n\n### \u003cspan id=\"52\"\u003e(2) Source code explanation\u003c/span\u003e\nDocuments about source code development, project design, unit testing, automated testing, development specifications, and how to make extensions in different languages, please read [source code explanation](/doc/explain.md)\n\n### \u003cspan id=\"53\"\u003e(3) Content contribution\u003c/span\u003e\n- Add more new features\n- Add more extensions ```/src/lang/{lang}-define.js```\n\n### \u003cspan id=\"54\"\u003e(4) Release version\u003c/span\u003e\nThe project is released with the version number of ```A-B-C```，regarding release log, you can check the [CHANGELOG](./CHANGELOG.md) or the [release record](https://github.com/WGrape/lexer/releases)\n\n- ```A```：Major upgrade\n- ```B```：Minor upgrade\n- ```C```：bug fix / features / ...\n\n### \u003cspan id=\"55\"\u003e(5) Q\u0026A\u003c/span\u003e\nIf you have any problems or questions, please [submit an issue](https://github.com/WGrape/lexer/issues/new)\n\n## \u003cspan id=\"6\"\u003e6、License\u003c/span\u003e\n\n![GitHub](https://img.shields.io/github/license/WGrape/lexer)\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwgrape%2Flexer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwgrape%2Flexer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwgrape%2Flexer/lists"}