https://github.com/abhisheksoni27/lexer.js

A lexer and longest common sequence finder (between JS source code files)
https://github.com/abhisheksoni27/lexer.js

javascript lexer longest-common-substring npm

Last synced: 24 days ago
JSON representation

A lexer and longest common sequence finder (between JS source code files)

Host: GitHub
URL: https://github.com/abhisheksoni27/lexer.js
Owner: abhisheksoni27
License: mit
Created: 2017-12-29T06:46:46.000Z (over 8 years ago)
Default Branch: master
Last Pushed: 2018-01-09T17:23:10.000Z (over 8 years ago)
Last Synced: 2026-05-17T09:55:53.084Z (25 days ago)
Topics: javascript, lexer, longest-common-substring, npm
Language: JavaScript
Homepage:
Size: 467 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # lexer.js

A lexical analyzer and longest common shared sequence finder between a list of JS files.

[![Build Status](https://travis-ci.org/abhisheksoni27/lexer.js.svg?branch=master)](https://travis-ci.org/abhisheksoni27/lexer.js)

# Table of Contents

* [What It Does](#what-it-does)

* [Installation](#installation)

* [Running lexer.js](#running-lexer.js)

    - [JSON configruation](#json-configuration)

    - [CSV configruation](#csv-configuration)

* [Result](#result)

* [Options](#options)

* [Running Examples](#running-examples)

    - [Test Github Project](#test-github-project)

* [Tests](#tests)

* [Known Issues](https://github.com/abhisheksoni27/lexer.js/wiki/Known-Issues)

# What it does

Suppose, you have two files with the same function but different function calls:

**`test1.js`**



```js

function add(a, b){

    return a + b;

}

const sum = add(11 + 11);

```

**`test2.js`**



```js

function add(a, b){

    return a + b;

}

let sum = add(11 + 11);

```

The longest common shared sequence between these two files is the *entire function definition.*

**LCSS**

```js

function add(a, b){

    return a + b;

}

```

![Results](https://raw.githubusercontent.com/abhisheksoni27/lexer.js/master/src/assets/result.png)

# Installation

### CLI

You can download the module via ***npm***. (To install npm, which ships with node.js, you can download node from [nodejs.org](https://nodejs.org) for your OS.) `node.js > 6.x`

```bash

$ npm install -g lexer.js

```

Or, if you prefer ***yarn***

```bash

$ yarn global add lexer.js

```

That's it. 🎉

### As module

In your application:

```js

const lexerJS = require('lexer.js');

const result = lexerJS(files, options);

```

`files`: An array of files.

`options`: See [options](#options) This parameter is *(ironically)* optional.

# Running lexer.js

To run it on your own set of files, you can either provide the files in CSV/JSON, or as command line arguments like this:

```bash

lexer.js test1.js test2.js

```

### JSON configuration

The **JSON** must have a key named `files` and it's value should be an array of the *paths* of files you want to test on.

```json

{

    "files": [

        "./example/test1.js",

        "./example/test2.js",

    ]

}

```

### CSV configuration

The **CSV** config file only has one header (or column) and is called `filename`. Each new line should contain the path of a source code file.

```csv

filename,

./example/test1.js,

./example/test2.js

```

# Result

The result contains the **longest common `shared` sequence** found between the set of files. The default format is JSON, but can be configured. (See options below.)

It also asssigns a score to each subsequnce using the following formula:

```js

score = log2(count) * log2(total)

```

**count**: Total number of occurences of the subsequence

**total**: Total number of tokens in the subsequence

## Options

#### **`-o`** Output Mode `[default: "json"]`

lexer.js supports **JSON** as well as **CSV** output. JSON is the defualt output format if you do not specify any during invocation.

```bash

lexer.js test.json -o csv

```

> Output would now be a `CSV` file. To know what that file would contain, check out [result](#result).

#### **`-s`** Save Tokens `[default: false]`

This is a boolean option, which when set, saves the tokens for each test file.

```bash

lexer.js test.json -s

```

> It will generate a tokens folder, and save individual `tokens` for each file in that directory.

#### **`-f`** Output File Name `[default: "result"]`

```

lexer.js test.json -f YayTheResultsYay

```

***Note:*** If you provide a file name with extension, such as `art.json`, then the output mode will be determined from the fileName and the output mode flag (if passed) will be overridden.

# Running Examples

The [examples](https://) directory contains a minimal example set that you can run lexer.js on. To do so, clone the repo, fire a terminal, and run:

```bash

npm install

```

This will downdload the dependencies. Then, run:

```bash

lexer.js test.json

```

This assumes that you already have lexer.js installed. If you don't, you can directly invoke the node script as follows:

```bash

node index.js test.json

```

As always, you must have `node` installed.

## Test GitHub project

The repo also contains a script to test lexer.js on any GitHub project. The script does the following:

1. Find all JS files in a project.

2. Select a file which has more than **n** commits. n is configurable.

3. Downloads the file at that point in time (when that commit was made).

4. Generates a configuration file for lexer.js.

5. Run lexer.js with that config file.

To run it, fire a terminal and run (assuming you are inside the project directory):

```bash

node runGitHubExamples.js -t TOKEN --owner OWNERNAME --repo REPONAME -n 20

```

**`OWNERNAME`**: Owner of the repo `[default: prettier]`

**`REPONAME`**: Name of the repo `[default: prettier]`

**`n`**: Minimum commits the selected file must have `[default: 10]`

**`t`**: GitHub OAuth token `[Mandatory]`

The results are saved in `result.json`. The command line [options](#options) for **lexer.js** can also be passed.

The GitHub OAuth token is a required argument. To generate a Personal Access Token for testing, you can generate one here: [GitHub - Personal Access Token](https://github.com/settings/tokens)

# Tests

To run tests, clone the repo (That green button above the repo contents) and run the following command:

```bash

npm install && npm run test

```

This will first download the dependencies, and then run the tests (using `mocha`) and output the result.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/abhisheksoni27/lexer.js

Awesome Lists containing this project

README