https://github.com/vinicq/falsegreen-js

Find JS/TS tests that pass green without protecting anything. Deterministic AST scanner for js/ts/tsx/jsx/mts/cts. Sibling of falsegreen (Python).
https://github.com/vinicq/falsegreen-js
false-positive javascript jest static-analysis test-smells testing typescript vitest
Last synced: 3 days ago
JSON representation
Find JS/TS tests that pass green without protecting anything. Deterministic AST scanner for js/ts/tsx/jsx/mts/cts. Sibling of falsegreen (Python).
Host: GitHub
URL: https://github.com/vinicq/falsegreen-js
Owner: vinicq
License: mit
Created: 2026-06-22T18:41:27.000Z (9 days ago)
Default Branch: main
Last Pushed: 2026-06-22T20:21:37.000Z (9 days ago)
Last Synced: 2026-06-22T20:24:52.622Z (9 days ago)
Topics: false-positive, javascript, jest, static-analysis, test-smells, testing, typescript, vitest
Language: TypeScript
Size: 109 KB
Stars: 1
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Codeowners: .github/CODEOWNERS
- Security: SECURITY.md
Awesome Lists containing this project

README

          # falsegreen-js

[![CI](https://github.com/vinicq/falsegreen-js/actions/workflows/ci.yml/badge.svg)](https://github.com/vinicq/falsegreen-js/actions/workflows/ci.yml)

[![npm](https://img.shields.io/npm/v/falsegreen-js.svg)](https://www.npmjs.com/package/falsegreen-js)

[![Node](https://img.shields.io/node/v/falsegreen-js.svg)](https://www.npmjs.com/package/falsegreen-js)

[![Downloads](https://img.shields.io/npm/dm/falsegreen-js.svg)](https://www.npmjs.com/package/falsegreen-js)

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)

[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](CONTRIBUTING.md)

[![Docs](https://img.shields.io/badge/docs-online-blue.svg)](https://vinicq.github.io/falsegreen-docs/)

Find JavaScript/TypeScript unit tests that give false positives: green tests that

protect nothing, and tests that pass while asserting the wrong thing. Deterministic

AST scan, no code execution. Sibling of [`falsegreen`](https://github.com/vinicq/falsegreen)

(the Python scanner); same contract, JS/TS rule set.

Covers `.js`, `.jsx`, `.ts`, `.tsx`, `.mjs`, `.cjs`, `.mts`, `.cts`.

**The falsegreen family** (install the one for your stack):

| Tool | Stack | Install | Package |

|---|---|---|---|

| [falsegreen](https://github.com/vinicq/falsegreen) | Python / pytest | `pip install falsegreen` | [PyPI](https://pypi.org/project/falsegreen/) |

| **falsegreen-js** | JS / TS | `npm i -D falsegreen-js` (`npx falsegreen-js`) | [npm](https://www.npmjs.com/package/falsegreen-js) |

| [robotframework-falsegreen](https://github.com/vinicq/robotframework-falsegreen) | Robot Framework | `pip install robotframework-falsegreen` | [PyPI](https://pypi.org/project/robotframework-falsegreen/) |

| [falsegreen-skill](https://github.com/vinicq/falsegreen-skill) | semantic LLM pass | `npx falsegreen-skill analyze ` | [npm](https://www.npmjs.com/package/falsegreen-skill) |

## Why

A test can be green and still protect nothing: an empty body, an assertion that is

never reached, `expect(x).toBe(x)`, `expect(value)` with no matcher, a focused

`it.only` that silently parks the rest of the suite, a `findByText` that is never

awaited. AI-generated tests produce these in bulk. This tool flags the mechanical

patterns a parser can prove, before they reach review.

## Install

```bash

npm install -D falsegreen-js

```

## Usage

```bash

npx falsegreen-js                 # scan cwd

npx falsegreen-js src test        # scan paths

npx falsegreen-js --staged        # only test files staged in git (pre-commit)

npx falsegreen-js --json          # machine-readable output (alias for --format json)

npx falsegreen-js --format sarif  # SARIF 2.1.0 for GitHub code scanning

npx falsegreen-js --format junit  # JUnit XML for CI test reporters

npx falsegreen-js --output report.json   # write to a file

npx falsegreen-js --output .falsegreen/  # write report. into a directory

npx falsegreen-js --config-audit  # audit Jest/Vitest config (project-layer PL codes)

npx falsegreen-js --disable C7,JS3

npx falsegreen-js --enable D8,M2   # re-activate off/opt-in codes at catalog severity

```

Each finding is reported with its pyramid level (unit / integration / e2e, read from the file's imports) and a one-line fix hint, and the summary breaks the findings down by level and lists the most common fixes. `--output` takes a file or a directory: an extension-less or trailing-slash path (e.g. `.falsegreen/`) receives `report.` for the chosen format. Reports are run artifacts; keep the output directory gitignored.

### Output formats

`--format text|json|sarif|junit` (default `text`; `--json` stays as an alias for `--format json`). These match the [Python sibling](https://github.com/vinicq/falsegreen) byte-for-concept, so a pipeline can swap one scanner for the other.

- **`sarif`**: SARIF 2.1.0. One rule per code present, one result per finding, with `error` for high-severity findings, `warning` for low, and `note` for off. Result tags carry the judgment (J1-J6), the risk group (`risk:effectiveness`...), and the level (`level:high`). Upload it to GitHub code scanning to see findings inline on the PR.

- **`junit`**: JUnit XML. High-severity findings become ``, everything else ``, so a CI test reporter surfaces them as a failing suite.

### Baseline (ratchet)

Adopting the scanner on a large codebase without fixing every legacy finding at once:

```bash

npx falsegreen-js --write-baseline   # record current findings to .falsegreen-baseline.json, exit 0

npx falsegreen-js --baseline         # report and fail only on findings not in the baseline

```

`--baseline [PATH]` and `--write-baseline [PATH]` default to `.falsegreen-baseline.json`. A finding's identity is a content fingerprint (`sha1` of relative path + code + detail, no line number), so it survives unrelated line shifts in the file. Commit the baseline, then let CI block only on net-new findings. (The fingerprint omits the source snippet the Python scanner folds in, since the js scanner does not carry one; two findings with the same code and detail in one file share an id.)

`--config-audit` is a separate mode: instead of scanning test files, it reads the Jest/Vitest config (`package.json` `jest` field, `jest.config.*`, `vitest.config.*`) and reports the project-layer ways a suite stays green by configuration: `PL10` (`passWithNoTests` passes an empty or filtered-to-nothing run), `PL7` (no `coverageThreshold` / `coverage.thresholds`), `PL8` (`bail` stops the run early). The per-file scan cannot see config.

For the layer no static scan reaches (does a green test fail when the code is wrong?), run a **mutation tester** like [Stryker](https://stryker-mutator.io/). falsegreen-js is the cheap pre-filter on every commit; mutation testing is the deeper audit.

Exit code: `0` clean, `10` low-confidence only, `20` high-confidence present. Wire it

into CI or a pre-commit hook and let exit `20` block the commit.

Suppress a single finding inline:

```ts

expect(user.id).toBe(user.id); // falsegreen: ignore[C7]

expect(x);                     // falsegreen: ignore

```

## Runner coverage

Runner-agnostic. The assertion and test vocabulary spans Jest, Vitest, Mocha + Chai,

Jasmine, AVA, `node:test`, tap, Cypress, Playwright, Testing Library

(`@testing-library/*` with `jest-dom` / `jasmine-dom` matchers and `user-event`),

and Vue Test Utils (`mount`/`wrapper.find`/`flushPromises`/`nextTick`).

`expect().matcher()`, chai `expect().to`, `assert`, `x.should`, and AVA `t.is` all

count as real assertions, so a Mocha or AVA test is not mistaken for one that never

checks anything.

Note: component files (`.vue`, `.svelte`, `.astro`, `.marko`) and templates (`.html`)

are not test files. Tests for those frameworks are written in `.spec`/`.test` files in

the eight extensions above, which is what the scanner reads.

## Test levels (the pyramid)

falsegreen-js scans tests at every level of the pyramid. Discovery is level-agnostic - it

reads any test file - but a few codes are read in light of the level, so a valid pattern at

one level is not flagged at another.

- **Unit:** a function or component with its boundaries doubled. The oracle is `expect`.

- **Integration (API and database):** API tests through supertest / chai-http

  (`request(app).get("/").expect(200)`, recognized as an assertion) or `fetch`, and database

  tests through Prisma / TypeORM / Knex against a real datastore. These cross the I/O

  boundary on purpose, so the response or row IS the verification at that level.

- **E2E:** Cypress (`.cy.*`) and Playwright (`.e2e.*`). `cy.get().should(...)` and

  `expect(page).toHaveURL(...)` are the oracle; a visible element is a real check here, not a

  weak one.

A real API or database call inside a test that claims to be a unit test is itself the smell

(mystery guest, environment coupling), not the level of the test. C23 flags the hard-coded

file path or URL form.

## Case catalog

Codes shared with `falsegreen` (Python) keep the same id, so cross-language results

line up in the research. `JS*` codes are ecosystem-specific.

| Code | Confidence | What it flags |

|---|---|---|

| C2  | high | test with no check at all (empty body) |

| C2b | low  | test calls code but asserts nothing |

| C5  | high | always-true check (`expect(true).toBe(true)`, `assert(1)`) |

| C6  | low  | weak check — only verifies something came back (`toBeTruthy`/`toBeDefined`, `length > 0`) |

| C7  | high | compares a thing to itself (`expect(x).toBe(x)`) |

| C44 | high | numeric tautology — a length compared so the result is always true (`x.length >= 0`) |

| C20 | high | assertion in unreachable code (after a `return`/`throw`/`process.exit`, a `break`, a both-arms-terminating `if`, or an exhaustive `switch`) — it never runs |

| C23 | low  | reads a real file at a literal path, or a hard-coded URL (mystery guest) |

| C8  | low  | exact equality on a float (use `toBeCloseTo`) |

| C9  | low  | `toThrow()` with no error type or message — accepts any error |

| C16 | low  | result depends on `Date.now`, `Math.random`, or a fixed timer |

| C18 | low  | compares `String(x)` / `JSON.stringify(x)` / `` `${x}` `` to a literal (formatting, not value) |

| C21 | low  | every assertion is conditional — none runs unconditionally |

| C37 | low  | duplicate case in `it.each`/`test.each` — the same scenario runs twice |

| C48 | low  | dark patch — the test flips a test-mode flag (`process.env.NODE_ENV = "test"`, `process.env.TESTING`, a `TESTING` flag) then asserts, exercising the product's test-only branch |

| CC  | low  | commented-out assertion |

| JS1 | high | focused test (`it.only` / `fit`) silently skips the rest of the suite |

| JS2 | high | `expect(x)` with no matcher — the assertion never runs |

| JS3 | low  | snapshot is the only assertion |

| JS4 | low  | skipped test (`it.skip` / `xit` / `it.todo`) never runs |

| JS5 | low  | async query/event not awaited (`findBy*` / `waitFor` / `user-event`) |

| JS6 | high | empty `describe`/`suite` — the suite is green but runs nothing |

| JS7 | low  | assertion inside a non-awaited `setTimeout`/`then` callback — may run after the test ends |

| JS8 | low  | mocks the unit under test (`jest.mock`/`vi.mock` of an imported module asserted directly) |

| JS9 | high | assertion in a dead branch (`if(false)` / `if(true){}else`) — never runs |

| JS11 | low | `try/catch` swallows the assertion — a failing `expect` is caught, test stays green |

| JS13 | low | query (`getBy*`/`queryBy*`) as a loose statement — its result is never asserted |

| JS15 | low | inappropriate assertion — comparison wrapped in a boolean (`expect(a===b).toBe(true)`), blind failure message |

| JS17 | low | commented-out test block (`// it(...)` / `// test(...)`) — disabled, no longer runs |

| JS18 | low | test takes a `done` callback instead of async/await — a mistimed `done` passes early |

| JS21 | high | matcher referenced but never called (`expect(x).toBe` with no `()`) — the assertion never runs |

| JS22 | high | empty `it.each`/`test.each` table — generated with zero cases, never runs |

| JS23 | high | `expect.assertions(N)` with fewer unconditional `expect()` calls than N — the guard can never be met |

| JS24 | low  | Cypress query (`cy.get`/`cy.find`/`cy.contains`) as a loose statement with no terminating `.should`/`.and` and no `expect` in `.then` — its result is never asserted |

Each code carries a judgment tag (J1-J6) shared with the

[falsegreen-skill](https://github.com/vinicq/falsegreen-skill) semantic framework.

### Opt-in: maintainability group (default off)

These are **not** false-green - the test still protects something - so they are off by

default. Enable them with `--diagnostics`, or per code via config `severity`. They are a

"plus" for test-code health, mirroring falsegreen's diagnostic/coupling groups.

| Code | Group | What it flags |

|---|---|---|

| D1 | diagnostic | assertion roulette — many assertions in one test |

| D3 | diagnostic | duplicate assert — the same assertion repeated |

| D4 | diagnostic | `it.each`/`test.each` without titled cases (index-only) |

| D6 | diagnostic | `console.*` in a test body |

| D7 | diagnostic | anonymous test — empty or missing description |

| D8 | diagnostic | magic number — a bare numeric literal as the expected value |

| M2 | coupling | test body exceeds the line-count threshold |

```bash

npx falsegreen-js --diagnostics      # include D*/M* as warnings

```

### Deliberately not implemented

Some catalog codes were reviewed and left out, on purpose:

- **JS19** (`toBe` on an object/array literal): `expect(x).toBe({...})` compares by reference,

  so it always fails. That is the false-red axis (a test that always fails), the opposite of

  what this scanner looks for, and out of scope on principle.

- **JS20** (a Promise compared without `resolves`/`rejects`): telling that a value is a

  Promise needs type information the AST does not carry, so it would be too noisy.

- **JS12** (a floating promise whose `expect` is never returned): already covered by JS7.

- **JS16** (`async` test with no `expect.assertions(n)`): the *absence* of a guard is not a

  smell on its own; flagging it would fire on most async tests. The implemented sibling is

  `JS23`, which fires on a present-but-unsatisfiable guard: `expect.assertions(N)` with a

  numeric `N` higher than the unconditional `expect()` calls that can run, so the count can

  never be met.

- **JS14** (a giant inline snapshot): a readability and review-noise concern, not a

  false-green one. The snapshot still protects, so it belongs to the diagnostic group and is

  better served by `eslint-plugin-jest` (`no-large-snapshots`) as an opt-in lint rule.

- **JS10** (any conditional in a test body): handled by `eslint-plugin-jest`

  (`no-conditional-in-test`); JS9 and C21 already cover the false-green subset.

- **C1** (an assertion under an `if`/`for` that may not run): redundant once C21 and JS9

  exist, and high-FP on its own. C21 already fires the actual false-green case, where

  *every* assertion is conditional and the test can pass with nothing checked. A test that

  mixes a conditional assertion with an unconditional one is not false-green: the

  unconditional assertion still protects. JS9 covers the dead-branch form (`if(false)`).

  Flagging every conditional assertion (C1's full scope) is the linter concern JS10 already

  names (`no-conditional-in-test`), so C1 would add noise without a new false-green signal.

### What carries over from falsegreen, what does not

Ported (same concept): C2, C2b, C5, C7, C8, C16, C44, C48, CC.

Python-only, not applicable to JS/TS: pytest collection rules (C4 family), `pytest.raises`

breadth (C9/C19/C27/C28), fixtures and `os.environ`/global-state codes (C23/C24/C29),

sklearn/torch/tensorflow metric and seed codes (C33, parts of C16), xfail (C25), and the

xunit/`self.assert*` codes. These have no JS equivalent or need a different signal.

JS/TS-only (new here): JS1-JS5 above. The `describe.only`/skip, snapshot, no-matcher,

and not-awaited patterns are specific to the JS test runners and Testing Library.

## Configuration

Optional. `falsegreen.json`, `.falsegreenrc.json`, or a `"falsegreen"` key in

`package.json`:

```json

{

  "disable": ["C8"],

  "exclude": ["**/legacy/**"],

  "severity": { "JS3": "off", "C16": "high" }

}

```

Precedence: CLI `--disable` > CLI `--enable` > config `disable`/`severity` > catalog default. `--enable ` re-activates listed off or opt-in codes at their catalog severity (it flips a default-off code on; it cannot raise a code above catalog). A code passed to both `--enable` and `--disable` stays off — `--disable` wins.

## Scope and honesty

This is a static scanner. It owns what the structure proves. Two things it does not

decide: whether the expected value contradicts the intended behavior, and whether the

test re-implements the production logic. Those are semantic and belong to the

`falsegreen-skill` LLM pass. Precision over recall: a softened heuristic that misses a

case is preferred to one that flags correct code.

Measured against the [Open Catalog of Test Smells](https://test-smell-catalog.readthedocs.io/) (517 documented smells), only the false-green slice is in scope. What stays out, on purpose: **brittleness / false-red** (sensitive equality, brittle assertions - the opposite axis), **hygiene / maintainability** (assertion roulette, magic numbers, long tests - linter territory, a few surfaced as opt-in diagnostics), and **slow, design, naming, duplication, runtime/culture** (none about whether the test protects). The boundary is deliberate: where a smell has a statically provable false-green form, that form is a code here - uncontrolled `Date.now`/`Math.random` is `C16`, a hard-coded path or URL is `C23`, an assertion that may never run is `C21`/`C20`, and JS-specific forms (focused tests, missing matchers) are the `JS*` codes. See [CREDITS.md](CREDITS.md) for the full cross-walk.

## References

The catalog is grounded in the test-smell literature. Direct influences: the

rotten-green-test work that names this whole family (Delplanque et al., ICSE 2019),

the founding test-smell refactoring catalog (van Deursen et al., XP 2001), the

JS/TS empirical studies (Jorge, UFCG 2023; Silva, PUC Minas 2022 - the academic

precedent for the focused-test and snapshot codes; Oliveira et al., SBES 2024/2025),

and the detection-tool baselines (tsDetect, Peruma et al., 2020). Full list and the

code-to-source mapping in [CREDITS.md](CREDITS.md).

## Status

The rule set is a deterministic core; the full JS/TS smell catalog is tracked as

research in the private audit hub. See [STATUS.md](STATUS.md) for the current version

and rule coverage. Issues and PRs welcome.

## License

MIT, Vinicius Queiroz.

## Contributors ✨

Thanks to the people who keep false-green tests out of real suites ([emoji key](https://allcontributors.org/docs/en/emoji-key)):

[![All Contributors](https://img.shields.io/badge/all_contributors-2-orange.svg?style=flat-square)](#contributors-)

  

    

      
_{Vinicius Queiroz}
💻 📖 🤔 🚧 🚇 ⚠️ 🔬

      
_{Home Seller}
💻 📖 ⚠️ 🚇

    

  

New contributors are added automatically; the table also recognizes non-code work (docs, ideas, infrastructure, tests, research) via the [all-contributors](https://allcontributors.org) spec.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/vinicq/falsegreen-js

Awesome Lists containing this project

README