An open API service indexing awesome lists of open source software.

https://github.com/vinicq/falsegreen-js

Find JS/TS tests that pass green without protecting anything. Deterministic AST scanner for js/ts/tsx/jsx/mts/cts. Sibling of falsegreen (Python).
https://github.com/vinicq/falsegreen-js

false-positive javascript jest static-analysis test-smells testing typescript vitest

Last synced: 3 days ago
JSON representation

Find JS/TS tests that pass green without protecting anything. Deterministic AST scanner for js/ts/tsx/jsx/mts/cts. Sibling of falsegreen (Python).

Awesome Lists containing this project

README

          

# falsegreen-js

[![CI](https://github.com/vinicq/falsegreen-js/actions/workflows/ci.yml/badge.svg)](https://github.com/vinicq/falsegreen-js/actions/workflows/ci.yml)
[![npm](https://img.shields.io/npm/v/falsegreen-js.svg)](https://www.npmjs.com/package/falsegreen-js)
[![Node](https://img.shields.io/node/v/falsegreen-js.svg)](https://www.npmjs.com/package/falsegreen-js)
[![Downloads](https://img.shields.io/npm/dm/falsegreen-js.svg)](https://www.npmjs.com/package/falsegreen-js)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](CONTRIBUTING.md)
[![Docs](https://img.shields.io/badge/docs-online-blue.svg)](https://vinicq.github.io/falsegreen-docs/)

Find JavaScript/TypeScript unit tests that give false positives: green tests that
protect nothing, and tests that pass while asserting the wrong thing. Deterministic
AST scan, no code execution. Sibling of [`falsegreen`](https://github.com/vinicq/falsegreen)
(the Python scanner); same contract, JS/TS rule set.

Covers `.js`, `.jsx`, `.ts`, `.tsx`, `.mjs`, `.cjs`, `.mts`, `.cts`.

**The falsegreen family** (install the one for your stack):

| Tool | Stack | Install | Package |
|---|---|---|---|
| [falsegreen](https://github.com/vinicq/falsegreen) | Python / pytest | `pip install falsegreen` | [PyPI](https://pypi.org/project/falsegreen/) |
| **falsegreen-js** | JS / TS | `npm i -D falsegreen-js` (`npx falsegreen-js`) | [npm](https://www.npmjs.com/package/falsegreen-js) |
| [robotframework-falsegreen](https://github.com/vinicq/robotframework-falsegreen) | Robot Framework | `pip install robotframework-falsegreen` | [PyPI](https://pypi.org/project/robotframework-falsegreen/) |
| [falsegreen-skill](https://github.com/vinicq/falsegreen-skill) | semantic LLM pass | `npx falsegreen-skill analyze ` | [npm](https://www.npmjs.com/package/falsegreen-skill) |

## Why

A test can be green and still protect nothing: an empty body, an assertion that is
never reached, `expect(x).toBe(x)`, `expect(value)` with no matcher, a focused
`it.only` that silently parks the rest of the suite, a `findByText` that is never
awaited. AI-generated tests produce these in bulk. This tool flags the mechanical
patterns a parser can prove, before they reach review.

## Install

```bash
npm install -D falsegreen-js
```

## Usage

```bash
npx falsegreen-js # scan cwd
npx falsegreen-js src test # scan paths
npx falsegreen-js --staged # only test files staged in git (pre-commit)
npx falsegreen-js --json # machine-readable output (alias for --format json)
npx falsegreen-js --format sarif # SARIF 2.1.0 for GitHub code scanning
npx falsegreen-js --format junit # JUnit XML for CI test reporters
npx falsegreen-js --output report.json # write to a file
npx falsegreen-js --output .falsegreen/ # write report. into a directory
npx falsegreen-js --config-audit # audit Jest/Vitest config (project-layer PL codes)
npx falsegreen-js --disable C7,JS3
npx falsegreen-js --enable D8,M2 # re-activate off/opt-in codes at catalog severity
```

Each finding is reported with its pyramid level (unit / integration / e2e, read from the file's imports) and a one-line fix hint, and the summary breaks the findings down by level and lists the most common fixes. `--output` takes a file or a directory: an extension-less or trailing-slash path (e.g. `.falsegreen/`) receives `report.` for the chosen format. Reports are run artifacts; keep the output directory gitignored.

### Output formats

`--format text|json|sarif|junit` (default `text`; `--json` stays as an alias for `--format json`). These match the [Python sibling](https://github.com/vinicq/falsegreen) byte-for-concept, so a pipeline can swap one scanner for the other.

- **`sarif`**: SARIF 2.1.0. One rule per code present, one result per finding, with `error` for high-severity findings, `warning` for low, and `note` for off. Result tags carry the judgment (J1-J6), the risk group (`risk:effectiveness`...), and the level (`level:high`). Upload it to GitHub code scanning to see findings inline on the PR.
- **`junit`**: JUnit XML. High-severity findings become ``, everything else ``, so a CI test reporter surfaces them as a failing suite.

### Baseline (ratchet)

Adopting the scanner on a large codebase without fixing every legacy finding at once:

```bash
npx falsegreen-js --write-baseline # record current findings to .falsegreen-baseline.json, exit 0
npx falsegreen-js --baseline # report and fail only on findings not in the baseline
```

`--baseline [PATH]` and `--write-baseline [PATH]` default to `.falsegreen-baseline.json`. A finding's identity is a content fingerprint (`sha1` of relative path + code + detail, no line number), so it survives unrelated line shifts in the file. Commit the baseline, then let CI block only on net-new findings. (The fingerprint omits the source snippet the Python scanner folds in, since the js scanner does not carry one; two findings with the same code and detail in one file share an id.)

`--config-audit` is a separate mode: instead of scanning test files, it reads the Jest/Vitest config (`package.json` `jest` field, `jest.config.*`, `vitest.config.*`) and reports the project-layer ways a suite stays green by configuration: `PL10` (`passWithNoTests` passes an empty or filtered-to-nothing run), `PL7` (no `coverageThreshold` / `coverage.thresholds`), `PL8` (`bail` stops the run early). The per-file scan cannot see config.

For the layer no static scan reaches (does a green test fail when the code is wrong?), run a **mutation tester** like [Stryker](https://stryker-mutator.io/). falsegreen-js is the cheap pre-filter on every commit; mutation testing is the deeper audit.

Exit code: `0` clean, `10` low-confidence only, `20` high-confidence present. Wire it
into CI or a pre-commit hook and let exit `20` block the commit.

Suppress a single finding inline:

```ts
expect(user.id).toBe(user.id); // falsegreen: ignore[C7]
expect(x); // falsegreen: ignore
```

## Runner coverage

Runner-agnostic. The assertion and test vocabulary spans Jest, Vitest, Mocha + Chai,
Jasmine, AVA, `node:test`, tap, Cypress, Playwright, Testing Library
(`@testing-library/*` with `jest-dom` / `jasmine-dom` matchers and `user-event`),
and Vue Test Utils (`mount`/`wrapper.find`/`flushPromises`/`nextTick`).
`expect().matcher()`, chai `expect().to`, `assert`, `x.should`, and AVA `t.is` all
count as real assertions, so a Mocha or AVA test is not mistaken for one that never
checks anything.

Note: component files (`.vue`, `.svelte`, `.astro`, `.marko`) and templates (`.html`)
are not test files. Tests for those frameworks are written in `.spec`/`.test` files in
the eight extensions above, which is what the scanner reads.

## Test levels (the pyramid)

falsegreen-js scans tests at every level of the pyramid. Discovery is level-agnostic - it
reads any test file - but a few codes are read in light of the level, so a valid pattern at
one level is not flagged at another.

- **Unit:** a function or component with its boundaries doubled. The oracle is `expect`.
- **Integration (API and database):** API tests through supertest / chai-http
(`request(app).get("/").expect(200)`, recognized as an assertion) or `fetch`, and database
tests through Prisma / TypeORM / Knex against a real datastore. These cross the I/O
boundary on purpose, so the response or row IS the verification at that level.
- **E2E:** Cypress (`.cy.*`) and Playwright (`.e2e.*`). `cy.get().should(...)` and
`expect(page).toHaveURL(...)` are the oracle; a visible element is a real check here, not a
weak one.

A real API or database call inside a test that claims to be a unit test is itself the smell
(mystery guest, environment coupling), not the level of the test. C23 flags the hard-coded
file path or URL form.

## Case catalog

Codes shared with `falsegreen` (Python) keep the same id, so cross-language results
line up in the research. `JS*` codes are ecosystem-specific.

| Code | Confidence | What it flags |
|---|---|---|
| C2 | high | test with no check at all (empty body) |
| C2b | low | test calls code but asserts nothing |
| C5 | high | always-true check (`expect(true).toBe(true)`, `assert(1)`) |
| C6 | low | weak check — only verifies something came back (`toBeTruthy`/`toBeDefined`, `length > 0`) |
| C7 | high | compares a thing to itself (`expect(x).toBe(x)`) |
| C44 | high | numeric tautology — a length compared so the result is always true (`x.length >= 0`) |
| C20 | high | assertion in unreachable code (after a `return`/`throw`/`process.exit`, a `break`, a both-arms-terminating `if`, or an exhaustive `switch`) — it never runs |
| C23 | low | reads a real file at a literal path, or a hard-coded URL (mystery guest) |
| C8 | low | exact equality on a float (use `toBeCloseTo`) |
| C9 | low | `toThrow()` with no error type or message — accepts any error |
| C16 | low | result depends on `Date.now`, `Math.random`, or a fixed timer |
| C18 | low | compares `String(x)` / `JSON.stringify(x)` / `` `${x}` `` to a literal (formatting, not value) |
| C21 | low | every assertion is conditional — none runs unconditionally |
| C37 | low | duplicate case in `it.each`/`test.each` — the same scenario runs twice |
| C48 | low | dark patch — the test flips a test-mode flag (`process.env.NODE_ENV = "test"`, `process.env.TESTING`, a `TESTING` flag) then asserts, exercising the product's test-only branch |
| CC | low | commented-out assertion |
| JS1 | high | focused test (`it.only` / `fit`) silently skips the rest of the suite |
| JS2 | high | `expect(x)` with no matcher — the assertion never runs |
| JS3 | low | snapshot is the only assertion |
| JS4 | low | skipped test (`it.skip` / `xit` / `it.todo`) never runs |
| JS5 | low | async query/event not awaited (`findBy*` / `waitFor` / `user-event`) |
| JS6 | high | empty `describe`/`suite` — the suite is green but runs nothing |
| JS7 | low | assertion inside a non-awaited `setTimeout`/`then` callback — may run after the test ends |
| JS8 | low | mocks the unit under test (`jest.mock`/`vi.mock` of an imported module asserted directly) |
| JS9 | high | assertion in a dead branch (`if(false)` / `if(true){}else`) — never runs |
| JS11 | low | `try/catch` swallows the assertion — a failing `expect` is caught, test stays green |
| JS13 | low | query (`getBy*`/`queryBy*`) as a loose statement — its result is never asserted |
| JS15 | low | inappropriate assertion — comparison wrapped in a boolean (`expect(a===b).toBe(true)`), blind failure message |
| JS17 | low | commented-out test block (`// it(...)` / `// test(...)`) — disabled, no longer runs |
| JS18 | low | test takes a `done` callback instead of async/await — a mistimed `done` passes early |
| JS21 | high | matcher referenced but never called (`expect(x).toBe` with no `()`) — the assertion never runs |
| JS22 | high | empty `it.each`/`test.each` table — generated with zero cases, never runs |
| JS23 | high | `expect.assertions(N)` with fewer unconditional `expect()` calls than N — the guard can never be met |
| JS24 | low | Cypress query (`cy.get`/`cy.find`/`cy.contains`) as a loose statement with no terminating `.should`/`.and` and no `expect` in `.then` — its result is never asserted |

Each code carries a judgment tag (J1-J6) shared with the
[falsegreen-skill](https://github.com/vinicq/falsegreen-skill) semantic framework.

### Opt-in: maintainability group (default off)

These are **not** false-green - the test still protects something - so they are off by
default. Enable them with `--diagnostics`, or per code via config `severity`. They are a
"plus" for test-code health, mirroring falsegreen's diagnostic/coupling groups.

| Code | Group | What it flags |
|---|---|---|
| D1 | diagnostic | assertion roulette — many assertions in one test |
| D3 | diagnostic | duplicate assert — the same assertion repeated |
| D4 | diagnostic | `it.each`/`test.each` without titled cases (index-only) |
| D6 | diagnostic | `console.*` in a test body |
| D7 | diagnostic | anonymous test — empty or missing description |
| D8 | diagnostic | magic number — a bare numeric literal as the expected value |
| M2 | coupling | test body exceeds the line-count threshold |

```bash
npx falsegreen-js --diagnostics # include D*/M* as warnings
```

### Deliberately not implemented

Some catalog codes were reviewed and left out, on purpose:

- **JS19** (`toBe` on an object/array literal): `expect(x).toBe({...})` compares by reference,
so it always fails. That is the false-red axis (a test that always fails), the opposite of
what this scanner looks for, and out of scope on principle.
- **JS20** (a Promise compared without `resolves`/`rejects`): telling that a value is a
Promise needs type information the AST does not carry, so it would be too noisy.
- **JS12** (a floating promise whose `expect` is never returned): already covered by JS7.
- **JS16** (`async` test with no `expect.assertions(n)`): the *absence* of a guard is not a
smell on its own; flagging it would fire on most async tests. The implemented sibling is
`JS23`, which fires on a present-but-unsatisfiable guard: `expect.assertions(N)` with a
numeric `N` higher than the unconditional `expect()` calls that can run, so the count can
never be met.
- **JS14** (a giant inline snapshot): a readability and review-noise concern, not a
false-green one. The snapshot still protects, so it belongs to the diagnostic group and is
better served by `eslint-plugin-jest` (`no-large-snapshots`) as an opt-in lint rule.
- **JS10** (any conditional in a test body): handled by `eslint-plugin-jest`
(`no-conditional-in-test`); JS9 and C21 already cover the false-green subset.
- **C1** (an assertion under an `if`/`for` that may not run): redundant once C21 and JS9
exist, and high-FP on its own. C21 already fires the actual false-green case, where
*every* assertion is conditional and the test can pass with nothing checked. A test that
mixes a conditional assertion with an unconditional one is not false-green: the
unconditional assertion still protects. JS9 covers the dead-branch form (`if(false)`).
Flagging every conditional assertion (C1's full scope) is the linter concern JS10 already
names (`no-conditional-in-test`), so C1 would add noise without a new false-green signal.

### What carries over from falsegreen, what does not

Ported (same concept): C2, C2b, C5, C7, C8, C16, C44, C48, CC.

Python-only, not applicable to JS/TS: pytest collection rules (C4 family), `pytest.raises`
breadth (C9/C19/C27/C28), fixtures and `os.environ`/global-state codes (C23/C24/C29),
sklearn/torch/tensorflow metric and seed codes (C33, parts of C16), xfail (C25), and the
xunit/`self.assert*` codes. These have no JS equivalent or need a different signal.

JS/TS-only (new here): JS1-JS5 above. The `describe.only`/skip, snapshot, no-matcher,
and not-awaited patterns are specific to the JS test runners and Testing Library.

## Configuration

Optional. `falsegreen.json`, `.falsegreenrc.json`, or a `"falsegreen"` key in
`package.json`:

```json
{
"disable": ["C8"],
"exclude": ["**/legacy/**"],
"severity": { "JS3": "off", "C16": "high" }
}
```

Precedence: CLI `--disable` > CLI `--enable` > config `disable`/`severity` > catalog default. `--enable ` re-activates listed off or opt-in codes at their catalog severity (it flips a default-off code on; it cannot raise a code above catalog). A code passed to both `--enable` and `--disable` stays off — `--disable` wins.

## Scope and honesty

This is a static scanner. It owns what the structure proves. Two things it does not
decide: whether the expected value contradicts the intended behavior, and whether the
test re-implements the production logic. Those are semantic and belong to the
`falsegreen-skill` LLM pass. Precision over recall: a softened heuristic that misses a
case is preferred to one that flags correct code.

Measured against the [Open Catalog of Test Smells](https://test-smell-catalog.readthedocs.io/) (517 documented smells), only the false-green slice is in scope. What stays out, on purpose: **brittleness / false-red** (sensitive equality, brittle assertions - the opposite axis), **hygiene / maintainability** (assertion roulette, magic numbers, long tests - linter territory, a few surfaced as opt-in diagnostics), and **slow, design, naming, duplication, runtime/culture** (none about whether the test protects). The boundary is deliberate: where a smell has a statically provable false-green form, that form is a code here - uncontrolled `Date.now`/`Math.random` is `C16`, a hard-coded path or URL is `C23`, an assertion that may never run is `C21`/`C20`, and JS-specific forms (focused tests, missing matchers) are the `JS*` codes. See [CREDITS.md](CREDITS.md) for the full cross-walk.

## References

The catalog is grounded in the test-smell literature. Direct influences: the
rotten-green-test work that names this whole family (Delplanque et al., ICSE 2019),
the founding test-smell refactoring catalog (van Deursen et al., XP 2001), the
JS/TS empirical studies (Jorge, UFCG 2023; Silva, PUC Minas 2022 - the academic
precedent for the focused-test and snapshot codes; Oliveira et al., SBES 2024/2025),
and the detection-tool baselines (tsDetect, Peruma et al., 2020). Full list and the
code-to-source mapping in [CREDITS.md](CREDITS.md).

## Status

The rule set is a deterministic core; the full JS/TS smell catalog is tracked as
research in the private audit hub. See [STATUS.md](STATUS.md) for the current version
and rule coverage. Issues and PRs welcome.

## License

MIT, Vinicius Queiroz.

## Contributors ✨

Thanks to the people who keep false-green tests out of real suites ([emoji key](https://allcontributors.org/docs/en/emoji-key)):

[![All Contributors](https://img.shields.io/badge/all_contributors-2-orange.svg?style=flat-square)](#contributors-)



Vinicius Queiroz
Vinicius Queiroz

💻 📖 🤔 🚧 🚇 ⚠️ 🔬
Home Seller
Home Seller

💻 📖 ⚠️ 🚇

New contributors are added automatically; the table also recognizes non-code work (docs, ideas, infrastructure, tests, research) via the [all-contributors](https://allcontributors.org) spec.