{"id":27376266,"url":"https://github.com/bert-w/sqomplexity","last_synced_at":"2026-02-07T00:01:40.375Z","repository":{"id":206852459,"uuid":"673956385","full_name":"bert-w/sqomplexity","owner":"bert-w","description":"SQompLexity is a Node.js program that assigns a complexity score to SQL SELECT queries, based on a data and cognitive complexity score.","archived":false,"fork":false,"pushed_at":"2024-12-24T10:43:59.000Z","size":1077,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"2.x","last_synced_at":"2025-08-16T09:00:06.785Z","etag":null,"topics":["complexity","metric","mysql","sql"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bert-w.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-08-02T20:07:42.000Z","updated_at":"2025-07-29T08:40:16.000Z","dependencies_parsed_at":"2024-02-26T19:39:59.279Z","dependency_job_id":"7586540d-afa4-4bd7-8b00-aea657c88e7b","html_url":"https://github.com/bert-w/sqomplexity","commit_stats":null,"previous_names":["bert-w/sqomplexity"],"tags_count":15,"template":false,"template_full_name":null,"purl":"pkg:github/bert-w/sqomplexity","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bert-w%2Fsqomplexity","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bert-w%2Fsqomplexity/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bert-w%2Fsqomplexity/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bert-w%2Fsqomplexity/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bert-w","download_url":"https://codeload.github.com/bert-w/sqomplexity/tar.gz/refs/heads/2.x","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bert-w%2Fsqomplexity/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29181265,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-06T23:15:33.022Z","status":"ssl_error","status_checked_at":"2026-02-06T23:15:09.128Z","response_time":59,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["complexity","metric","mysql","sql"],"created_at":"2025-04-13T12:36:12.628Z","updated_at":"2026-02-07T00:01:40.359Z","avatar_url":"https://github.com/bert-w.png","language":"JavaScript","readme":"# SQompLexity\n[![Build Status](https://github.com/bert-w/sqomplexity/actions/workflows/tests.yml/badge.svg)](https://github.com/bert-w/sqomplexity/actions)\n[![NPM Version](http://img.shields.io/npm/v/sqomplexity.svg?style=flat)](https://www.npmjs.org/package/sqomplexity)\n[![NPM Downloads](https://img.shields.io/npm/dm/sqomplexity.svg?style=flat)](https://npmcharts.com/compare/sqomplexity?minimal=true)\n[![Install Size](https://packagephobia.now.sh/badge?p=sqomplexity)](https://packagephobia.now.sh/result?p=sqomplexity)\n```txt\n   _____   ____                            _                  _  _          \n  / ____| / __ \\                          | |                (_)| |         \n | (___  | |  | |  ___   _ __ ___   _ __  | |      ___ __  __ _ | |_  _   _ \n  \\___ \\ | |  | | / _ \\ | '_ ` _ \\ | '_ \\ | |     / _ \\\\ \\/ /| || __|| | | |\n  ____) || |__| || (_) || | | | | || |_) || |____|  __/ \u003e  \u003c | || |_ | |_| |\n |_____/  \\___\\_\\ \\___/ |_| |_| |_|| .__/ |______|\\___|/_/\\_\\|_| \\__| \\__, |\n                                   | |                                 __/ |\n     Calculate complexity scores   |_|   for SQL queries              |___/ \n     \n```\nSQompLexity is a metric that assigns a complexity score to SQL queries. It is specifically tailored to work with\nMySQL queries, but other dialects of SQL will likely work as well. It needs no knowledge of the database schema and\nquantifies each query in a vacuum.\n\n## Installation\n```shell\nnpm i sqomplexity\n```\n\n## Demo\nhttps://bert-w.github.io/sqomplexity/\n\n## Usage instructions\n### Execution in Node (v16, v18, v20)\n```js\nimport { Sqomplexity } from 'sqomplexity';\n\n(async () =\u003e {\n    const sqomplexity = new Sqomplexity([\n        \"SELECT * FROM users\",\n    ]);\n    \n    console.log(\n        await sqomplexity.score()\n    );\n    \n    // Result: [ 2.40625 ]\n})();\n```\nSee [examples/node.js](examples/node.js) for a full example.\n\n### Execution in a browser\nUse the precompiled [dist/sqomplexity.umd.js](dist/sqomplexity.umd.js) file:\n```html\n\u003cscript src=\"sqomplexity.umd.js\"\u003e\u003c/script\u003e\n\u003cscript\u003e\n    (async() =\u003e {\n        // The UMD build exposes the `$sqomplexity` global constructor.\n\n        console.log(\n            await (new window.$sqomplexity('SELECT * FROM users')).score()\n        )\n\n        // Result: [ 7.876953 ]\n    })();\n\u003c/script\u003e\n```\nSee [examples/browser.html](examples/browser.html) for a full example.\n\n### Execution as a Stand-alone CLI application\nUse the precompiled [dist/sqomplexity.js](dist/sqomplexity.js) containing all required code in a single file.\n\nOptions:\n```shell\nnode sqomplexity.js --help\n\nArguments:\n  queries                  one or multiple SQL queries (space separated or quoted)\n\nOptions:\n  -V, --version            output the version number\n  -f, --files              assumes the given arguments/queries are filepaths, and it will read the contents from them.\n                           Every file is expected to contain 1 query; if not, their complexity is summed\n  -b, --base64             assumes the given arguments/queries are base64 encoded\n  -s, --score              output only the complexity score. -1 will be returned if an error occurs\n  -w, --weights \u003cweights\u003e  takes a path to a json file that defines a custom set of weights\n  -a, --all                returns all data including the AST\n  -p, --pretty-print       output JSON with indentation and newlines (default: false)\n  -h, --help               display help for command\n```\nSee [examples/cli.sh](examples/cli.sh) for various examples.\n\n## Explanation of the complexity metric\n\nThe scoring of an SQL query is based on 2 major components, being:\n\n**Data complexity** (see prefix **D** in the table below), also called [_Computational complexity_](https://en.wikipedia.org/wiki/Computational_complexity), which takes into account elements like the _amount of rows_\nthat a query operates on (relatively speaking), the _computation paths_ a query may take, and the usage of\n_table indexes_ (_indices_). All of these determine the computational cost of a certain component.\n\n**Cognitive complexity** (see prefix **C** in the table below), which describes the mental effort and the concepts a\nperson must understand in order to parse the query. This includes components like understanding of [_First-order logic_](https://en.wikipedia.org/wiki/First-order_logic),\nunderstanding of _grouping_, _filtering_ and _sorting_ (common SQL concepts), and [_Domain knowledge_](https://en.wikipedia.org/wiki/Domain_knowledge)\nlike the context of the query compared to its database schema.\n\n### Complexity indicators\n| Code                 | Explanation                                                                                    |\n|----------------------|------------------------------------------------------------------------------------------------|\n| *Indexing behavior*  |                                                                                                |\n| D1-A                 | No possibility to affect the chosen index                                                      |\n| D1-B                 | Low possibility to affect the chosen index                                                     |\n| D1-C                 | High possibility to affect the chosen index                                                    |\n|                      |                                                                                                |\n| *Running time*       |                                                                                                |\n| D2-A                 | $O(0)$ (negligible) running time w.r.t. the number of rows                                     |\n| D2-B                 | $O(1)$ (constant) running time w.r.t. the number of rows                                       |\n| D2-C                 | $O(\\log n)$ (logarithmic) running time w.r.t. the number of rows                               |\n| D2-D                 | $O(n)$ (linear) running time w.r.t. the number of rows                                         |\n| D2-E                 | $O(n \\log n)$ (linearithmic) running time w.r.t. the number of rows                            |\n| D2-F                 | $O(x)$ (highly variable) running time w.r.t. the number of rows                                |\n|                      |                                                                                                |\n| *Relational algebra* |                                                                                                |\n| C1                   | Requires understanding of *projection* (selection of columns)                                  |\n| C2                   | Requires understanding of *selection* (e.g. boolean logic like (in)equalities and comparisons) |\n| C3                   | Requires understanding of *composition* (multiple tables, column relations, set theory)        |\n| C4                   | Requires understanding of *grouping*                                                           |\n| C5                   | Requires understanding of *aggregation*                                                        |\n|                      |                                                                                                |\n| *Programming*        |                                                                                                |\n| C6                   | Requires understanding of *data types* (e.g. integers, decimals, booleans, dates, times)       |\n| C7                   | Requires understanding of variable *scopes*                                                    |\n| C8                   | Requires understanding of *nesting*                                                            |\n|                      |                                                                                                |\n| *Usage*              |                                                                                                |\n| C9-A                 | One parameter                                                                                  |\n| C9-B                 | Low amount of parameters                                                                       |\n| C9-C                 | High amount of parameters                                                                      |\n| C10                  | Requires understanding of the *database schema*                                                |\n| C11                  | Requires understanding of the *RDBMS* toolset (e.g. function support and differences)          |\n\nWhat follows is the assignment of each of these indicators to components of an SQL query. The table below shows the\nresult of this process. The combination and presence of these indicators are combined into a final weighting for each\ncomponent, namely **Low**, **Medium** or **High**.\n\n### Complexity scoring\n| Component                   | Data Complexity | By            | Cognitive Complexity | By                            |\n|-----------------------------|-----------------|---------------|----------------------|-------------------------------|\n| **Clause:SELECT**           | Low             | D1-A, D2-D    | Low                  | C1, C6, C9-B, C10             |\n| **Clause:FROM**             | Medium          | D1-B, D2-D    | Low                  | C3, C7, C9-A, C10             |\n| **Clause:JOIN**             | Medium          | D1-C, D2-F    | Medium               | C2, C3, C7, C9-B, C10         |\n| **Clause:WHERE**            | High            | D1-C, D2-C/D  | Medium               | C2, C6, C9-B, C10             |\n| **Clause:GROUP BY**         | High            | D1-C, D2-D/E  | High                 | C2, C4, C5, C9-B, C10         |\n| **Clause:HAVING**           | Medium          | D1-A, D2-D    | High                 | C2, C4, C5, C9-C, C10         |\n| **Clause:ORDER BY**         | Low             | D1-C, D2-D/E  | Medium               | C6, C9-B, C10                 |\n| **Clause:LIMIT**            | Low             | D1-A, D2-B    | Low                  | C9-A                          |\n| **Clause:OFFSET**           | Low             | D1-A, D2-B    | Low                  | C9-A                          |\n| **Expression:Table**        | Medium          | D1-B, D2-A    | Medium               | C9-A, C10                     |\n| **Expression:Column**       | Medium          | D1-B, D2-A    | Medium               | C6, C9-A, C10                 |\n| **Expression:String**       | Low             | D1-A, D2-A    | Low                  | C6, C9-A                      |\n| **Expression:Number**       | Low             | D1-A, D2-A    | Low                  | C6, C9-A                      |\n| **Expression:Null**         | Low             | D1-A, D2-A    | Low                  | C6, C9-A                      |\n| **Expression:Star**         | Low             | D1-A, D2-A    | Low                  | C1, C9-A                      |\n| **Expression:Unary**        | Low             | D1-A, D2-A    | Medium               | C2, C6, C9-A                  |\n| **Expression:Binary**       | Low             | D1-A, D2-A    | Medium               | C2, C6, C9-B                  |\n| **Expression:Function**     | High            | D1-B, D2-D    | Medium               | C6, C9-A, C11                 |\n| **Expression:List**         | Low             | D1-C, D2-A    | Low                  | C6, C9-C                      |\n| **Expression:Agg-Function** | High            | D1-B, D2-F    | High                 | C4, C5, C9-A, C10, C11        |\n| **Operator**                | Low             | D1-C, D2-A    | Medium               | C2, C6, C9-B                  |\n| **Emergent:Cycle**          | Medium          | D1-B, D2-F    | High                 | C2, C3, C9-C, C10             |\n| **Emergent:Mixed-Style**    | None            | D1-A, D2-A    | Medium               | C9-C                          |\n| **Emergent:Subquery**       | High            | D1-C, D2-F    | High                 | C1, C2, C3, C7, C8, C9-C, C10 |\n| **Emergent:Variety**        | None            | D1-A, D2-A    | Medium               | C9-C                          |\n\n### Calculation\nEach query that passes through SQompLexity is parsed into an Abstract Syntax Tree (AST), which provides the backbone of\nthe algorithm that sums up the weights. Each query is traversed fully (including subqueries), and the scores are summed\nto result in a final SQompLexity score for any given SQL query.\n\nThe numerical weights for each of groups are like so:\n\n| **Category**         | **Numerical Score** |\n|----------------------|---------------------|\n| Data Complexity      | 50%                 |\n| Cognitive Complexity | 50%                 |\n|                      |                     |\n| Low                  | 1.0                 |\n| Medium               | 1.25                |\n| High                 | 1.5                 |\n\nThe equal contribution of both _Data Complexity_ and _Cognitive Complexity_ is arbitrary, and research could still be done\nto develop a distribution that more fairly approaches a general sense of _complexity_.\n\nSimilarly, the weights of _Low_, _Medium_ and _High_ are set to some sensible defaults. It is necessary though for all\nweights to be greater than or equal to 1, since multiplication may take place during the algorithm.\n\n## Project Origin\nThis is a product of my master's thesis on complexity progression and correlations on Stack Overflow. For this study, I have developed an SQL complexity metric to be used on question and answer data from Stack Overflow.","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbert-w%2Fsqomplexity","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbert-w%2Fsqomplexity","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbert-w%2Fsqomplexity/lists"}