{"id":34034132,"url":"https://github.com/largecats/sparksql-formatter","last_synced_at":"2026-04-05T01:31:37.145Z","repository":{"id":49194588,"uuid":"273904982","full_name":"largecats/sparksql-formatter","owner":"largecats","description":"A SparkSQL formatter based on https://github.com/zeroturnaround/sql-formatter, with customizations and extra features.","archived":false,"fork":false,"pushed_at":"2024-11-07T01:21:55.000Z","size":354,"stargazers_count":14,"open_issues_count":1,"forks_count":4,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-12-15T10:47:38.475Z","etag":null,"topics":["formatter","python","query-language","sparksql"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/largecats.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-06-21T13:00:48.000Z","updated_at":"2024-11-07T01:21:59.000Z","dependencies_parsed_at":"2024-11-07T02:25:24.955Z","dependency_job_id":"fa307923-30b0-4547-b839-baea5b62c643","html_url":"https://github.com/largecats/sparksql-formatter","commit_stats":{"total_commits":179,"total_committers":2,"mean_commits":89.5,"dds":"0.23463687150837986","last_synced_commit":"530108a37c938e02f8b6de3ef753f562eb5367f5"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/largecats/sparksql-formatter","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/largecats%2Fsparksql-formatter","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/largecats%2Fsparksql-formatter/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/largecats%2Fsparksql-formatter/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/largecats%2Fsparksql-formatter/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/largecats","download_url":"https://codeload.github.com/largecats/sparksql-formatter/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/largecats%2Fsparksql-formatter/sbom","scorecard":{"id":579065,"data":{"date":"2025-08-11","repo":{"name":"github.com/largecats/sparksql-formatter","commit":"f7f8204f1491e46749eb116dbecdf4f83f91a7af"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3.2,"checks":[{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Code-Review","score":1,"reason":"Found 2/14 approved changesets -- score normalized to 1","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: MIT License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 26 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-20T18:40:24.666Z","repository_id":49194588,"created_at":"2025-08-20T18:40:24.667Z","updated_at":"2025-08-20T18:40:24.667Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31421869,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-05T00:25:07.052Z","status":"ssl_error","status_checked_at":"2026-04-05T00:25:05.923Z","response_time":60,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["formatter","python","query-language","sparksql"],"created_at":"2025-12-13T19:33:43.507Z","updated_at":"2026-04-05T01:31:37.130Z","avatar_url":"https://github.com/largecats.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# sparksqlformatter\nA [SparkSQL](http://spark.apache.org/docs/latest/sql-ref.html) formatter in Python based on [sql-formatter](https://github.com/zeroturnaround/sql-formatter) and its fork [sql-formatter-plus](https://github.com/kufii/sql-formatter-plus), with customizations and extra features.\n\nUsed by: [pysqlformatter](https://github.com/largecats/pyspark-sql-formatter).\n\n- [sparksqlformatter](#sparksqlformatter)\n- [Installation](#installation)\n  - [Install using pip](#install-using-pip)\n  - [Install from source](#install-from-source)\n- [Compatibility](#compatibility)\n- [Usage](#usage)\n  - [Use as command-line tool](#use-as-command-line-tool)\n  - [Use as Python library](#use-as-python-library)\n- [Style configurations](#style-configurations)\n\n# Installation\n\n## Install using pip\n```\npip install sparksqlformatter\n```\n\n## Install from source\n1. Download source code.\n2. Navigate to the source code directory.\n3. Do `python setup.py install` or `pip install .`.\n\n# Compatibility\nSupports Python 2.7 and 3.6+.\n\n# Usage\n`sparksqlformatter` can be used as either a command-line tool or a Python library.\n\n## Use as command-line tool\n```\nusage: sparksqlformatter [-h] [-f FILES [FILES ...]] [-i] [--style STYLE]\n\nFormatter for SparkSQL queries.\n\noptional arguments:\n  -h, --help            show this help message and exit\n  -f FILES [FILES ...], --files FILES [FILES ...]\n                        Paths to files to format.\n  -i, --in-place        Format the files in place.\n  --style STYLE         Style configurations for SparkSQL. Can be a path to a style config file or a dictionary.\n```\n\n**Style**   \n\nThe `--style` argument specifies foramtting style. Supported language attributes can be found in [style configurations](#style-configurations).\n\nThere are two ways to specify style:  \n* Path to a style config file. E.g.,\n```\n$ sparksqlformatter --style=\"\u003cpath_to_config_file\u003e\" -f \u003cpath_to_file1\u003e \u003cpath_to_file2\u003e\n```\nThe style config file should have section `[sparksqlformatter]` and key-value pairs specifying attributes. E.g.,\n```\n[sparksqlformatter]\nreservedKeywordUppercase = False\nlinesBetweenQueries = 2\n```\n* Dictionary of configurations expressed as key-value pairs. E.g.,\n```\n$ sparksqlformatter --style=\"{'reservedKeywordUppercase': False}\" -f \u003cpath_to_file1\u003e \u003cpath_to_file2\u003e\n```\n\n## Use as Python library\n\nCall `sparksqlformatter.api.format_query()` to format query in string:\n```\n\u003e\u003e\u003e from sparksqlformatter import api\n\u003e\u003e\u003e query = 'select c1 from t1'\n\u003e\u003e\u003e api.format_query(query)\n'SELECT\\n    c1\\nFROM\\n    t0'\n```\nCall `hiveql.formatter.api.format_file()` to format query in file:\n```W\n\u003e\u003e\u003e from sparksqlformatter import api\n\u003e\u003e\u003e api.format_file(\u003cpath_to_file\u003e, inPlace=False)\n...\n```\n\n**Style**   \n\nFormatting style can be specified via the `style` parameter in the api format functions.\n\nSimilar to the command-line tool, there are two ways to create configurations when using `sparksqlformatter` as a Python library:   \n* Path to a style config file\n```\n\u003e\u003e\u003e from sparksqlformatter import api\n\u003e\u003e\u003e style = '\u003cpath_to_config_file\u003e'\n\u003e\u003e\u003e query = 'select c1 FROM t0'\n\u003e\u003e\u003e api.format_query(query, style)\n...\n```\n* Dictionary\n```\n\u003e\u003e\u003e from sparksqlformatter import api\n\u003e\u003e\u003e style = {'reservedKeywordUppercase': False}\n\u003e\u003e\u003e query = 'select c1 FROM t0'\n\u003e\u003e\u003e api.format_query(query, style)\n'select\\n    c1\\nfrom\\n    t0'\n```\n\n# Style configurations\n\n**`topLevelKeywords`**   \n\nA list of keywords that should start a query block when formatting. E.g.,\n```sql\nSELECT\n    [block]\nFROM\n    [block]\n```\nDefault to\n```python\nTOP_LEVEL_KEYWORDS = [\n    'ADD', 'AFTER', 'ALTER COLUMN', 'ALTER TABLE', 'CREATE TABLE', 'CROSS JOIN', 'DELETE FROM', 'EXCEPT',\n    'FETCH FIRST', 'FROM', 'GROUP BY', 'GO', 'HAVING', 'INNER JOIN', 'INSERT INTO', 'INSERT', 'JOIN',\n    'LEFT JOIN', 'LEFT OUTER JOIN', 'LIMIT', 'MODIFY', 'ORDER BY', 'OUTER JOIN', 'PARTITION BY', 'RIGHT JOIN',\n    'RIGHT OUTER JOIN', 'SELECT', 'SET CURRENT SCHEMA', 'SET SCHEMA', 'SET', 'UPDATE', 'VALUES', 'WHERE'\n]\n```\n\n**`topLevelKeywordsNoIndent`**   \n\nA list of top-level keywords that should not be indented when formatting. E.g., `UNION` in\n```sql\nSELECT\n    ...\nFROM\n    ...\nUNION\nSELECT\n    ...\nFROM\n    ...\n```\nDefault to\n```Python\nTOP_LEVEL_KEYWORDS_NO_INDENT = ['INTERSECT', 'INTERSECT ALL', 'MINUS', 'UNION', 'UNION ALL']\n```\n\n**`newlineKeywords`**   \n\nA list of keywords that should start a newline when formatting. E.g., `LEFT JOIN` in\n```sql\nSELECT\n    ...\nFROM\n    t0\n    LEFT JOIN t1 ...\n    LEFT JOIN t2 ...\n```\nNote that this is less restrictive than `topLevelKeywords`, since top-level keywords always start a newline.\nDefault to\n```python\nNEWLINE_KEYWORDS = [\n    'AND', 'ELSE', 'LATERAL', 'ON', 'OPTIONS', 'OR', 'PARTITIONED BY', 'THEN', 'USING', 'WHEN', 'XOR'\n]\n```\n\n**`stringTypes`**   \n\nA list of character pairs that enclose strings in the query language. Default to\n```python\n['\"\"', \"''\", '{}', '``']\n```\n\n**`openParens`**   \n\nA list of strings that behave as opening parentheses in the query language regarding block indent level. Default to\n```python\n['(', '[', 'CASE']\n```\n\n**`closeParens`**   \n\nA list of strings that behave as closing parentheses in the query language regarding block indent level. Default to\n```python\n[')', ']', 'END']\n```\n\n**`lineCommentTypes`**   \n\nA list of prefixes to comments in the query language. Default to\n```python\n['--']\n```\n\n**`reservedKeywordUppercase`**   \n\nA boolean indicating whether the keywords should be converted to uppercase when formatting. Default to `True`.\n\n**`linesBetweenQueries`**   \n\nAn integer that specifies the number of blank lines to put between (sub-)queries when formatting. E.g., with `linesBetweenQueries = 1`,\n```sql\nWITH t0 AS (\n    ...\n),\n\nt1 AS (\n    ...\n)\n\nSELECT\n    ...\nFROM\n    ...\n```\n\n**`specialWordChars`**   \n\nA list of characters that require special handling when formatting. Default to `[]`.\n\n**`indent`**   \n\nA string that specifies one indent. Default to four blanks:\n```python\n'    '\n```\n\n**`inlineMaxLength`**    \n\nMaximum length of an inline block. Default to `120`.\n\n**`splitOnComma`**    \n\nIf true, in cases where a comma separated list in `GROUP BY`, `ORDER BY` clauses is too long to fit in a line, split such that all elements are on a single line.\nElse, will only split at `inlineMaxLength`.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flargecats%2Fsparksql-formatter","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flargecats%2Fsparksql-formatter","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flargecats%2Fsparksql-formatter/lists"}