{"id":30772189,"url":"https://github.com/query-farm/python-sql-manipulation","last_synced_at":"2025-09-05T00:52:47.684Z","repository":{"id":303324722,"uuid":"1015105997","full_name":"Query-farm/python-sql-manipulation","owner":"Query-farm","description":"A Python library for intelligent SQL predicate manipulation using SQLGlot. This library provides tools to safely remove specific predicates from SQL WHERE clauses and filter SQL statements based on column availability.","archived":false,"fork":false,"pushed_at":"2025-07-07T02:42:11.000Z","size":0,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-07T03:10:08.726Z","etag":null,"topics":["predicate-filtering","predicate-logic","predicate-pushdown","sql","sqlglot"],"latest_commit_sha":null,"homepage":"https://query.farm","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Query-farm.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-07T02:39:47.000Z","updated_at":"2025-07-07T02:43:16.000Z","dependencies_parsed_at":"2025-07-07T03:10:12.010Z","dependency_job_id":"e63069ed-7f7c-472a-a713-d0987701237e","html_url":"https://github.com/Query-farm/python-sql-manipulation","commit_stats":null,"previous_names":["query-farm/python-sql-manipulation"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Query-farm/python-sql-manipulation","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Query-farm%2Fpython-sql-manipulation","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Query-farm%2Fpython-sql-manipulation/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Query-farm%2Fpython-sql-manipulation/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Query-farm%2Fpython-sql-manipulation/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Query-farm","download_url":"https://codeload.github.com/Query-farm/python-sql-manipulation/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Query-farm%2Fpython-sql-manipulation/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273695250,"owners_count":25151484,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-04T02:00:08.968Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["predicate-filtering","predicate-logic","predicate-pushdown","sql","sqlglot"],"created_at":"2025-09-05T00:52:45.081Z","updated_at":"2025-09-05T00:52:47.666Z","avatar_url":"https://github.com/Query-farm.png","language":"Python","readme":"# [Query.Farm](https://query.farm) SQL Manipulation\n\nA Python library for intelligent SQL predicate manipulation using [SQLGlot](https://sqlglot.org)\n\n### Column Filtering with Complex Expressions\n\n```python\nimport sqlglot\nfrom query_farm_sql_manipulation import transforms\n\nsql = '''\nSELECT * FROM users\nWHERE age \u003e 18\n  AND (status = 'active' OR role = 'admin')\n  AND department IN ('engineering', 'sales')\n'''\n\n# Parse the statement first\nstatement = sqlglot.parse_one(sql, dialect=\"duckdb\")\n\n# Only keep predicates involving 'age' and 'role'\nallowed_columns = {'age', 'role'}\n\nresult = transforms.filter_column_references(\n    statement=statement,\n    selector=lambda col: col.name in allowed_columns,\n)\n\n# Result: SELECT * FROM users WHERE age \u003e 18 AND role = 'admin'\nprint(result.sql())\n```\n\n## Features\n\n- **Predicate Removal**: Safely remove specific predicates from complex `SQL WHERE` clauses while preserving logical structure\n- **Column Filtering**: Filter SQL statements to only include predicates referencing allowed columns\n- **Intelligent Logic Handling**: Properly handles `AND/OR` logic, nested expressions, `CASE` statements, and parentheses\n- **SQLGlot Integration**: Built on top of [SQLGlot](https://sqlglot.com/sqlglot.html) for robust SQL parsing and manipulation\n- **Multiple Dialect Support**: Works with various SQL dialects (default: DuckDB)\n\n## Installation\n\n```bash\npip install query-farm-sql-manipulation\n```\n\n## Requirements\n\n- Python \u003e= 3.12\n- SQLGlot \u003e= 26.33.0\n\n## Quick Start\n\n### Basic Predicate Removal\n\n```python\nimport sqlglot\nfrom query_farm_sql_manipulation import transforms\n\n# Parse a SQL statement\nsql = 'SELECT * FROM data WHERE x = 1 AND y = 2'\nstatement = sqlglot.parse_one(sql, dialect=\"duckdb\")\n\n# Find the predicate you want to remove\npredicates = list(statement.find_all(sqlglot.expressions.Predicate))\ntarget_predicate = predicates[0]  # x = 1\n\n# Remove the predicate\ntransforms.remove_expression_part(target_predicate)\n\n# Result: SELECT * FROM data WHERE y = 2\nprint(statement.sql())\n```\n\n### Column-Name Based Filtering\n\n```python\nimport sqlglot\nfrom query_farm_sql_manipulation import transforms\n\n# Parse SQL statement first\nsql = 'SELECT * FROM data WHERE color = \"red\" AND size \u003e 10 AND type = \"car\"'\nstatement = sqlglot.parse_one(sql, dialect=\"duckdb\")\n\n# Filter to only include predicates with allowed columns\nallowed_columns = {\"color\", \"type\"}\n\nfiltered = transforms.filter_column_references(\n    statement=statement,\n    selector=lambda col: col.name in allowed_columns,\n)\n\n# Result: SELECT * FROM data WHERE color = \"red\" AND type = \"car\"\nprint(filtered.sql())\n```\n\n## API Reference\n\n### `remove_expression_part(child: sqlglot.Expression) -\u003e None`\n\nRemoves the specified SQLGlot expression from its parent, respecting logical structure.\n\n**Parameters:**\n- `child`: The SQLGlot expression to remove\n\n**Raises:**\n- `ValueError`: If the expression cannot be safely removed\n\n**Supported Parent Types:**\n- `AND`/`OR` expressions: Replaces parent with the remaining operand\n- `WHERE` clauses: Removes the entire WHERE clause if it becomes empty\n- `Parentheses`: Recursively removes the parent\n- `NOT` expressions: Removes the entire NOT expression\n- `CASE` statements: Removes conditional branches\n\n### `filter_column_references(*, statement: sqlglot.Expression, selector: Callable[[sqlglot.expressions.Column], bool]) -\u003e sqlglot.Expression`\n\nFilters a SQL statement to remove predicates containing columns that don't match the selector criteria.\n\n**Parameters:**\n- `statement`: The SQLGlot expression to filter\n- `selector`: A callable that takes a Column and returns True if it should be preserved, False if it should be removed\n\n**Returns:**\n- Filtered SQLGlot expression with non-matching columns removed\n\n**Raises:**\n- `ValueError`: If a column can't be cleanly removed due to interactions with allowed columns\n\n### `where_clause_contents(statement: sqlglot.expressions.Expression) -\u003e sqlglot.expressions.Expression | None`\n\nExtracts the contents of the WHERE clause from a SQLGlot expression.\n\n**Parameters:**\n- `statement`: The SQLGlot expression to extract from\n\n**Returns:**\n- The contents of the WHERE clause, or None if no WHERE clause exists\n\n### `filter_predicates_with_right_side_column_references(statement: sqlglot.expressions.Expression) -\u003e sqlglot.Expression`\n\nFilters out predicates that have column references on the right side of comparisons.\n\n**Parameters:**\n- `statement`: The SQLGlot expression to filter\n\n**Returns:**\n- Filtered SQLGlot expression with right-side column reference predicates removed\n\n## Examples\n\n### Complex Logic Handling\n\nThe library intelligently handles complex logical expressions:\n\n```python\n# Original: (x = 1 AND y = 2) OR z = 3\n# Remove y = 2: x = 1 OR z = 3\n\n# Original: NOT (x = 1 AND y = 2)\n# Remove x = 1: NOT y = 2 (which becomes y \u003c\u003e 2)\n\n# Original: CASE WHEN x = 1 THEN 'yes' WHEN x = 2 THEN 'maybe' ELSE 'no' END\n# Remove x = 1: CASE WHEN x = 2 THEN 'maybe' ELSE 'no' END\n```\n\n### Column Filtering with Complex Expressions\n\n```python\nsql = '''\nSELECT * FROM users\nWHERE age \u003e 18\n  AND (status = 'active' OR role = 'admin')\n  AND department IN ('engineering', 'sales')\n'''\n\n# Only keep predicates involving 'age' and 'role'\nallowed_columns = {'age', 'role'}\n\nresult = transforms.filter_column_references_statement(\n    sql=sql,\n    selector=lambda col: col.name in allowed_columns,\n)\n\n# Result: SELECT * FROM users WHERE age \u003e 18 AND role = 'admin'\n```\n\n### Error Handling\n\nThe library will raise `ValueError` when predicates cannot be safely removed:\n\n```python\nimport sqlglot\nfrom query_farm_sql_manipulation import transforms\n\n# This will raise ValueError because x = 1 is part of a larger expression\nsql = \"SELECT * FROM data WHERE result = (x = 1)\"\nstatement = sqlglot.parse_one(sql, dialect=\"duckdb\")\n\n# Cannot remove x = 1 because it's used as a value, not a predicate\n# This would raise ValueError if attempted\n```\n\n## Supported SQL Constructs\n\n- **Logical Operators**: `AND`, `OR`, `NOT`\n- **Comparison Operators**: `=`, `\u003c\u003e`, `\u003c`, `\u003e`, `\u003c=`, `\u003e=`, `LIKE`, `IN`, `IS NULL`, etc.\n- **Complex Expressions**: `CASE` statements, subqueries, function calls\n- **Nested Logic**: Parentheses and nested boolean expressions\n- **Multiple Dialects**: DuckDB, PostgreSQL, MySQL, SQLite, and more via SQLGlot\n\n## Testing\n\nRun the test suite:\n\n```bash\npytest src/query_farm_sql_manipulation/test_transforms.py\n```\n\nThe test suite includes comprehensive examples of:\n- Basic predicate removal scenarios\n- Complex logical expression handling\n- Error cases and edge conditions\n- Column filtering with various SQL constructs\n\n## Contributing\n\nThis project uses:\n- **Rye** for dependency management\n- **pytest** for testing\n- **mypy** for type checking\n- **ruff** for linting\n\n\n## Author\n\nThis Python module was created by [Query.Farm](https://query.farm).\n\n# License\n\nMIT Licensed.\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquery-farm%2Fpython-sql-manipulation","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fquery-farm%2Fpython-sql-manipulation","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquery-farm%2Fpython-sql-manipulation/lists"}