{"id":31648943,"url":"https://github.com/gigapi/duckdb-gigapi-extension","last_synced_at":"2025-10-07T07:03:34.952Z","repository":{"id":300454465,"uuid":"1006211851","full_name":"gigapi/duckdb-gigapi-extension","owner":"gigapi","description":"Experimental Extension for GigaPI Catalog","archived":false,"fork":false,"pushed_at":"2025-06-29T21:14:01.000Z","size":105,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-09-28T03:44:39.240Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gigapi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null},"funding":{"github":["lmangani"]}},"created_at":"2025-06-21T18:36:32.000Z","updated_at":"2025-07-18T02:46:10.000Z","dependencies_parsed_at":"2025-06-21T21:29:20.348Z","dependency_job_id":null,"html_url":"https://github.com/gigapi/duckdb-gigapi-extension","commit_stats":null,"previous_names":["lmangani/duckdb-gigapi-extension","gigapi/duckdb-gigapi-extension"],"tags_count":0,"template":false,"template_full_name":"duckdb/extension-template","purl":"pkg:github/gigapi/duckdb-gigapi-extension","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gigapi%2Fduckdb-gigapi-extension","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gigapi%2Fduckdb-gigapi-extension/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gigapi%2Fduckdb-gigapi-extension/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gigapi%2Fduckdb-gigapi-extension/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gigapi","download_url":"https://codeload.github.com/gigapi/duckdb-gigapi-extension/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gigapi%2Fduckdb-gigapi-extension/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278734416,"owners_count":26036404,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-07T02:00:06.786Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-10-07T07:02:01.985Z","updated_at":"2025-10-07T07:03:34.946Z","avatar_url":"https://github.com/gigapi.png","language":"C++","funding_links":["https://github.com/sponsors/lmangani"],"categories":[],"sub_categories":[],"readme":"# \u003cimg src=\"https://github.com/user-attachments/assets/5b0a4a37-ecab-4ca6-b955-1a2bbccad0b4\" /\u003e\n\n# \u003cimg src=\"https://github.com/user-attachments/assets/74a1fa93-5e7e-476d-93cb-be565eca4a59\" height=25 /\u003e GigAPI DuckDB Extension\n\nThis extension provides transparent, metadata-driven query support for [GigAPI](https://github.com/gigapi)\n\n## Overview\n\nThe `gigapi` extension seamlessly rewrites DuckDB SQL queries using GigAPI metadata indices.\n\n- If an index is found, the extension dynamically rewrites the query to read the specific data files (e.g., Parquet files on S3) relevant to the query's time range and other filters.\n- If no index is found, the query is passed on to DuckDB's default planner, allowing you to work with regular tables as usual.\n\n## Configuration\n\nTo use this extension, you must first configure a secret in DuckDB to store the connection details for your Redis instance. The extension will look for a `redis` type secret with the name `gigapi`.\n\n### Creating the Secret\n\nYou can create the secret using the following SQL command. Replace the values for `host`, `port`, and `password` with your Redis instance's details.\n\n```sql\nCREATE SECRET gigapi (\n    TYPE redis,\n    HOST 'localhost',\n    PORT '6379',\n    PASSWORD 'your-password'\n);\n```\n\n**Parameters:**\n\n- `TYPE`: Must be `redis`.\n- `HOST`: The hostname or IP address of your Redis server. (Default: 'localhost')\n- `PORT`: The port number for your Redis server. (Default: '6379')\n- `PASSWORD`: The password for your Redis server. (Optional)\n\n\n## Usage: `gigapi()` Table Function\n\nThe primary way to use the extension is via the `gigapi()` table function. You pass a complete SQL query as a string to this function. The extension will then rewrite it using the metadata from Redis and execute it.\n\n### Example\n\n```sql\n-- Load the extension\nINSTALL gigapi FROM community;\nLOAD gigapi;\n\n-- Create the Redis secret for the GigAPI backend\nCREATE SECRET gigapi (\n    TYPE redis,\n    HOST '127.0.0.1',\n    PORT '6379',\n    PASSWORD ''\n);\n\n-- Use the gigapi() table function to run a query\nSELECT * FROM gigapi('SELECT * FROM my_measurement WHERE time \u003e now() - interval ''1 hour''');\n```\n\nBehind the scenes, the extension will perform the following steps:\n1. Parse the inner `SELECT` query.\n2. Check Redis for a key named `giga:idx:ts:my_measurement`.\n3. If the key exists, extract the time range from the `WHERE` clause.\n4. Fetch the relevant list of data files from the Redis sorted set.\n5. Rewrite the query to be `SELECT * FROM read_parquet(['file1.parquet', 'file2.parquet', ...]) WHERE time \u003e now() - interval '1 hour'`.\n6. Pass the rewritten query to the DuckDB planner for execution.\n\n## Transparent Query Hijacking with `GIGAPI`\n\nAs a more powerful alternative to the `gigapi()` table function, you can use the `GIGAPI` keyword at the beginning of any query. This will trigger the same query rewriting logic but allows you to use standard SQL syntax without wrapping your query in a string.\n\n### How it Works\nAny query prefixed with `GIGAPI ` will be automatically intercepted by the extension's query planner. The planner then rewrites the query based on the metadata found in Redis, just like the `gigapi()` function does.\n\n### Example\n\n```sql\n-- The same query, but using the GIGAPI keyword for transparent hijacking.\nGIGAPI SELECT * FROM my_measurement WHERE time \u003e now() - interval '1 hour';\n```\n\nBehind the scenes, the extension performs the same steps as the `gigapi()` table function, rewriting the query to read from specific data files before execution. If a query is not prefixed with `GIGAPI`, it will be handled by DuckDB's default planner.\n\n## Developer Information\n\n### Dry Run Function\n\nFor debugging and development, the extension provides a scalar function `gigapi_dry_run(sql_query)` that shows you how a query would be rewritten without actually connecting to Redis. It uses a dummy list of Parquet files in its place.\n\n**Example:**\n```sql\nSELECT gigapi_dry_run('SELECT * FROM my_table WHERE value \u003e 10');\n```\n\n**Output:**\n```\nSELECT * FROM read_parquet(['dummy/file1.parquet', 'dummy/file2.parquet']) WHERE (\"value\" \u003e 10)\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgigapi%2Fduckdb-gigapi-extension","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgigapi%2Fduckdb-gigapi-extension","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgigapi%2Fduckdb-gigapi-extension/lists"}