{"id":28233921,"url":"https://github.com/patricktrainer/duckdb-webhook-gateway","last_synced_at":"2025-07-13T19:08:10.674Z","repository":{"id":292897701,"uuid":"981661640","full_name":"patricktrainer/duckdb-webhook-gateway","owner":"patricktrainer","description":"DuckDB as both a storage mechanism and a computational engine for processing webhooks","archived":false,"fork":false,"pushed_at":"2025-05-12T18:25:10.000Z","size":5657,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-11T23:08:57.970Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/patricktrainer.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-11T16:06:41.000Z","updated_at":"2025-05-12T18:25:11.000Z","dependencies_parsed_at":"2025-05-12T18:47:13.891Z","dependency_job_id":null,"html_url":"https://github.com/patricktrainer/duckdb-webhook-gateway","commit_stats":null,"previous_names":["patricktrainer/duckdb-webhook-gateway"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/patricktrainer/duckdb-webhook-gateway","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/patricktrainer%2Fduckdb-webhook-gateway","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/patricktrainer%2Fduckdb-webhook-gateway/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/patricktrainer%2Fduckdb-webhook-gateway/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/patricktrainer%2Fduckdb-webhook-gateway/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/patricktrainer","download_url":"https://codeload.github.com/patricktrainer/duckdb-webhook-gateway/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/patricktrainer%2Fduckdb-webhook-gateway/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265191214,"owners_count":23725283,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-05-18T21:10:49.133Z","updated_at":"2025-07-13T19:08:10.667Z","avatar_url":"https://github.com/patricktrainer.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DuckDB Webhook Gateway\n\nA powerful webhook processing system using DuckDB as both a storage mechanism and a computational engine.\n\n## Overview\n\nDuckDB Webhook Gateway is a flexible, high-performance system for processing, transforming, and routing webhook events. Unlike traditional webhook handlers that require custom code for each integration, this gateway uses SQL as a universal interface for data transformation and filtering.\n\nDuckDB serves as both the storage layer and the computational engine, enabling complex data operations directly on incoming webhook payloads without requiring intermediate ETL processes.\n\n### Key Features\n\n- **Dynamic webhook registration** with SQL-defined transformations and filtering\n- **Webhook-specific reference tables** for enriching webhook data with lookups\n- **Runtime-registered Python UDFs** for custom transformations beyond SQL\n- **Ad-hoc SQL query capabilities** for analytics, debugging, and auditing\n- **Thread-safe DuckDB operations** for reliable concurrent processing\n- **Built-in audit trail** of all webhook events (raw and transformed)\n- **Interactive webhook testing** with visualization of transformed data\n\n![webhookui](etc/duckdb-webhook-ui.gif)\n\n## Real-World Use Cases\n\n### 1. DevOps Event Router\n\nRoute GitHub or GitLab events to different services based on content:\n- Send PR events to code review tools\n- Route issues with security tags to security teams\n- Trigger CI/CD pipelines for specific branch events\n- Extract JIRA keys from commit messages to update tickets\n\n### 2. E-commerce Order Processing\n\nProcess incoming orders from multiple platforms:\n- Transform order payloads from different sources into a consistent format\n- Enrich orders with customer data from reference tables\n- Apply business rules via SQL filters (e.g., fraud detection)\n- Route high-value orders to priority fulfillment\n\n### 3. IoT Data Processing\n\nManage streams of IoT device data:\n- Filter out readings below sensor thresholds\n- Transform raw sensor data into actionable metrics\n- Enrich events with device metadata from reference tables\n- Trigger alerts based on anomaly detection\n\n### 4. Marketing Automation\n\nProcess webhook events from marketing platforms:\n- Transform campaign performance data into standardized formats\n- Join events with customer segments from reference tables\n- Filter for high-value conversion events\n- Route customer actions to appropriate teams based on behavior\n\n### 5. Financial Transaction Processing\n\nHandle payment webhook events with precision:\n- Transform transaction data from payment processors\n- Apply complex compliance and validation rules\n- Enrich transactions with account metadata\n- Route suspicious transactions for manual review\n\n## Installation and Setup\n\n### Prerequisites\n- Python 3.8+\n- pip\n\n### Standard Installation\n1. Clone the repository:\n```bash\ngit clone https://github.com/patricktrainer/duckdb-webhook-gateway.git\ncd duckdb-webhook-gateway\n```\n\n2. Install dependencies:\n```bash\npip install -r requirements.txt\n```\n\n3. Install the package in development mode:\n```bash\npip install -e .\n```\n\n4. Run the application:\n```bash\npython -m src.app\n```\n\nThe server will start on `http://localhost:8000`.\n\n### Docker Installation\n\nThe application can be deployed using Docker for easier setup and deployment.\n\n#### Prerequisites\n- Docker\n- Docker Compose\n\n#### Steps to Run with Docker\n\n1. Clone the repository:\n```bash\ngit clone https://github.com/patricktrainer/duckdb-webhook-gateway.git\ncd duckdb-webhook-gateway\n```\n\n2. Configure environment (optional):\n   Edit the `docker-compose.yml` file to set your preferred API key and other configurations.\n\n3. Build and start the containers:\n```bash\ndocker-compose up -d\n```\n\n4. Access the application:\n   - Frontend: http://localhost:80\n   - Backend API: http://localhost:8000\n\n#### Docker Environment Variables\n\nYou can customize the Docker deployment by editing the environment variables in `docker-compose.yml`:\n\n- `WEBHOOK_GATEWAY_API_KEY`: API key for authenticating with the webhook gateway\n\n#### Data Persistence\n\nThe Docker setup uses a named volume (`webhook-data`) to persist the DuckDB database file across container restarts. You can manage this volume using standard Docker commands:\n\n```bash\n# List all volumes\ndocker volume ls\n\n# Inspect the webhook data volume\ndocker volume inspect duckdb-webhook-gateway_webhook-data\n\n# Backup the volume data\ndocker run --rm -v duckdb-webhook-gateway_webhook-data:/source -v $(pwd):/backup alpine tar -czvf /backup/webhook-data-backup.tar.gz -C /source .\n```\n\n## How It Works\n\n1. **Webhook Registration**: Define source paths, destination URLs, and SQL transformations.\n2. **Event Reception**: The gateway receives webhook events at the registered paths.\n3. **Data Transformation**: DuckDB applies the SQL transformations to the incoming JSON.\n4. **Filtering**: Optional SQL filter clauses determine which events to forward.\n5. **Enrichment**: Reference tables and custom UDFs can be used to enrich the data.\n6. **Delivery**: Transformed events are forwarded to destination endpoints.\n7. **Auditing**: All raw and transformed events are stored for analysis and replay.\n8. **Testing \u0026 Visualization**: Built-in webhook tester allows viewing both raw and transformed data.\n\n## Running Tests\n\nThe project includes a comprehensive test suite covering all aspects of the application:\n\n```bash\n# Install dev dependencies\npip install -e \".[dev]\"\n\n# Run all tests\npytest\n\n# Run tests with verbose output\npytest -v\n\n# Run specific test modules\npytest tests/test_db_manager.py\n\n# Run unit tests without integration tests\npytest -k \"not integration\"\n\n# Run only integration tests\npytest -k \"integration\"\n```\n\n## Example Usage\n\nHere are some examples of how to use the DuckDB Webhook Gateway:\n\n### 1. Registering a webhook\n\n```bash\ncurl -X POST http://localhost:8000/register \\\n  -H \"X-API-Key: default_key\" \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"source_path\": \"/github-events\",\n    \"destination_url\": \"https://example.com/webhook-handler\",\n    \"transform_query\": \"SELECT repository.name AS repo_name, sender.login AS sender, type AS event_type FROM {{payload}}\",\n    \"filter_query\": \"type IN (\\'PushEvent\\', \\'PullRequestEvent\\')\",\n    \"owner\": \"team-a\"\n  }'\n```\n\n### 2. Uploading a reference table\n\n```bash\n# Create a users.csv file:\n# user_id,username,department,role\n# 1,john_doe,engineering,developer\n# 2,jane_smith,product,manager\n# 3,bob_jones,engineering,devops\n\ncurl -X POST http://localhost:8000/upload_table \\\n  -H \"X-API-Key: default_key\" \\\n  -F \"webhook_id=\u003cwebhook_id\u003e\" \\\n  -F \"table_name=users\" \\\n  -F \"description=User information for enriching webhook data\" \\\n  -F \"file=@users.csv\"\n```\n\n### 3. Registering a Python UDF\n\n```bash\ncurl -X POST http://localhost:8000/register_udf \\\n  -H \"X-API-Key: default_key\" \\\n  -F \"webhook_id=a2b392f3-8cf7-43d5-936c-322d64c9f07e\" \\\n  -F \"function_name=extract_jira_key\" \\\n  -F 'function_code=def extract_jira_key(text: str) -\u003e str:\n    \"\"\"Extract JIRA issue keys from text\"\"\"\n    import re\n    if not text:\n        return None\n    match = re.search(r\"[A-Z]+-\\d+\", text)\n    return match.group(0) if match else None'\n```\n\n### 4. Testing Webhooks via the UI\n\nThe gateway includes a built-in webhook testing UI that allows you to:\n\n1. Select any registered webhook\n2. Craft custom JSON payloads\n3. Send test webhooks\n4. View complete processing results including:\n   - Original API response\n   - Raw payload as received\n   - Transformed data after SQL processing\n   - Delivery status and response details\n\nTo access the webhook tester:\n\n1. Open the web UI at `http://localhost:80` (or `http://localhost:8000` if using direct Python install)\n2. Navigate to the \"Webhook Tester\" section\n3. Select a webhook and customize your payload\n4. View the processed results in the tabbed interface\n\n### 5. Example admin queries\n\n```bash\n# Get all events for a specific source path\ncurl -X POST http://localhost:8000/query \\\n  -H \"X-API-Key: default_key\" \\\n  -F 'query=SELECT r.id, r.timestamp, r.source_path, r.payload, t.success, t.response_code FROM raw_events r LEFT JOIN transformed_events t ON r.id = t.raw_event_id WHERE r.source_path = \"/github-events\" ORDER BY r.timestamp DESC LIMIT 10'\n\n# Get success rate by webhook\ncurl -X POST http://localhost:8000/query \\\n  -H \"X-API-Key: default_key\" \\\n  -F 'query=SELECT w.source_path, COUNT(t.id) as total, SUM(CASE WHEN t.success THEN 1 ELSE 0 END) as success_count, CAST(SUM(CASE WHEN t.success THEN 1 ELSE 0 END) AS FLOAT) / COUNT(t.id) as success_rate FROM webhooks w JOIN transformed_events t ON w.id = t.webhook_id GROUP BY w.source_path'\n```\n\n## System Architecture\n\nThe system is built around DuckDB's unique capabilities:\n\n1. **JSON Processing**: Uses DuckDB's JSON functions to query directly against webhook payloads\n2. **In-Memory Processing**: Leverages DuckDB's high-performance query engine\n3. **Temporary Views**: Creates temporary views of payload data for transformation\n4. **User-Defined Functions**: Extends SQL capabilities with custom Python functions\n5. **Thread Safety**: Manages concurrent operations with query locks\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## License\n\nThis project is licensed under the Apache License Version 2.0 - see the LICENSE file for details.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpatricktrainer%2Fduckdb-webhook-gateway","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpatricktrainer%2Fduckdb-webhook-gateway","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpatricktrainer%2Fduckdb-webhook-gateway/lists"}