{"id":38508196,"url":"https://github.com/hyperskill/querysight","last_synced_at":"2026-01-17T06:18:07.312Z","repository":{"id":274071107,"uuid":"829586518","full_name":"hyperskill/querysight","owner":"hyperskill","description":"ClickHouse Log-Driven dbt Project Enhancer","archived":false,"fork":false,"pushed_at":"2025-03-10T16:20:49.000Z","size":214,"stargazers_count":5,"open_issues_count":1,"forks_count":0,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-03-10T17:31:48.006Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hyperskill.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-16T18:25:52.000Z","updated_at":"2025-03-10T16:20:53.000Z","dependencies_parsed_at":"2025-02-10T17:27:21.025Z","dependency_job_id":"44807762-2383-4583-a360-3ddde6eca9d3","html_url":"https://github.com/hyperskill/querysight","commit_stats":null,"previous_names":["hyperskill/querysight"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/hyperskill/querysight","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hyperskill%2Fquerysight","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hyperskill%2Fquerysight/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hyperskill%2Fquerysight/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hyperskill%2Fquerysight/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hyperskill","download_url":"https://codeload.github.com/hyperskill/querysight/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hyperskill%2Fquerysight/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28502149,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-17T04:31:57.058Z","status":"ssl_error","status_checked_at":"2026-01-17T04:31:45.816Z","response_time":85,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-01-17T06:18:04.058Z","updated_at":"2026-01-17T06:18:07.291Z","avatar_url":"https://github.com/hyperskill.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# QuerySight: ClickHouse Log-Driven dbt Project Enhancer\n\nQuerySight helps optimize dbt projects by analyzing ClickHouse query logs, identifying inefficiencies, and suggesting improvements. By analyzing query logs and integrating with your dbt project, it helps identify optimization opportunities and improve query performance.\n\n## Key Features\n\n- 🔍 **Advanced Query Analysis**\n  - Parse and analyze ClickHouse query logs\n  - Track query frequency, duration, and memory usage patterns\n  - Filter queries by users, types, and custom criteria\n  - Intelligent pattern detection and categorization\n\n- 📊 **dbt Integration**\n  - Map queries to dbt models for coverage analysis\n  - Track model dependencies and relationships\n  - Identify unused or inefficient models\n  - Generate model-specific optimization recommendations\n\n- 🤖 **AI-Powered Optimization**\n  - Smart recommendations using OpenAI integration\n  - Pattern-based performance improvement suggestions\n  - Model-specific optimization strategies\n  - Best practices enforcement\n\n- 💾 **Performance \u0026 Usability**\n  - Intelligent caching system for faster repeated analysis\n  - Batch processing for large query logs\n  - Progress tracking with rich CLI interface\n  - Flexible output formats (CLI, JSON)\n\n## Prerequisites\n\n- Python 3.10+\n- ClickHouse database instance\n- OpenAI API key (optional, for AI-powered recommendations)\n- dbt project (recommended, for dbt integration features)\n\n## Installation\n\n1. Clone the repository:\n```bash\ngit clone https://github.com/hyperskill/querysight.git\ncd querysight\n```\n\n2. Install dependencies:\n```bash\npython -m venv venv\nsource venv/bin/activate  # (or `venv\\Scripts\\activate` on Windows)\npip install -r requirements.txt\n```\n\n## Configuration\n\nCreate a `.env` file with your configuration (or copy from `.env.example`):\n\n```bash\n# ClickHouse Connection, QuerySight needs read-only permissions for system schema and users schemas\nCLICKHOUSE_HOST=localhost\nCLICKHOUSE_PORT=9000\nCLICKHOUSE_USER=default\nCLICKHOUSE_PASSWORD=your_password\nCLICKHOUSE_DATABASE=default\n\n# OpenAI API Key (optional, only needed for AI-powered suggestions)\nOPENAI_API_KEY=your_openai_key\n\n# Optional dbt Configuration\nDBT_PROJECT_PATH=/path/to/dbt/project\n```\n\n## Usage\n\n### Analysis Command\n\n```bash\npython querysight.py analyze [OPTIONS]\n\nAnalysis Options:\n  --days INTEGER              Analysis timeframe [default: 7]\n  --focus [queries|models]    Analysis focus [default: queries]\n  --min-frequency INTEGER     Minimum query frequency [default: 5]\n  --min-duration INTEGER      Minimum query duration in ms\n  --sample-size INTEGER       Sample size for pattern analysis\n  --batch-size INTEGER        Batch size for processing\n\nFiltering Options:\n  --include-users TEXT       Include specific users (comma-separated)\n  --exclude-users TEXT       Exclude specific users (comma-separated)\n  --query-kinds TEXT         Filter by query kinds (SELECT,INSERT,etc)\n  --select-patterns TEXT     Filter specific patterns by pattern_id (pattern_id is getting created at the first analysis step, you can select patterns of interest on the next steps\n  --select-tables TEXT       Filter specific tables\n  --select-models TEXT       Filter specific dbt models\n\nOutput Options:\n  --sort-by TEXT            Sort by [frequency|duration|memory]\n  --page-size INTEGER       Results per page [default: 20]\n\nCache Options:\n  --cache / --no-cache      Use cached data [default: True]\n  --force-reset            Force cache reset\n\nAnalysis Level:\n  --level TEXT             Analysis depth [data_collection|pattern_analysis|dbt_integration|optimization]\n  --dbt-project TEXT       dbt project path\n```\n\n### Export Command\n\nExport analysis results to JSON format:\n\n```bash\npython querysight.py export [OPTIONS]\n  --output TEXT    Output file path [default: stdout]\n```\n\n## Docker Support\n\nRun QuerySight in a containerized environment:\n\n```bash\n# Using docker-compose\ndocker-compose up --build\n\n# Or with Docker directly\ndocker build -t querysight .\ndocker run -it --network host \\\n  -v ~/.ssh:/root/.ssh:ro \\\n  -v /path/to/dbt:/app/dbt_project:ro \\\n  -v ./logs:/app/logs \\\n  -v ./.cache:/app/.cache \\\n  --env-file .env \\\n  querysight analyze --days 7\n```\n\n## Project Structure\n\n```\nquerysight/\n├── querysight.py           # Main CLI interface\n├── utils/\n│   ├── ai_suggester.py     # AI-powered recommendations\n│   ├── cache_manager.py    # Query cache management\n│   ├── data_acquisition.py # ClickHouse data fetching\n│   ├── dbt_analyzer.py     # dbt project analysis\n│   ├── dbt_mapper.py       # Query to model mapping\n│   ├── filtering.py        # Query filtering logic\n│   ├── models.py           # Data models\n│   └── sql_parser.py       # SQL parsing utilities\n├── tests/              # Test suite\n└── docker/            # Docker configuration\n```\n\n## Contributing\n\n1. Fork the repository\n2. Create your feature branch (`git checkout -b feature/amazing-feature`)\n3. Commit your changes (`git commit -m 'Add some amazing feature'`)\n4. Push to the branch (`git push origin feature/amazing-feature`)\n5. Open a Pull Request\n\n## License\n\nThis project is licensed under the MIT License - see the `LICENSE` file for details.\n\n## Acknowledgments\n\n- Built with [ClickHouse](https://clickhouse.com/) integration\n- Powered by [OpenAI](https://openai.com/) for intelligent recommendations\n- Integrates with [dbt](https://www.getdbt.com/) for data transformation analysis\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhyperskill%2Fquerysight","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhyperskill%2Fquerysight","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhyperskill%2Fquerysight/lists"}