{"id":28918308,"url":"https://github.com/emllm/qry","last_synced_at":"2026-04-29T04:39:09.608Z","repository":{"id":300361630,"uuid":"1005975668","full_name":"emllm/qry","owner":"emllm","description":"ULTRA-SZYBKI PROCESOR PRZESZUKIWANIA I PRZETWARZANIA PLIKÓW","archived":false,"fork":false,"pushed_at":"2025-06-21T08:14:58.000Z","size":46,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-21T09:18:35.231Z","etag":null,"topics":["llm","mcp","process","qry","query","search","sql"],"latest_commit_sha":null,"homepage":"https://emllm.github.io/qry/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/emllm.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-21T07:54:22.000Z","updated_at":"2025-06-21T08:15:01.000Z","dependencies_parsed_at":"2025-06-21T09:18:37.995Z","dependency_job_id":"e63c6cb3-f688-4425-b664-aab091835378","html_url":"https://github.com/emllm/qry","commit_stats":null,"previous_names":["emllm/qry"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/emllm/qry","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emllm%2Fqry","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emllm%2Fqry/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emllm%2Fqry/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emllm%2Fqry/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/emllm","download_url":"https://codeload.github.com/emllm/qry/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emllm%2Fqry/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261224290,"owners_count":23126930,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["llm","mcp","process","qry","query","search","sql"],"created_at":"2025-06-22T02:02:22.079Z","updated_at":"2026-04-29T04:39:09.602Z","avatar_url":"https://github.com/emllm.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# qry\n\n[![CI](https://github.com/emllm/qry/actions/workflows/ci.yml/badge.svg)](https://github.com/emllm/qry/actions/workflows/ci.yml)\n[![License: Apache-2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n[![Python](https://img.shields.io/badge/python-3.8%2B-blue)](pyproject.toml)\n\nUltra-fast file search and metadata extraction tool.\n\n## Features\n\n- Fast filesystem search with optional depth, date, and size filters\n- **Optimized performance** — caching, parallel processing, date-based directory pruning\n- CLI modes: `search`, `interactive`, `batch`, `version`\n- HTTP API (FastAPI) for JSON and HTML search responses\n- Metadata extraction for matched files (size, timestamps, content type)\n- **Streaming results** — Ctrl+C stops search mid-way and outputs what was found so far\n- **Smart directory exclusions** — `.git`, `.venv`, `__pycache__`, `dist`, `node_modules` skipped by default\n- **YAML, JSON, and paths output** — machine-readable output for piping into other tools\n- **Python API** — `import qry; qry.search(...)` for use in other applications\n- **Regex search** — `--regex` flag for pattern matching in filenames and content\n- **Size filtering** — `--min-size` / `--max-size` with human-readable units (1k, 10MB, 1G)\n- **Sort results** — `--sort name|size|date`\n- **Content preview** — `--preview` shows matching line with context for content search\n- **Date filtering** — `--last-days`, `--after-date`, `--before-date` for time-based searches\n- **Parallel search** — `-w/--workers` for multi-threaded directory processing\n- **Priority-based search** — searches important directories first (src/, tests/), then lower priority (cache/, .git/)\n- **Incremental results** — shows results as they're found with timeout-based priority fallback\n\n## Installation\n\n### Poetry (recommended)\n\n```bash\npoetry install --with dev\n```\n\n### pip (minimal)\n\n```bash\npip install -r requirements.txt\n```\n\n## Quick start\n\n```bash\n# search in current directory (filename match, YAML output)\npoetry run qry \"invoice\"\n\n# search file contents with preview snippet\npoetry run qry \"def search\" -c -P --path ./qry\n\n# regex search, sorted by name\npoetry run qry \"\\.py$\" -r --sort name --path .\n\n# pipe-friendly output for shell pipelines\npoetry run qry \"TODO\" -c -o paths | xargs grep -n \"FIXME\"\n\n# show version and engines\npoetry run qry version\n```\n\n## CLI usage\n\n```bash\nqry search [query ...] [-f] [-c] [-r] [-P] [--type EXT1,EXT2] [--scope PATH | --path PATH]\n           [--depth N] [--last-days N] [--limit N] [--min-size SIZE] [--max-size SIZE]\n           [--sort name|size|date] [--exclude DIR] [--no-exclude]\n           [--output yaml|json|paths]\n\nqry interactive\nqry batch \u003cinput_file\u003e [--output-file FILE] [--format text|json|csv] [--workers N]\nqry version\n```\n\n### Search mode flags\n\n| Flag | Long form | Searches |\n|------|-----------|----------|\n| (none) | | filename (default) |\n| `-f` | `--filename` | filename only |\n| `-c` | `--content` | file contents |\n| `-r` | `--regex` | treat query as regular expression |\n\n### Filtering flags\n\n| Flag | Description |\n|------|-------------|\n| `-t EXT` | Filter by file type (comma-separated) |\n| `-d N` | Max directory depth |\n| `-l N` | Limit results (0 = unlimited, default) |\n| `--last-days N` | Files modified in last N days |\n| `--after-date YYYY-MM-DD` | Files modified after date |\n| `--before-date YYYY-MM-DD` | Files modified before date |\n| `-w N` | Workers for parallel search (default: 4) |\n| `--min-size SIZE` | Minimum file size (e.g. `1k`, `10MB`, `1G`) |\n| `--max-size SIZE` | Maximum file size (e.g. `100k`, `5MB`) |\n| `-e DIR` | Exclude extra directory (repeatable, comma-separated) |\n| `--no-exclude` | Disable all default exclusions |\n\n### Output flags\n\n| Flag | Description |\n|------|-------------|\n| `-o yaml` | YAML output (default) |\n| `-o json` | JSON output |\n| `-o paths` | One path per line — pipe-friendly |\n| `-P` | `--preview` — show matching line with context (with `-c`) |\n| `--sort` | Sort results by `name`, `size`, or `date` |\n\nDefault excluded directories: `.git` `.venv` `__pycache__` `dist` `node_modules` `.tox` `.mypy_cache`\n\n### Priority-based search\n\nWhen priority mode is enabled, directories are searched in order of importance:\n\n| Priority | Value | Directories |\n|----------|-------|-------------|\n| SOURCE | 100 | `src/`, `source/`, `lib/`, `code/` |\n| PROJECT | 90 | `tests/`, `test/`, `docs/`, `scripts/`, `examples/` |\n| CONFIG | 80 | `config/`, `.config/`, `settings/` |\n| MAIN | 70 | `main/`, `app/`, `core/`, `server/`, `client/` |\n| MODULES | 60 | `modules/`, `module/`, `components/`, `packages/`, `plugins/` |\n| UTILS | 50 | `utils/`, `helpers/`, `tools/` |\n| BUILD | 40 | `build/`, `dist/`, `out/`, `target/`, `release/` |\n| CACHE | 30 | `cache/`, `__pycache__/`, `node_modules/`, `.pytest_cache/` |\n| TEMP | 20 | `temp/`, `tmp/`, `.tmp/` |\n| GENERATED | 10 | `generated/`, `compiled/`, `bin/`, `obj/` |\n| EXCLUDED | 0 | `.git/`, `.svn/`, `.venv/`, `venv/`, `.idea/`, `.vscode/` |\n\nThis ensures that important directories (source code) are searched first, while cache and temporary directories are searched last.\n\n### Examples\n\n```bash\n# search by filename (default)\npoetry run qry \"invoice\"\n\n# search inside file contents — press Ctrl+C to stop early\npoetry run qry \"def search\" -c\npoetry run qry \"TODO OR FIXME\" -c --type py --path ./src\n\n# regex search for Python files\npoetry run qry \"\\.py$\" -r --sort name -s qry/\n\n# content search with preview snippet\npoetry run qry \"search\" -c -P --sort name -s qry/ -d 2\n\n# filter by file size\npoetry run qry \"\" --min-size 10k --max-size 1MB --sort size\n\n# JSON output for piping\npoetry run qry \"invoice\" -o json | jq '.results[]'\n\n# pipe-friendly: one path per line\npoetry run qry \"TODO\" -c -o paths | xargs grep -n \"FIXME\"\npoetry run qry \"invoice\" -o paths | xargs -I{} cp {} /backup/\n\n# exclude extra directories\npoetry run qry \"config\" -e build -e \".cache\"\n\n# disable all exclusions (search everything)\npoetry run qry \"config\" --no-exclude\n\n# combine scope/depth/type/date\npoetry run qry \"invoice OR faktura\" --scope /data/docs --depth 3\npoetry run qry search \"report\" --type pdf,docx --last-days 7\npoetry run qry batch queries.txt --format json --output-file results.json\n\n# date filtering examples\npoetry run qry \"report\" --last-days 30          # files modified in last 30 days\npoetry run qry \"invoice\" --after-date 2026-01-01    # files after Jan 1, 2026\npoetry run qry \"invoice\" --before-date 2025-12-31  # files before Dec 31, 2025\npoetry run qry \"report\" --after-date 2026-01-01 --before-date 2026-02-01  # date range\n```\n\n## Python API\n\nUse `qry` directly from Python — no subprocess needed:\n\n```python\nimport qry\n\n# Return all matching file paths as a list\nfiles = qry.search(\"invoice\", scope=\"/data/docs\", mode=\"content\", depth=3)\n\n# Stream results one at a time (memory-efficient, supports Ctrl+C)\nfor path in qry.search_iter(\"TODO\", scope=\"./src\", mode=\"content\"):\n    print(path)\n\n# Regex search with sorting\npy_files = qry.search(r\"test_.*\\.py$\", scope=\".\", regex=True, sort_by=\"name\")\n\n# Size filtering — find large files\nbig = qry.search(\"\", scope=\".\", min_size=1024*1024, sort_by=\"size\")\n\n# Custom exclusions\nfiles = qry.search(\"config\", exclude_dirs=[\".git\", \"build\", \".venv\"])\n```\n\nParameters for both `qry.search()` and `qry.search_iter()`:\n\n| Parameter | Type | Default | Description |\n|-----------|------|---------|-------------|\n| `query_text` | str | — | Text to search for |\n| `scope` | str | `\".\"` | Directory to search |\n| `mode` | str | `\"filename\"` | `\"filename\"`, `\"content\"`, or `\"both\"` |\n| `depth` | int\\|None | `None` | Max directory depth |\n| `file_types` | list\\|None | `None` | Extensions to include, e.g. `[\"py\",\"txt\"]` |\n| `exclude_dirs` | list\\|None | `None` | Dir names to skip (None = use defaults) |\n| `max_results` | int | unlimited | Hard cap on results |\n| `min_size` | int\\|None | `None` | Minimum file size in bytes |\n| `max_size` | int\\|None | `None` | Maximum file size in bytes |\n| `regex` | bool | `False` | Treat query as regular expression |\n| `sort_by` | str\\|None | `None` | Sort by `\"name\"`, `\"size\"`, or `\"date\"` |\n| `date_range` | tuple\\|None | `None` | Date range as `(start_date, end_date)` datetime tuples |\n\nExample with date filtering:\n\n```python\nfrom datetime import datetime, timedelta\nimport qry\n\n# Files modified in last 30 days\nend_date = datetime.now()\nstart_date = end_date - timedelta(days=30)\nfiles = qry.search(\"invoice\", date_range=(start_date, end_date))\n\n# Files modified in 2026\nfiles = qry.search(\"report\", date_range=(datetime(2026, 1, 1), datetime(2026, 12, 31)))\n```\n\n## HTTP API usage\n\nRun server:\n\n```bash\npoetry run qry-api --host 127.0.0.1 --port 8000\n```\n\nMain endpoints:\n\n- `GET /api/search`\n- `GET /api/search/html`\n- `GET /api/engines`\n- `GET /api/health`\n- OpenAPI docs: `GET /api/docs`\n\n## Development\n\n### Run tests\n\n```bash\npoetry run pytest -q\n```\n\n### Useful make targets\n\n```bash\nmake install\nmake test\nmake lint\nmake type-check\nmake run-api\n```\n\n## Project structure\n\n- `qry/cli/` – CLI commands and interactive mode\n- `qry/api/` – FastAPI application and routes\n- `qry/core/` – core data models\n- `qry/engines/` – search engine implementations\n- `qry/web/` – HTML renderer/templates integration\n- `tests/` – test suite\n\n## Additional docs\n\n- Usage examples: [EXAMPLES.md](EXAMPLES.md)\n\n## License\n\nApache License 2.0 - see [LICENSE](LICENSE) for details.\n\n## Author\n\nCreated by **Tom Sapletta** - [tom@sapletta.com](mailto:tom@sapletta.com)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Femllm%2Fqry","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Femllm%2Fqry","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Femllm%2Fqry/lists"}