{"id":29066999,"url":"https://github.com/byteakp/devsearch","last_synced_at":"2026-04-13T17:33:11.081Z","repository":{"id":293832532,"uuid":"985153759","full_name":"byteakp/Devsearch","owner":"byteakp","description":"Flask-based search engine with hybrid retrieval (BM25 + Sentence Transformers) and Gemini API integration for code-specific and general-purpose searches.","archived":false,"fork":false,"pushed_at":"2025-05-17T11:55:13.000Z","size":0,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-17T12:33:09.533Z","etag":null,"topics":["engine","engineering","flask","js","python","web"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/byteakp.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":"CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-17T07:05:57.000Z","updated_at":"2025-05-17T12:01:52.000Z","dependencies_parsed_at":"2025-05-17T12:43:16.945Z","dependency_job_id":null,"html_url":"https://github.com/byteakp/Devsearch","commit_stats":null,"previous_names":["byteakp/devsearch"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/byteakp/Devsearch","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/byteakp%2FDevsearch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/byteakp%2FDevsearch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/byteakp%2FDevsearch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/byteakp%2FDevsearch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/byteakp","download_url":"https://codeload.github.com/byteakp/Devsearch/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/byteakp%2FDevsearch/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":262235802,"owners_count":23279571,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["engine","engineering","flask","js","python","web"],"created_at":"2025-06-27T10:10:09.465Z","updated_at":"2026-04-13T17:33:06.053Z","avatar_url":"https://github.com/byteakp.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Dual-Mode AI Search Engine\n\n**Version**: 1.0 (May 2025)\nA Flask-based search engine with hybrid retrieval (BM25 + Sentence Transformers) and Gemini API integration for code-specific and general-purpose searches. Features smart autocomplete, adjustable scoring, and highlighted results.\n\n## Features\n- **Dual Modes**: Code-specific (e.g., code snippets, error logs) and General-purpose (e.g., articles, topics) search with Auto-Detect.\n- **Hybrid Retrieval**: Combines BM25 (keyword-based) and Sentence Transformer (semantic) scores via Reciprocal Rank Fusion (RRF).\n- **AI Insights**: Gemini API provides query completions and contextual suggestions.\n- **Web Interface**: Flask UI with search bar, mode selection, and sliders for sparse/dense scoring weights.\n- **Performance**: \u003c300ms query response for \u003c10k documents; ~1-2 min indexing for ~1k documents.\n\n\n## Requirements\n- Python 3.8+\n- Virtual environment (recommended)\n- Dependencies: `flask`, `sentence-transformers`, `rank-bm25`, Gemini API key\n- Install via: `pip install -r requirements.txt`\n\n## Setup\n1. **Clone the Repository**:\n   ```bash\n   git clone \u003crepository-url\u003e\n   cd search_project\n   ```\n2. **Set Up Virtual Environment**:\n   ```bash\n   python -m venv venv\n   source venv/bin/activate  # On Windows: venv\\Scripts\\activate\n   ```\n3. **Install Dependencies**:\n   ```bash\n   pip install -r requirements.txt\n   ```\n4. **Configure Gemini API**:\n   - Add your Gemini API key to `config.py` or as an environment variable (`GEMINI_API_KEY`).\n\n## Usage\n### 1. Build the Search Index\nIndex documents (code files like `.py`, `.js`, `.log` or text like `.txt`, `.md`):\n```bash\npython build_index_script.py --data_path ./sample_data/ --index_file search_index.pkl\n```\n- **Options**:\n  - `--data_path`: Document directory (default: `./sample_data/`).\n  - `--index_file`: Index save path (default: `search_index.pkl`).\n  - `--model_name`: Sentence Transformer model (default: `all-MiniLM-L6-v2`). Use `NONE` to disable dense embeddings.\n  - `--force_re_embed`: Recompute embeddings.\n  - `--log_level`: Logging level (DEBUG, INFO, WARNING, ERROR; default: INFO).\n\n### 2. Run the Search Application\nStart the Flask app:\n```bash\npython app.py\n```\n- Access at `http://localhost:5001` (check terminal for exact address).\n- UI includes search bar, mode selector (Code, General, Auto-Detect), and scoring sliders.\n\n### 3. Perform Searches\n- **Code Search**:\n  - Mode: Select \"Code\" or use Auto-Detect.\n  - Queries: Code snippets (e.g., `def example_function():`), errors (e.g., `TypeError: 'NoneType' object is not subscriptable`), or API issues (e.g., `Flask route not found`).\n  - Results: Code, logs, or stack traces with highlighted matches; AI suggestions for fixes.\n- **General Search**:\n  - Mode: Select \"General\" or use Auto-Detect.\n  - Queries: Topics (e.g., `machine learning basics`), questions (e.g., `What is BM25?`), or keywords (e.g., `search engine optimization`).\n  - Results: Articles or tools with highlighted snippets; AI summaries.\n- **Auto-Detect**: Automatically selects mode based on query (code syntax vs. natural language).\n- **Smart Autocomplete**: Gemini API suggests completions tailored to mode.\n\n### 4. Customize Search\n- Adjust sparse (`sparse_weight`) and dense (`dense_weight`) scoring via UI sliders (0 to 1).\n- Optimize indexing/search speed by disabling dense embeddings (`--model_name NONE`).\n- Tune `max_chunk_size` and `chunk_overlap` in `search_engine.py` for precision vs. speed.\n\n## Performance\n- **Query Speed**: \u003c300ms for \u003c10k documents on mid-tier hardware (8GB RAM, 4-core CPU).\n- **Indexing**: ~1-2 minutes for ~1k documents.\n- **Tips**:\n  - Use BM25 (sparse) for faster keyword searches.\n  - Disable dense embeddings for quicker indexing.\n  - Adjust chunk sizes in `search_engine.py` for performance.\n\n## Project Structure\n```\nsearch_project/\n├── app.py                # Flask web app\n├── build_index_script.py # Index builder\n├── search_engine.py      # Core search logic\n├── sample_data/          # Sample documents\n├── templates/            # HTML templates\n├── static/               # CSS/JS for UI\n└── requirements.txt      # Dependencies\n```\n\n## Contributing\n- Report issues or suggest features via the repository's issue tracker.\n- Submit pull requests with clear descriptions of changes.\n\n## License\nMIT License. See `LICENSE` for details.\n\n## Screenshots\nRefer to the repository's `/docs` folder or issue tracker for UI screenshots and demo visuals.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbyteakp%2Fdevsearch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbyteakp%2Fdevsearch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbyteakp%2Fdevsearch/lists"}