{"id":34501348,"url":"https://github.com/tr-3n/smartsearch-ai","last_synced_at":"2026-05-01T22:33:46.580Z","repository":{"id":289755975,"uuid":"972287841","full_name":"TR-3N/smartsearch-ai","owner":"TR-3N","description":"SmartSearchAI is a live semantic search engine powered by Streamlit for the UI, SerpAPI for real-time web search, and SentenceTransformers with FAISS for fast semantic similarity matching. It allows users to ask natural language queries and get intelligent, web-sourced answers without relying on a static dataset.","archived":false,"fork":false,"pushed_at":"2025-12-16T20:11:34.000Z","size":53,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-12-16T21:51:02.502Z","etag":null,"topics":["artificial-intelligence","data-science","deployment","faiss","machine-learning","nlp","pandas","scikit-learn","sentence-transformers","streamlit"],"latest_commit_sha":null,"homepage":"https://smartsearch-ai.streamlit.app/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/TR-3N.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-04-24T20:41:28.000Z","updated_at":"2025-12-16T20:11:38.000Z","dependencies_parsed_at":"2025-04-24T22:48:09.358Z","dependency_job_id":null,"html_url":"https://github.com/TR-3N/smartsearch-ai","commit_stats":null,"previous_names":["tr-3n/smartsearch-ai"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/TR-3N/smartsearch-ai","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TR-3N%2Fsmartsearch-ai","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TR-3N%2Fsmartsearch-ai/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TR-3N%2Fsmartsearch-ai/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TR-3N%2Fsmartsearch-ai/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/TR-3N","download_url":"https://codeload.github.com/TR-3N/smartsearch-ai/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TR-3N%2Fsmartsearch-ai/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":27992996,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-12-24T02:00:07.193Z","response_time":83,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","data-science","deployment","faiss","machine-learning","nlp","pandas","scikit-learn","sentence-transformers","streamlit"],"created_at":"2025-12-24T02:01:08.858Z","updated_at":"2025-12-24T02:01:49.863Z","avatar_url":"https://github.com/TR-3N.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🔍 SmartSearchAI\n\nSmartSearchAI is a **live semantic search engine** that lets users ask natural language questions and retrieves intelligent answers sourced directly from the internet — **no static dataset required**.\n\nIt uses:\n- 🌐 [SerpAPI](https://serpapi.com/) to fetch real-time Google Search results\n- 🤖 [SentenceTransformers](https://www.sbert.net/) to embed queries and result snippets\n- 🧠 Cosine similarity to rerank results by **semantic closeness**, not just keyword overlap\n- 🖥️ [Streamlit](https://streamlit.io/) for an intuitive web interface\n- 🌐 [Flask](https://flask.palletsprojects.com/) to expose a simple `/search` JSON API\n\n---\n\n## ✨ Features\n\n- 🔎 Real-time web search via SerpAPI (Google Search API)  \n- 🧠 Semantic reranking of results using SentenceTransformer embeddings and cosine similarity  \n- 🧩 Clean separation of concerns:\n  - `Flask` backend: `/search` endpoint returning JSON\n  - `Streamlit` frontend: user UI calling the backend\n- 📄 Modern web UI with sidebar navigation and custom styling  \n- 🔐 API key loaded securely from `.env` (not committed)\n\n---\n\n## 📸 Demo Idea\n\nFor demo, try queries where semantics matter more than exact wording, for example:\n\n- “cheap ways to exercise at home” vs “low-cost home workout ideas”  \n- “how to fix python environment not found” vs “python venv activation error in powershell”  \n\nYou can highlight that top results stay relevant even when keywords don’t match exactly, because ranking is based on embedding similarity.\n\n---\n\n## 🚀 Getting Started\n\n### 1. Clone the repo\n\n```\ngit clone https://github.com/TR-3N/smartsearch-ai.git\ncd smartsearch-ai\n```\n\n\u003e If you’re using a different remote or fork, adjust the URL accordingly.\n\n### 2. Create and activate a virtual environment\n\n```\npython -m venv smartsearch_env\n```\n\nOn **Windows (PowerShell)**:\n\n```\n# If needed, allow scripts just for this session:\nSet-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass\n\n.\\smartsearch_env\\Scripts\\Activate.ps1\n```\n\nOn **Windows (cmd.exe)**:\n\n```\nsmartsearch_env\\Scripts\\activate.bat\n```\n\nOn **macOS / Linux**:\n\n```\nsource smartsearch_env/bin/activate\n```\n\n### 3. Install dependencies\n\n```\npip install -r requirements.txt\npip install sentence-transformers streamlit-extras streamlit-option-menu\n```\n\n`requirements.txt` covers the core stack (Streamlit, Flask, CORS, dotenv, requests, etc.), and `sentence-transformers` plus the Streamlit extras are installed explicitly. [file:2][file:4][file:6]\n\n---\n\n## 🔑 SerpAPI Setup\n\n1. Go to https://serpapi.com/ and sign up (free plan available).  \n2. Get your SerpAPI key.  \n3. Create a `.env` file in the project root:\n\n```\nSERPAPI_KEY=your_serpapi_key_here\n```\n\n\u003e Important: `.env` is listed in `.gitignore` and **must not** be committed.\n\nThe backend reads `SERPAPI_KEY` in `search_engine.py` and will throw a clear error if it is missing. [file:3][file:5]\n\n---\n\n## 🧪 Running the App (Backend + Frontend)\n\nThe app runs as two processes: a Flask API and a Streamlit UI.\n\n### 1. Start the Flask semantic backend\n\nIn a terminal with the virtualenv activated:\n\n```\npython app.py\n```\n\nThis starts Flask at `http://127.0.0.1:5000` and exposes:\n\n- `POST /search` — accepts JSON `{\"query\": \"...\", \"top_k\": 5}` and returns a list of result objects with semantic `score`. [file:3][file:7]\n\nLeave this terminal running.\n\n### 2. Start the Streamlit frontend\n\nOpen a **second** terminal, activate the same virtualenv, `cd` into the project folder, then run:\n\n```\nstreamlit run streamlit_app.py\n```\n\nOpen `http://localhost:8501` in your browser.\n\n- Use the **Home** page to enter a natural language query.  \n- Streamlit calls the Flask `/search` endpoint, which:\n  - Calls SerpAPI to fetch organic Google results. [file:3]  \n  - Computes embeddings for the query and each result (`title + snippet`) using `all-MiniLM-L6-v2`. [file:4]  \n  - Computes cosine similarity and sorts results by semantic `score`. [file:3][file:4]  \n- The UI displays the top results with title, description, link, and (optionally) the semantic score. [file:6]\n\n---\n\n## 🧠 How the Semantic Ranking Works\n\nInside `search_engine.py`, the `SemanticSearch` class:\n\n1. Reads `SERPAPI_KEY` from environment variables. [file:3]  \n2. Calls `https://serpapi.com/search` with the user query and retrieves `organic_results`. [file:3]  \n3. For each result, builds a text string from `title` and `snippet` and passes it to `utils.py`. [file:3][file:4]  \n4. `utils.py`:\n   - Loads the `all-MiniLM-L6-v2` SentenceTransformer model.  \n   - Provides:\n     - `clean_text(text)` – basic normalization\n     - `get_embedding(text)` – returns a dense embedding\n     - `cosine_similarity(a, b)` – similarity between query and result embeddings. [file:4]\n5. Results are scored by cosine similarity, sorted descending, and returned to Streamlit. [file:3][file:4]\n\nThis makes SmartSearchAI behave differently from a classic keyword engine: it can understand paraphrases and re-order SerpAPI’s results based on **meaning** rather than position alone. [file:3][file:4][file:5]\n\n---\n\n## 📁 Project Structure\n\n```\n.\n├── app.py               # Flask API: /search endpoint\n├── search_engine.py     # SemanticSearch class (SerpAPI + embeddings)\n├── streamlit_app.py     # Streamlit UI\n├── utils.py             # SentenceTransformer model + helpers\n├── requirements.txt     # Core Python dependencies\n├── .env                 # Contains SERPAPI_KEY (not committed)\n├── .gitignore\n└── README.md\n```\n\n---\n\n## 📌 Future Improvements\n\nSome ideas you can implement next:\n\n- Add OpenAI / GPT (or another LLM) to **summarize** the top results into one concise answer. [file:2][file:5]  \n- Show both “Original Google rank” and “Semantic rank” side-by-side in the UI for demo purposes.  \n- Cache SerpAPI responses and embeddings to speed up repeated queries. [file:5]\n\n---\n\n## 🛡️ License\n\nThis project is open-source and available under the MIT License. [file:5]\n\n---\n\n## 🙋‍♂️ Author\n\n**Shahil Sinha**\n\nFeel free to reach out on LinkedIn or open issues / PRs on this repo if you want to contribute or suggest improvements! [file:5]\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftr-3n%2Fsmartsearch-ai","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftr-3n%2Fsmartsearch-ai","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftr-3n%2Fsmartsearch-ai/lists"}