https://github.com/ki-ian/semantic-search

Upload PDFs, ask natural-language questions, and get context-aware answers powered by LangChain, ChromaDB, and NVIDIA/Gemini LLMs — all wrapped in a clean Gradio interface. Docker-ready and deployable on Hugging Face Spaces.
https://github.com/ki-ian/semantic-search

chromadb docker docker-compose gemini-api gradio huggingface-spaces langchain nvidia-nim-api python

Last synced: 2 months ago
JSON representation

Host: GitHub
URL: https://github.com/ki-ian/semantic-search
Owner: KI-IAN
License: mit
Created: 2025-09-22T07:34:43.000Z (10 months ago)
Default Branch: main
Last Pushed: 2025-09-25T13:50:08.000Z (9 months ago)
Last Synced: 2025-09-25T15:38:14.863Z (9 months ago)
Topics: chromadb, docker, docker-compose, gemini-api, gradio, huggingface-spaces, langchain, nvidia-nim-api, python
Language: Python
Homepage: https://huggingface.co/spaces/frkhan/semantic-search-app
Size: 13.7 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

---
title: Semantic Search App
emoji: 📄🔗🧠❓🔗🤖
colorFrom: indigo
colorTo: pink
sdk: gradio
sdk_version: 5.46.1
app_file: app.py
pinned: false
---

# Semantic Search App (📄 → 🔗 → 🧠 → ❓ → 🔗 → 🤖)

Upload a PDF, ask questions, and get context-aware answers powered by LangChain, ChromaDB, and NVIDIA/Google LLMs — all wrapped in a clean Gradio interface.

🔗 **Live Demo**: [Semantic Search App](https://huggingface.co/spaces/frkhan/semantic-search-app)

### 📖 Read the Full Story

Want to learn more about the journey behind building this project? Check out the full story on Medium:

- [**When the Credits Ran Out, Curiosity Didn’t: A Journey into LLMs, AI Agents & RAG**](https://frkhan.medium.com/when-the-credits-ran-out-curiosity-didnt-a-journey-into-llms-ai-agents-rag-6fcd5299c49a)
---

### 🚀 Features

- 📄 Upload and process PDF documents
- 🔍 Perform semantic search using vector embeddings
- 🤖 Get answers from powerful LLMs (NVIDIA or Google Gemini)
- 🧠 Uses LangChain + ChromaDB for retrieval
- 📈 Integrated with Langfuse for tracing and observability.
- 🧰 Docker-ready and Hugging Face Spaces–compatible

---

### 🛠️ Tech Stack

| Component | Purpose |
|------------------|-----------------------------------------|
| LangChain | Orchestration of embedding + LLM calls |
| ChromaDB | Vector database for semantic retrieval |
| NVIDIA / Gemini | Embedding + LLM APIs |
| Gradio | Interactive UI |
| Langfuse | Tracing and Observability |
| Docker | Containerized deployment |

---

## 📦 Installation

### Option 1: Run Locally

```bash
git clone https://github.com/KI-IAN/semantic-search.git
cd semantic-search
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```

Create a .env file in the root directory:

```env
GOOGLE_API_KEY=your_google_api_key
NVIDIA_API_KEY=your_nvidia_api_key
CHROMA_DIR=./chroma_db
LANGFUSE_PUBLIC_KEY=your_langfuse_public_key
LANGFUSE_SECRET_KEY=your_langfuse_secret_key
LANGFUSE_HOST=https://cloud.langfuse.com
```

Then run:

```bash
python app.py
```

---

### Option 2: Run with Docker

```bash
# To Run in Live environment. It automatically uses the docker-compose.yml
docker-compose up --build

# Or If you use the latest docker compose command, use the following

docker compose up --build
```

Access the app at http://localhost:12100

---

```bash
# To Run in local environment use docker-compose.dev.yml if you want to reflect your code changes without rebuilding docker container
docker-compose -f docker-compose.dev.yml up --build

# Or If you use the latest docker compose command, use the following
docker compose -f docker-compose.dev.yml up --build

```

Access the app at http://localhost:12100

---

### Option 3: Deploy on Hugging Face Spaces

Create a new Space → choose Gradio as the SDK

Upload your project files (including app/, Dockerfile, requirements.txt, .env)

Set Secrets in the “Secrets” tab:

GOOGLE_API_KEY

NVIDIA_API_KEY

(Optional) CHROMA_DIR (defaults to ./chroma_db)

Hugging Face will auto-detect and launch the app via Gradio

---

## 🔑 Getting API Keys

To use this app, you'll need API keys for both **Gemini** and **NVIDIA NIM**. Here's how to obtain them:

### 🌐 Gemini API Key
Gemini is Google's family of generative AI models. To get an API key:

1. Visit the [Google AI Studio](https://aistudio.google.com/api-keys).
2. Sign in with your Google account.
3. Click **"Create API Key"** and copy the key shown.
4. Use this key in your `.env` file or configuration as `GEMINI_API_KEY`.

> Note: Gemini API access may be limited based on region or account eligibility. Check the Gemini API [Rate Limits here](https://ai.google.dev/gemini-api/docs/rate-limits)

### 🚀 NVIDIA NIM API Key
NIM (NVIDIA Inference Microservices) provides hosted models via REST APIs. To get started:

1. Go to the [NVIDIA API Catalog](https://build.nvidia.com/?integrate_nim=true&hosted_api=true&modal=integrate-nim).
2. Choose a model (e.g., `nim-gemma`, `nim-mistral`, etc.) and click **"Get API Key"**.
3. Sign in or create an NVIDIA account if prompted.
4. Copy your key and use it as `NVIDIA_NIM_API_KEY` in your environment.

> Tip: You can test NIM endpoints directly in the browser before integrating.

---

Once you have both keys, store them securely and never commit them to version control.

---

### 🧪 How to Use
Upload a PDF — drag and drop your document

Click “📄 Process Document” — the app will split, embed, and store the content

Enter a query — ask a question like:

“What are the key findings?”

“Summarize the methodology.”

“What does the report say about climate change?”

Click “🔍 Ask a Question” — get semantic search results and an LLM-generated answer

---

### ⚙️ Configuration
All secrets are loaded from .env or Hugging Face Secrets tab:

| Variable | Description |
|------------------|-----------------------------------------|
| GOOGLE_API_KEY | Gemini LLM API key |
| NVIDIA_API_KEY | NVIDIA LLM API key |
| CHROMA_DIR | Path to store Chroma vector DB |

---

### 🧩 Customization

Switch between NVIDIA and Gemini embeddings in process_pdf()

Change LLM model in search_query() (bytedance/seed-oss-36b-instruct, gemini-2.5-pro, etc.)

Tune chunk size and overlap in RecursiveCharacterTextSplitter

Add dropdowns to UI for model selection (optional)

---

### 📁 File Structure

```Code
semantic-search/
├── .env
├── .github/
├── .gitignore
├── docker-compose.yml
├── docker-compose.dev.yml
├── Dockerfile
├── requirements.txt
├── app.py
├── config.py
```

> .env file is not tracked in git. Use it only for local development and do not push it to git if you save secrets there.

---

## 📜 License

This project is open-source and distributed under the **[MIT License](https://opensource.org/licenses/MIT)**. Feel free to use, modify, and distribute it with attribution.

---

## 🤝 Acknowledgements

- [LangChain](https://www.langchain.com) — Powerful framework for orchestrating LLMs, embeddings, and retrieval pipelines.
- [ChromaDB](https://www.trychroma.com/) — Fast and flexible open-source vector database for semantic search.
- [NVIDIA AI Endpoints](https://build.nvidia.com/models) — Hosted LLM and embedding APIs including Seed OSS and NV-Embed.
- [Google Gemini](https://aistudio.google.com/welcome) — Robust multimodal LLM platform offering text embeddings and chat models.
- [Gradio](https://www.gradio.app) — Simple and elegant Python library for building machine learning interfaces.
- [PyMuPDF](https://pymupdf.readthedocs.io) — Lightweight PDF parser for fast and accurate text extraction.
- [Docker](https://www.docker.com) — Containerization platform for reproducible deployment across environments.
- [Hugging Face Spaces](https://huggingface.co/spaces) — Free hosting platform for ML demos with secret management and GPU support.
- [Langfuse](https://langfuse.com/) for providing excellent observability tools.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ki-ian/semantic-search

Awesome Lists containing this project

README