An open API service indexing awesome lists of open source software.

https://github.com/ki-ian/semantic-search

Upload PDFs, ask natural-language questions, and get context-aware answers powered by LangChain, ChromaDB, and NVIDIA/Gemini LLMs β€” all wrapped in a clean Gradio interface. Docker-ready and deployable on Hugging Face Spaces.
https://github.com/ki-ian/semantic-search

chromadb docker docker-compose gemini-api gradio huggingface-spaces langchain nvidia-nim-api python

Last synced: 2 months ago
JSON representation

Upload PDFs, ask natural-language questions, and get context-aware answers powered by LangChain, ChromaDB, and NVIDIA/Gemini LLMs β€” all wrapped in a clean Gradio interface. Docker-ready and deployable on Hugging Face Spaces.

Awesome Lists containing this project

README

          

---
title: Semantic Search App
emoji: πŸ“„πŸ”—πŸ§ β“πŸ”—πŸ€–
colorFrom: indigo
colorTo: pink
sdk: gradio
sdk_version: 5.46.1
app_file: app.py
pinned: false
---

# Semantic Search App (πŸ“„ β†’ πŸ”— β†’ 🧠 β†’ ❓ β†’ πŸ”— β†’ πŸ€–)

Upload a PDF, ask questions, and get context-aware answers powered by LangChain, ChromaDB, and NVIDIA/Google LLMs β€” all wrapped in a clean Gradio interface.

πŸ”— **Live Demo**: [Semantic Search App](https://huggingface.co/spaces/frkhan/semantic-search-app)

### πŸ“– Read the Full Story

Want to learn more about the journey behind building this project? Check out the full story on Medium:

- [**When the Credits Ran Out, Curiosity Didn’t: A Journey into LLMs, AI Agents & RAG**](https://frkhan.medium.com/when-the-credits-ran-out-curiosity-didnt-a-journey-into-llms-ai-agents-rag-6fcd5299c49a)
---

### πŸš€ Features

- πŸ“„ Upload and process PDF documents
- πŸ” Perform semantic search using vector embeddings
- πŸ€– Get answers from powerful LLMs (NVIDIA or Google Gemini)
- 🧠 Uses LangChain + ChromaDB for retrieval
- πŸ“ˆ Integrated with Langfuse for tracing and observability.
- 🧰 Docker-ready and Hugging Face Spaces–compatible

---

### πŸ› οΈ Tech Stack

| Component | Purpose |
|------------------|-----------------------------------------|
| LangChain | Orchestration of embedding + LLM calls |
| ChromaDB | Vector database for semantic retrieval |
| NVIDIA / Gemini | Embedding + LLM APIs |
| Gradio | Interactive UI |
| Langfuse | Tracing and Observability |
| Docker | Containerized deployment |

---

## πŸ“¦ Installation

### Option 1: Run Locally

```bash
git clone https://github.com/KI-IAN/semantic-search.git
cd semantic-search
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```

Create a .env file in the root directory:

```env
GOOGLE_API_KEY=your_google_api_key
NVIDIA_API_KEY=your_nvidia_api_key
CHROMA_DIR=./chroma_db
LANGFUSE_PUBLIC_KEY=your_langfuse_public_key
LANGFUSE_SECRET_KEY=your_langfuse_secret_key
LANGFUSE_HOST=https://cloud.langfuse.com
```

Then run:

```bash
python app.py
```

---

### Option 2: Run with Docker

```bash
# To Run in Live environment. It automatically uses the docker-compose.yml
docker-compose up --build

# Or If you use the latest docker compose command, use the following

docker compose up --build
```

Access the app at http://localhost:12100

---

```bash
# To Run in local environment use docker-compose.dev.yml if you want to reflect your code changes without rebuilding docker container
docker-compose -f docker-compose.dev.yml up --build

# Or If you use the latest docker compose command, use the following
docker compose -f docker-compose.dev.yml up --build

```

Access the app at http://localhost:12100

---

### Option 3: Deploy on Hugging Face Spaces

Create a new Space β†’ choose Gradio as the SDK

Upload your project files (including app/, Dockerfile, requirements.txt, .env)

Set Secrets in the β€œSecrets” tab:

GOOGLE_API_KEY

NVIDIA_API_KEY

(Optional) CHROMA_DIR (defaults to ./chroma_db)

Hugging Face will auto-detect and launch the app via Gradio

---

## πŸ”‘ Getting API Keys

To use this app, you'll need API keys for both **Gemini** and **NVIDIA NIM**. Here's how to obtain them:

### 🌐 Gemini API Key
Gemini is Google's family of generative AI models. To get an API key:

1. Visit the [Google AI Studio](https://aistudio.google.com/api-keys).
2. Sign in with your Google account.
3. Click **"Create API Key"** and copy the key shown.
4. Use this key in your `.env` file or configuration as `GEMINI_API_KEY`.

> Note: Gemini API access may be limited based on region or account eligibility. Check the Gemini API [Rate Limits here](https://ai.google.dev/gemini-api/docs/rate-limits)

### πŸš€ NVIDIA NIM API Key
NIM (NVIDIA Inference Microservices) provides hosted models via REST APIs. To get started:

1. Go to the [NVIDIA API Catalog](https://build.nvidia.com/?integrate_nim=true&hosted_api=true&modal=integrate-nim).
2. Choose a model (e.g., `nim-gemma`, `nim-mistral`, etc.) and click **"Get API Key"**.
3. Sign in or create an NVIDIA account if prompted.
4. Copy your key and use it as `NVIDIA_NIM_API_KEY` in your environment.

> Tip: You can test NIM endpoints directly in the browser before integrating.

---

Once you have both keys, store them securely and never commit them to version control.

---

### πŸ§ͺ How to Use
Upload a PDF β€” drag and drop your document

Click β€œπŸ“„ Process Document” β€” the app will split, embed, and store the content

Enter a query β€” ask a question like:

β€œWhat are the key findings?”

β€œSummarize the methodology.”

β€œWhat does the report say about climate change?”

Click β€œπŸ” Ask a Question” β€” get semantic search results and an LLM-generated answer

---

### βš™οΈ Configuration
All secrets are loaded from .env or Hugging Face Secrets tab:

| Variable | Description |
|------------------|-----------------------------------------|
| GOOGLE_API_KEY | Gemini LLM API key |
| NVIDIA_API_KEY | NVIDIA LLM API key |
| CHROMA_DIR | Path to store Chroma vector DB |

---

### 🧩 Customization

Switch between NVIDIA and Gemini embeddings in process_pdf()

Change LLM model in search_query() (bytedance/seed-oss-36b-instruct, gemini-2.5-pro, etc.)

Tune chunk size and overlap in RecursiveCharacterTextSplitter

Add dropdowns to UI for model selection (optional)

---

### πŸ“ File Structure

```Code
semantic-search/
β”œβ”€β”€ .env
β”œβ”€β”€ .github/
β”œβ”€β”€ .gitignore
β”œβ”€β”€ docker-compose.yml
β”œβ”€β”€ docker-compose.dev.yml
β”œβ”€β”€ Dockerfile
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ app.py
β”œβ”€β”€ config.py
```

> .env file is not tracked in git. Use it only for local development and do not push it to git if you save secrets there.

---

## πŸ“œ License

This project is open-source and distributed under the **[MIT License](https://opensource.org/licenses/MIT)**. Feel free to use, modify, and distribute it with attribution.

---

## 🀝 Acknowledgements

- [LangChain](https://www.langchain.com) β€” Powerful framework for orchestrating LLMs, embeddings, and retrieval pipelines.
- [ChromaDB](https://www.trychroma.com/) β€” Fast and flexible open-source vector database for semantic search.
- [NVIDIA AI Endpoints](https://build.nvidia.com/models) β€” Hosted LLM and embedding APIs including Seed OSS and NV-Embed.
- [Google Gemini](https://aistudio.google.com/welcome) β€” Robust multimodal LLM platform offering text embeddings and chat models.
- [Gradio](https://www.gradio.app) β€” Simple and elegant Python library for building machine learning interfaces.
- [PyMuPDF](https://pymupdf.readthedocs.io) β€” Lightweight PDF parser for fast and accurate text extraction.
- [Docker](https://www.docker.com) β€” Containerization platform for reproducible deployment across environments.
- [Hugging Face Spaces](https://huggingface.co/spaces) β€” Free hosting platform for ML demos with secret management and GPU support.
- [Langfuse](https://langfuse.com/) for providing excellent observability tools.