https://github.com/saba-gul/celebrity-vector-twin
https://github.com/saba-gul/celebrity-vector-twin
Last synced: 4 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/saba-gul/celebrity-vector-twin
- Owner: Saba-Gul
- License: mit
- Created: 2024-11-12T04:22:45.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2024-11-16T07:01:02.000Z (11 months ago)
- Last Synced: 2025-01-13T16:28:15.964Z (9 months ago)
- Language: Jupyter Notebook
- Size: 376 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Vector Twin 🎭
**Find Your Celebrity Look-Alike with AI-Powered Vector Search**Vector Twin is a fun and interactive platform that uses AI to match user-uploaded images with their closest celebrity look-alikes.

---
## 📋 Features
- **Celebrity Look-Alike Matching:** Upload your image and find your top matches with similarity scores.
- **Real-Time Results:** Get results in seconds, thanks to optimized backend processing.
- **Scalable Design:** Built to handle high-dimensional embeddings efficiently.
- **User-Friendly Interface:** Simple and intuitive design powered by Streamlit.---
## 🛠️ Tech Stack
### **Frontend**
- **[Streamlit](https://streamlit.io/):** For building a responsive and interactive user interface.### **Backend**
- **[FastAPI](https://fastapi.tiangolo.com/):** For handling API requests and backend processing.### **Database**
- **[Qdrant](https://qdrant.tech/):** A high-performance vector search database for querying embeddings.### **Authentication**
- **[Supabase](https://supabase.com/):** To secure user data with authentication and role management.### **Machine Learning**
- **[InceptionResNetV1 pre-trained on FaceNet](https://github.com/davidsandberg/facenet):** Used for generating 512-dimensional image embeddings.
- InceptionResNetV1 is a deep convolutional neural network that excels in generating facial embeddings for face recognition tasks. The model is pre-trained on FaceNet, which allows it to efficiently compute embeddings that can be used for similarity search.---
## 🚀 How It Works
1. **User Upload:** Upload an image through the Streamlit frontend.
2. **Embedding Generation:** The image is processed to generate a high-dimensional vector using InceptionResNetV1 (pre-trained on FaceNet).
3. **Similarity Search:** The vector is compared with celebrity embeddings stored in Qdrant using cosine similarity.
4. **Results Display:** The top matches with their similarity scores are displayed in the frontend.---
## 🧰 Installation and Setup
### Clone the Repository
```bash
git clone https://github.com/yourusername/vector-twin.git
cd vector-twin
```### Create a Virtual Environment
```bash
python -m venv myenv
source myenv/bin/activate # On Windows: myenv\Scripts\activate
```### Install Dependencies
```bash
pip install -r requirements.txt
```### Set Up Environment Variables
Create a `.env` file in the root directory and add the following variables:
```plaintext
QDRANT_API_URL=
QDRANT_API_KEY=
SUPABASE_URL=
SUPABASE_KEY=
```---
## 📂 Project Structure
```plaintext
vector-twin/
├── backend/
│ ├── app.py # FastAPI backend
│ ├── auth.py # Supabase authentication logic
| ├── embeddings.py # FaceNet embedding generation
├── frontend/
│ ├── app.py # Streamlit app
├── requirements.txt # Python dependencies
├── README.md # Project documentation
├── .env # Environment variables
├── docker-compose.yml
```---
## 🔧 Configuration
### Qdrant Vector Collection
- **Collection Name:** `celebrity_embeddings`
- **Vector Size:** `512`
- **Distance Metric:** `Cosine`The collection is preloaded with celebrity embeddings for similarity searches.
---
## 🐳 Running with Docker Compose
### Build and Run Services
```bash
docker-compose up --build
```### Access the Application
- **Frontend**: `http://127.0.0.1:8501`
- **Backend**: `http://127.0.0.1:8500`---
## 🎯 Usage
1. Navigate to the Streamlit frontend (`http://127.0.0.1:8501`).
2. Upload your image to find matches.
3. View the top results with similarity scores!---
## 📊 Dataset
The dataset used to create embeddings for celebrity look-alike matching is the [Pins Face Recognition Dataset](https://www.kaggle.com/datasets/hereisburak/pins-face-recognition) available on Kaggle. This dataset contains a wide variety of celebrity face images used to generate embeddings for facial recognition.### Embedding Generation Model
- **Model:** [InceptionResNetV1 pre-trained on FaceNet](https://github.com/davidsandberg/facenet)
- **Embedding Size:** `512`
- **Distance Metric:** Cosine similarity for comparing vectors.---