An open API service indexing awesome lists of open source software.

https://github.com/adityaexp/comicfinder

πŸ€– AI-based manhwa recommender using OpenAI embeddings + Streamlit UI.
https://github.com/adityaexp/comicfinder

ai comic-recommender cosine-similarity embeddings manga manhua manhwa nlp openai python recommender-system semantic-search streamlit webtoon

Last synced: 11 months ago
JSON representation

πŸ€– AI-based manhwa recommender using OpenAI embeddings + Streamlit UI.

Awesome Lists containing this project

README

          

# 🧠 ComicFinder

ComicFinder is an AI-powered content-based recommendation system built using Python and OpenAI Embeddings.
It helps users discover semantically similar manga, manhwa, manhua, and webtoons based on natural language descriptions, genres, or titles β€” ideal for fans seeking personalized recommendations beyond keyword search.

![ComicFinder Preview: Streamlit interface for manga recommendation](asset/img.jpg)

---

## πŸ’» Live Demo Of Comic Finder
**https://comicfinder.streamlit.app/**

---

## πŸš€ Features Of Comic Finder

- πŸ” Recommends similar manga/manhwa/manhua/webtoon based on descriptions or titles
- πŸ“¦ Utilizes precomputed `clean_embeddings.npy` for fast results
- 🧠 Embedding generation using OPENAI embeddings api
- ⚑ Fast cosine similarity search for real-time recommendation
- πŸ–₯️ Clean Streamlit-based frontend
- πŸ“ Organized data and scripts for easy retraining or extension

---

## πŸ“ Project Structure

```
comic-recommender/
β”œβ”€β”€ app.py # Main application script
β”œβ”€β”€ data/
β”‚ β”œβ”€β”€ data.csv # Original manhwa dataset
β”‚ β”œβ”€β”€ clean_data.csv # Cleaned and preprocessed data
β”‚ └── clean_embeddings.npy # (Ignored from Git, must be downloaded separately)
β”œβ”€β”€ scripts/
β”‚ β”œβ”€β”€ clean_dataset.py # Data cleaning script
β”‚ β”œβ”€β”€ generate_embeddings.py # Embedding generation
β”‚ └── recommend.py # Similarity-based recommendations but CLI version
β”œβ”€β”€ .env # Store API keys
β”œβ”€β”€ requirements.txt # Python dependencies
└── README.md # You're here!
```

---

# πŸ”§ How to Install and Run ComicFinder Locally
```
git clone https://github.com/AdityaEXP/ComicFinder.git
cd ComicFinder

# Optional: Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate

pip install -r requirements.txt
streamlit run .\app.py
```

---
## πŸ“Œ Example Use Cases
- Find romance manhwa similar to *What's Wrong with Secretary Kim?*
- Get fantasy webtoon recommendations with strong male leads
- Discover hidden manga gems with character development arcs
- Replace genre filters with AI-powered natural language queries

---

# πŸ“₯ Download Embedding File
Since clean_embeddings.npy is large, it’s not included in this repo.
[πŸ“¦ Download clean_embeddings.npy](https://drive.google.com/file/d/1toZRablb8yCVhFrICdU1jQtmYh20mZ9P/view?usp=sharing)
Or you can also generate the clean_embeddings.npy using your own openai api key it will cost around $0.02 per generation

---

# πŸ” Environment Variables
Create a .env file for your OpenAI API Key
```
OPENAIKEY=sk-xxxxxx
```

---

# πŸ“œ License
MIT β€” free to use, modify, and distribute.

---

# 🀝 Author
Aditya
πŸ› οΈ AI + Python + Web3 Enthusiast

---

## πŸ“š Dataset Source and Preprocessing
This project uses data inspired by or adapted from the following Kaggle dataset:

**πŸ“Š [Kaggle - Manhwa and Webtoon Dataset](https://www.kaggle.com/datasets/victorsoeiro/manga-manhwa-and-manhua-dataset/data)**
Credit to **Victor Soeiro** for compiling and sharing this dataset.