https://github.com/adityaexp/comicfinder
π€ AI-based manhwa recommender using OpenAI embeddings + Streamlit UI.
https://github.com/adityaexp/comicfinder
ai comic-recommender cosine-similarity embeddings manga manhua manhwa nlp openai python recommender-system semantic-search streamlit webtoon
Last synced: 11 months ago
JSON representation
π€ AI-based manhwa recommender using OpenAI embeddings + Streamlit UI.
- Host: GitHub
- URL: https://github.com/adityaexp/comicfinder
- Owner: AdityaEXP
- Created: 2025-07-12T05:11:28.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2025-07-12T09:45:56.000Z (12 months ago)
- Last Synced: 2025-07-12T10:27:38.614Z (12 months ago)
- Topics: ai, comic-recommender, cosine-similarity, embeddings, manga, manhua, manhwa, nlp, openai, python, recommender-system, semantic-search, streamlit, webtoon
- Language: Python
- Homepage: https://comicfinder.streamlit.app
- Size: 15.8 MB
- Stars: 2
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# π§ ComicFinder
ComicFinder is an AI-powered content-based recommendation system built using Python and OpenAI Embeddings.
It helps users discover semantically similar manga, manhwa, manhua, and webtoons based on natural language descriptions, genres, or titles β ideal for fans seeking personalized recommendations beyond keyword search.

---
## π» Live Demo Of Comic Finder
**https://comicfinder.streamlit.app/**
---
## π Features Of Comic Finder
- π Recommends similar manga/manhwa/manhua/webtoon based on descriptions or titles
- π¦ Utilizes precomputed `clean_embeddings.npy` for fast results
- π§ Embedding generation using OPENAI embeddings api
- β‘ Fast cosine similarity search for real-time recommendation
- π₯οΈ Clean Streamlit-based frontend
- π Organized data and scripts for easy retraining or extension
---
## π Project Structure
```
comic-recommender/
βββ app.py # Main application script
βββ data/
β βββ data.csv # Original manhwa dataset
β βββ clean_data.csv # Cleaned and preprocessed data
β βββ clean_embeddings.npy # (Ignored from Git, must be downloaded separately)
βββ scripts/
β βββ clean_dataset.py # Data cleaning script
β βββ generate_embeddings.py # Embedding generation
β βββ recommend.py # Similarity-based recommendations but CLI version
βββ .env # Store API keys
βββ requirements.txt # Python dependencies
βββ README.md # You're here!
```
---
# π§ How to Install and Run ComicFinder Locally
```
git clone https://github.com/AdityaEXP/ComicFinder.git
cd ComicFinder
# Optional: Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
streamlit run .\app.py
```
---
## π Example Use Cases
- Find romance manhwa similar to *What's Wrong with Secretary Kim?*
- Get fantasy webtoon recommendations with strong male leads
- Discover hidden manga gems with character development arcs
- Replace genre filters with AI-powered natural language queries
---
# π₯ Download Embedding File
Since clean_embeddings.npy is large, itβs not included in this repo.
[π¦ Download clean_embeddings.npy](https://drive.google.com/file/d/1toZRablb8yCVhFrICdU1jQtmYh20mZ9P/view?usp=sharing)
Or you can also generate the clean_embeddings.npy using your own openai api key it will cost around $0.02 per generation
---
# π Environment Variables
Create a .env file for your OpenAI API Key
```
OPENAIKEY=sk-xxxxxx
```
---
# π License
MIT β free to use, modify, and distribute.
---
# π€ Author
Aditya
π οΈ AI + Python + Web3 Enthusiast
---
## π Dataset Source and Preprocessing
This project uses data inspired by or adapted from the following Kaggle dataset:
**π [Kaggle - Manhwa and Webtoon Dataset](https://www.kaggle.com/datasets/victorsoeiro/manga-manhwa-and-manhua-dataset/data)**
Credit to **Victor Soeiro** for compiling and sharing this dataset.