https://github.com/ryogrid/anime-illust-image-searcher
Anime Style Illustration Specific Image Search App with ViT Tagger x BM25/Doc2Vec
https://github.com/ryogrid/anime-illust-image-searcher
anime anime-style bm25 deep-learning doc2vec gensim illustration image-search machine-learning onnxruntime python pytorch search-engine streamlit transformer vector-search vision-transformer
Last synced: 6 months ago
JSON representation
Anime Style Illustration Specific Image Search App with ViT Tagger x BM25/Doc2Vec
- Host: GitHub
- URL: https://github.com/ryogrid/anime-illust-image-searcher
- Owner: ryogrid
- License: mit
- Created: 2024-10-06T12:26:27.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-03-02T02:10:56.000Z (8 months ago)
- Last Synced: 2025-04-22T19:08:52.420Z (6 months ago)
- Topics: anime, anime-style, bm25, deep-learning, doc2vec, gensim, illustration, image-search, machine-learning, onnxruntime, python, pytorch, search-engine, streamlit, transformer, vector-search, vision-transformer
- Language: Python
- Homepage:
- Size: 346 KB
- Stars: 8
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Anime Style Illustration Specific Image Search App with ViT Tagger x BM25/Doc2Vec
## What's This?
- Anime Style Illustration Specific Image Search App with ML Technique
- can be used for photos. but flexible photo search is offered by Google Photos or etc :)
- Search capabilities of cloud photo album services towards illustration image files are poor for some reason
- So, I wrote simple scripts## Method
- Search Images Matching with Query Texts on Latent Semantic Representation Vector Space and with BM25
- Vectors are generated with embedding model: Tagger Using Visual Transformar (ViT) Internally x Doc2Vec
- Scores which is calculated with [BM25](https://en.wikipedia.org/wiki/Okapi_BM25) is used in combination
- Internal re-ranking method is also introduced
- Assumption: Users make queries better asymptotically according to top search results and find appropriate queries eventually
- If you wan to know detail of the method, please read webui.py :)
- Doc2Vec is Mainly Used for Covering Tagging Presision
- Simple search logic can be implemented with BM25 only
- But, you can use tags to search which are difficult for tagging because the index data which is composed of vectors generated with Doc2Vec model
- implemented with Gensim lib
- ( Web UI is implemented with StreamLit )## Usage
- (collect working confirmed environment)
- (Windows 11 Pro 64bit 23H2 X86_64)
- Python 3.12.7
- pip 22.0.4
- $ pip install -r requirements.txt
- (If use you want use GPU from tagging.py, please install appropriate pytorch pypi module like below)
- $ pip install torch==2.5.1+cu121 torchaudio==2.5.1+cu121 torchvision==0.20.1+cu121 --index-url https://download.pytorch.org/whl/cu121
- CUDA version should be changed to matche with your PC environment and change index-url option for the CUDA version
- $ python tagging.py --dir "IMAGE FILES CONTAINED DIR PATH"
- The script searches directory structure recursively :)
- This takes quite a while...
- About 1.7 sec/file at middle spec desktop PC (GPU is not used)
- AMD Ryzen 7 5700X 8-Core Processor 4.50 GHz
- About 0.5sec/file with GPU
- GeForce GTX 1660 SUPER
- Released at 2009/10
- VRAM: 6GB
- CUDA core: 1408 units
- Core frequrncy: 1785MHz (boost mode)
- Theorical peak flops (fp32): about 5.02TFLOPS
- Pathes and tags of image files are saved to tags-wd-tagger.txt
- $ python genmodel.py
- This takes quite a while...
- $ streamlit run webui.py
- Search app is opend on your web browser## Use Character Image Feture Based Reranking Mode (Optional)
- Reranking based on similarity calculation with [Quantized CCIP(Contrastive Anime Character Image Pre-Training) model](https://huggingface.co/deepghs/ccip_onnx)
- When index data described below exists, this mode becomes selectable at webui.py
- **Additional index data preparation is needed**
- $ python gen_cfeatures.py --dir "IMAGE FILES CONTAINED DIR PATH"
- If 'onnx-runtime-gpu' module does not work, please uninstall it and install normal 'onnx-runtime'...
- Best of luck!## Index Data Updating
- When you get and store new image files, you should update index data for adding the files to be hitted at search on webui.py
- Procedure
- 1 Backup all files genarated by scripts on this repo!
- Model files on your home directory is exception :)
- 2 $ python tagging.py --dir "IMAGE FILES CONTAINED DIR PATH" **--after "YYYY-MM-DD"**
- Param of --dir doesn't have to be changed
- Adding --after option is needed. Please specify date after last index data creation or update
- Tagging target is filtered by specified date: added date (cdate attribute) <= YYYY-MM-DD
- 3 $ python genmodel.py --update
- Optional
- 4 $ python gen_cfeatures.py --dir "IMAGE FILES CONTAINED DIR PATH" **--after "YYYY-MM-DD"**
- Thats's all!## Usage (Binary Package of Windows at Release Page)
- Same with above except that you need not to execute python and execution path (current path) is little bit different :)
- First, unzip package and launch command prompt or PowerShell :)
- $ cd anime-illust-image-searcher-pkg
- $ .\cmd_run\cmd_run.exe tagging --dir "IMAGE FILES CONTAINED DIR PATH"
- $ .\cmd_run\cmd_run.exe genmodel
- Same with above :)
- $ .\run_webui.exe
- Search app is opend on your web browser!## Tips (Attention)
- Words (tags) which were not apeeared at tagging are not usable on query
- Solution
- Search words you want to use from taggs-wd-tagger.txt with grep, editor or something for existance checking
- If exist, there is no problem. If not, you should think similar words and search it in same manner :)
- **Specifying Eath Tag Weight (format -> TAG:WEIGHT, WEIGHT shoud be integer)**
- Examples
- "girl:3 dragon"
- "girl:2 boy:3"
- **Exclude tag marking**
- **Weight specification which starts with '-' indicates that images tagged it should be excluded**
- **ex: "girl boy:-3"**
- **Images tagged 'boy' are removed from results. Numerical weight value is ignored but can't be omitted :)**
- **Required tag marking**
- **Weight specification which starts with '+' indicates the tag is required**
- **ex: "girl:+3 dragon"**
- **Images not tagged 'girl' are removed from results**
- **Weight value is NOT ignored at calculation of scores**
- **Search Result Exporting feature**
- You can export file paths list which is hitted at search
- Pressing 'Export' button saves the list as text file to path Web UI executed at
- File name is query text with timestamp and contents is line break delimited
- Some viewer tools such as [Irfan View](https://www.irfanview.com/) can load image files with passing a text file contains path list :)
- Irfan View can slideshow also. It's nice :)
- At Windows, charactor code is sjis. At other OSes, charactor code is utf-8
- Character code of file pathes
- If file path contains characters which can't be converted to Unicode or utf-8, scripts may ouput error message at processing the file
- But, it doesn't mean that your script usage is wrong. Though these files is ignored or not displayed at Web UI :|
- This is problem of current implentation. When you use scripts on Windows and charactor code of directory/file names isn't utf-8, the problem may occur## Information Related to Copyrights
- Tagger
- [this code](https://huggingface.co/spaces/SmilingWolf/wd-tagger/blob/main/app.py) was used as reference wheh implmenting tagger script
- ["WD EVA02-Large Tagger v3" model](https://huggingface.co/SmilingWolf/wd-eva02-large-tagger-v3) is used for image file tagging
- **I thank to great works of SmilingWolf**
- Character visual similarity calculation
- [this code](https://huggingface.co/spaces/deepghs/ccip/blob/f7d50a4f5dd3d4681984187308d70839ff0d3f5b/ccip.py) was used as reference when implemnting model execution
- [Quantized CCIP(Contrastive anime Character Image Pre-training) model](https://huggingface.co/deepghs/ccip_onnx) is used
- **I thank to great works of deepghs community members**## For Busy People
- **Tagging using Google Colab env !**
- 1 Make preprocessed data with [utility/make_tensor_files.py](./utility/make_tensor_files.py)
- 2 Zip the output dir
- 3 Upload zipped file to Google Drive
- 4 Use Google Colab env like [this](https://github.com/ryogrid/ryogridJupyterNotebooks/blob/master/tagging_colab-241104-T4-with-Tensor-files.ipynb)
- 5 Get tags-wd-tagger.txt and replace file pathes on it to be matched with your image files existing pathes :)
- 6 Execute genmodel.py !## TODO
- [ ]Search on latent representation generated by CLIP model
- **This method was alredy tried but precition was not good because current public available CLIP models are not fitting for anime style illust :|**
- If CLIP models which are fine tuned with anime style illust images are available, this method is better than current one
- [x] Weight specifying to tags like prompt format of Stable Diffusion Web UI
- Current implemenataion uses all tags faialy. But there is many cases that users want to emphasize specific tags and can't get appropriate results without that!
- [x] Fix a bug: some type of tags on tags-wd-tagger.txt can't be used on query
- [x] Incremental index updating at image files increasing
- [x] Similar image search with specifying a image file
- This is realized at 'Character Image Feture Based Reranking Mode' practically :)
- [x] Exporting found files list feature
- In text file. Once you get list, many other tools and viewer you like can be used :)
- [x] Making binary package of this app which doesn't need python environment building## Screenshots of Demo
- I used about 1000 image files collected from [Irasutoya](https://www.irasutoya.com/) which offers free image materials as search target example
- Note: image materials of Irasutoya have restrictions at commercial purposes use
- Partial tagging result: [./tagging_example.txt](/tagging_example.txt)
- Generation script was executed in Windows
- File paths in linked file have been partially masked- Search "standing"
- 
- Search "standing animal"
- 
- Image info page
- 
- Slideshow feature
- Auto slide in 5 sec period (roop)
- 