Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/zer0int/clip-xai-gui
CLIP GUI - XAI app ~ explainable (and guessable) AI with ViT & ResNet models
https://github.com/zer0int/clip-xai-gui
attention attention-visualization clip game gradient-ascent gui image-to-text vision-transformer vit xai
Last synced: 3 months ago
JSON representation
CLIP GUI - XAI app ~ explainable (and guessable) AI with ViT & ResNet models
- Host: GitHub
- URL: https://github.com/zer0int/clip-xai-gui
- Owner: zer0int
- Created: 2024-03-23T21:31:46.000Z (10 months ago)
- Default Branch: CLIP-vision
- Last Pushed: 2024-09-13T19:29:36.000Z (5 months ago)
- Last Synced: 2024-09-14T10:05:51.617Z (5 months ago)
- Topics: attention, attention-visualization, clip, game, gradient-ascent, gui, image-to-text, vision-transformer, vit, xai
- Language: Python
- Homepage:
- Size: 3.46 MB
- Stars: 9
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
![CLIP-gui-banner2](https://github.com/zer0int/CLIP-XAI-GUI/assets/132047210/208fce6e-221b-4ff3-b7ee-3795a97b4fb6)
### Change 4/May/2024:
- Added AMP (Automatic Mixed Precision); uses torch.cuda.amp / autocast + GradScaler
- ViT models are now much smaller - ViT-L/14 fits into 24 GB VRAM!
- Just do "python run_clipapp-amp.py" to launch the GUI / use AMP for a CLIP 'opinion'.![before-after](https://github.com/zer0int/CLIP-XAI-GUI/assets/132047210/11b6b703-3f64-42df-9177-31143834d6c2)
-----
## CLIP GUI - XAI app ~ explainable (and guessable) ViT & ResNet
This is a GUI for OpenAI's CLIP ViT and ResNet models, where you can:
- Upload an image, get a CLIP 'opinion' (text) about the image
- --> Gradient Ascent -> optimize text embeddings for cosine similarity with image embedding -> tokenizer -> CLIP 'opinion' words
- Guess where CLIP was 'looking' for a given predicted word by setting a ROI (optional) & see what CLIP was 'looking' at
- --> "GradCAM" - like heatmap of salient features / attention visualization## Installation & Running
- **Prerequisite**: [OpenAI CLIP](https://github.com/openai/CLIP)
- Check / install `requirements.txt`
- From the console, use "python run_clipapp.py" -> GUI- Default CLIP ViT-B/32 takes ~15 seconds to generate an 'opinion' (RTX 4090), 4 GB VRAM.
- Gigantic models >> 24 GB VRAM can use NVIDIA Driver CUDA SysMem Fallback Policy to run, but largest models ~ 30 Minutes for 1 opinion (not recommended)
- You can get a smaller model's "opinion" and force that on a bigger model (should work for all >=6 GB VRAM), or add your own words to visualize.
- Check the console to see what CLIP is "MatMulling" about while you wait to get a CLIP opinion.
- Click the image to place a ROI and "guess where CLIP was looking" (gamification, optional).
- Images and texts are saved to the "clipapp" subfolder.
- Check out the examples in "image-examples" to get started with some interesting images (hilarious 'opinion', typographic attack vulnerability, ...).
- Use square images for best results## Credits / Built On
- [OpenAI / CLIP](https://github.com/openai/CLIP)
- ViT heatmaps built on: [Transformer-MM-Explainability](https://github.com/hila-chefer/Transformer-MM-Explainability)
- ResNet heatmaps built on: [GradCAM Visualization](https://github.com/kevinzakka/clip_playground)
- Original CLIP Gradient Ascent Script: Used with permission by Twitter / X: [@advadnoun](https://twitter.com/advadnoun)
- Special thanks to GPT-4 for coding 90% of the run_clipapp.py =)## Warning about Bias and Fairness in CLIP Output
CLIP 'opinions' may contain biased rants (especially when non-English text is in the image), slurs, and profanity. Use responsibly / at your own discretion.
For more information, refer to the [CLIP Model Card](https://github.com/openai/CLIP/blob/main/model-card.md).## Known Issues
- No threading, scripts that invoke models run on main thread (check console to verify thread is not *actually* hanging)## Examples
![Screenshot 2024-03-23 163731](https://github.com/zer0int/CLIP-XAI-GUI/assets/132047210/17f4bc5f-51e3-4c87-96b5-682a5fcaa794)
![example_git](https://github.com/zer0int/CLIP-XAI-GUI/assets/132047210/170b20e2-9ce1-4b12-bb86-706af89db156)