https://github.com/zer0int/clip-generative-adversarial
Projected Gradient Descent (PGD), inverted and amplified -> prompt & generate images with CLIP
https://github.com/zer0int/clip-generative-adversarial
adversarial adversarial-attacks adversarial-examples ai clip generative generative-ai pgd vision-transformer xai
Last synced: 3 months ago
JSON representation
Projected Gradient Descent (PGD), inverted and amplified -> prompt & generate images with CLIP
- Host: GitHub
- URL: https://github.com/zer0int/clip-generative-adversarial
- Owner: zer0int
- Created: 2024-07-27T11:24:34.000Z (10 months ago)
- Default Branch: CLIP-vision
- Last Pushed: 2024-07-28T18:15:11.000Z (10 months ago)
- Last Synced: 2025-01-02T17:33:25.140Z (5 months ago)
- Topics: adversarial, adversarial-attacks, adversarial-examples, ai, clip, generative, generative-ai, pgd, vision-transformer, xai
- Language: Python
- Homepage:
- Size: 14.4 MB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## CLIP-generative-adversarial
### Who needs a Diffusion U-Net or Transformer if you can just use the 'text encoder'...? 😉
- Uses Projected Gradient Descent (PGD), which is commonly used in adversarial robustness training
- Inverts objective and amplifies perturbations for human perception
- Perturbation towards (!) the prompt by default, making CLIP a self-generative AI 🙃
- Also plots results 'success' (original vs. perturbated cosine similarity)
- You can change the default behavior to "classic" adversarial example generation
- See code comments for details and instructions
------
- If you ever wondered why some strange word in a prompt for e.g. Stable Diffusion works - now you can find out!
- For Stable Diffusion V1, the sole text encoder "guide" is CLIP ViT-L/14 (the model set by default in my code).
- The diff between *this* and feature activation max visualization: We're using the whole model (output) to guide towards a text prompt.
- To visualize indivual 'neurons' (features) in CLIP ViT, see my other repo: [zer0int/CLIP-ViT-visualization](https://github.com/zer0int/CLIP-ViT-visualization)
- To get a CLIP opinion of what CLIP 'thinks' of an image, see: [zer0int/CLIP-XAI-GUI](https://github.com/zer0int/CLIP-XAI-GUI)
------
Requires: torch, torchvision, numpy, PIL, matplotlib, OpenCV, (skimage).
Requires [OpenAI/CLIP](https://github.com/openai/CLIP).