Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/Curt-Park/segment-anything-with-clip
Segment Anything combined with CLIP
https://github.com/Curt-Park/segment-anything-with-clip
colab-notebook huggingface-spaces machine-learning nlp-machine-learning segmentation-model
Last synced: 2 months ago
JSON representation
Segment Anything combined with CLIP
- Host: GitHub
- URL: https://github.com/Curt-Park/segment-anything-with-clip
- Owner: Curt-Park
- License: apache-2.0
- Created: 2023-04-06T15:33:18.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-02-19T14:20:58.000Z (11 months ago)
- Last Synced: 2024-08-03T23:23:45.315Z (6 months ago)
- Topics: colab-notebook, huggingface-spaces, machine-learning, nlp-machine-learning, segmentation-model
- Language: Python
- Homepage: https://huggingface.co/spaces/curt-park/segment-anything-with-clip
- Size: 16.8 MB
- Stars: 322
- Watchers: 1
- Forks: 22
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-segment-anything-extensions - Repo
- Awesome-Segment-Anything - [**Segment Anything with Clip** - formed text. (Application / Image Detection/Segmentation)
README
# Segment Anything with Clip
[[HuggingFace Space](https://huggingface.co/spaces/curt-park/segment-anything-with-clip)] | [[COLAB](https://colab.research.google.com/github/Curt-Park/segment-anything-with-clip/blob/main/colab.ipynb)] | [[Demo Video](https://youtu.be/vM7MfAc3BdQ)]Meta released [a new foundation model for segmentation tasks](https://ai.facebook.com/blog/segment-anything-foundation-model-image-segmentation/).
It aims to resolve downstream segmentation tasks with prompt engineering, such as foreground/background points, bounding box, mask, and free-formed text.
However, the text prompt is not released yet.Alternatively, I took the following steps:
1. Get all object proposals generated by SAM (Segment Anything Model).
2. Crop the object regions by bounding boxes.
3. Get cropped images' features and a query feature from [CLIP](https://openai.com/research/clip).
4. Calculate the similarity between image features and the query feature.
```python
# How to get the similarity.
preprocessed_img = preprocess(crop).unsqueeze(0)
tokens = clip.tokenize(texts)
logits_per_image, _ = model(preprocessed_img, tokens)
similarity = logits_per_image.softmax(-1)
```## How to run on local
[Anaconda](https://www.anaconda.com/) is required before start setup.
```bash
make env
conda activate segment-anything-with-clip
make setup
``````bash
# this executes GRadio server.
make run
```
Open http://localhost:7860/
![](https://user-images.githubusercontent.com/14961526/232016821-dda192c1-1095-4086-adb8-e6a9f44b685f.png)## Successive Works
- [Fast Segment Everything](https://huggingface.co/spaces/Annotation-AI/fast-segment-everything): Re-implemented *Everything* algorithm in iterative manner that is better for CPU only environments. It shows comparable results to the original Everything within 1/5 number of inferences (e.g. 1024 vs 200), and it takes under 10 seconds to search for masks on a `CPU upgrade` instance (8 vCPU, 32GB RAM) of Huggingface space.
- [Fast Segment Everything with Text Prompt](https://huggingface.co/spaces/Annotation-AI/fast-segment-everything-with-text-prompt): This example based on Fast-Segment-Everything provides a text prompt that generates an attention map for the area you want to focus on.
- [Fast Segment Everything with Image Prompt](https://huggingface.co/spaces/Annotation-AI/fast-segment-everything-with-image-prompt): This example based on Fast-Segment-Everything provides an image prompt that generates an attention map for the area you want to focus on.
- [Fast Segment Everything with Drawing Prompt](https://huggingface.co/spaces/Annotation-AI/fast-segment-everything-with-drawing-prompt): This example based on Fast-Segment-Everything provides a drawing prompt that generates an attention map for the area you want to focus on.## References
- https://github.com/facebookresearch/segment-anything
- https://github.com/openai/CLIP