Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sovit-123/sam_molmo_whisper
An integration of Segment Anything Model, Molmo, and, Whisper to segment objects using voice and natural language.
https://github.com/sovit-123/sam_molmo_whisper
molmo segment-anything-model segmentanythingmodel vlm whisper
Last synced: 2 months ago
JSON representation
An integration of Segment Anything Model, Molmo, and, Whisper to segment objects using voice and natural language.
- Host: GitHub
- URL: https://github.com/sovit-123/sam_molmo_whisper
- Owner: sovit-123
- License: apache-2.0
- Created: 2024-10-10T01:43:26.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2024-10-16T16:58:56.000Z (2 months ago)
- Last Synced: 2024-10-17T13:38:21.111Z (2 months ago)
- Topics: molmo, segment-anything-model, segmentanythingmodel, vlm, whisper
- Language: Jupyter Notebook
- Homepage:
- Size: 5.99 MB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# SAM_Molmo_Whisper
A simple integration of Segment Anything Model, Molmo, and, Whisper to segment objects using voice and natural language.Capabilities:
* Segment objects with **SAM2.1** using point prompts.
* Points can be obtained by **prompting Molmo** with natural language. Molmo can take inputs by the **text box (typing)** or **Whisper via microphone (speech to text)**.**Run the Gradio demo using**:
```
python app.py
```https://github.com/user-attachments/assets/66a0620e-ede3-4018-8ee7-f261790747cb
## Installing Requirements
Install Pytorch, Hugging Face Transformers, and the rest of the base requirements.
```
pip install -r requirements.txt
```**Install SAM2:**
*It is highly recommended to clone SAM2 to a separate directory other than this project directory and run the installation commands*.
```
git clone https://github.com/facebookresearch/sam2.git && cd sam2pip install -e .
```