https://github.com/scholarchen20/ustxtseg
Weakly-Supervised Medical Image Segmentation with Simple Text Cues
https://github.com/scholarchen20/ustxtseg
image-processing image-segmentation medical multimodal python pytorch segment-anything ultrasound-imaging
Last synced: about 1 month ago
JSON representation
Weakly-Supervised Medical Image Segmentation with Simple Text Cues
- Host: GitHub
- URL: https://github.com/scholarchen20/ustxtseg
- Owner: ScholarChen20
- License: apache-2.0
- Created: 2025-11-09T14:47:26.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2025-11-09T14:59:11.000Z (8 months ago)
- Last Synced: 2025-11-09T16:22:49.105Z (8 months ago)
- Topics: image-processing, image-segmentation, medical, multimodal, python, pytorch, segment-anything, ultrasound-imaging
- Language: Python
- Homepage: https://github.com/ScholarChen20/USTxtSeg
- Size: 1.43 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
- License: LICENSE
Awesome Lists containing this project
README
[//]: # (# SimTxtSeg: Weakly-Supervised Medical Image Segmentation with Simple Text Cues)
[//]: # (Paper : [arxiv](https://arxiv.org/abs/2406.19364), has been acceptd by ***MICCAI2024✨***)
[//]: # ()
[//]: # (by Yuxin Xie, Tao Zhou, Yi Zhou, Geng Chen)
## 🙋 Introduction
[//]: # (Our contribution consists of two key components: an effective Textual-to-Visual Cue Converter that produces visual prompts from text prompts on medical images, and a text-guided segmentation model with Text-Vision Hybrid Attention that fuses text and image features. We evaluate our framework on two medical image segmentation tasks: colonic polyp segmentation and MRI brain tumor segmentation, and achieve consistent state-of-the-art performance.)
[//]: # (
)
[//]: # (
)
## 🚀 Updates
[//]: # (* `[2024.07.07]` We are excited to release : ✅dataset and ✅TVCC code.)
[//]: # (* `[2024.09.25]` We are excited to release : ✅TVHA code.)
## 📖 Dataset Preparation
* Dataset Download
1. Polyp Dataset: [PolypGen](https://www.synapse.org/#!Synapse:syn26376615/wiki/613312) (data_C1 - data_C6 is used), [others](https://github.com/DengPingFan/PraNet) (including CVC-300 (60 samples), CVC-ClinicDB (612 samples), CVC-ColonDB (380 samples), ETIS-LaribPolypDB (196 samples), Kvasir (100 samples), Kvasir-SEG (900 samples))
2. Brain Tumor Dataset: [kaggle_3m](https://www.kaggle.com/datasets/nikhilroxtomar/brain-tumor-segmentation)
3. Isic Dataset: [ISIC](https://challenge.isic-archive.com/data/#2019)
* For TVCC, to avoid handcrafted prompting cost, we use GPT-4 to generate a concise sentence within 20 words. Before training, you need to transform your dataset into **ODVG** format for precise alignment of regions and phrases. **coco** format label is also required for test and validation.
```
python util/mask2odvg.py
python util/mask2coco.py
```
* For TVHA segmentation model, just use binary mask.
## ⚡ Quick Start
### 1. Environment
Clone the whole repository and install the dependencies.
```
conda create -n USTxtSeg python=3.11
conda activate USTxtSeg
git clone https://github.com/xyx1024/USTxtSeg.git
pip install -r requirements.txt
```
see [mmdet_get_started_中文](https://github.com/open-mmlab/mmdetection/blob/cfd5d3a985b0249de009b67d04f37263e11cdf3d/docs/zh_cn/get_started.md) or [mmdet_get_started_english](https://github.com/open-mmlab/mmdetection/blob/cfd5d3a985b0249de009b67d04f37263e11cdf3d/docs/en/get_started.md) to install mmdet.
### 2. For TVCC
download swin_tiny_patch4_window7_224.pth : https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_tiny_patch4_window7_224.pth
download grounding-dino checkpoints:
```
wget load_from = 'https://download.openmmlab.com/mmdetection/v3.0/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det/grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det_20231204_095047-b448804b.pth' # noqa
```
Then use config files to pretrain TVCC:support polyp dataset, brain tumor dataset, isic dataset.
```
cd TVCC/polyp_grounding_dino
./tools/dist_train.sh TVCC/polyp_grounding_dino/config/GroundingDINO_Polyp_PhraseGrounding_config.py n # gpu num, change as you want
```
TVCC evaluation:
```
# 单卡
python tools/test.py config_path ckpt_path
# 4 卡
./tools/dist_test.sh config_path ckpt_path 4
```
visual cues visualize:
```
python tools/image_demo.py
image_path \
config_path \
--weights weight_path \
--texts 'xxx'
```
### 3. Pseudo Masks Generation
Click the links below to download the checkpoint for the corresponding model type.
default or vit_h: [ViT-H SAM model](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth).
vit_l: [ViT-L SAM model](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_l_0b3195.pth).
vit_b: [ViT-B SAM model](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth) & [SAM-Med2d](https://drive.google.com/file/d/1ARiB5RkSsWmAB_8mqWnwDF8ZKTtFwsjl/view?usp=drive_link).
Use the checkpoint of SAM and TVCC to generate the pseudo masks.
```
cd TVCC/polyp_grounding_dino
python TVCC_Sam.py
```
### 4. USTxtSeg with TVHA
use pseudo mask and text prompt to supervise model.
```
python train.py
python test.py
```
## 🎯 Results
**Comparison experiments and Ablation study:**
[//]: # ()
[//]: # (
)
[//]: # ()
**Visualization**
[//]: # ()
[//]: # (
)
[//]: # ()
## 🗓️ Ongoing
- [x] paper release
[//]: # (- [x] dataset release)
[//]: # (- [x] TVCC pretrain and test code release)
[//]: # (- [x] SimTxtSeg with TVHA model release.)
[//]: # ()
## 🎫 License
This project is released under the Apache 2.0 license.
## 💘 Acknowledge
mmdetection: https://github.com/open-mmlab/mmdetection/tree/main
GroundingDINO: https://github.com/IDEA-Research/GroundingDINO
Segment Anything: https://github.com/facebookresearch/segment-anything?tab=readme-ov-file
[//]: # ()
## ✒️ Citation
[//]: # (If you find this repository useful, please consider citing this paper:)
[//]: # (```)
[//]: # (```)
[//]: # (## 📬 Contact)
[//]: # (If you have any question, please feel free to contact silver_iris@163.com.)