{"id":51186284,"url":"https://github.com/syrax90/mask2former_tensorflow2","last_synced_at":"2026-06-27T10:04:05.350Z","repository":{"id":336131833,"uuid":"1065783000","full_name":"syrax90/mask2former_tensorflow2","owner":"syrax90","description":"Mask2Former TensorFlow","archived":false,"fork":false,"pushed_at":"2026-03-21T15:33:46.000Z","size":5299,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-22T05:58:25.898Z","etag":null,"topics":["computer-vision","machine-learning","mask2former","tensorflow","tensorflow2"],"latest_commit_sha":null,"homepage":"https://syrax90.github.io/mask2former_tensorflow2/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/syrax90.png","metadata":{"files":{"readme":"ReadMe.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":"CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-28T12:23:12.000Z","updated_at":"2026-03-21T15:33:28.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/syrax90/mask2former_tensorflow2","commit_stats":null,"previous_names":["syrax90/mask2former_tensorflow2"],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/syrax90/mask2former_tensorflow2","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/syrax90%2Fmask2former_tensorflow2","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/syrax90%2Fmask2former_tensorflow2/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/syrax90%2Fmask2former_tensorflow2/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/syrax90%2Fmask2former_tensorflow2/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/syrax90","download_url":"https://codeload.github.com/syrax90/mask2former_tensorflow2/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/syrax90%2Fmask2former_tensorflow2/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34848983,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-27T02:00:06.362Z","response_time":126,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","machine-learning","mask2former","tensorflow","tensorflow2"],"created_at":"2026-06-27T10:04:04.642Z","updated_at":"2026-06-27T10:04:05.343Z","avatar_url":"https://github.com/syrax90.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Mask2Former with TensorFlow\n\nThis project is an implementation of **Mask2Former** using the TensorFlow framework. The goal is to provide a clear explanation of how Mask2Former works and demonstrate how the model can be implemented with TensorFlow.   \n[Mask2Former TensorFlow](https://github.com/syrax90/mask2former_tensorflow2)\n\n## About Mask2Former\n\nMask2Former is a model designed for computer vision tasks, specifically instance segmentation.\n\u003e [**Masked-attention Mask Transformer for Universal Image Segmentation**](https://arxiv.org/abs/2112.01527),\n\u003e Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, Rohit Girdhar\n\u003e *arXiv preprint ([arXiv:2112.01527](https://arxiv.org/abs/2112.01527))*\n\nTo understand instance or panoptic segmentation better, consider the example below, where multiple objects—whether of the same or different classes—are identified as separate instances, each with its own segmentation mask (and the probability of belonging to a certain class):\n\n![Instance segmentation picture](images/readme/my_photo_with_masks.jpg)\n\nCurrent implementation of Mask2Former supports two backbones:\n- **ResNet-50** (default) — ImageNet pre-trained, \"caffe\"-style BGR preprocessing.\n- **MobileNetV4ConvMedium** — lightweight pure-convolution backbone from `tf-models-official`, suitable for mobile/edge deployment. Trained from scratch with [0, 1] RGB preprocessing.\n\n## Installation, Dependencies, and Requirements\n\nThe project has been tested on \u003cstrong\u003eUbuntu 24.04.2 LTS\u003c/strong\u003e with \u003cstrong\u003envcr.io/nvidia/tensorflow:25.02-tf2-py3 container using TensorFlow 2.17.0\u003c/strong\u003e. It may work on other operating systems and TensorFlow versions (older than 2.17.0), but we cannot guarantee compatibility.\n\nIf you don't use the container, you need to install the following dependencies:\n\n- Python 3.12.3\n- All dependencies are listed in `requirements.txt`.\n- Use `setup.sh` to install all dependencies on Linux.\n\n\u003e \u003cstrong\u003eNote:\u003c/strong\u003e A GPU with CUDA support is highly recommended to speed up training.\n\n## Datasets\n\nThe code supports datasets in the COCO format. We recommend creating your own dataset to better understand the full training cycle, including data preparation. [LabelMe](https://github.com/wkentaro/labelme) is a good tool for this. You don’t need a large dataset or many classes to begin training and see results. This makes it easier to experiment and learn without requiring powerful hardware.  \nAlternatively, you can use the original [COCO dataset](https://cocodataset.org/#home), which contains 80 object categories. You can also train your own large dataset because the model suits well for this task.\n\nFor high-performance results we chose TFRecord format for the dataset. TensorFlow is able to use TFRecord format files for parallel reading and is compatible with TensorFlow Graph Mode. To use the dataset, follow these steps:\n\n1) Convert your COCO dataset to TFRecord files:\n\n```bash\npython convert_coco_to_tfrecord.py \\\n  --images_root /path/to/images \\\n  --annotations /path/to/instances_train.json \\\n  --output /path/to/out/train.tfrecord \\\n  --num_shards 4\n```\n\n```bash\npython convert_coco_to_tfrecord.py \\\n  --images_root /path/to/images \\\n  --annotations /path/to/instances_val.json \\\n  --output /path/to/out/test.tfrecord \\\n  --num_shards 4\n```\n\nFor Panoptic Segmentation:\n\n```bash\npython convert_coco_to_tfrecord.py \\\n  --images_root /path/to/images \\\n  --annotations /path/to/panoptic_train.json \\\n  --panoptic_masks_root /path/to/panoptic_masks \\\n  --output /path/to/out/panoptic_train.tfrecord \\\n  --num_shards 4\n```\n\n2) Set corresponding settings in `config.py` file:\n\n```python\nself.tfrecord_dataset_directory_path  = 'path/to/tfrecords/train/directory'\nself.tfrecord_test_path = 'path/to/tfrecords/test/directory'\n```\n\nFor Panoptic Segmentation:\n\n```python\nself.use_panoptic_dataset = True\nself.tfrecord_panoptic_dataset_directory_path = 'path/to/tfrecords/panoptic_train/directory'\nself.tfrecord_panoptic_test_path = 'path/to/tfrecords/panoptic_test/directory'\n```\n\n## Configuration\n\nAll configuration parameters are defined in `config.py` file within the `Mask2FormerConfig` class.\n\n(Optionally) Set the path to your COCO root directory:\n\n```python\nself.coco_root_path = '/path/to/your/coco/root/directory'\n```\n\nSet the path to your COCO training dataset:\n\n```python\nself.tfrecord_dataset_directory_path  = 'path/to/tfrecords/directory'\n```\n\nSet the path to the dataset's annotation file:  \n\n```python\nself.train_annotation_path = f'{self.coco_root_path}/annotations/instances_train2017.json'\n```\n\nFor Panoptic Segmentation:\n\n```python\n# Panoptic Segmentation\nself.use_panoptic_dataset = True\nself.panoptic_train_annotation_path = f'{self.coco_root_path}/annotations/panoptic_train2017.json'\nself.tfrecord_panoptic_dataset_directory_path = 'path/to/tfrecords/panoptic_train/directory'\nself.tfrecord_panoptic_test_path = 'path/to/tfrecords/panoptic_test/directory'\n```\n\nAnd you can find other intuitive parameters:\n\n```python\n# Image parameters\nself.img_height = 480\nself.img_width = 480\n\n# Transformer architecture parameters\nself.transformer_input_channels = 256 # Channel dimension for transformer inputs\nself.num_decoder_layers = 6           # Number of transformer decoder layers\nself.num_heads = 8                    # Number of attention heads\nself.dim_feedforward = 1024           # Feed-forward network dimension\n\n# Backbone selection: \"resnet50\" (default) or \"mobilenetv4\"\nself.backbone_type = \"resnet50\"\n\n# If load_previous_model = True: load the previous model weights.\nself.load_previous_model = False\nself.lr = 0.0001\nself.batch_size = 16\n# If load_previous_model = True, the code will look for the latest checkpoint in this directory or use this path if it is a specific checkpoint file.\nself.model_path = './checkpoints'  # example for specific checkpoint: self.model_path = './checkpoints/ckpt-5'\n\n# Save the model weights every save_iter epochs:\nself.save_iter = 1\nself.approx_coco_train_size = 118287\n# Number of epochs\nself.epochs = 100\n\n# Testing configuration\nself.test_model_path = './checkpoints'  # example for specific checkpoint: self.test_model_path = './checkpoints/ckpt-5'\nself.score_threshold = 0.5\n\n# Accumulation mode\nself.use_gradient_accumulation_steps = False\nself.accumulation_steps = 8\n\n# Dataset options\nself.tfrecord_test_path = f'{self.coco_root_path}/tfrecords/test'  # Path to TFRecord test dataset directory. Used for mAP calculation.\nself.augment = True\nself.shuffle_buffer_size = 4096  # TFRecord dataset shuffle buffer size. Set to None to disable shuffling\nself.warmup_steps = 10000\n\n# Whether to print the model summary at the beginning of training\nself.show_model_summary = False\n```\n\n## Docker file\n\nThe docker files are available in the `docker` directory. `nvcr.io/nvidia/tensorflow:25.02-tf2-py3` doesn't contain all the required dependencies, so we provide custom containers depending on the chosen backbone:\n\n- **`docker/my-tf`**: Use this container for the default **ResNet50** backbone.\n- **`docker/my-tf-mobile`**: Use this container when using the **MobileNetV4** backbone. It installs additional dependencies (like `tf-models-official`) and applies specific patches needed for the lightweight mobile setup.\n\n## Training\n\nTo start training, run:\n\n```bash\npython train.py\n```\n\nUsing the container:\n\n```bash\ndocker run --rm --ipc host --gpus all -v /path/to/Mask2Former/directory:/opt/project -v /path/to/datasets/Cocodataset2017:/path/to/datasets/Cocodataset2017 -w /opt/project --entrypoint=  my-tf:latest python train.py\n```\n\nModel weights are saved in the `checkpoints` directory every `cfg.save_iter` epochs.\n\nTo proceed training:\n\n1) Set configuration parameter `load_previous_model` to `True`:\n\n```python\nself.load_previous_model = True\n```\n\n2) Set the path to the previously saved model. By default, the latest checkpoint will be used:\n\n```python\nself.model_path = './checkpoints'  # example for specific checkpoint: self.model_path = './checkpoints/ckpt-5'\n```\n\n## Testing\n\nTo test the model:\n\n1) Move your test images in the `/images/test` directory.\n\n2) In the config file, set the path to the model weights you want to test. By default, the latest checkpoint will be used:\n\n```python\nself.test_model_path = './checkpoints'  # example for specific checkpoint: self.test_model_path = './checkpoints/ckpt-5'\n```\n\n3) Run the test script:\n\n```bash\npython test.py\n```\n\nUsing the container:\n\n```bash\ndocker run --rm --ipc host --gpus all -v /path/to/Mask2Former/directory:/opt/project -v /path/to/datasets/Cocodataset2017:/path/to/datasets/Cocodataset2017 -w /opt/project --entrypoint=  my-tf:latest python test.py\n```\n\nOutput images with masks and class labels will be saved in the `/images/res` directory.\n\n## Dataset Evaluation\n\nIt is possible to evaluate the data fed to the model before training to ensure that the masks, classes, and scales are applied correctly:\n\nThis script generates images with instance or panoptic masks and their corresponding category labels. The outputs are saved in `images/dataset_test`.\n\nBy default, it processes the first 200 randomly selected images. To change or remove this limit, edit `test_dataset.py`.\n\n## Test mAP\n\nThere is possibility to evaluate how accurate the model is.\n\n1) Set the path to the test dataset in config file:\n\n```python\nself.tfrecord_test_path = path/to/tfrecords/test/directory'  # Path to TFRecord test dataset directory. Used for mAP calculation.\n```\n\nFor Panoptic Segmentation:\n\n```python\n# Panoptic Segmentation\nself.use_panoptic_dataset = True\nself.tfrecord_panoptic_test_path = 'path/to/tfrecords/panoptic_test/directory'\n```\n\n2) Run the test mAP script:\n\n```bash\npython test_map.py\n```\n\n## Tasks for nearest future\n\n- Add support for multi GPU training.\n\n## Thank you\n\nWe appreciate your interest and contributions toward improving this project. Happy learning and using Mask2Former!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsyrax90%2Fmask2former_tensorflow2","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsyrax90%2Fmask2former_tensorflow2","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsyrax90%2Fmask2former_tensorflow2/lists"}