{"id":23102750,"url":"https://github.com/dataxujing/mask_rcnn_keras","last_synced_at":"2026-05-01T09:31:42.428Z","repository":{"id":112305628,"uuid":"251602850","full_name":"DataXujing/Mask_RCNN_keras","owner":"DataXujing","description":":bug::bug: Keras训练Mask RCNN教程，包括训练集构建，配置文件修改，训练，推断","archived":false,"fork":false,"pushed_at":"2020-03-31T12:58:42.000Z","size":75446,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-03T21:32:18.963Z","etag":null,"topics":["instance-segmentation","keras","mask-rcnn"],"latest_commit_sha":null,"homepage":"https://github.com/matterport/Mask_RCNN","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DataXujing.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-03-31T12:56:11.000Z","updated_at":"2022-05-19T18:01:35.000Z","dependencies_parsed_at":"2023-05-12T18:15:53.562Z","dependency_job_id":null,"html_url":"https://github.com/DataXujing/Mask_RCNN_keras","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/DataXujing/Mask_RCNN_keras","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DataXujing%2FMask_RCNN_keras","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DataXujing%2FMask_RCNN_keras/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DataXujing%2FMask_RCNN_keras/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DataXujing%2FMask_RCNN_keras/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DataXujing","download_url":"https://codeload.github.com/DataXujing/Mask_RCNN_keras/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DataXujing%2FMask_RCNN_keras/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32492115,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-30T13:12:12.517Z","status":"online","status_checked_at":"2026-05-01T02:00:05.856Z","response_time":64,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["instance-segmentation","keras","mask-rcnn"],"created_at":"2024-12-17T00:00:26.241Z","updated_at":"2026-05-01T09:31:42.410Z","avatar_url":"https://github.com/DataXujing.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Mask R-CNN 训练自己的数据集\n\n## Mask R-CNN for Object Detection and Segmentation\n\n**徐静**\n\n\n\npaper: [Mask R-CNN](https://arxiv.org/abs/1703.06870)\n\nMask R-CNN的Python3, Keras和TensorFlow实现，该网络基于Feature Pyramid Network(FPN)和ResNet101 backbone\n\n\n该Repository实现了:\n\n* 基于FPN和ResNet101的Mask R-CNN网络.\n* 训练基于COCO标注数据样式的标注\n* 预训练模型\n* Jupyter notebooks 可是化整个训练和推断过程\n* 多GPU并行训练\n* 评价指标AP\n* 训练自己数据及的样例\n\n\n相关文件说明：\n\n* [demo.ipynb](samples/demo.ipynb) 用预训练的MS COCO数据集分割自己数据的demo\n* [train_shapes.ipynb](samples/shapes/train_shapes.ipynb) 如何使用MS R-CNN训练自己的数据集，以toy data Shapes为例\n* ([model.py](mrcnn/model.py), [utils.py](mrcnn/utils.py), [config.py](mrcnn/config.py)): Mask R-CNN的实现\n* [inspect_data.ipynb](samples/coco/inspect_data.ipynb). 数据预处理每一步的可视化\n* [inspect_model.ipynb](samples/coco/inspect_model.ipynb) 模型训练过程的可视化\n* [inspect_weights.ipynb](samples/coco/inspect_weights.ipynb) 训练好的模型的可视化\n\n### 1.数据准备\n\n使用labelme做标注，标注后转化为COCO样式的数据，该过程请参考labelme教程，在此不赘述，标注的训练数据为专业的医学消化内镜数据，因涉及数据安全和隐私的保护，不会提供和公开该标注数据集。\n\n\n### 2.重写config和dataset两个类，并构造训练过程\n\n详细参考`./samples/water/coco.py`\n\n+ 重写config文件\n\n```\nclass CocoConfig(Config):\n    \"\"\"Configuration for training on MS COCO.\n    Derives from the base Config class and overrides values specific\n    to the COCO dataset.\n    \"\"\"\n    # Give the configuration a recognizable name\n    NAME = \"coco\"\n\n    # We use a GPU with 12GB memory, which can fit two images.\n    # Adjust down if you use a smaller GPU.\n    # IMAGES_PER_GPU = 8\n    IMAGES_PER_GPU = 2\n\n\n    # Uncomment to train on 8 GPUs (default is 1)\n    GPU_COUNT = 1\n\n    # Number of classes (including background)\n    NUM_CLASSES = 1 + 2  # COCO has 80 classes\n\n    # Backbone network architecture\n    # Supported values are: resnet50, resnet101\n    BACKBONE = \"resnet50\"\n\n    # Input image resizing\n    # Random crops of size 512x512\n    IMAGE_RESIZE_MODE = \"crop\"\n    IMAGE_MIN_DIM = 512\n    IMAGE_MAX_DIM = 512\n    IMAGE_MIN_SCALE = 2.0\n\n    # Length of square anchor side in pixels\n    RPN_ANCHOR_SCALES = (8, 16, 32, 64, 128)\n\n```\n\n关于config的介绍请参考`./mrcnn/config.py`\n\n+ 重写dataset\n\n```\nclass CocoDataset(utils.Dataset):\n    def load_coco(self, dataset_dir, subset,class_ids=None,class_map=None, return_coco=False):\n        \"\"\"Load a subset of the COCO dataset.\n        dataset_dir: The root directory of the COCO dataset.\n        subset: What to load (train, val)\n        class_ids: If provided, only loads images that have the given classes.\n        class_map: TODO: Not implemented yet. Supports maping classes from\n            different datasets to the same class ID.\n        return_coco: If True, returns the COCO object.\n     \n        \"\"\"\n\n        # 将自己的标注和图像文件的路径修改成自己的\n        # coco = COCO(\"{}/annotations/instances_{}{}.json\".format(dataset_dir, subset, year))\n        coco = COCO(\"{}/annotations/annotations_{}.json\".format(dataset_dir, subset))\n        # image_dir = \"{}/{}\".format(dataset_dir, subset)\n        image_dir = \"{}/images\".format(dataset_dir)\n\n\n        # Load all classes or a subset?\n        # 注意将标注文件中__background__类别去掉\n        if not class_ids:\n            # All classes\n            class_ids = sorted(coco.getCatIds())\n\n        # All images or a subset?\n        if class_ids:\n            image_ids = []\n            for id_ in class_ids:\n                image_ids.extend(list(coco.getImgIds(catIds=[id_])))\n            # Remove duplicates\n            image_ids = list(set(image_ids))\n        else:\n            # All images\n            image_ids = list(coco.imgs.keys())\n\n        # Add classes\n        for i in class_ids:\n            self.add_class(\"coco\", i, coco.loadCats(i)[0][\"name\"])\n\n        # Add images\n        for i in image_ids:\n            self.add_image(\n                \"coco\", image_id=i,\n                path=os.path.join(image_dir, coco.imgs[i]['file_name']),\n                width=coco.imgs[i][\"width\"],\n                height=coco.imgs[i][\"height\"],\n                annotations=coco.loadAnns(coco.getAnnIds(\n                    imgIds=[i], catIds=class_ids, iscrowd=None)))\n        if return_coco:\n            return coco\n\n   \n    def load_mask(self, image_id):\n        \"\"\"Load instance masks for the given image.\n\n        Different datasets use different ways to store masks. This\n        function converts the different mask format to one format\n        in the form of a bitmap [height, width, instances].\n\n        Returns:\n        masks: A bool array of shape [height, width, instance count] with\n            one mask per instance.\n        class_ids: a 1D array of class IDs of the instance masks.\n        \"\"\"\n        # If not a COCO image, delegate to parent class.\n        image_info = self.image_info[image_id]\n        if image_info[\"source\"] != \"coco\":\n            return super(CocoDataset, self).load_mask(image_id)\n\n        instance_masks = []\n        class_ids = []\n        annotations = self.image_info[image_id][\"annotations\"]\n        # Build mask of shape [height, width, instance_count] and list\n        # of class IDs that correspond to each channel of the mask.\n        for annotation in annotations:\n            class_id = self.map_source_class_id(\n                \"coco.{}\".format(annotation['category_id']))\n            if class_id:\n                m = self.annToMask(annotation, image_info[\"height\"],\n                                   image_info[\"width\"])\n                # Some objects are so small that they're less than 1 pixel area\n                # and end up rounded out. Skip those objects.\n                if m.max() \u003c 1:\n                    continue\n                # Is it a crowd? If so, use a negative class ID.\n                if annotation['iscrowd']:\n                    # Use negative class ID for crowds\n                    class_id *= -1\n                    # For crowd masks, annToMask() sometimes returns a mask\n                    # smaller than the given dimensions. If so, resize it.\n                    if m.shape[0] != image_info[\"height\"] or m.shape[1] != image_info[\"width\"]:\n                        m = np.ones([image_info[\"height\"], image_info[\"width\"]], dtype=bool)\n                instance_masks.append(m)\n                class_ids.append(class_id)\n\n        # Pack instance masks into an array\n        if class_ids:\n            mask = np.stack(instance_masks, axis=2).astype(np.bool)\n            class_ids = np.array(class_ids, dtype=np.int32)\n            return mask, class_ids\n        else:\n            # Call super class to return an empty mask\n            return super(CocoDataset, self).load_mask(image_id)\n\n    # def image_reference(self, image_id):\n    #     \"\"\"Return a link to the image in the COCO Website.\"\"\"\n    #     info = self.image_info[image_id]\n    #     if info[\"source\"] == \"coco\":\n    #         return \"http://cocodataset.org/#explore?id={}\".format(info[\"id\"])\n    #     else:\n    #         super(CocoDataset, self).image_reference(image_id)\n\n    # The following two functions are from pycocotools with a few changes.\n\n    def annToRLE(self, ann, height, width):\n        \"\"\"\n        Convert annotation which can be polygons, uncompressed RLE to RLE.\n        :return: binary mask (numpy 2D array)\n        \"\"\"\n        segm = ann['segmentation']\n        if isinstance(segm, list):\n            # polygon -- a single object might consist of multiple parts\n            # we merge all parts into one mask rle code\n            rles = maskUtils.frPyObjects(segm, height, width)\n            rle = maskUtils.merge(rles)\n        elif isinstance(segm['counts'], list):\n            # uncompressed RLE\n            rle = maskUtils.frPyObjects(segm, height, width)\n        else:\n            # rle\n            rle = ann['segmentation']\n        return rle\n\n    def annToMask(self, ann, height, width):\n        \"\"\"\n        Convert annotation which can be polygons, uncompressed RLE, or RLE to binary mask.\n        :return: binary mask (numpy 2D array)\n        \"\"\"\n        rle = self.annToRLE(ann, height, width)\n        m = maskUtils.decode(rle)\n        return m\n\n```\n\n\n### 3.实现OpenCV读取图片推断单张图像\n\n请参考`./samples/water/detection.py`,因原Mask R-CNN使用skimage和matplotlib处理数据，我们实现了通过OpenCV处理数据，并将最终的识别结果通过OpenCV保存和抛出。\n\n\n### 4.实现OpenCV读取视频推断视频\n\n请参考`./samples/water/detection_video.py`,原Mask R-CNN中并未提供视频推断的源码，我们基于OpenCV实现了训练Mask R-CNN模型的视频推断过程，并计算推断的帧率（FPS)\n\n### 5.demo\n\n训练模型\n\n```\n# 基于pre-trained COCO weights训练新的模型\npython3 coco.py train --dataset=./myData/coco/ --model=coco\n\n#基于预训练的 ImageNet weights，训练新的模型\npython3 coco.py train --dataset=./myData/coco/ --model=imagenet \n\n# 断点训练1\npython3 coco.py train --dataset=/path/to/coco/ --model=/path/to/weights.h5\n\n# 断点训练2\npython3 coco.py train --dataset=/path/to/coco/ --model=last\n\n```\n\n模型推断\n\n```\n# 图像\npython3 detection.py\n\n# 视频\npython3 detection_video.py\n\n```\n\n![](assets/street.png)\n\n\n### Citation\n\n```\n@misc{matterport_maskrcnn_2017,\n  title={Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow},\n  author={Waleed Abdulla},\n  year={2017},\n  publisher={Github},\n  journal={GitHub repository},\n  howpublished={\\url{https://github.com/matterport/Mask_RCNN}},\n}\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdataxujing%2Fmask_rcnn_keras","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdataxujing%2Fmask_rcnn_keras","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdataxujing%2Fmask_rcnn_keras/lists"}