{"id":21298911,"url":"https://github.com/rathod-shubham/segmentanything","last_synced_at":"2025-03-15T17:47:33.761Z","repository":{"id":176951337,"uuid":"659441179","full_name":"RATHOD-SHUBHAM/SegmentAnything","owner":"RATHOD-SHUBHAM","description":"SAM is a deep learning model (transformer based). When we give an image as input to the Segment Anything Model, it first passes through an image encoder and produces a one-time embedding for the entire image. The downsampling happens using 2D convolutional layers. Then the model concatenates it with the image embedding to get the final vector. ","archived":false,"fork":false,"pushed_at":"2023-09-11T14:36:41.000Z","size":31619,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-01-22T07:43:19.575Z","etag":null,"topics":["deep-learning","encoder-decoder","instance-segmentation","neural-network","object-detection","sam","segment-anything","segment-anything-model","segmentation","segmentation-models","semantic-segmentation","transformer"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/RATHOD-SHUBHAM.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-06-27T21:01:59.000Z","updated_at":"2023-09-11T05:43:43.000Z","dependencies_parsed_at":"2023-11-21T07:13:48.156Z","dependency_job_id":null,"html_url":"https://github.com/RATHOD-SHUBHAM/SegmentAnything","commit_stats":{"total_commits":23,"total_committers":2,"mean_commits":11.5,"dds":"0.26086956521739135","last_synced_commit":"87d755964a2adaba45af1eed2bc22b8f240589df"},"previous_names":["rathod-shubham/segmentanything"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RATHOD-SHUBHAM%2FSegmentAnything","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RATHOD-SHUBHAM%2FSegmentAnything/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RATHOD-SHUBHAM%2FSegmentAnything/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/RATHOD-SHUBHAM%2FSegmentAnything/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/RATHOD-SHUBHAM","download_url":"https://codeload.github.com/RATHOD-SHUBHAM/SegmentAnything/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243769948,"owners_count":20345216,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","encoder-decoder","instance-segmentation","neural-network","object-detection","sam","segment-anything","segment-anything-model","segmentation","segmentation-models","semantic-segmentation","transformer"],"created_at":"2024-11-21T14:58:21.490Z","updated_at":"2025-03-15T17:47:33.739Z","avatar_url":"https://github.com/RATHOD-SHUBHAM.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# SegmentAnything\n\nSAM is a deep learning model (transformer based). \n\nSAM can take prompts from users about which area to segment out precisely. As of the current release, we can provide three different prompts to SAM:\n\n* By clicking on a point\n* By drawing a bounding box\n* By drawing a rough mask on an object\n\n\nThree important components of the mode:\n\n* An image encoder.\n* A prompt encoder.\n* A mask decoder.\n\nWhen we give an image as input to the Segment Anything Model, it first passes through an image encoder and produces a one-time embedding for the entire image. \n\n\nThere is also a prompt encoder for points and boxes. \n\n* For points, the x \u0026 y coordinates, along with the foreground and background information, become input to the encoder. \n* For boxes, the bounding box coordinates become the input to the encoder, and as for the text (not released at the time of writing this), the tokens become the input.\n\nIn case we provide a mask as input, it directly goes through a downsampling stage.\\\nThe downsampling happens using 2D convolutional layers. Then the model concatenates it with the image embedding to get the final vector. \n\nAny vector that the model gets from the prompt vector + image embedding passes through a lightweight decoder that creates the final segmentation mask. \n\nWe get possible valid masks along with a confidence score as the output.\n\n\n**Installation**\n\n\u003e The code requires python\u003e=3.8, as well as pytorch\u003e=1.7 and torchvision\u003e=0.8. \n\nInstalling both PyTorch and TorchVision with CUDA support is strongly recommended.\n\nInstall Segment Anything:\n\n  ` pip install git+https://github.com/facebookresearch/segment-anything.git `\n\n  ` pip install opencv-python pycocotools matplotlib onnxruntime onnx `\n\n---\n\nExample on using SAM with prompts and automatically generating masks:\n\n![Screenshot 2023-06-27 at 7 40 42 PM](https://github.com/RATHOD-SHUBHAM/SegmentAnything/assets/58945964/4c21c252-687d-4992-828b-a56278a6fb93)\n\n***\n\n# Yolo NAS + SAM\n\n*Yolo NAS model explored [here](https://github.com/RATHOD-SHUBHAM/OOD_YOLONAS).*\n\nSteps:\n\n1. Install the necessary libraries and frameworks: You will need to install libraries and frameworks like OpenCV, Supergradient, SAM which are required for object detection.\n\n2. Download the YOLO NAS model: You can download the YOLO NAS model from the official website or from GitHub. This model is trained on the COCO dataset, which includes a large number of object classes.\n\n3. Download the Segment Anything model: You can download the Segment Anything model from GitHub. This model is trained to segment objects from an image.\n\n4. Load the YOLO NAS model: Use Keras to load the YOLO NAS model into your project.\n\n5. Load the Segment Anything model: Use TensorFlow to load the Segment Anything model into your project.\n\n6. Load the input image: Load the input image that you want to perform object detection on.\n\n7. Perform object detection: Use the YOLO NAS model to detect objects in the input image. This will give you a list of bounding boxes and confidence scores for each object detected.\n\n8. Segment the objects: Use the Segment Anything model to segment the objects detected in the input image.\n\n9. Provide Bounding Box Coordinates obtained from YOLO NAS to SAM.\n\n10. Visualize the results: Visualize the results of object detection and segmentation by drawing bounding boxes around the objects detected and coloring the segmented objects.\n\n![Screenshot 2023-06-27 at 8 06 06 PM](https://github.com/RATHOD-SHUBHAM/SegmentAnything/assets/58945964/d2eb91a1-47cd-45b9-9d74-75f0ba70aa58)\n\n\n***\n\nKaggle Notebook: https://www.kaggle.com/code/gibborathod/segmentanything?scriptVersionId=130225073\n\n***\n\n# Object Detection + Mask\n\n![my_image](https://github.com/RATHOD-SHUBHAM/SegmentAnything/assets/58945964/56c7208d-f880-4b3a-9580-e6525f30f65d)\n\n***\n\n![my_image](https://github.com/RATHOD-SHUBHAM/SegmentAnything/assets/58945964/ef51b544-cb8e-4ed4-81cc-e365bdb08698)\n\n***\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frathod-shubham%2Fsegmentanything","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frathod-shubham%2Fsegmentanything","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frathod-shubham%2Fsegmentanything/lists"}