{"id":17719819,"url":"https://github.com/lilydaytoy/openpvsg","last_synced_at":"2025-04-24T07:37:53.583Z","repository":{"id":209093095,"uuid":"529166321","full_name":"LilyDaytoy/OpenPVSG","owner":"LilyDaytoy","description":"Benchmarking Panoptic Video Scene Graph Generation (PVSG), CVPR'23","archived":false,"fork":false,"pushed_at":"2024-04-30T17:07:04.000Z","size":4113,"stargazers_count":53,"open_issues_count":9,"forks_count":3,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-05-02T06:19:39.312Z","etag":null,"topics":["scene-graph","scene-graph-generation","scene-understanding","video-understanding"],"latest_commit_sha":null,"homepage":"https://jingkang50.github.io/PVSG/","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LilyDaytoy.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-08-26T07:52:29.000Z","updated_at":"2024-05-01T21:09:23.000Z","dependencies_parsed_at":"2024-04-30T11:10:28.364Z","dependency_job_id":"daf2bc8b-bfbd-45ba-abf1-e6fad6742d0c","html_url":"https://github.com/LilyDaytoy/OpenPVSG","commit_stats":null,"previous_names":["jingkang50/openpvsg"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LilyDaytoy%2FOpenPVSG","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LilyDaytoy%2FOpenPVSG/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LilyDaytoy%2FOpenPVSG/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LilyDaytoy%2FOpenPVSG/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LilyDaytoy","download_url":"https://codeload.github.com/LilyDaytoy/OpenPVSG/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246290648,"owners_count":20753730,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["scene-graph","scene-graph-generation","scene-understanding","video-understanding"],"created_at":"2024-10-25T15:09:16.291Z","updated_at":"2025-03-31T09:31:35.753Z","avatar_url":"https://github.com/LilyDaytoy.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Panoptic Video Scene Graph Generation\n\u003cp align=\"center\"\u003e\n  \u003c!-- \u003cimg src=\"./assets/psgtr_long.gif\" align=\"center\" width=\"80%\"\u003e --\u003e\n\nhttps://github.com/Jingkang50/OpenPVSG/assets/17070708/54a0f4c4-daca-4168-8460-95eb4cf8b85a\n\n\u003cvideo controls\u003e\n  \u003csource src=\"[https://github.com/Jingkang50/OpenPVSG/assets/17070708/54a0f4c4-daca-4168-8460-95eb4cf8b85a](https://github.com/Jingkang50/OpenPVSG/assets/17070708/54a0f4c4-daca-4168-8460-95eb4cf8b85a)\" type=\"video/mp4\"\u003e\n  Your browser does not support the video tag.\n\u003c/video\u003e\n\n  \u003cp align=\"center\"\u003e\n  \u003ca href=\"https://arxiv.org/abs/2311.17058\" target='_blank'\u003e\n    \u003cimg src=\"https://img.shields.io/badge/Paper-CVPR%202023-b31b1b?style=flat-square\"\u003e\n  \u003c/a\u003e\n  \u0026nbsp;\u0026nbsp;\u0026nbsp;\n  \u003ca href=\"https://jingkang50.github.io/PVSG/\" target='_blank'\u003e\n    \u003cimg src=\"https://img.shields.io/badge/Page-jingkang50/PVSG-228c22?style=flat-square\"\u003e\n  \u003c/a\u003e\n  \u0026nbsp;\u0026nbsp;\u0026nbsp;\n  \u003ca href=\"https://entuedu-my.sharepoint.com/:f:/g/personal/jingkang001_e_ntu_edu_sg/EpHpnXP-ta9Nu1wD6FwkDWAB0LxY8oE9VNqsgv6ln-i8QQ?e=fURefF\" target='_blank'\u003e\n    \u003cimg src=\"https://img.shields.io/badge/Data-PVSGDataset-334b7f?style=flat-square\"\u003e\n  \u003c/a\u003e\n  \u0026nbsp;\u0026nbsp;\u0026nbsp;\n  \u003ca href=\"https://entuedu-my.sharepoint.com/:f:/g/personal/jingkang001_e_ntu_edu_sg/EgvpTfCTMudLpxw-h0_BVdcBAHacUaAQD-u9OvkUlpaDBg?e=LXnqaX\" target='_blank'\u003e\n    \u003cimg src=\"https://img.shields.io/badge/Data-QuickView-7de5f6?style=flat-square\"\u003e\n  \u003c/a\u003e\n  \u0026nbsp;\u0026nbsp;\u0026nbsp;\n  \u003ca href=\"https://github.com/LilyDaytoy/OpenPVSG\" target='_blank'\u003e\n    \u003cimg src=\"https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fgithub.com%2FLilyDaytoy%2FPVSG\u0026count_bg=%23FFA500\u0026title_bg=%23555555\u0026icon=\u0026icon_color=%23E7E7E7\u0026title=visitors\u0026edge_flat=true\"\u003e\n  \u003c/p\u003e\n  \u003c/a\u003e\n  \u003cp align=\"center\"\u003e\n  \u003cfont size=5\u003e\u003cstrong\u003ePanoptic Video Scene Graph Generation\u003c/strong\u003e\u003c/font\u003e\n    \u003cbr\u003e\n        \u003ca href=\"https://jingkang50.github.io/\"\u003eJingkang Yang\u003c/a\u003e,\n        \u003ca href=\"https://lilydaytoy.github.io/\"\u003eWenxuan Peng\u003c/a\u003e,\n        \u003ca href=\"https://lxtgh.github.io/\"\u003eXiangtai Li\u003c/a\u003e,\u003cbr\u003e\n        \u003ca href=\"https://scholar.google.com/citations?user=G8DPsoUAAAAJ\u0026amp;hl=zh-CN\"\u003eZujin Guo\u003c/a\u003e,\n        \u003ca href=\"https://cliangyu.com/\"\u003e Liangyu Chen\u003c/a\u003e,\n        \u003ca href=\"https://brianboli.com/\"\u003eBo Li\u003c/a\u003e,\n        \u003ca href=\"https://www.linkedin.com/in/zheng-ma-4201223a/?originalSubdomain=hk\"\u003eZheng Ma\u003c/a\u003e,\u003cbr\u003e\n        \u003ca href=\"https://kaiyangzhou.github.io/\"\u003eKaiyang Zhou\u003c/a\u003e,\n        \u003ca href=\"https://bmild.github.io/\"\u003eWayne Zhang\u003c/a\u003e,\n        \u003ca href=\"https://www.mmlab-ntu.com/person/ccloy/\"\u003eChen Change Loy\u003c/a\u003e,\n        \u003ca href=\"https://liuziwei7.github.io/\"\u003eZiwei Liu\u003c/a\u003e,\n    \u003cbr\u003e\n  S-Lab, Nanyang Technological University \u0026 SenseTime Research\n  \u003c/p\u003e\n\u003c/p\u003e\n\n---\n## What is PVSG Task?\n\u003cstrong\u003eThe Panoptic Video Scene Graph Generation (PVSG) Task\u003c/strong\u003e aims to interpret a complex scene video with a dynamic scene graph representation, with each node in the scene graph grounded by its pixel-accurate segmentation mask tube in the video.\n\n| ![pvsg.jpg](assets/teaser.png) |\n|:--:|\n| \u003cb\u003eGiven a video, PVSG models need to generate a dynamic (temporal) scene graph that is grounded by panoptic mask tubes.\u003c/b\u003e|\n\n\n## The PVSG Dataset\nWe carefully collect 400 videos, each featuring dynamic scenes and rich in logical reasoning content. On average, these videos are 76.5 seconds long (5 FPS). The collection comprises 289 videos from VidOR, 55 videos from EpicKitchen, and 56 videos from Ego4D.\n\nPlease access the dataset via this [link](https://entuedu-my.sharepoint.com/:f:/g/personal/jingkang001_e_ntu_edu_sg/EpHpnXP-ta9Nu1wD6FwkDWAB0LxY8oE9VNqsgv6ln-i8QQ?e=fURefF), and put the downloaded zip files to the place below.\n```\n├── assets\n├── checkpoints\n├── configs\n├── data\n├── data_zip\n│   ├── Ego4D\n│   │   ├── ego4d_masks.zip\n│   │   └── ego4d_videos.zip\n│   ├── EpicKitchen\n│   │   ├── epic_kitchen_masks.zip\n│   │   └── epic_kitchen_videos.zip\n│   ├── VidOR\n│   │   ├── vidor_masks.zip\n│   │   └── vidor_videos.zip\n│   └── pvsg.json\n├── datasets\n├── models\n├── scripts\n├── tools\n├── utils\n├── .gitignore\n├── environment.yml\n└── README.md\n```\nPlease run `unzip_and_extract.py` to unzip the files and extract frames from the videos. If you use `zip`, make sure to use `unzip -j xxx.zip` to remove junk paths. You should have your `data` directory looks like this:\n```\ndata\n├── ego4d\n│   ├── frames\n│   ├── masks\n│   └── videos\n├── epic_kitchen\n│   ├── frames\n│   ├── masks\n│   └── videos\n├── vidor\n│   ├── frames\n│   ├── masks\n│   └── videos\n└── pvsg.json\n```\n\nWe suggest our users to play with `./tools/Visualize_Dataset.ipynb` to quickly get familiar with PSG dataset.\n\n## Get Started\nTo setup the environment, we use `conda` to manage our dependencies.\n\nOur developers use `CUDA 10.1` to do experiments.\n\nYou can specify the appropriate `cudatoolkit` version to install on your machine in the `environment.yml` file, and then run the following to create the `conda` environment:\n```bash\nconda env create -f environment.yml\nconda activate openpvsg\n```\nYou shall manually install the following dependencies.\n```bash\n# Install mmcv\npip install mmcv-full==1.4.0 -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.7.0/index.html\nconda install -c conda-forge pycocotools\npip install mmdet==2.25.0\n\n# already within environment.yml\npip install timm\npython -m pip install scipy\npip install git+https://github.com/cocodataset/panopticapi.git\n\n# for unitrack\npip install imageio==2.6.1\npip install lap==0.4.0\npip install cython_bbox==0.1.3\n\n# for vps\npip install seaborn\npip install ftfy\npip install regex\n\n# If you're using wandb for logging\npip install wandb\nwandb login\n```\n\nDownload the [pretrained models](https://entuedu-my.sharepoint.com/:f:/g/personal/jingkang001_e_ntu_edu_sg/ErwH2H27bJpAg9xpaTa49fkB3IJkiLJ6AEFuxUHYKMI1dQ?e=9XINcP) for tracking if you are interested in IPS+Tracking solution.\n\n\n## Training and Testing\n**IPS+Tracking \u0026 Relation Modeling**\n```bash\n# Train IPS\nsh scripts/train/train_ips.sh\n# Tracking and save query features\nsh scripts/utils/prepare_qf_ips.sh\n# Prepare for relation modeling\nsh scripts/utils/prepare_rel_set.sh\n# Train relation models\nsh scripts/train/train_relation.sh\n# Test\nsh scripts/test/test_relation_full.sh\n```\n\n**VPS \u0026 Relation Modeling**\n```bash\n# Train VPS\nsh scripts/train/train_vps.sh\n# Save query features\nsh scripts/utils/prepare_qf_vps.sh\n# Prepare for relation modeling\nsh scripts/utils/prepare_rel_set.sh\n# Train relation models\nsh scripts/train/train_relation.sh\n# Test\nsh scripts/test/test_relation_full.sh\n```\n## Model Zoo\nMethod    | M2F ckpt | vanilla | filter |  conv |  transformer |\n---       | ---  | ---  | ---  | ---  | ---  |\nmask2former_ips | [link](https://entuedu-my.sharepoint.com/:f:/g/personal/jingkang001_e_ntu_edu_sg/ErwH2H27bJpAg9xpaTa49fkB3IJkiLJ6AEFuxUHYKMI1dQ?e=9XINcP) | [link](https://entuedu-my.sharepoint.com/:f:/g/personal/jingkang001_e_ntu_edu_sg/ErwH2H27bJpAg9xpaTa49fkB3IJkiLJ6AEFuxUHYKMI1dQ?e=9XINcP) | [link](https://entuedu-my.sharepoint.com/:f:/g/personal/jingkang001_e_ntu_edu_sg/ErwH2H27bJpAg9xpaTa49fkB3IJkiLJ6AEFuxUHYKMI1dQ?e=9XINcP) | [link](https://entuedu-my.sharepoint.com/:f:/g/personal/jingkang001_e_ntu_edu_sg/ErwH2H27bJpAg9xpaTa49fkB3IJkiLJ6AEFuxUHYKMI1dQ?e=9XINcP) | [link](https://entuedu-my.sharepoint.com/:f:/g/personal/jingkang001_e_ntu_edu_sg/ErwH2H27bJpAg9xpaTa49fkB3IJkiLJ6AEFuxUHYKMI1dQ?e=9XINcP) |\nmask2former_vps | [link](https://entuedu-my.sharepoint.com/:f:/g/personal/jingkang001_e_ntu_edu_sg/ErwH2H27bJpAg9xpaTa49fkB3IJkiLJ6AEFuxUHYKMI1dQ?e=9XINcP) | [link](https://entuedu-my.sharepoint.com/:f:/g/personal/jingkang001_e_ntu_edu_sg/ErwH2H27bJpAg9xpaTa49fkB3IJkiLJ6AEFuxUHYKMI1dQ?e=9XINcP) | [link](https://entuedu-my.sharepoint.com/:f:/g/personal/jingkang001_e_ntu_edu_sg/ErwH2H27bJpAg9xpaTa49fkB3IJkiLJ6AEFuxUHYKMI1dQ?e=9XINcP) | [link](https://entuedu-my.sharepoint.com/:f:/g/personal/jingkang001_e_ntu_edu_sg/ErwH2H27bJpAg9xpaTa49fkB3IJkiLJ6AEFuxUHYKMI1dQ?e=9XINcP) | [link](https://entuedu-my.sharepoint.com/:f:/g/personal/jingkang001_e_ntu_edu_sg/ErwH2H27bJpAg9xpaTa49fkB3IJkiLJ6AEFuxUHYKMI1dQ?e=9XINcP) |\n\n## Citation\nIf you find our repository useful for your research, please consider citing our paper:\n```bibtex\n@inproceedings{yang2023pvsg,\n    author = {Yang, Jingkang and Peng, Wenxuan and Li, Xiangtai and Guo, Zujin and Chen, Liangyu and Li, Bo and Ma, Zheng and Zhou, Kaiyang and Zhang, Wayne and Loy, Chen Change and Liu, Ziwei},\n    title = {Panoptic Video Scene Graph Generation},\n    booktitle = {CVPR},\n    year = {2023},\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flilydaytoy%2Fopenpvsg","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flilydaytoy%2Fopenpvsg","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flilydaytoy%2Fopenpvsg/lists"}