{"id":27948313,"url":"https://github.com/pku-alignment/safe-sora","last_synced_at":"2025-05-07T14:57:33.417Z","repository":{"id":244291381,"uuid":"812545829","full_name":"PKU-Alignment/safe-sora","owner":"PKU-Alignment","description":"SafeSora is a human preference dataset designed to support safety alignment research in the text-to-video generation field, aiming to enhance the helpfulness and harmlessness of Large Vision Models (LVMs).","archived":false,"fork":false,"pushed_at":"2024-08-20T10:51:13.000Z","size":2578,"stargazers_count":31,"open_issues_count":1,"forks_count":5,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-05-07T14:57:29.086Z","etag":null,"topics":["alignment","human-preferences","large-vision-models","text-to-video-generation"],"latest_commit_sha":null,"homepage":"https://sites.google.com/view/safe-sora","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/PKU-Alignment.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-06-09T07:49:27.000Z","updated_at":"2025-04-25T15:03:58.000Z","dependencies_parsed_at":"2024-06-13T22:10:11.680Z","dependency_job_id":"1fe301f5-75c2-4374-9684-5535fd09cfd2","html_url":"https://github.com/PKU-Alignment/safe-sora","commit_stats":null,"previous_names":["pku-alignment/safe-sora"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PKU-Alignment%2Fsafe-sora","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PKU-Alignment%2Fsafe-sora/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PKU-Alignment%2Fsafe-sora/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PKU-Alignment%2Fsafe-sora/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/PKU-Alignment","download_url":"https://codeload.github.com/PKU-Alignment/safe-sora/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252902623,"owners_count":21822257,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["alignment","human-preferences","large-vision-models","text-to-video-generation"],"created_at":"2025-05-07T14:57:32.785Z","updated_at":"2025-05-07T14:57:33.404Z","avatar_url":"https://github.com/PKU-Alignment.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003c!-- markdownlint-disable first-line-h1 --\u003e\n\u003c!-- markdownlint-disable html --\u003e\n\n\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"docs/images/SafeSora.png\" alt=\"SafeSora Logo\" width=\"60%\"/\u003e\n\u003c/div\u003e\n\u003ch1 align=\"center\"\u003eTowards Safety Alignment of Text2Video Generation \u003c/h1\u003e\n\n[![Code License](https://img.shields.io/badge/Code%20License-Apache_2.0-green.svg?logo=data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAyNCAyNCIgd2lkdGg9IjI0IiBoZWlnaHQ9IjI0IiBmaWxsPSIjZmZmZmZmIj48cGF0aCBmaWxsLXJ1bGU9ImV2ZW5vZGQiIGQ9Ik0xMi43NSAyLjc1YS43NS43NSAwIDAwLTEuNSAwVjQuNUg5LjI3NmExLjc1IDEuNzUgMCAwMC0uOTg1LjMwM0w2LjU5NiA1Ljk1N0EuMjUuMjUgMCAwMTYuNDU1IDZIMi4zNTNhLjc1Ljc1IDAgMTAwIDEuNUgzLjkzTC41NjMgMTUuMThhLjc2Mi43NjIgMCAwMC4yMS44OGMuMDguMDY0LjE2MS4xMjUuMzA5LjIyMS4xODYuMTIxLjQ1Mi4yNzguNzkyLjQzMy42OC4zMTEgMS42NjIuNjIgMi44NzYuNjJhNi45MTkgNi45MTkgMCAwMDIuODc2LS42MmMuMzQtLjE1NS42MDYtLjMxMi43OTItLjQzMy4xNS0uMDk3LjIzLS4xNTguMzEtLjIyM2EuNzUuNzUgMCAwMC4yMDktLjg3OEw1LjU2OSA3LjVoLjg4NmMuMzUxIDAgLjY5NC0uMTA2Ljk4NC0uMzAzbDEuNjk2LTEuMTU0QS4yNS4yNSAwIDAxOS4yNzUgNmgxLjk3NXYxNC41SDYuNzYzYS43NS43NSAwIDAwMCAxLjVoMTAuNDc0YS43NS43NSAwIDAwMC0xLjVIMTIuNzVWNmgxLjk3NGMuMDUgMCAuMS4wMTUuMTQuMDQzbDEuNjk3IDEuMTU0Yy4yOS4xOTcuNjMzLjMwMy45ODQuMzAzaC44ODZsLTMuMzY4IDcuNjhhLjc1Ljc1IDAgMDAuMjMuODk2Yy4wMTIuMDA5IDAgMCAuMDAyIDBhMy4xNTQgMy4xNTQgMCAwMC4zMS4yMDZjLjE4NS4xMTIuNDUuMjU2Ljc5LjRhNy4zNDMgNy4zNDMgMCAwMDIuODU1LjU2OCA3LjM0MyA3LjM0MyAwIDAwMi44NTYtLjU2OWMuMzM4LS4xNDMuNjA0LS4yODcuNzktLjM5OWEzLjUgMy41IDAgMDAuMzEtLjIwNi43NS43NSAwIDAwLjIzLS44OTZMMjAuMDcgNy41aDEuNTc4YS43NS43NSAwIDAwMC0xLjVoLTQuMTAyYS4yNS4yNSAwIDAxLS4xNC0uMDQzbC0xLjY5Ny0xLjE1NGExLjc1IDEuNzUgMCAwMC0uOTg0LS4zMDNIMTIuNzVWMi43NXpNMi4xOTMgMTUuMTk4YTUuNDE4IDUuNDE4IDAgMDAyLjU1Ny42MzUgNS40MTggNS40MTggMCAwMDIuNTU3LS42MzVMNC43NSA5LjM2OGwtMi41NTcgNS44M3ptMTQuNTEtLjAyNGMuMDgyLjA0LjE3NC4wODMuMjc1LjEyNi41My4yMjMgMS4zMDUuNDUgMi4yNzIuNDVhNS44NDYgNS44NDYgMCAwMDIuNTQ3LS41NzZMMTkuMjUgOS4zNjdsLTIuNTQ3IDUuODA3eiI+PC9wYXRoPjwvc3ZnPgo=)](docs/CODE_LICENCE)\n[![Data License](https://img.shields.io/badge/Data%20License-CC%20BY--NC%204.0-red.svg?logo=data:image/svg+xml;base64,PHN2ZyB4bWxucz0iaHR0cDovL3d3dy53My5vcmcvMjAwMC9zdmciIHZpZXdCb3g9IjAgMCAyNCAyNCIgd2lkdGg9IjI0IiBoZWlnaHQ9IjI0IiBmaWxsPSIjZmZmZmZmIj48cGF0aCBmaWxsLXJ1bGU9ImV2ZW5vZGQiIGQ9Ik0xMi43NSAyLjc1YS43NS43NSAwIDAwLTEuNSAwVjQuNUg5LjI3NmExLjc1IDEuNzUgMCAwMC0uOTg1LjMwM0w2LjU5NiA1Ljk1N0EuMjUuMjUgMCAwMTYuNDU1IDZIMi4zNTNhLjc1Ljc1IDAgMTAwIDEuNUgzLjkzTC41NjMgMTUuMThhLjc2Mi43NjIgMCAwMC4yMS44OGMuMDguMDY0LjE2MS4xMjUuMzA5LjIyMS4xODYuMTIxLjQ1Mi4yNzguNzkyLjQzMy42OC4zMTEgMS42NjIuNjIgMi44NzYuNjJhNi45MTkgNi45MTkgMCAwMDIuODc2LS42MmMuMzQtLjE1NS42MDYtLjMxMi43OTItLjQzMy4xNS0uMDk3LjIzLS4xNTguMzEtLjIyM2EuNzUuNzUgMCAwMC4yMDktLjg3OEw1LjU2OSA3LjVoLjg4NmMuMzUxIDAgLjY5NC0uMTA2Ljk4NC0uMzAzbDEuNjk2LTEuMTU0QS4yNS4yNSAwIDAxOS4yNzUgNmgxLjk3NXYxNC41SDYuNzYzYS43NS43NSAwIDAwMCAxLjVoMTAuNDc0YS43NS43NSAwIDAwMC0xLjVIMTIuNzVWNmgxLjk3NGMuMDUgMCAuMS4wMTUuMTQuMDQzbDEuNjk3IDEuMTU0Yy4yOS4xOTcuNjMzLjMwMy45ODQuMzAzaC44ODZsLTMuMzY4IDcuNjhhLjc1Ljc1IDAgMDAuMjMuODk2Yy4wMTIuMDA5IDAgMCAuMDAyIDBhMy4xNTQgMy4xNTQgMCAwMC4zMS4yMDZjLjE4NS4xMTIuNDUuMjU2Ljc5LjRhNy4zNDMgNy4zNDMgMCAwMDIuODU1LjU2OCA3LjM0MyA3LjM0MyAwIDAwMi44NTYtLjU2OWMuMzM4LS4xNDMuNjA0LS4yODcuNzktLjM5OWEzLjUgMy41IDAgMDAuMzEtLjIwNi43NS43NSAwIDAwLjIzLS44OTZMMjAuMDcgNy41aDEuNTc4YS43NS43NSAwIDAwMC0xLjVoLTQuMTAyYS4yNS4yNSAwIDAxLS4xNC0uMDQzbC0xLjY5Ny0xLjE1NGExLjc1IDEuNzUgMCAwMC0uOTg0LS4zMDNIMTIuNzVWMi43NXpNMi4xOTMgMTUuMTk4YTUuNDE4IDUuNDE4IDAgMDAyLjU1Ny42MzUgNS40MTggNS40MTggMCAwMDIuNTU3LS42MzVMNC43NSA5LjM2OGwtMi41NTcgNS44M3ptMTQuNTEtLjAyNGMuMDgyLjA0LjE3NC4wODMuMjc1LjEyNi41My4yMjMgMS4zMDUuNDUgMi4yNzIuNDVhNS44NDYgNS44NDYgMCAwMDIuNTQ3LS41NzZMMTkuMjUgOS4zNjdsLTIuNTQ3IDUuODA3eiI+PC9wYXRoPjwvc3ZnPgo=)](docs/DATA_LICENCE)\n\n\u003c!-- [[`📕 Paper`](https://arxiv.org/abs/2307.04657)] --\u003e\n[[`🏠 Project Homepage`](https://sites.google.com/view/safe-sora)]\n[[`📕 Paper`](https://arxiv.org/abs/2406.14477)]\n[[`🤗 SafeSora Datasets`](https://huggingface.co/datasets/PKU-Alignment/SafeSora)]\n[[`🤗 SafeSora Label`](https://huggingface.co/datasets/PKU-Alignment/SafeSora-Label)]\n[[`🤗 SafeSora Evaluation`](https://huggingface.co/datasets/PKU-Alignment/SafeSora-Eval)]\n[[`BibTeX`](#citation)]\n\nSafeSora is a human preference dataset designed to support safety alignment research in the text-to-video generation field, aiming to enhance the helpfulness and harmlessness of Large Vision Models (LVMs). It currently contains three types of data:\n\n- A [classification dataset](#multi-label-classification-dataset) of 57k+ Text-Video pairs, including multi-label classification of 12 harm labels for their text prompts and text-video pairs.\n- A [human preference dataset](#human-preference-dataset) of 51k+ instances in the text-to-video generation task, containing comparative relationships in terms of helpfulness and harmlessness, as well as four sub-dimensions of helpfulness.\n- An [evaluation dataset](#evaluation-dataset) containing 600 human-written prompts, with 300 being safety-neutral and another 300 constructed according to 12 harm categories as red-team prompts.\n\nIn the future, we will also open-source some baseline alignment algorithms that utilize these datasets.\n\n\u003c!-- ## What's New --\u003e\n\n### Table of Contents  \u003c!-- omit in toc --\u003e \u003c!-- markdownlint-disable heading-increment --\u003e\n\n- [Dataset Release](#dataset-release)\n  - [Multi-label Classification Dataset](#multi-label-classification-dataset)\n  - [Human Preference Dataset](#human-preference-dataset)\n  - [Evaluation Dataset](#evaluation-dataset)\n- [Data Access](#data-access)\n- [Citation](#citation)\n- [License](#license)\n\n## Dataset Release\n\n### Multi-label Classification Dataset\n\nThe multi-label classification dataset contains 57k+ text-video pairs, each labeled with 12 harm tags.\nWe perform multi-label classification on individual prompts as well as the combination of prompts and the videos generated from those prompts. These 12 harm tags are defined as:\n\n- S1: `Adult, Explicit Sexual Content`\n- S2: `Animal Abuse`\n- S3: `Child Abuse`\n- S4: `Crime`\n- S5: `Debated Sensitive Social Issue`\n- S6: `Drug, Weapons, Substance Abuse`\n- S7: `Insulting, Hateful, Aggressive Behavior`\n- S8: `Violence, Injury, Gory Content`\n- S9: `Racial Discrimination`\n- S10: `Other Discrimination (Excluding Racial)`\n- S11: `Terrorism, Organized Crime`\n- S12: `Other Harmful Content`\n\n\u003c!-- For more details, please see here. --\u003e\n\nThe distribution of these 14 categories is shown below:\n\n\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"docs/images/data_ratio.png\" alt=\"Data Ratio\" width=\"85%\"/\u003e\n\u003c/div\u003e\n\nIn our dataset, nearly half of the prompts are safety-critical, while the remaining half are safety-neutral. Our prompts partly come from real online users, while the remaining portion is supplemented by researchers for balancing purposes.\n\nFor more information, please refer to **Hugging Face Page**: [PKU-Alignment/SafeSora-Label](https://huggingface.co/datasets/PKU-Alignment/SafeSora).\n\u003c!-- - **Data Card**: [SafeSora Label](data/SafeSora-Label). --\u003e\n\n### Human Preference Dataset\n\nThe human preference dataset contains over 51,000 comparisons, each data point comprising a user input and two generated videos. Through the following heuristic-based annotation process, human preferences were obtained in terms of `helpfulness` or `harmlessness` dimensions.\n\nAdditionally, due to a pre-annotation process, human preferences on four helpfulness sub-dimensions were also included. These sub-dimensions are:\n\n- `Instruction Following`\n- `Correctness`\n- `Informativeness`\n- `Aesthetics`\n\nThe specific annotation process is as shown in the figure below:\n\n\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"docs/images/annotation_pipeline.png\" alt=\"Annotation Process\" width=\"85%\"/\u003e\n\u003c/div\u003e\n\nFor more information, please refer to **Hugging Face Page**: [PKU-Alignment/SafeSora](https://huggingface.co/datasets/PKU-Alignment/SafeSora).\n\n### Evaluation Dataset\n\nThe evaluation dataset contains 600 human-written prompts, including 300 safety-neutral prompts and 300 red-teaming prompts. The 300 red-teaming prompts are constructed based on 12 harmful categories. These prompts will not appear in the training set and are reserved for researchers to generate videos for model evaluation.\n\nFor more information, please refer to **Hugging Face Page**: [PKU-Alignment/SafeSora-Eval](https://huggingface.co/datasets/PKU-Alignment/SafeSora-Eval).\n\n## Data Access\n\nThe dataset is available on the Hugging Face Datasets Hub.\nA recommended way to download is using huggingface cli:\n\n```bash\n# Multi-label Classification Dataset: SafeSora-Label\nhuggingface-cli download --repo-type dataset --local-dir-use-symlinks False --resume-download PKU-Alignment/SafeSora-Label --local-dir ./SafeSora-Label\n\n# Human Preference Dataset: SafeSora\nhuggingface-cli download --repo-type dataset --local-dir-use-symlinks False --resume-download PKU-Alignment/SafeSora --local-dir ./SafeSora\n\n# Evaluation Dataset: SafeSora-Eval\nhuggingface-cli download --repo-type dataset --local-dir-use-symlinks False --resume-download PKU-Alignment/SafeSora-Eval --local-dir ./SafeSora-Eval\n```\n\nThe downloaded data mainly consists of two parts: `config-train.json.gz` and `config-test.json.gz` are the data configurations, and `videos.tar.gz` is the compressed package of videos. Please unzip the package before use.\n\n```bash\ntar -xzvf video.tar.gz\n```\n\nEach data point in the dataset includes a user prompt, the potential harmful category of the user prompt, a generated video, and the annotation results of the harmful category for the Text-Video pair. In the config, the video will include a `video_path` pointing to its relative location in the videos folder. This relative location follows a fixed rule: `videos/prompt_id/video_id`.\n\n**Note:** The `videos.tar.gz` file in the SafeSora-Label and SafeSora preference datasets is the same, so if you have previously downloaded `videos.tar.gz`, you can use the same video folder and only need to download the config files separately.\n\nWe also provide a script to quickly return a Torch Dataset class:\n\n```python\nfrom safe_sora.datasets import VideoDataset, PairDataset, PromptDataset\n\n# Multi-label Classification Dataset\nlabel_data = VideoDataset.load(\"path/to/config\", video_dir=\"path/to/video_dir\")\n\n# Human Preference Dataset\npref_data = PairDataset.load(\"path/to/config\", video_dir=\"path/to/video_dir\")\n\n# Evaluation Dataset\neval_data = PromptDataset.load(\"path/to/config\", video_dir=\"path/to/video_dir\")\n```\n\n## Citation\n\nIf you find the SafeSora dataset family useful in your research, please cite the following paper:\n\n```bibtex\n@misc{dai2024safesora,\n      title={SafeSora: Towards Safety Alignment of Text2Video Generation via a Human Preference Dataset},\n      author={Josef Dai and Tianle Chen and Xuyao Wang and Ziran Yang and Taiye Chen and Jiaming Ji and Yaodong Yang},\n      year={2024},\n      eprint={2406.14477},\n      archivePrefix={arXiv},\n      primaryClass={id='cs.CV' full_name='Computer Vision and Pattern Recognition' is_active=True alt_name=None in_archive='cs' is_general=False description='Covers image processing, computer vision, pattern recognition, and scene understanding. Roughly includes material in ACM Subject Classes I.2.10, I.4, and I.5.'}\n}\n```\n\n\u003c!-- ## Acknowledgment  omit in toc --\u003e\n\n## License\n\nSafeSora dataset and its family are released under the CC BY-NC 4.0 License.\nThe code is released under Apache License 2.0.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpku-alignment%2Fsafe-sora","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpku-alignment%2Fsafe-sora","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpku-alignment%2Fsafe-sora/lists"}