{"id":19137476,"url":"https://github.com/dsaurus/threestudio-4dfy","last_synced_at":"2025-09-01T13:43:56.605Z","repository":{"id":210723717,"uuid":"726116188","full_name":"DSaurus/threestudio-4dfy","owner":"DSaurus","description":null,"archived":false,"fork":false,"pushed_at":"2024-01-12T15:26:14.000Z","size":203,"stargazers_count":44,"open_issues_count":1,"forks_count":0,"subscribers_count":3,"default_branch":"clean_branch","last_synced_at":"2025-05-06T20:13:46.273Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DSaurus.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-12-01T15:10:28.000Z","updated_at":"2025-03-05T16:51:55.000Z","dependencies_parsed_at":"2024-01-12T20:29:29.311Z","dependency_job_id":"def26dee-e9f5-4b90-a5d4-3e8546a2ffe4","html_url":"https://github.com/DSaurus/threestudio-4dfy","commit_stats":null,"previous_names":["dsaurus/threestudio-4dfy"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/DSaurus/threestudio-4dfy","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DSaurus%2Fthreestudio-4dfy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DSaurus%2Fthreestudio-4dfy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DSaurus%2Fthreestudio-4dfy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DSaurus%2Fthreestudio-4dfy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DSaurus","download_url":"https://codeload.github.com/DSaurus/threestudio-4dfy/tar.gz/refs/heads/clean_branch","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DSaurus%2Fthreestudio-4dfy/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273136462,"owners_count":25051999,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-01T02:00:09.058Z","response_time":120,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-09T06:38:32.124Z","updated_at":"2025-09-01T13:43:56.563Z","avatar_url":"https://github.com/DSaurus.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 4D-fy threestudio extension\n\u003cimg src=\"https://github.com/DSaurus/threestudio-4dfy/assets/24589363/390d55ae-8e8d-4e06-9da6-6aebb98c431d\" width=\"\" height=\"200\"\u003e\n\u003cimg src=\"https://github.com/DSaurus/threestudio-4dfy/assets/24589363/9e9983a2-c5e4-4717-be4a-c22f92d8852c\" width=\"\" height=\"200\"\u003e\n\u003cimg src=\"https://github.com/DSaurus/threestudio-4dfy/assets/24589363/7031fe93-0b62-4d18-bc36-b3cfe4611c12\" width=\"\" height=\"200\"\u003e\n\u003cimg src=\"https://github.com/DSaurus/threestudio-4dfy/assets/24589363/13dd8eb8-105f-4fe9-95ad-fc2cae45e64e\" width=\"\" height=\"200\"\u003e\n\n| [Project Page](https://sherwinbahmani.github.io/4dfy/) | [Paper](https://arxiv.org/abs/2311.17984) | [User Study Template](https://github.com/victor-rong/video-generation-study) |\n\nThis is 4D-fy extension of threestudio. The original implementation can be found at https://github.com/sherwinbahmani/4dfy. We thank them for their contribution to the 3D generation community. To use it, please install [threestudio](https://github.com/threestudio-project/threestudio) and [threestudio-mvdream](https://github.com/DSaurus/threestudio-mvdream) extension first, and then install this extension in `custom` directory. If you want to run 4D-fy with a 24 GB GPU, you need additionally install [threestudio-stable-nerf-renderer](https://github.com/DSaurus/threestudio-stable-nerf-renderer) extension.\n\n**Note!!! Currently running under low VRAM in 3rd stage may result in decreased performance and we are working on it.**\n\n## Installation\n\n```\ncd custom\ngit clone https://github.com/DSaurus/threestudio-4dfy\n\n# If you have a 24/40/48 GB GPU, please install additional stable-nerf-renderer extension\ngit clone https://github.com/DSaurus/threestudio-stable-nerf-renderer\n```\n\n## Quickstart\n\nOur model is trained in 3 stages and there are three different config files for every stage. Training has to be resumed after finishing a stage.\n\n```sh\nseed=0\ngpu=0\nexp_root_dir=/path/to\n\n# If you have a 24/40/48 GB GPU, you can use the low_vram config files:\n\n# Stage 1\n# python launch.py --config custom/threestudio-4dfy/configs/fourdfy_stage_1_low_vram.yaml --train --gpu $gpu exp_root_dir=$exp_root_dir seed=$seed system.prompt_processor.prompt=\"a dog riding a skateboard\"\n\n# Stage 2\n# ckpt=/path/to/fourdfy_stage_1/a_dog_riding_a_skateboard@timestamp/ckpts/last.ckpt\n# python launch.py --config custom/threestudio-4dfy/configs/fourdfy_stage_2_low_vram.yaml --train --gpu $gpu exp_root_dir=$exp_root_dir seed=$seed system.prompt_processor.prompt=\"a dog riding a skateboard\" system.weights=$ckpt\n\n# Stage 3\n# ckpt=/path/to/fourdfy_stage_2/a_dog_riding_a_skateboard@timestamp/ckpts/last.ckpt\n# python launch.py --config custom/threestudio-4dfy/configs/fourdfy_stage_3_low_vram.yaml --train --gpu $gpu exp_root_dir=$exp_root_dir seed=$seed system.prompt_processor.prompt=\"a dog riding a skateboard\" system.weights=$ckpt\n\n\n# If you have a 80 GB GPU, you can use the original config files:\n\n# Stage 1\n# python launch.py --config custom/threestudio-4dfy/configs/fourdfy_stage_1.yaml --train --gpu $gpu exp_root_dir=$exp_root_dir seed=$seed system.prompt_processor.prompt=\"a dog riding a skateboard\"\n\n# Stage 2\n# ckpt=/path/to/fourdfy_stage_1/a_dog_riding_a_skateboard@timestamp/ckpts/last.ckpt\n# python launch.py --config custom/threestudio-4dfy/configs/fourdfy_stage_2.yaml --train --gpu $gpu exp_root_dir=$exp_root_dir seed=$seed system.prompt_processor.prompt=\"a dog riding a skateboard\" system.weights=$ckpt\n\n# Stage 3\n# ckpt=/path/to/fourdfy_stage_2/a_dog_riding_a_skateboard@timestamp/ckpts/last.ckpt\n# python launch.py --config custom/threestudio-4dfy/configs/fourdfy_stage_3.yaml --train --gpu $gpu exp_root_dir=$exp_root_dir seed=$seed system.prompt_processor.prompt=\"a dog riding a skateboard\" system.weights=$ckpt\n```\n\n## Memory Usage\nWe provide low_vram config files for 24/40/48 GB GPUs, as we originally trained on a 80 GB GPU. If you want to further reduce memory, you can try this:\n- VSD guidance can be disabled and multi-view guidance increased accordingly to compensate by setting data.single_view.prob_single_view_video=1.0 and data.prob_multi_view=0.75\n- Reducing the number of ray samples with system.renderer.num_samples_per_ray=256 or system.renderer.num_samples_per_ray=128\n- Another way is to reduce the rendering resolution for the video model with data.single_view.width_vid=144 and data.single_view.height_vid=80 (or even data.single_view.width_vid=72 and data.single_view.height_vid=40)\n- Mixed precision: trainer.precision=16-mixed\n- Memory efficient attention: Set system.guidance_video.enable_memory_efficient_attention=true\n- Furthermore, by setting data.single_view.num_frames=8, the number of frames can be reduced\n- Reducing the hash grid capacity in system.geometry.pos_encoding_config, e.g., system.geometry.pos_encoding_config.n_levels=8. For this, retraining of the first two stages is required though.\n\n## More tips\n- **More motion**. To increase the motion, the learning rate for the video model can be increased to system.loss.lambda_sds_video=0.3 or system.loss.lambda_sds_video=0.5.\n\n## Credits\n\nThis code is built on the [threestudio-project](https://github.com/threestudio-project/threestudio) and [MVDream-threestudio](https://github.com/bytedance/MVDream-threestudio). Thanks to the maintainers for their contribution to the community!\n\n## Citing\n\nIf you find 4D-fy helpful, please consider citing:\n\n```\n@article{bah20234dfy,\n  author = {Bahmani, Sherwin and Skorokhodov, Ivan and Rong, Victor and Wetzstein, Gordon and Guibas, Leonidas and Wonka, Peter and Tulyakov, Sergey and Park, Jeong Joon and Tagliasacchi, Andrea and Lindell, David B.},\n  title = {4D-fy: Text-to-4D Generation Using Hybrid Score Distillation Sampling},\n  journal = {arXiv},\n  year = {2023},\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdsaurus%2Fthreestudio-4dfy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdsaurus%2Fthreestudio-4dfy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdsaurus%2Fthreestudio-4dfy/lists"}