{"id":13717711,"url":"https://github.com/xiadingZ/video-caption.pytorch","last_synced_at":"2025-05-07T07:32:19.090Z","repository":{"id":50323397,"uuid":"116533463","full_name":"xiadingZ/video-caption.pytorch","owner":"xiadingZ","description":"pytorch implementation of video captioning","archived":false,"fork":false,"pushed_at":"2019-08-19T11:25:58.000Z","size":100901,"stargazers_count":398,"open_issues_count":25,"forks_count":128,"subscribers_count":11,"default_branch":"master","last_synced_at":"2024-05-23T06:49:41.673Z","etag":null,"topics":["deep-learning","pytorch","video-captioning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/xiadingZ.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-01-07T03:14:44.000Z","updated_at":"2024-05-14T14:07:33.000Z","dependencies_parsed_at":"2022-09-24T11:01:54.244Z","dependency_job_id":null,"html_url":"https://github.com/xiadingZ/video-caption.pytorch","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xiadingZ%2Fvideo-caption.pytorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xiadingZ%2Fvideo-caption.pytorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xiadingZ%2Fvideo-caption.pytorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/xiadingZ%2Fvideo-caption.pytorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/xiadingZ","download_url":"https://codeload.github.com/xiadingZ/video-caption.pytorch/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224567999,"owners_count":17332835,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","pytorch","video-captioning"],"created_at":"2024-08-03T00:01:25.940Z","updated_at":"2024-11-14T05:32:23.187Z","avatar_url":"https://github.com/xiadingZ.png","language":"Python","funding_links":[],"categories":["Tutorials \u0026 books \u0026 examples｜教程 \u0026 书籍 \u0026 示例"],"sub_categories":["Other libraries｜其他库:"],"readme":"# pytorch implementation of video captioning\r\n\r\nrecommend installing pytorch and python packages using Anaconda\r\n\r\n## requirements\r\n\r\n- cuda\r\n- pytorch 0.4.0\r\n- python3\r\n- ffmpeg (can install using anaconda)\r\n\r\n### python packages\r\n\r\n- tqdm\r\n- pillow\r\n- pretrainedmodels\r\n- nltk\r\n\r\n## Data\r\n\r\nMSR-VTT. Test video doesn't have captions, so I spilit train-viedo to train/val/test. Extract and put them in `./data/` directory\r\n\r\n- train-video: [download link](https://drive.google.com/file/d/1Qi6Gn_l93SzrvmKQQu-drI90L-x8B0ly/view?usp=sharing)\r\n- test-video: [download link](https://drive.google.com/file/d/10fPbEhD-ENVQihrRvKFvxcMzkDlhvf4Q/view?usp=sharing)\r\n- json info of train-video: [download link](https://drive.google.com/file/d/1LcTtsAvfnHhUfHMiI4YkDgN7lF1-_-m7/view?usp=sharing)\r\n- json info of test-video: [download link](https://drive.google.com/file/d/1Kgra0uMKDQssclNZXRLfbj9UQgBv-1YE/view?usp=sharing)\r\n\r\n\r\n## Options\r\n\r\nall default options are defined in opt.py or corresponding code file, change them for your like.\r\n\r\n## Acknowledgements\r\nSome code refers to [ImageCaptioning.pytorch](https://github.com/yunjey/pytorch-tutorial/tree/master/tutorials/03-advanced/image_captioning)\r\n\r\n## Usage\r\n\r\n### (Optional) c3d features\r\nyou can use [video-classification-3d-cnn-pytorch](https://github.com/kenshohara/video-classification-3d-cnn-pytorch) to extract features from video. \r\n\r\n### Steps\r\n\r\n1. preprocess videos and labels\r\n\r\n```bash\r\npython prepro_feats.py --output_dir data/feats/resnet152 --model resnet152 --n_frame_steps 40  --gpu 4,5\r\n\r\npython prepro_vocab.py\r\n```\r\n\r\n2. Training a model\r\n\r\n```bash\r\n\r\npython train.py --gpu 0 --epochs 3001 --batch_size 300 --checkpoint_path data/save --feats_dir data/feats/resnet152 --model S2VTAttModel  --with_c3d 1 --c3d_feats_dir data/feats/c3d_feats --dim_vid 4096\r\n```\r\n\r\n3. test\r\n\r\n    opt_info.json will be in same directory as saved model.\r\n\r\n```bash\r\npython eval.py --recover_opt data/save/opt_info.json --saved_model data/save/model_1000.pth --batch_size 100 --gpu 1\r\n```\r\n\r\n## TODO\r\n- lstm\r\n- beam search\r\n- reinforcement learning\r\n- dataparallel (broken in pytorch 0.4)\r\n\r\n\r\n## Acknowledgements\r\nSome code refers to [ImageCaptioning.pytorch](https://github.com/ruotianluo/ImageCaptioning.pytorch)\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FxiadingZ%2Fvideo-caption.pytorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FxiadingZ%2Fvideo-caption.pytorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FxiadingZ%2Fvideo-caption.pytorch/lists"}