{"id":20063255,"url":"https://github.com/markdtw/vqa-winner-cvprw-2017","last_synced_at":"2025-05-05T17:32:34.189Z","repository":{"id":93067275,"uuid":"102872812","full_name":"markdtw/vqa-winner-cvprw-2017","owner":"markdtw","description":"Pytorch implementation of winner from VQA Chllange Workshop in CVPR'17","archived":false,"fork":false,"pushed_at":"2019-02-08T00:04:45.000Z","size":26,"stargazers_count":163,"open_issues_count":4,"forks_count":38,"subscribers_count":11,"default_branch":"master","last_synced_at":"2025-04-09T02:04:51.717Z","etag":null,"topics":["pytorch","visual-question-answering"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/markdtw.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2017-09-08T14:55:00.000Z","updated_at":"2025-01-23T05:56:36.000Z","dependencies_parsed_at":"2023-06-04T15:15:29.529Z","dependency_job_id":null,"html_url":"https://github.com/markdtw/vqa-winner-cvprw-2017","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/markdtw%2Fvqa-winner-cvprw-2017","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/markdtw%2Fvqa-winner-cvprw-2017/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/markdtw%2Fvqa-winner-cvprw-2017/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/markdtw%2Fvqa-winner-cvprw-2017/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/markdtw","download_url":"https://codeload.github.com/markdtw/vqa-winner-cvprw-2017/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252542388,"owners_count":21764959,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["pytorch","visual-question-answering"],"created_at":"2024-11-13T13:41:23.967Z","updated_at":"2025-05-05T17:32:33.834Z","avatar_url":"https://github.com/markdtw.png","language":"Python","funding_links":[],"categories":["Paper implementations｜论文实现","Paper implementations"],"sub_categories":["Other libraries｜其他库:","Other libraries:"],"readme":"# 2017 VQA Challenge Winner (CVPR'17 Workshop)\npytorch implementation of [Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge by Teney et al](https://arxiv.org/pdf/1708.02711.pdf).\n\n![Model architecture](https://i.imgur.com/phBHIqZ.png)\n\n## Prerequisites\n- python 3.6+\n- numpy\n- [pytorch](http://pytorch.org/) 0.4\n- [tqdm](https://pypi.python.org/pypi/tqdm)\n- [nltk](http://www.nltk.org/install.html)\n- [pandas](https://pandas.pydata.org/)\n\n\n## Data\n- [VQA 2.0](http://visualqa.org/download.html)\n- [COCO 36 features pretrained resnet model](https://github.com/peteanderson80/bottom-up-attention#pretrained-features)\n- [GloVe pretrained Wikipedia+Gigaword word embedding](https://nlp.stanford.edu/projects/glove/)\n\n\n## Preparation\n- To download and extract vqav2, glove, and pretrained visual features:\n  ```bash\n  bash scripts/download_extract.sh\n  ```\n- To prepare data for training:\n  ```bash\n  python scripts/preproc.py\n  ```\n- The structure of `data/` directory should look like this:\n  ```\n  - data/\n    - zips/\n      - v2_XXX...zip\n      - ...\n      - glove...zip\n      - trainval_36.zip\n    - glove/\n      - glove...txt\n      - ...\n    - v2_XXX.json\n    - ...\n    - trainval_resnet...tsv\n    (The above are files created after executing scripts/download_extract.sh)\n    - tokenizers/\n      - ...\n    - dict_ans.pkl\n    - dict_q.pkl\n    - glove_pretrained_300.npy\n    - train_qa.pkl\n    - val_qa.pkl\n    - train_vfeats.pkl\n    - val_vfeats.pkl\n    (The above are files created after executing scripts/preproc.py)\n  ```\n\n## Train\nUse default parameters:\n```bash\nbash scripts/train.sh\n```\n\n## Notes\n- Huge re-factor (especially data preprocessing), tested based on pytorch 0.4.1 and python 3.6\n- Training for 20 epochs reach around 50% training accuracy. (model seems buggy in my implementation)\n- After all the preprocessing, `data/` directory may be up to 38G+\n- Some of `preproc.py` and `utils.py` are based on [this repo](https://github.com/hengyuan-hu/bottom-up-attention-vqa)\n\n\n## Resources\n- [The paper](https://arxiv.org/pdf/1708.02711.pdf).\n- [Their CVPR Workshop slides](http://cs.adelaide.edu.au/~Damien/Research/VQA-Challenge-Slides-TeneyAnderson.pdf).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmarkdtw%2Fvqa-winner-cvprw-2017","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmarkdtw%2Fvqa-winner-cvprw-2017","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmarkdtw%2Fvqa-winner-cvprw-2017/lists"}