{"id":13737688,"url":"https://github.com/fartashf/vsepp","last_synced_at":"2025-05-08T15:30:52.678Z","repository":{"id":22923216,"uuid":"95817438","full_name":"fartashf/vsepp","owner":"fartashf","description":"PyTorch Code for the paper \"VSE++: Improving Visual-Semantic Embeddings with Hard Negatives\"","archived":false,"fork":false,"pushed_at":"2021-12-08T21:38:15.000Z","size":53,"stargazers_count":487,"open_issues_count":0,"forks_count":125,"subscribers_count":15,"default_branch":"master","last_synced_at":"2024-08-04T03:11:05.951Z","etag":null,"topics":["bmvc","negatives","paper","pytorch","vse"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fartashf.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-06-29T20:35:17.000Z","updated_at":"2024-07-07T11:48:28.000Z","dependencies_parsed_at":"2022-08-07T10:16:25.269Z","dependency_job_id":null,"html_url":"https://github.com/fartashf/vsepp","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fartashf%2Fvsepp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fartashf%2Fvsepp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fartashf%2Fvsepp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fartashf%2Fvsepp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fartashf","download_url":"https://codeload.github.com/fartashf/vsepp/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224742198,"owners_count":17362229,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bmvc","negatives","paper","pytorch","vse"],"created_at":"2024-08-03T03:01:57.339Z","updated_at":"2024-11-15T06:30:47.519Z","avatar_url":"https://github.com/fartashf.png","language":"Python","funding_links":[],"categories":["Python","Paper implementations｜论文实现","Paper implementations"],"sub_categories":["Other libraries｜其他库:","Other libraries:"],"readme":"# Improving Visual-Semantic Embeddings with Hard Negatives\n\nCode for the image-caption retrieval methods from\n**[VSE++: Improving Visual-Semantic Embeddings with Hard Negatives](https://arxiv.org/abs/1707.05612)**\n*, F. Faghri, D. J. Fleet, J. R. Kiros, S. Fidler, Proceedings of the British Machine Vision Conference (BMVC),  2018. (BMVC Spotlight)*\n\n## Dependencies\nWe recommended to use Anaconda for the following packages.\n\n* Python 2.7 (Checkout branch `python3`)\n* [PyTorch](http://pytorch.org/) (\u003e0.2) (Checkout branch `pytorch4.1`)\n* [NumPy](http://www.numpy.org/) (\u003e1.12.1)\n* [TensorBoard](https://github.com/TeamHG-Memex/tensorboard_logger)\n* [pycocotools](https://github.com/cocodataset/cocoapi)\n* [torchvision]()\n* [matplotlib]()\n\n\n* Punkt Sentence Tokenizer:\n```python\nimport nltk\nnltk.download()\n\u003e d punkt\n```\n\n## Download data\n\nDownload the dataset files and pre-trained models. We use splits produced by [Andrej Karpathy](http://cs.stanford.edu/people/karpathy/deepimagesent/). The precomputed image features are from [here](https://github.com/ryankiros/visual-semantic-embedding/) and [here](https://github.com/ivendrov/order-embedding). To use full image encoders, download the images from their original sources [here](http://nlp.cs.illinois.edu/HockenmaierGroup/Framing_Image_Description/KCCA.html), [here](http://shannon.cs.illinois.edu/DenotationGraph/) and [here](http://mscoco.org/).\n\n```bash\nwget http://www.cs.toronto.edu/~faghri/vsepp/vocab.tar\nwget http://www.cs.toronto.edu/~faghri/vsepp/data.tar\nwget http://www.cs.toronto.edu/~faghri/vsepp/runs.tar\n```\n\nWe refer to the path of extracted files for `data.tar` as `$DATA_PATH` and \nfiles for `models.tar` as `$RUN_PATH`. Extract `vocab.tar` to `./vocab` \ndirectory.\n\n*Update: The vocabulary was originally built using all sets (including test set \ncaptions). Please see issue #29 for details. Please consider not using test set \ncaptions if building up on this project.*\n\n## Evaluate pre-trained models\n\n```python\npython -c \"\\\nfrom vocab import Vocabulary\nimport evaluation\nevaluation.evalrank('$RUN_PATH/coco_vse++/model_best.pth.tar', data_path='$DATA_PATH', split='test')\"\n```\n\nTo do cross-validation on MSCOCO, pass `fold5=True` with a model trained using \n`--data_name coco`.\n\n## Training new models\nRun `train.py`:\n\n```bash\npython train.py --data_path \"$DATA_PATH\" --data_name coco_precomp --logger_name \nruns/coco_vse++ --max_violation\n```\n\nArguments used to train pre-trained models:\n\n| Method    | Arguments |\n| :-------: | :-------: |\n| VSE0      | `--no_imgnorm` |\n| VSE++     | `--max_violation` |\n| Order0    | `--measure order --use_abs --margin .05 --learning_rate .001` |\n| Order++   | `--measure order --max_violation` |\n\n\n## Reference\n\nIf you found this code useful, please cite the following paper:\n\n    @article{faghri2018vse++,\n      title={VSE++: Improving Visual-Semantic Embeddings with Hard Negatives},\n      author={Faghri, Fartash and Fleet, David J and Kiros, Jamie Ryan and Fidler, Sanja},\n      booktitle = {Proceedings of the British Machine Vision Conference ({BMVC})},\n      url = {https://github.com/fartashf/vsepp},\n      year={2018}\n    }\n\n## License\n\n[Apache License 2.0](http://www.apache.org/licenses/LICENSE-2.0)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffartashf%2Fvsepp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffartashf%2Fvsepp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffartashf%2Fvsepp/lists"}