{"id":19186815,"url":"https://github.com/donydchen/image-caption-cpp","last_synced_at":"2026-06-19T19:01:41.237Z","repository":{"id":95783575,"uuid":"53315697","full_name":"donydchen/image-caption-cpp","owner":"donydchen","description":"A data driven query expansion approach for image caption, implemented in cpp","archived":false,"fork":false,"pushed_at":"2017-10-07T09:56:11.000Z","size":686,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-02-23T03:25:26.991Z","etag":null,"topics":["bleu","cpp","image-captioning"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/donydchen.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-03-07T10:10:20.000Z","updated_at":"2019-07-22T08:38:55.000Z","dependencies_parsed_at":"2023-03-21T21:32:46.537Z","dependency_job_id":null,"html_url":"https://github.com/donydchen/image-caption-cpp","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/donydchen/image-caption-cpp","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/donydchen%2Fimage-caption-cpp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/donydchen%2Fimage-caption-cpp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/donydchen%2Fimage-caption-cpp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/donydchen%2Fimage-caption-cpp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/donydchen","download_url":"https://codeload.github.com/donydchen/image-caption-cpp/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/donydchen%2Fimage-caption-cpp/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34544413,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-19T02:00:06.005Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bleu","cpp","image-captioning"],"created_at":"2024-11-09T11:16:49.249Z","updated_at":"2026-06-19T19:01:41.227Z","avatar_url":"https://github.com/donydchen.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Image Caption\n\n\nAutomatically generating describing information for images is a fundamental problem for artificial intelligence connected with computer vision and natural language processing. The research is with great practical significance, for instance, to help visually impaired people to have a better understanding of the content of the images on web. \n\nThis project is the implementation of a  data driven approach, used to automatically generate English captions for given images. The core algorithm of this project is mainly implemented by C++, while the image features are extracted by Caffe and the sentence vectors are extracted by Word2Vec.\n\n![samples](pics/1.jpg)\n\n\n## Algorithm Details\n\nThis project implements a data driven query expansion approach. And it can be roughly desribed by the following three steps:\n\n* It begins with **nearest neighbor algorithm**, which is to find a set of nearest images in the training set that contains images as well as related captions.\n* **Sentence vectors** are extracted from the captions of the nearest images by employing Word2Vec, and then averaged to generate a sentence vector for the input image.\n* The generated vector is later used to **rerank** the captions of the K nearest neighbors and finally the nearest caption is **borrowed** as the generated caption for the given image.\n\nThe main process is illustrated as follow:\n\n![algorithm](pics/2.jpg)\n\n\n## Auxiliary Module\n\nThis project also provide two auxiliary modules, which should be useful for some other similar image caption projects. They are both self-contained and can be used independently.\n\n* **BLEU**: It is an algorithm for evaluating the quality of text which has been machine-translated from one natural language to another. Check out the subfolder `bleu/` for more details.\n* **Webpage Autogenerator**: It is a tool that can be used to generate a beautiful webpage based on the given images, captions and BLEU scores. Check out the subfolder `result/` for more details.\n\n\n## Results\n\n### Quantized Data\n\n![data result](pics/3.jpg)\n\n### Presentation Webpage\n\n![UI result](pics/4.jpg)\n\n\n## How to Run \n\nRun the following commands\n\n```sh\nbash setup.sh   # download some required files\nbash run.sh     # compile the project\n./demo          # auto generate captions, run BLEU test and generate presentation webpage.\n```\n\nAfter it all finishes, the final result should be presented in `result/index.html`\n\n\n## References\n\n1. V. Ordonez, G. Kulkarni, and T. Berg. Im2text: Describing images using 1 million captioned photographs. In NIPS, 2011. 1, 2, 3.\n2. Devlin Jacob, Gupta Saurabh, Girshick Ross, Mitchell Margaret, and Zitnick C Lawrence. Exploring nearest neighbor approaches for image captioning. arXiv preprint arXiv:1505.04467, 2015b.\n3. Mao Junhua, Xu Wei, Yang Yi, Wang Jiang, and Yuille Alan. Deep captioning with multimodal recurrent neural networks (m-RNN). arXiv:1412.6632 [cs.CV], 2014. 12.\n\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdonydchen%2Fimage-caption-cpp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdonydchen%2Fimage-caption-cpp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdonydchen%2Fimage-caption-cpp/lists"}