Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

awesome-image-captioning

A curated list of image captioning and related area resources. :-)
https://github.com/zhjohnchan/awesome-image-captioning

Last synced: about 8 hours ago
JSON representation

Papers
- Survey
  - A Comprehensive Survey of Deep Learning for Image Captioning - Hossain M et al, `arXiv preprint 2018`.
- Before
  - I2t: Image parsing to text description - Yao B Z et al, `P IEEE 2011`.
  - Im2Text: Describing Images Using 1 Million Captioned Photographs - Ordonez V et al, `NIPS 2011`. [[project web]](http://vision.cs.stonybrook.edu/~vicente/sbucaptions/)
  - Deep Captioning with Multimodal Recurrent Neural Networks - Mao J et al, `arXiv preprint 2014`.
  - I2t: Image parsing to text description - Yao B Z et al, `P IEEE 2011`.
- 2015
  - Show and Tell: A Neural Image Caption Generator - Vinyals O et al, `CVPR 2015`. [[code]](https://github.com/karpathy/neuraltalk) [[code]](https://github.com/zsdonghao/Image-Captioning)
  - Deep Visual-Semantic Alignments for Generating Image Descriptions - Karpathy A et al, `CVPR 2015`. [[project web]](http://cs.stanford.edu/people/karpathy/deepimagesent/) [[code]](https://github.com/karpathy/neuraltalk)
  - Long-term Recurrent Convolutional Networks for Visual Recognition and Description - Donahue J et al, `CVPR 2015`. [[code]](https://github.com/BVLC/caffe/pull/2033) [[project web]](http://jeffdonahue.com/lrcn/)
  - Guiding the Long-Short Term Memory Model for Image Caption Generation - Jia X et al, `ICCV 2015`.
  - Learning like a Child: Fast Novel Visual Concept Learning from Sentence Descriptions of Images - Mao J et al, `ICCV 2015`. [[code]](https://github.com/mjhucla/NVC-Dataset)
  - Expressing an Image Stream with a Sequence of Natural Sentences - Park C C et al, `NIPS 2015`. [[code]](https://github.com/cesc-park/CRCN)
  - Show, Attend and Tell: Neural Image Caption Generation with Visual Attention - Xu K et al, `ICML 2015`. [[project]](http://kelvinxu.github.io/projects/capgen.html) [[code]](https://github.com/yunjey/show-attend-and-tell-tensorflow) [[code]](https://github.com/kelvinxu/arctic-captions)
  - Order-Embeddings of Images and Language - Vendrov I et al, `arXiv preprint 2015`. [[code]](https://github.com/ivendrov/order-embedding)
  - Learning FRAME Models Using CNN Filters for Knowledge Visualization - Lu Y, et al, `arXiv preprint 2015`. [[code]](http://www.stat.ucla.edu/~yang.lu/project/deepFrame/doc/deepFRAME_1.1.zip)
  - Aligning where to see and what to tell: image caption with region-based attention and scene factorization - Jin J et al, `arXiv preprint 2015`.
  - Mind’s Eye: A Recurrent Visual Representation for Image Caption Generation - Chen X et al, `CVPR 2015`.
- 2016
  - Generating captions without looking beyond objects - Heuer H et al, `arXiv preprint 2016`.
  - Image captioning with semantic attention - You Q et al, `CVPR 2016`. [[code]](https://github.com/chapternewscu/image-captioning-with-semantic-attention)
  - DenseCap: Fully Convolutional Localization Networks for Dense Captioning - Johnson J et al, `CVPR 2016`. [[code]](https://github.com/jcjohnson/densecap)
  - What value do explicit high level concepts have in vision to language problems? - Wu Q et al, `CVPR 2016`.
  - Deep Compositional Captioning: Describing Novel Object Categories without Paired Training Data - Lisa Anne Hendricks et al, `CVPR 2016`. [[code]](https://github.com/LisaAnne/DCC)
  - SPICE: Semantic Propositional Image Caption Evaluation - Anderson P et al, `ECCV 2016`. [[code]](https://github.com/peteanderson80/SPICE)
  - Image Captioning with Deep Bidirectional LSTMs - Wang C et al, `ACMMM 2016`. [[code]](https://github.com/deepsemantic/image_captioning)
  - Image Caption Generation with Text-Conditional Semantic Attention - Zhou L et al, `arXiv preprint 2016`. [[code]](https://github.com/LuoweiZhou/e2e-gLSTM-sc)
  - DeepDiary: Automatic Caption Generation for Lifelogging Image Streams - Fan C et al, `arXiv preprint 2016`.
  - Learning to generalize to new compositions in image understanding - Atzmon Y et al, `arXiv preprint 2016`.
  - Bootstrap, Review, Decode: Using Out-of-Domain Textual Data to Improve Image Captioning - Chen W et al, `arXiv preprint 2016`. [[code]](https://github.com/wenhuchen/Semi-Supervised-Image-Captioning)
  - Recurrent Image Captioner: Describing Images with Spatial-Invariant Transformation and Attention Filtering - Liu H et al, `arXiv preprint 2016`.
  - Multimodal Pivots for Image Caption Translation - Hitschler J et al, `ACL 2016`.
- 2017
  - Captioning Images with Diverse Objects - Venugopalan S et al, `CVPR 2017`. [[code]](https://github.com/vsubhashini/noc)
  - Top-down Visual Saliency Guided by Captions - Ramanishka V et al, `CVPR 2017`. [[code]](https://github.com/VisionLearningGroup/caption-guided-saliency)
  - Self-Critical Sequence Training for Image Captioning - Steven J et al, `CVPR 2017`. [[code]](https://github.com/ruotianluo/self-critical.pytorch)
  - Dense Captioning with Joint Inference and Visual Context - Yang L et al, `CVPR 2017`. [[code]](https://github.com/linjieyangsc/densecap)
  - Skeleton Key: Image Captioning by Skeleton-Attribute Decomposition - Yufei W et al, `CVPR 2017`. [[code]](https://github.com/feiyu1990/Skeleton-key)
  - A Hierarchical Approach for Generating Descriptive Image Paragraphs - Krause J et al, `CVPR 2017`. [[code]](https://github.com/InnerPeace-Wu/im2p-tensorflow)
  - Deep Reinforcement Learning-based Image Captioning with Embedding Reward - Ren Z et al, `CVPR 2017`.
  - Incorporating Copying Mechanism in Image Captioning for Learning Novel Objects - Ting Y et al, `CVPR 2017`.
  - Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning - Lu J et al, `CVPR 2017`. [[code]](https://github.com/jiasenlu/AdaptiveAttention)
  - Attend to You: Personalized Image Captioning with Context Sequence Memory Networks - CC Park et al, `CVPR 2017`. [[code]](https://github.com/cesc-park/attend2u)
  - SCA-CNN: Spatial and channel-wise attention in convolutional networks for image captioning - Chen L et al, `CVPR 2017`. [[code]](https://github.com/zjuchenlong/sca-cnn.cvpr17)
  - Bidirectional Beam Search: Forward-Backward Inference in Neural Sequence Models for Fill-In-The-Blank Image Captioning - Qing S et al, `CVPR 2017`.
  - Areas of Attention for Image Captioning - Pedersoli M et al, `ICCV 2017`.
  - Boosting Image Captioning with Attributes - Yao T et al, `ICCV 2017`.
  - Improved Image Captioning via Policy Gradient Optimization of SPIDEr - Liu S et al, `ICCV 2017`.
  - Towards Diverse and Natural Image Descriptions via a Conditional GAN - Dai B et al, `ICCV 2017`. [[code]](https://github.com/doubledaibo/gancaption_iccv2017)
  - Paying Attention to Descriptions Generated by Image Captioning Models - Tavakoliy H R et al, `ICCV 2017`.
  - Show, Adapt and Tell: Adversarial Training of Cross-domain Image Captioner - Chen T H et al, `ICCV 2017`. [[code]](https://github.com/tsenghungchen/show-adapt-and-tell)
  - Image Caption with Global-Local Attention - Li L et al, `AAAI 2017`.
  - Reference Based LSTM for Image Captioning - Chen M et al, `AAAI 2017`.
  - Attention Correctness in Neural Image Captioning - Liu C et al, `AAAI 2017`.
  - Text-guided Attention Model for Image Captioning - Mun J et al, `AAAI 2017`. [[code]](https://github.com/JonghwanMun/TextguidedATT)
  - Contrastive Learning for Image Captioning - Dai B et al, `NIPS 2017`. [[code]](https://github.com/doubledaibo/clcaption_nips2017)
  - Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge - Vinyals O et al, `TPAMI 2017`. [[code]](https://github.com/tensorflow/models/tree/master/im2txt)
  - MAT: A Multimodal Attentive Translator for Image Captioning - Liu C et al, `arXiv preprint 2017`.
  - Actor-Critic Sequence Training for Image Captioning - Zhang L et al, `arXiv preprint 2017`.
  - What is the Role of Recurrent Neural Networks (RNNs) in an Image Caption Generator? - Tanti M et al, `arXiv preprint 2017`. [[code]](https://github.com/mtanti/rnn-role)
  - Self-Guiding Multimodal LSTM - when we do not have a perfect training dataset for image captioning - Xian Y et al, `arXiv preprint 2017`.
  - Phrase-based Image Captioning with Hierarchical LSTM Model - Tan Y H et al, `arXiv preprint 2017`.
  - Show-and-Fool: Crafting Adversarial Examples for Neural Image Captioning - Chen H et al, `arXiv preprint 2017`.
  - An Empirical Study of Language CNN for Image Captioning - Gu J et al, `ICCV 2017`.
- 2018
  - Neural Baby Talk - Lu J et al, `CVPR 2018`. [[code]](https://github.com/jiasenlu/NeuralBabyTalk)
  - Convolutional Image Captioning - Aneja J et al, `CVPR 2018`.
  - Learning to Evaluate Image Captioning - Cui Y et al, `CVPR 2018`. [[code]](https://github.com/richardaecn/cvpr18-caption-eval)
  - Discriminability Objective for Training Descriptive Captions - Luo R et al, `CVPR 2018`. [[code]](https://github.com/ruotianluo/DiscCaptioning)
  - SemStyle: Learning to Generate Stylised Image Captions using Unaligned Text - Mathews A et al, `CVPR 2018`.
  - Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering - Anderson P et al, `CVPR 2018`. [[code]](https://github.com/peteanderson80/bottom-up-attention)
  - Unpaired Image Captioning by Language Pivoting - Gu J et al, `ECCV 2018`.
  - Recurrent Fusion Network for Image Captioning - Jiang W et al, `ECCV 2018`.
  - Exploring Visual Relationship for Image Captioning - Yao T et al, `ECCV 2018`.
  - Rethinking the Form of Latent States in Image Captioning - Dai B et al, `ECCV 2018`. [[code]](https://github.com/doubledaibo/2dcaption_eccv2018)
  - Boosted Attention: Leveraging Human Attention for Image Captioning - Chen S et al, `ECCV 2018`.
  - "Factual" or "Emotional": Stylized Image Captioning with Adaptive Learning and Attention - Chen T et al, `ECCV 2018`.
  - Learning to Guide Decoding for Image Captioning - Jiang W et al, `AAAI 2018`.
  - Stack-Captioning: Coarse-to-Fine Learning for Image Captioning - Gu J et al, `AAAI 2018`. [[code]](https://github.com/gujiuxiang/Stack-Captioning)
  - Temporal-difference Learning with Sampling Baseline for Image Captioning - Chen H et al, `AAAI 2018`.
  - Partially-Supervised Image Captioning - Anderson P et al, `NeurIPS 2018`.
  - A Neural Compositional Paradigm for Image Captioning - Dai B et al, `NeurIPS 2018`.
  - Defoiling Foiled Image Captions - Wang J et al, `NAACL 2018`.
  - Punny Captions: Witty Wordplay in Image Descriptions - Chandrasekaran A et al, `NAACL 2018`. [[code]](https://github.com/purvaten/punny_captions)
  - Object Counts! Bringing Explicit Detections Back into Image Captioning - Aneja J et al, `NAACL 2018`.
  - Conceptual Captions: A Cleaned, Hypernymed, Image Alt-text Dataset For Automatic Image Captioning - Sharma P et al, `ACL 2018`. [[code]](https://github.com/google-research-datasets/conceptual-captions)
  - Attacking visual language grounding with adversarial examples: A case study on neural image captioning - Chen H et al, `ACL 2018`. [[code]](https://github.com/IBM/Image-Captioning-Attack)
  - simNet: Stepwise Image-Topic Merging Network for Generating Detailed and Comprehensive Image Captions - Liu et al, `EMNLP 2018`. [[code]](https://github.com/lancopku/simNet)
  - Improved Image Captioning with Adversarial Semantic Alignment - Melnyk I et al, `arXiv preprint 2018`.
  - CNN+CNN: Convolutional Decoders for Image Captioning - Wang Q et al, `arXiv preprint 2018`.
  - Partially-Supervised Image Captioning - Anderson P et al, `NeurIPS 2018`.
  - A Neural Compositional Paradigm for Image Captioning - Dai B et al, `NeurIPS 2018`.
- 2019
  - SC-RANK: Improving Convolutional Image Captioning with Self-Critical Learning and Ranking Metric-based Reward - Yan et al, `BMVC 2019`.
  - Unsupervised Image Captioning - Yang F et al, `CVPR 2019`. [[code]](https://github.com/fengyang0317/unsupervised_captioning)
  - Engaging Image Captioning Via Personality - Shuster K et al, `CVPR 2019`.
  - Pointing Novel Objects in Image Captioning - Li Y et al, `CVPR 2019`.
  - Auto-Encoding Scene Graphs for Image Captioning - Yang X et al, `CVPR 2019`.
  - Context and Attribute Grounded Dense Captioning - Yin G et al, `CVPR 2019`.
  - Look Back and Predict Forward in Image Captioning - Qin Y et al, `CVPR 2019`.
  - Self-critical n-step Training for Image Captioning - Gao J et al, `CVPR 2019`.
  - Intention Oriented Image Captions with Guiding Objects - Zheng Y et al, `CVPR 2019`.
  - Describing like humans: on diversity in image captioning - Wang Q et al, `CVPR 2019`.
  - Adversarial Semantic Alignment for Improved Image Captions - Dognin P et al, `CVPR 2019`.
  - MSCap: Multi-Style Image Captioning With Unpaired Stylized Text - Gao L et al, `CVPR 2019`.
  - Good News, Everyone! Context driven entity-aware captioning for news images - Biten A F et al, `CVPR 2019`. [[code]](https://github.com/furkanbiten/GoodNews)
  - Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning - Kim D et al, `CVPR 2019`. [[code]](https://github.com/Dong-JinKim/DenseRelationalCaptioning)
  - Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions - Cornia M et al, `CVPR 2019`. [[code]](https://github.com/aimagelab/show-control-and-tell)
  - Exact Adversarial Attack to Image Captioning via Structured Output Learning With Latent Variables - Xu Y et al, `CVPR 2019`.
  - Meta Learning for Image Captioning - Li N et al, `AAAI 2019`.
  - Learning Object Context for Dense Captioning - Li X et al, `AAAI 2019`.
  - Hierarchical Attention Network for Image Captioning - Wang W et al, `AAAI 2019`.
  - Deliberate Residual based Attention Network for Image Captioning - Gao L et al, `AAAI 2019`.
  - Connecting Language to Images: A Progressive Attention-Guided Network for Simultaneous Image Captioning and Language Grounding - Song L et al, `AAAI 2019`.
  - Dense Procedure Captioning in Narrated Instructional Videos - Shi B et al, `ACL 2019`.
  - Informative Image Captioning with External Sources of Information - Zhao S et al, `ACL 2019`.
  - Bridging by Word: Image Grounded Vocabulary Construction for Visual Captioning - Fan Z et al, `ACL 2019`.
  - Image Captioning with Unseen Objects - Demirel et al, `BMVC 2019`.
  - Look and Modify: Modification Networks for Image Captioning - Sammani et al, `BMVC 2019`. [[code]](https://github.com/fawazsammani/look-and-modify)
  - Show, Infer and Tell: Contextual Inference for Creative Captioning - Khare et al, `BMVC 2019`. [[code]](https://github.com/ankit1khare/Show_Infer_and_Tell-CIC)
  - Hierarchy Parsing for Image Captioning - Yao T et al, `ICCV 2019`.
  - Entangled Transformer for Image Captioning - Li G et al, `ICCV 2019`.
  - Attention on Attention for Image Captioning - Huang L et al, `ICCV 2019`. [[code]](https://github.com/husthuaan/AoANet)
  - Reflective Decoding Network for Image Captioning - Ke L at al, `ICCV 2019`.
  - Learning to Collocate Neural Modules for Image Captioning - Yang X et al, `ICCV 2019`.
  - Image Captioning: Transforming Objects into Words - Herdade S et al, `NeurIPS 2019`.
  - Adaptively Aligned Image Captioning via Adaptive Attention Time - Huang L et al, `NeurIPS 2019`. [[code]](https://github.com/husthuaan/AAT)
  - Variational Structured Semantic Inference for Diverse Image Captioning - Chen F et al, `NeurIPS 2019`.
  - Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations - Liu F et al, `NeurIPS 2019`. [[code]](https://github.com/fenglinliu98/MIA)
  - Image Captioning with Compositional Neural Module Networks - Tian J et al, `IJCAI 2019`.
  - Exploring and Distilling Cross-Modal Information for Image Captioning - Liu F et al, `IJCAI 2019`.
  - Swell-and-Shrink: Decomposing Image Captioning by Transformation and Summarization - Wang H et al, `IJCAI 2019`.
  - Hornet: a hierarchical offshoot recurrent network for improving person re-ID via image captioning - Yan S et al, `IJCAI 2019`.
  - Image Captioning with Very Scarce Supervised Data: Adversarial Semi-Supervised Learning Approach - Kim D J et al, `EMNLP 2019`.
  - TIGEr: Text-to-Image Grounding for Image Caption Evaluation - Jiang M et al, `EMNLP 2019`.
  - REO-Relevance, Extraness, Omission: A Fine-grained Evaluation for Image Captioning - Jiang M et al, `EMNLP 2019`.
  - Decoupled Box Proposal and Featurization with Ultrafine-Grained Semantic Labels Improve Image Captioning and Visual Question Answering - Changpinyo S et al, `EMNLP 2019`.
  - Compositional Generalization in Image Captioning - Nikolaus M et al, `CoNLL 2019`. [[code]](https://github.com/mitjanikolaus/compositional-image-captioning)
  - Hierarchical Attention Network for Image Captioning - Wang W et al, `AAAI 2019`.
  - Informative Image Captioning with External Sources of Information - Zhao S et al, `ACL 2019`.
  - Improving Image Captioning with Conditional Generative Adversarial Nets - Chen C et al, `AAAI 2019`.
  - Hornet: a hierarchical offshoot recurrent network for improving person re-ID via image captioning - Yan S et al, `IJCAI 2019`.
  - TIGEr: Text-to-Image Grounding for Image Caption Evaluation - Jiang M et al, `EMNLP 2019`.
  - Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations - Liu F et al, `NeurIPS 2019`. [[code]](https://github.com/fenglinliu98/MIA)
  - Image Captioning with Compositional Neural Module Networks - Tian J et al, `IJCAI 2019`.
  - Image Captioning with Very Scarce Supervised Data: Adversarial Semi-Supervised Learning Approach - Kim D J et al, `EMNLP 2019`.
  - Fast, Diverse and Accurate Image Captioning Guided By Part-of-Speech - Aditya D et al, `CVPR 2019`.
  - REO-Relevance, Extraness, Omission: A Fine-grained Evaluation for Image Captioning - Jiang M et al, `EMNLP 2019`.
  - Decoupled Box Proposal and Featurization with Ultrafine-Grained Semantic Labels Improve Image Captioning and Visual Question Answering - Changpinyo S et al, `EMNLP 2019`.
  - Hierarchy Parsing for Image Captioning - Yao T et al, `ICCV 2019`.
  - Reflective Decoding Network for Image Captioning - Ke L at al, `ICCV 2019`.
  - Learning to Collocate Neural Modules for Image Captioning - Yang X et al, `ICCV 2019`.
  - Adaptively Aligned Image Captioning via Adaptive Attention Time - Huang L et al, `NeurIPS 2019`. [[code]](https://github.com/husthuaan/AAT)
  - Compositional Generalization in Image Captioning - Nikolaus M et al, `CoNLL 2019`. [[code]](https://github.com/mitjanikolaus/compositional-image-captioning)
- 2020
  - MemCap: Memorizing Style Knowledge for Image Captioning - Zhao et al, `AAAI 2020`.
  - Unified Vision-Language Pre-Training for Image Captioning and VQA - Zhou L et al, `AAAI 2020`.
  - Show, Recall, and Tell: Image Captioning with Recall Mechanism - Wang L et al, `AAAI 2020`.
  - Reinforcing an Image Caption Generator using Off-line Human Feedback - Hongsuck Seo P et al, `AAAI 2020`.
  - Interactive Dual Generative Adversarial Networks for Image Captioning - Liu et al, `AAAI 2020`.
  - Feature Deformation Meta-Networks in Image Captioning of Novel Objects - Cao et al, `AAAI 2020`.
  - Joint Commonsense and Relation Reasoning for Image and Video Captioning - Hou et al, `AAAI 2020`.
  - Normalized and Geometry-Aware Self-Attention Network for Image Captioning - Guo L et al, `CVPR 2020`.
  - Object Relational Graph with Teacher-Recommended Learning for Video Captioning - Zhang Z et al, `CVPR 2020`.
  - Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs - Chen S et al, `CVPR 2020`.
  - X-Linear Attention Networks for Image Captioning - Pan et al, `CVPR 2020`.
  - Improving Image Captioning with Better Use of Caption - Shi Z et al, `ACL 2020`.
  - Cross-modal Coherence Modeling for Caption Generation - Alikhani M et al, `ACL 2020`.
  - Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA - Kim H et al, `ACL 2020`.
  - Length-Controllable Image Captioning - Deng C et al, `ECCV 2020`.
  - Captioning Images Taken by People Who Are Blind - Gurari D et al, `ECCV 2020`.
  - Towards Unique and Informative Captioning of Images - Wang Z et al, `ECCV 2020`.
  - Learning Visual Representations with Caption Annotations - Sariyildiz M et al, `ECCV 2020`.
  - Comprehensive Image Captioning via Scene Graph Decomposition - Zhong Y et al, `ECCV 2020`.
  - SODA: Story Oriented Dense Video Captioning Evaluation Framework - Fujita S et al, `ECCV 2020`.
  - TextCaps: a Dataset for Image Captioning with Reading Comprehension - Sidorov O et al, `ECCV 2020`.
  - Compare and Reweight: Distinctive Image Captioning Using Similar Images Sets - Wang J et al, `ECCV 2020`.
  - Learning to Generate Grounded Visual Captions without Localization Supervision - Ma C et al, `ECCV 2020`.
  - Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards - Yang X et al, `ECCV 2020`.
  - Learning Modality Interaction for Temporal Sentence Localization and Event Captioning in Videos - Chen S et al, `ECCV 2020`.
  - CapWAP: Image Captioning with a Purpose - Fisch A et al, `EMNLP 2020`.
  - X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers - Cho J et al, `EMNLP 2020`.
  - Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning - Fang Z et al, `EMNLP 2020`.
  - Widget Captioning: Generating Natural Language Description for Mobile User Interface Elements - Li Y et al, `EMNLP 2020`.
  - Diverse Image Captioning with Context-Object Split Latent Spaces - Mahajan S et al, `NeurIPS 2020`.
  - RATT: Recurrent Attention to Transient Tasks for Continual Image Captioning - Chiaro R et al, `NeurIPS 2020`.
  - Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA - Kim H et al, `ACL 2020`.
Dataset
- 2020
Image Captioning Challenge
- 2020
  - Microsoft COCO Image Captioning
  - Google AI Blog: Conceptual Captions
Popular Implementations
- TensorFlow
  - tensorflow/models/im2txt
Licenses
- Others
  - ![CC0
  - Zhihong Chen
  - ![CC0
Change Log
- here

Categories

Papers 180 Dataset 8 Licenses 3 Image Captioning Challenge 2 Change Log 1 Popular Implementations 1

Sub Categories

2019 61 2020 42 2017 31 2018 27 2016 13 2015 11 Before 4 Others 3 Survey 1 TensorFlow 1

Keywords

vision-and-language-pre-training 1 pre-training 1 multi-modal-learning 1