Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-scene-graph
A curated list of scene graph generation and related area resources. :-)
https://github.com/mqjyl/awesome-scene-graph
Last synced: 1 day ago
JSON representation
-
Scene Graph Generation
-
Spatio-Temporal Scene Graph
- Video Relation Detection with Spatio-Temporal Graph - Xufeng Qian _et al_, `ACM MM 2019`.
- Video Relationship Reasoning using Gated Spatio-Temporal Energy Graph - Yao-Hung Hubert Tsai _et al_, `CVPR 2019`. [[code]](https://github.com/yaohungt/Gated-Spatio-Temporal-Energy-Graph)
- Video Visual Relation Detection via Multi-modal Feature Fusion - Xu Sun _et al_, `ACM MM 2019`.
- Action Genome: Actions as Composition of Spatio-temporal Scene Graphs - Jingwei Ji _et al_, `arXiv 2019`.
-
2-D Scene Graph
- Unbiased Scene Graph Generation from Biased Training - Kaihua Tang _et al_, `CVPR 2020`. [[code]](https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch)
- One-shot Scene Graph Generation - Yuyu Gao _et al_, `MM 2020`.
- Hierarchical Visual Relationship Detection - Xu Sun _et al_, `ACM MM 2019`.
- Part-Aware Interactive Learning for Scene Graph Generation - Hongshuo Tian _et al_, `MM 2020`.
- Contextual Translation Embedding for Visual Relationship Detection and Scene Graph Generation - Zih-Siou Hung _et al_, `T-PAMI 2020`.
- Leveraging Auxiliary Text for Deep Recognition of Unseen Visual Relationships - Gal Sadeh Kenigsfield _et al_, `ICLR 2020`.
- Unbiased Scene Graph Generation from Biased Training - Kaihua Tang _et al_, `CVPR 2020`. [[code]](https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch)
- Weakly Supervised Visual Semantic Parsing - Alireza Zareian _et al_, `CVPR 2020`.
- GPS-Net: Graph Property Sensing Network for Scene Graph Generation - Xin Lin _et al_, `CVPR 2020`. [[code]](https://github.com/taksau/GPS-Net)
- Deep Generative Probabilistic Graph Neural Networks for Scene Graph Generation - Mahmoud Khademi _et al_, `AAAI 2020`.
- PCPL: Predicate-Correlation Perception Learning for Unbiased Scene Graph Generation - Shaotian Yan _et al_, `MM 2020`.
- HOSE-Net: Higher Order Structure Embedded Network for Scene Graph Generation - Meng Wei _et al_, `MM 2020`.
- Unbiased Scene Graph Generation via Rich and Fair Semantic Extraction - Bin Wen _et al_, `ARXIV 2020`.
- Long-tail Visual Relationship Recognition with a Visiolinguistic Hubless Loss - Sherif Abdelkarim _et al_, `ARXIV 2020`.
- Bridging Knowledge Graphs to Generate Scene Graphs - Alireza Zareian _et al_, `ARXIV 2020`.
- NODIS: Neural Ordinary Differential Scene Understanding - Cong Yuren _et al_, `ARXIV 2020`.
- AVR: Attention based Salient Visual Relationship Detection - Jianming Lv _et al_, `ARXIV 2020`.
- Large-Scale Visual Relationship Understanding - Ji Zhang _et al_, `AAAI 2019`. [[code]](https://github.com/facebookresearch/Large-Scale-VRD)
- Learning to Compose Dynamic Tree Structures for Visual Contexts - Kaihua Tang _et al_, `CVPR 2019 Oral`. [[code]](https://github.com/KaihuaTang/VCTree-Scene-Graph-Generation)
- Counterfactual Critic Multi-Agent Training for Scene Graph Generation - Long Chen _et al_, `ICCV 2019 Oral`.
- On Exploring Undetermined Relationships for Visual Relationship Detection - Yibing Zhan _et al_, `CVPR 2019`. [[code]](https://github.com//Atmegal//MFURLN-CVPR-2019-relationship-detection-method)
- Exploring Context and Visual Pattern of Relationship for Scene Graph Generation - Wenbin Wang _et al_, `CVPR 2019`.
- Relationship-Aware Spatial Perception Fusion for Realistic Scene Layout Generation - Hongdong Zheng _et al_, `arXiv 2019`.
- The Limited Multi-Label Projection Layer - Brandon Amos _et al_, `arXiv 2019`. [[code]](https://github.com/locuslab/lml)
- Detecting Visual Relationships Using Box Attention - Alexander Kolesnikov _et al_, `ICCVW 2019`.
- Visual Relationships as Functions: Enabling Few-Shot Scene Graph Prediction - Apoorva Dornadula _et al_, `ICCVW 2019`.
- Attention-Translation-Relation Network for Scalable Scene Graph Generation - Nikolaos Gkanatsios _et al_, `ICCVW 2019`. [[code]](https://github.com/deeplab-ai/atr-net)
- Attentive Relational Networks for Mapping Images to Scene Graphs - Mengshi Qi _et al_, `CVPR 2019`.
- Visual Relation Detection with Multi-Level Attention - Sipeng Zheng, _et al_, `ACM MM 2019`.
- Visual Relationship Recognition via Language and Position Guided Attention - Hao Zhou, _et al_, `ICASSP 2019`.
- Relationship Detection Based on Object Semantic Inference and Attention Mechanisms - Liang Zhang _et al_, `ICMR 2019`.
- Natural Language Guided Visual Relationship Detection - Wentong Liao _et al_, `CVPR 2019`.
- Knowledge-Embedded Routing Network for Scene Graph Generation - Tianshui Chen _et al_, `CVPR 2019`. [[code]](https://github.com/yuweihao/KERN)
- Soft Transfer Learning via Gradient Diagnosis for Visual Relationship Detection - Diqi Chen _et al_, `WACV 2019`.
- Compensating Supervision Incompleteness with Prior Knowledge in Semantic Image Interpretation - Ivan Donadello _et al_, `IJCNN 2019`. [[code]](https://github.com/ivanDonadello/Visual-Relationship-Detection-LTN)
- Visual Relationship Detection with Low Rank Non-Negative Tensor Decomposition - Mohammed Haroon Dupty _et al_, `arXiv 2019`.
- Relational Reasoning using Prior Knowledge for Visual Captioning - Jingyi Hou _et al_, `arXiv 2019`.
- Attention-Translation-Relation Network for Scalable Scene Graph Generation - Nikolaos Gkanatsios _et al_, `ICCV 2019`.
- Detecting Unseen Visual Relations Using Analogies - Julia Peyre _et al_, `ICCV 2019`.
- BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and Visual Relationship Detection - Kaiyu Yang _et al_, `AAAI 2019`. [[code]](https://github.com/Cadene/block.bootstrap.pytorch)
- On Class Imbalance and Background Filtering in Visual Relationship Detection - Alessio Sarullo _et al_, `arXiv 2019`.
- Support Relation Analysis for Objects in Multiple View RGB-D Images - Peng Zhang _et al_, `IJCAIW QR 2019`.
- Improving Visual Relation Detection using Depth Maps - Sahand Sharifzadeh _et al_, `arXiv 2019`. [[code]](https://github.com/Sina-Baharlou/Depth-VRD)
- MR-NET: Exploiting Mutual Relation for Visual Relationship Detection - Yi Bin _et al_, `AAAI 2019`.
- Scene Graph Prediction with Limited Labels - Vincent S. Chen _et al_, `ICCV 2019`. [[code]](https://github.com/vincentschen/limited-label-scene-graphs)
- Graphical Contrastive Losses for Scene Graph Parsing - Ji Zhang _et al_, `CVPR 2019`. [[code]](https://github.com/NVIDIA/ContrastiveLosses4VRD)
- Neural Message Passing for Visual Relationship Detection - Yue Hu _et al_, `ICML LRG Workshop 2019`. [[code]](https://github.com/PhyllisH/NMP)
- PANet: A Context Based Predicate Association Network for Scene Graph Generation - Yunian Chen _et al_, `ICME 2019`.
- Visual Relationship Detection with Relative Location Mining - Hao Zhou _et al_, `ACM MM 2019`.
- Visual relationship detection based on bidirectional recurrent neural network - Yibo Dai _et al_, `Multimedia Tools and Applications 2019`.
- Exploring the Semantics for Visual Relationship Detection - Wentong Liao _et al_, `arXiv 2019`.
- Optimising the Input Image to Improve Visual Relationship Detection - Noel Mizzi _et al_, `arXiv 2019`.
- Deeply Supervised Multimodal Attentional Translation Embeddings for Visual Relationship Detection - Nikolaos Gkanatsios _et al_, `arXiv 2019`.
- Learning Effective Visual Relationship Detector on 1 GPU - Yichao Lu _et al_, `arXiv 2019`.
- Graph R-CNN for Scene Graph Generation - Jianwei Yang _et al_, `ECCV 2018`. [[code]](https://github.com/jwyang/graph-rcnn.pytorch)
- LinkNet_Relational Embedding for Scene Graph - Sanghyun Woo _et al_, `NIPS 2018`. [[code]](https://github.com/jiayan97/linknet-pytorch)
- Generating Triples with Adversarial Networks for Scene Graph Construction - Matthew Klawonn _et al_, `AAAI 2018`.
- Scene Graph Generation Based on Node-Relation Context Module - Xin Lin _et al_, `ICONIP 2018`.
- Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation - Yikang Li _et al_, `ECCV 2018`. [[code]](https://github.com/yikang-li/FactorizableNet)
- Neural Motifs_Scene Graph Parsing with Global Context - Rowan Zellers _et al_, `CVPR 2018`. [[code]](https://github.com/rowanz/neural-motifs)
- Zoom-Net: Mining Deep Feature Interactions for Visual Relationship Recognition - Guojun Yin _et al_, `ECCV 2018`. [[code]](https://github.com/gjyin91/ZoomNet)
- Deep Structured Learning for Visual Relationship Detection - Yaohui Zhu _et al_, `AAAI 2018`.
- Visual Relationship Detection Using Joint Visual-Semantic Embedding - Binglin Li _et al_, `ICPR 2018`.
- Object Relation Detection Based on One-shot Learning - Li Zhou _et al_, `arXiv 2018`.
- A Problem Reduction Approach for Visual Relationships Detection - Toshiyuki Fukuzawa _et al_, `ECCVW 2018`.
- An Interpretable Model for Scene Graph Generation - Ji Zhang _et al_, `arXiv 2018`.
- Shuffle-Then-Assemble: Learning Object-Agnostic Visual Relationship Features - Xu Yang _et al_, `ECCV 2018`. [[code]](https://github.com/yangxuntu/vrd)
- Visual Relationship Detection with Deep Structural Ranking - Kongming Liang _et al_, `AAAI 2018`. [[code]](https://github.com/GriffinLiang/vrd-dsr)
- Mapping Images to Scene Graphs with Permutation-Invariant Structured Prediction - Roei Herzig _et al_, `NIPS 2018`. [[code]](https://github.com/shikorab/SceneGraph)
- Visual Relationship Detection with Language prior and Softmax - Jaewon Jung _et al_, `IPAS 2018`. [[code]](https://github.com/pranoyr/visual-relationship-detection)
- Visual Relationship Detection Based on Guided Proposals and Semantic Knowledge Distillation - François Plesse _et al_, `ICME 2018`.
- Context-Dependent Diffusion Network for Visual Relationship Detection - Zhen Cui _et al_, `ACM MM 2018`. [[code]](https://github.com/pranoyr/visual-relationship-detection)
- Region-Object Relevance-Guided Visual Relationship Detection - Yusuke Goutsu _et al_, `BMVC 2018`.
- Deep Image Understanding Using Multilayered Contexts - Donghyeop Shin _et al_, `MPE 2018`.
- Scene Graph Generation via Conditional Random Fields - Weilin Cong _et al_, `arXiv 2018`.
- Scene Graph Generation by Iterative Message Passing - Danfei Xu _et al_, `CVPR 2017`. [[code]](https://github.com/danfeiX/scene-graph-TF-release)
- Scene Graph Generation from Objects, Phrases and Region Captions - Yikang Li _et al_, `ICCV 2017`. [[code]](https://github.com/yikang-li/MSDN)
- ViP-CNN: Visual Phrase Guided Convolutional Neural Network - Yikang Li _et al_, `CVPR 2017`.
- Towards Context-Aware Interaction Recognition for Visual Relationship Detection - Bohan Zhuang _et al_, `ICCV 2017`.
- Phrase Localization and Visual Relationship Detection with Comprehensive Image-Language Cues - Bryan A _et al_, `ICCV 2017`. [[code]](https://github.com/BryanPlummer/pl-clc)
- Deep Variation-Structured Reinforcement Learning for Visual Relationship and Attribute Detection - Xiaodan Liang _et al_, `CVPR 2017`. [[code]](https://github.com/nexusapoorvacus/DeepVariationStructuredRL)
- Visual Relationship Detection with Internal and External Linguistic Knowledge Distillation - Ruichi Yu _et al_, `ICCV 2017`.
- Visual Translation Embedding Network for Visual Relation Detection - Hanwang Zhang _et al_, `CVPR 2017`. [[code]](https://github.com/YANYANYEAH/vtranse)
- Pixels to Graphs by Associative Embedding - Alejandro Newell _et al_, `NIPS 2017`. [[code]](https://github.com/princeton-vl/px2graph)
- PPR-FCN: Weakly Supervised Visual Relation Detection via Parallel Pairwise R-FCN - Hanwang Zhang _et al_, `ICCV 2017`. [[code]](https://github.com/jpeyre/unrel)
- Visual relationship detection with object spatial distribution - Yaohui Zhu _et al_, `ICME 2017`.
- On Support Relations and Semantic Scene Graphs - Michael Ying Yang _et al_, `ISPRS 2017`.
- Improving Visual Relationship Detection using Semantic Modeling of Scene Descriptions - Bryan A _et al_, `ISWC 2017`.
- Recurrent Visual Relationship Recognition with Triplet Unit - Kento Masui _et al_, `ISM 2017`.
- Recognition using visual phrases - Mohammad Amin Sadeghi _et al_, `CVPR 2011`.
- Memory-Based Network for Scene Graph with Unbalanced Relations - Weitao Wang _et al_, `MM 2020`.
- Visual Spatial Attention Network for Relationship Detection - Chaojun Han, _et al_, `ACM MM 2019`.
- Scene Graph Generation with External Knowledge and Image Reconstruction - Jiuxiang Gu _et al_, `CVPR 2019`. [[code]](https://github.com/arxrean/SGG_Ex_RC)
- Learning Prototypes for Visual Relationship Detection - François Plesse _et al_, `CBMI 2018`.
- Detecting Visual Relationships with Deep Relational Networks - Bo Dai _et al_, `CVPR 2017`. [[code]](https://github.com/doubledaibo/drnet_cvpr2017)
- Recurrent Visual Relationship Recognition with Triplet Unit for Diversity - Kento Masui _et al_, `IJSC 2018`.
- Visual Relationship Recognition via Language and Position Guided Attention - Hao Zhou, _et al_, `ICASSP 2019`.
-
Datasets
- Weakly-Supervised Learning of Visual Relations - Julia Peyre _et al_, `ICCV 2017`. [[download]](https://www.di.ens.fr/willow/research/unrel/)
- Learning 3D Semantic Scene Graphs from 3D Indoor Reconstructions - Johanna Wald _et al_, `CVPR 2020`. [[download]](https://3dssg.github.io/)
- The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale - Alina Kuznetsova _et al_, `IJCV 2018`. [[download]](https://storage.googleapis.com/openimages/web/index.html)
- Image Retrieval using Scene Graphs - Justin Johnson _et al_, `CVPR 2015`. [[download]](http://imagenet.stanford.edu/internal/jcjohns/scene_graphs/sg_dataset.zip)
- Recognition Using Visual Phrases - Ali Farhadi _et al_, `CVPR 2011`. [[download]](http://vision.cs.uiuc.edu/phrasal/)
- VrR-VG: Refocusing Visually-Relevant Relationships - Yuanzhi Liang _et al_, `ICCV 2019`. [[download]](http://vrr-vg.com/)
- SpatialVOC2K: A Multilingual Dataset of Images with Annotations and Features for Spatial Relations between Objects - Anja Belz _et al_, `INLG 2018`. [[download]](https://github.com/muskata/SpatialVOC2K)
- Image Description using Visual Dependency Representations - Desmond Elliott _et al_, `EMNLP 2013`. [[download]]
- Combining geometric, textual and visual features for predicting prepositions in image descriptions - Arnau Ramisa _et al_, `EMNLP 2015`. [[download]]
- SynthRel0: Towards a Diagnostic Dataset for Relational Representation Learning - Daniel Dorda _et al_, `ICCVW 2019`. [[download]]
- Indoor Segmentation and Support Inference from RGBD Images - Nathan Silberman _et al_, `ECCV 2012`. [[download]](https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html)
- 3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera - Iro Armeni _et al_, `ICCV 2019`. [[download]](https://3dscenegraph.stanford.edu/)
- Visual Relationship Detection with Language Priors - Cewu Lu _et al_, `ECCV 2016 Oral`. [[download]](https://cs.stanford.edu/people/ranjaykrishna/vrd/)
- Annotating Objects and Relations in User-Generated Videos - Xindi Shang _et al_, `ACM MM 2019`. [[download]](https://xdshang.github.io/docs/vidor.html)
- Recognition Using Visual Phrases - Ali Farhadi _et al_, `CVPR 2011`. [[download]](http://vision.cs.uiuc.edu/phrasal/)
- SpatialSense: An Adversarially Crowdsourced Benchmark for Spatial Relation Recognition - Kaiyu Yang _et al_, `ICCV 2019`. [[download]](https://drive.google.com/drive/folders/125fgCq-1YYfKOAxRxVEdmnyZ7sKWlyqZ?usp=sharing)
-
3-D Scene Graph
- 3-D Scene Graph: A Sparse and Semantic Representation of Physical Environments for Intelligent Agents - Ue-Hwan Kim _et al_, `IEEE transactions on cybernetics 2019`. [[code]](https://github.com/Uehwan/3-D-Scene-Graph)
- 3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera - Iro Armeni _et al_, `ICCV 2019`. [[code]](https://github.com/StanfordVL/3DSceneGraph)
- Visual Graphs from Motion (VGfM): Scene understanding with object geometry reasoning - Paul Gay _et al_, `ACCV 2018`. [[code]](https://github.com/paulgay/VGfM)
-
Generate Scene Graph from Textual Description
- Scene Graph Parsing as Dependency Parsing - Yu-Siang Wang _et al_, `NAACL 2018`. [[code]](https://github.com/vacancy/SceneGraphParser)
- Scene Graph Parsing by Attention Graph - Martin Andrews _et al_, `NIPS 2018`.
-
Other Works
- Relationship Prediction for Scene Graph Generation - Uzair Navid Iftikhar _et al_, `2019`.
- Joint Learning of Scene Graph Generation and Reasoning for Visual Question Answering Mid-term report - Arka Sadhu _et al_, `2019`.
- Joint Embeddings of Scene Graphs and Images - Eugene Belilovsky _et al_, `2020`.
-
-
Related High-level Vision-and-Language Tasks
-
Image Retrieval
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Cross-modal Scene Graph Matching for Relationship-aware Image-Text Retrieval - Sijin Wang _et al_, `WACV 2020`.
- Scene Graph based Image Retrieval -- A case study on the CLEVR Dataset - Sahana Ramnath _et al_, `ICCVW 2019`.
- Compact Scene Graphs for Layout Composition and Patch Retrieval - Subarna Tripathi _et al_, `CVPRW 2019`.
- Revisiting Visual Grounding - Erik Conser _et al_, `ACL 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning Relationship-aware Visual Features - Nicola Messina _et al_, `ECCVW 2018`. [[code]](https://github.com/mesnico/learning-relationship-aware-visual-features)
- Image retrieval by dense caption reasoning - Xinru Wei _et al_, `VCIP 2017`.
- Image retrieval using scene graphs - Justin Johnson _et al_, `CVPR 2015`.
- Generating Semantically Precise Scene Graphs from Textual Descriptions for Improved Image Retrieval - Sebastian Schuster _et al_, `EMNLP 2015`.
- Deep Learning of Binary Hash Codes for Fast Image Retrieval - Kevin Lin _et al_, `CVPRW 2015`. [[code]](https://github.com/kevinlin311tw/caffe-cvprw15)
- Beyond instance-level image retrieval: Leveraging captions to learn a global visual representation for semantic retrieval - Albert Gordo _et al_, `CVPR 2017`.
- PatternNet: A Benchmark Dataset for Performance Evaluation of Remote Sensing Image Retrieval - Weixun Zhou _et al_, `ISPRS 2018`. [[download]](https://sites.google.com/view/zhouwx/dataset)
- Large-Scale Image Retrieval with Attentive Deep Local Features - Hyeonwoo Noh _et al_, `ICCV 2017`. [[download]](https://github.com/tensorflow/models/tree/master/research/delf)
- Google Landmarks Dataset v2 -- A Large-Scale Benchmark for Instance-Level Recognition and Retrieval - Tobias Weyand _et al_, `CVPR 2020`. [[download]](https://github.com/cvdfoundation/google-landmark)
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Representation Learning for Visual-Relational Knowledge Graphs - Daniel Oñoro-Rubio _et al_, `ARXIV 2017`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
-
Image Caption
- Exploring Semantic Relationships for Image Captioning without Parallel Data - Fenglin Liu _et al_, `ICDM 2019`.
- Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations - Ranjay Krishna _et al_, `IJCV 2016`. [[download]](http://visualgenome.org/)
- Learning visual relationship and context-aware attention for image captioning - Junbo Wang _et al_, `Pattern Recognition 2020`.
- Object Relational Graph with Teacher-Recommended Learning for Video Captioning - Ziqi Zhang _et al_, `CVPR 2020`.
- Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs - Shizhe Chen _et al_, `CVPR 2020`. [[code]](https://github.com/cshizhe/asg2cap)
- Joint Commonsense and Relation Reasoning for Image and Video Captioning - Jingyi Hou _et al_, `AAAI 2020`.
- Auto-Encoding Scene Graphs for Image Captioning - Xu Yang _et al_, `CVPR 2019`. [[code]](https://github.com/yangxuntu/SGAE)
- Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning - Dong-Jin Kim _et al_, `CVPR 2019`. [[code]](https://github.com/Dong-JinKim/DenseRelationalCaptioning)
- Visual Semantic Reasoning for Image-Text Matching - Kunpeng Li _et al_, `ICCV 2019`. [[code]](https://github.com/KunpengLi1994/VSRN)
- Unpaired Image Captioning via Scene Graph Alignments - Jiuxiang Gu _et al_, `ICCV 2019`. [[code]](https://github.com/gujiuxiang/unpaired_image_captioning)
- Expressing Visual Relationships via Language - Hao Tan _et al_, `ACL 2019`. [[code]](https://github.com/airsplay/VisualRelationships)
- On the Role of Scene Graphs in Image Captioning - Dalin Wang _et al_, `ACL 2019`.
- Adversarial Adaptation of Scene Graph Models for Understanding Civic Issues - Shanu Kumar _et al_, `WWW 2019`. [[code]](https://github.com/Sshanu/civic_issue_dataset)
- Aligning Linguistic Words and Visual Semantic Units for Image Captioning - Longteng Guo _et al_, `ACM MM 2019`. [[code]](https://github.com/ltguo19/VSUA-Captioning)
- Better Understanding Hierarchical Visual Relationship for Image Caption - Zheng-cong Fei _et al_, `NeurIPS 2019 workshop on New In ML`.
- Visual Relationship Embedding Network for Image Paragraph Generation - Wenbin Che _et al_, `TMM 2019`.
- Know More Say Less: Image Captioning Based on Scene Graphs - Xiangyang Li _et al_, `TMM 2019`.
- Visual Relationship Attention for Image Captioning - Zongjian Zhang _et al_, `IJCNN 2019`.
- Exploring Semantic Relationships for Image Captioning without Parallel Data - Fenglin Liu _et al_, `ICDM 2019`.
- TPsgtR: Neural-Symbolic Tensor Product Scene-Graph-Triplet Representation for Image Captioning - Chiranjib Sur _et al_, `ARXIV 2019`.
- Learning Visual Relation Priors for Image-Text Matching and Image Captioning with Neural Scene Graph Generators - Kuang-Huei Lee _et al_, `ARXIV 2019`.
- Relational Reasoning using Prior Knowledge for Visual Captioning - Jingyi Hou _et al_, `ARXIV 2019`.
- Exploring Visual Relationship for Image Captioning - Ting Yao _et al_, `ECCV 2018`.
- Improved Image Description Via Embedded Object Structure Graph and Semantic Feature Matching - Li Ren _et al_, `ISM 2018`.
- Sports Video Captioning by Attentive Motion Representation based Hierarchical Recurrent Neural Networks - Mengshi Qi _et al_, `2018`.
- Image Captioning and Visual Question Answering Based on Attributes and External Knowledge - Qi Wu _et al_, `TPAMI 2017`.
- SPICE: Semantic Propositional Image Caption Evaluation - Peter Anderson _et al_, `ECCV 2016`. [[code]](https://github.com/peteanderson80/SPICE)
- Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models - Bryan A. Plummer _et al_, `IJCV 2017`. [[download]](http://bryanplummer.com/Flickr30kEntities/)
- Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics - Micah Hodosh _et al_, `IJCAI 2013`. [[download]](http://academictorrents.com/details/9dea07ba660a722ae1008c4c8afdd303b6f6e53b)
- The IAPR TC-12 Benchmark: A New Evaluation Resource for Visual Information Systems - Michael Grubinger _et al_, `International workshop onto Image 2006`. [[download]](https://www.imageclef.org/photodata)
- Paragraph Generation Network with Visual Relationship Detection - Wenbin Che _et al_, `ACM MM 2018`.
-
Referring Expression Comprehension - Visual Grounding
- Cross-Modal Relationship Inference for Grounding Referring Expressions - Sibei Yang _et al_, `CVPR 2019`. [[code]](https://github.com/sibeiyang/sgmn/tree/master/lib/cmrin_models)
- Relationship-Embedded Representation Learning for Grounding Referring Expressions - Sibei Yang _et al_, `TPAMI 2020`. [[code]](https://github.com/sibeiyang/sgmn/tree/master/lib/cmrin_models)
- Referring Expression Comprehension with Semantic Visual Relationship and Word Mapping - Chao Zhang _et al_, `ACM MM 2019`.
- Learning to Relate from Captions and Bounding Boxes - Sarthak Garg _et al_, `ACL 2019`.
- Joint Visual Grounding with Language Scene Graphs - Daqing Liu _et al_, `ARXIV 2019`.
- Modeling Relationships in Referential Expressions With Compositional Modular Networks - Ronghang Hu _et al_, `CVPR 2017`. [[code]](https://github.com/ronghanghu/cmn)
- Graph-Structured Referring Expression Reasoning in The Wild - Sibei Yang _et al_, `CVPR 2020`. [[code]](https://github.com/sibeiyang/sgmn)
- Dynamic Graph Attention for Referring Expression Comprehension - Sibei Yang _et al_, `ICCV 2019`. [[code]](https://github.com/sibeiyang/sgmn/tree/master/lib/dga_models)
- Grounding Referring Expressions in Images by Variational Context - Hanwang Zhang _et al_, `CVPR 2018`. [[code]](https://github.com/yuleiniu/vc)
- Modeling Context in Referring Expressions - Licheng Yu _et al_, `ECCV 2016`. [[download]](https://github.com/lichengunc/refer)
-
Visual Question Answering
- DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue - Xiaoze Jiang _et al_, `AAAI 2020`. [[code]](https://github.com/JXZe/DualVD)
- Visual Query Answering by Entity-Attribute Graph Matching and Reasoning - Peixi Xiong _et al_, `CVPR 2019`.
- Relation-Aware Graph Attention Network for Visual Question Answering - Linjie Li _et al_, `ICCV 2019`. [[code]](https://github.com/linjieli222/VQA_ReGAT)
- Multi-interaction Network with Object Relation for Video Question Answering - Weike Jin _et al_, `ACM MM 2019`.
- An Empirical Study on Leveraging Scene Graphs for Visual Question Answering - Cheng Zhangs _et al_, `BMVC 2019`. [[code]](https://github.com/czhang0528/scene-graphs-vqa)
- Generating Natural Language Explanations for Visual Question Answering using Scene Graphs and Visual Attention - Shalini Ghosh _et al_, `ARXIV 2019`.
- R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering - Pan Lu _et al_, `SIGKDD 2018`. [[code]](https://github.com/BierOne/relation-vqa)
- Scene Graph Reasoning with Prior Visual Relationship for Visual Question Answering - Zhuoqian Yang _et al_, `ARXIV 2018`.
- VisKE: Visual Knowledge Extraction and Question Answering by Visual Verification of Relation Phrases - Fereshteh Sadeghi _et al_, `CVPR 2015`. [[code]](https://github.com/fsadeghi/VisKE)
- Image Question Answering using Convolutional Neural Network with Dynamic Parameter Prediction - Hyeonwoo Noh _et al_, `CVPR 2016`. [[code]](https://github.com/HyeonwooNoh/DPPnet)
- Ask Your Neurons: A Neural-based Approach to Answering Questions about Images - Mateusz Malinowski _et al_, `ICCV 2015`.
- VQA: Visual question answering - Aishwarya Agrawal _et al_, `ICCV 2015`. [[download]](https://visualqa.org/vqa_v1_download.html)
- Making the V in VQA matter: Elevating the role of image understanding in Visual Question Answering - Yash Goyal _et al_, `CVPR 2017`. [[download]](https://visualqa.org/download.html)
- Image Question Answering: A Visual Semantic Embedding Model and a New Dataset - Mengye Ren _et al_, `ICML 2015`. or [Exploring Models and Data for Image Question Answering](https://arxiv.org/abs/1505.02074) - Mengye Ren _et al_, `NIPS 2015`. [[download]](http://www.cs.toronto.edu/~mren/imageqa/data/cocoqa/)
- CRA-Net: Composed Relation Attention Network for Visual Question Answering - Liang Peng _et al_, `ACM MM 2019`.
-
Visual Reasoning
- Language-Conditioned Graph Networks for Relational Reasoning - Ronghang Hu _et al_, `ICCV 2019`. [[code]](https://github.com/ronghanghu/lcgn)
- Explainable and Explicit Visual Reasoning over Scene Graphs - Jiaxin Shi _et al_, `CVPR 2019`. [[code]](https://github.com/shijx12/XNM-Net)
- Referring Relationships - Ranjay Krishna _et al_, `CVPR 2018`. [[code]](https://github.com/StanfordVL/ReferringRelationships)
- Broadcasting Convolutional Network for Visual Relational Reasoning - Simyung Chang _et al_, `ECCV 2018`.
- Object level Visual Reasoning in Videos - Fabien Baradel _et al_, `ECCV 2018`. [[code]](https://github.com/fabienbaradel/object_level_visual_reasoning)
- GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering - Drew A. Hudson _et al_, `CVPR 2019`. [[download]](https://cs.stanford.edu/people/dorarad/gqa/index.html)
- CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning - Justin Johnson _et al_, `CVPR 2017`. [[download]](https://cs.stanford.edu/people/jcjohns/clevr/) [[code]](https://github.com/facebookresearch/clevr-dataset-gen)
- Differentiable Scene Graphs - Moshiko Raboh _et al_, `WACV 2020`.
- A simple neural network module for relational reasoning - Adam Santoro _et al_, `NIPS 2017`. [[code]](https://github.com/clvrai/Relation-Network-Tensorflow)
-
Image Generation - Content-based Image Retrieval(CBIR)
- PasteGAN: A Semi-Parametric Method to Generate Image from Scene Graph - Yikang Li _et al_, `NIPS 2019`. [[code]](https://github.com/yikang-li/PasteGAN)
- Specifying Object Attributes and Relations in Interactive Scene Generation - Oron Ashual _et al_, `ICCV 2019`. [[code]](https://github.com/ashual/scene_generation)
- Triplet-Aware Scene Graph Embeddings - Brigit Schroeder _et al_, `ICCVW 2019`.
- Heuristics for Image Generation from Scene Graphs - Subarna Tripathi _et al_, `ICLR 2019`.
- Interactive Image Generation Using Scene Graphs - Gaurav Mittal _et al_, `ICLR 2019`.
- Visual-Relation Conscious Image Generation from Structured-Text - Duc Minh Vo _et al_, `ARXIV 2019`.
- Using Scene Graph Context to Improve Image Generation - Subarna Tripathi _et al_, `ARXIV 2019`.
- Learning Canonical Representations for Scene Graph to Image Generation - Roei Herzig _et al_, `ARXIV 2019`.
- Relationship-Aware Spatial Perception Fusion for Realistic Scene Layout Generation - Hongdong Zheng _et al_, `ARXIV 2019`.
- Image Generation from Scene Graphs - Justin Johnson _et al_, `CVPR 2018`. [[code]](https://github.com/google/sg2im)
- Image Generation From Small Datasets via Batch Statistics Adaptation - Atsuhiro Noguchi _et al_, `ICCV 2019`. [[code]](https://github.com/nogu-atsu/small-dataset-image-generation)
- Text2Scene: Generating Compositional Scenes from Textual Descriptions - Fuwen Tan _et al_, `CVPR 2019`. [[code]](https://github.com/uvavision/Text2Scene)
- Unsupervised Cross-Domain Image Generation - Yaniv Taigman _et al_, `ICLR 2017 conference submission`. [[code]](https://github.com/yunjey/domain-transfer-network)
- Generative Visual Manipulation on the Natural Image Manifold - Jun-Yan Zhu _et al_, `ECCV 2016`. [[code]](https://github.com/junyanz/iGAN)
- Attribute2Image: Conditional Image Generation from Visual Attributes - Xinchen Yan _et al_, `ECCV 2016`. [[code]](https://github.com/xcyan/eccv16_attr2img)
- Microsoft COCO: Common objects in context - Tsung-Yi Lin _et al_, `ECCV 2014`. [[download]](http://cocodataset.org/#home)
-
Other Applications
- Semantic Image Manipulation Using Scene Graphs - Helisa Dhamo _et al_, `CVPR 2020`.
- SOGNet: Scene Overlap Graph Network for Panoptic Segmentation - Yibo Yang _et al_, `AAAI 2020`. [[code]](https://github.com/LaoYang1994/SOGNet)
- ReLaText: Exploiting Visual Relationships for Arbitrary-Shaped Scene Text Detection with Graph Convolutional Networks - Chixiang Ma _et al_, `ARXIV 2020`.
- Event Detection with Relation-Aware Graph Convolutional Neural Networks - Shiyao Cui _et al_, `ARXIV 2020`.
- SceneGraphNet: Neural Message Passing for 3D Indoor Scene Augmentation - Yang Zhou _et al_, `ICCV 2019`. [[code]](https://github.com/yzhou359/3DIndoor-SceneGraphNet)
- Seq-SG2SL: Inferring Semantic Layout from Scene Graph Through Sequence to Sequence Learning - Boren Li _et al_, `ICCV 2019`.
- PlanIT: planning and instantiating indoor scenes with relation graph and spatial prior networks - Kai Wang _et al_, `TOGS 2019`.
- Hierarchical Relational Networks for Group Activity Recognition and Retrieval - Mostafa S. Ibrahim _et al_, `ECCV 2018`. [[code]](https://github.com/mostafa-saad/hierarchical-relational-network)
- Scene Graphs for Interpretable Video Anomaly Classification - Nicholas F. Y. Chen _et al_, `NIPS 2018 ViGIL Workshop`.
- Towards a Domain Specific Language for a Scene Graph based Robotic World Model - Sebastian Blumenthal _et al_, `DSLRob 2013`.
- Semantic Image Manipulation Using Scene Graphs - Helisa Dhamo _et al_, `CVPR 2020`.
-
-
Human-centric Relation
-
HCR Datasets
- [download
- [download
- Visual Semantic Role Labeling - Saurabh Gupta _et al_, `ARXIV 2015`. [[download]](https://github.com/s-gupta/v-coco)
- A Benchmark for Recognizing Human-Object Interactions in Images - Dieu-Thu Le _et al_, `ACL 2014`. [[download]](http://disi.unitn.it/~dle/dataset/TUHOI.html)
- A Benchmark for Recognizing Human-Object Interactions in Images - Dieu-Thu Le _et al_, `ACL 2014`. [[download]](http://disi.unitn.it/~dle/dataset/TUHOI.html)
- The "something something" video database for learning and evaluating visual common sense - Raghav Goyal _et al_, `ICCV 2017`. [[download]](https://20bn.com/datasets/something-something)
- A Benchmark for Recognizing Human-Object Interactions in Images - Yu-Wei Chao _et al_, `ICCV 2015`. [[download]](http://www-personal.umich.edu/~ywchao/hico/)
-
Human-Object Interaction(HOI)
- Interact as You Intend: Intention-Driven Human-Object Interaction Detection - Bingjie Xu _et al_, `TMM 2018`.
- VSGNet: Spatial Attention Network for Detecting Human Object Interactions Using Graph Convolutions - Oytun Ulutan _et al_, `CVPR 2020`. [[code]](https://github.com/ASMIftekhar/VSGNet)
- Learning Human-Object Interaction Detection using Interaction Points - Tiancai Wang _et al_, `CVPR 2020`. [[code]](https://github.com/vaesl/IP-Net)
- Detailed 2D-3D Joint Representation for Human-Object Interaction - Yong-Lu Li _et al_, `CVPR 2020`. [[code]](https://github.com/DirtyHarryLYL/DJ-RN)
- Cascaded Human-Object Interaction Recognition - Tianfei Zhou _et al_, `CVPR 2020`. [[code]](https://github.com/tfzhou/C-HOI)
- PPDM: Parallel Point Detection and Matching for Real-time Human-Object Interaction Detection - Yue Liao _et al_, `CVPR 2020`. [[code]](https://github.com/YueLiao/PPDM)
- Detecting Human-Object Interactions via Functional Generalization - Ankan Bansal _et al_, `AAAI 2020`.
- Classifying All Interacting Pairs in a Single Shot - Sanaa Chafik _et al_, `WACV 2020`.
- Visual-Semantic Graph Attention Network for Human-Object Interaction Detection - Zhijun Liang _et al_, `ARXIV 2020`.
- Spatial Priming for Detecting Human-Object Interactions - Ankan Bansal _et al_, `ARXIV 2020`.
- GID-Net: Detecting Human-Object Interaction with Global and Instance Dependency - Dongming Yang _et al_, `ARXIV 2020`.
- Reasoning About Human-Object Interactions Through Dual Attention Networks - Tete Xiao _et al_, `ICCV 2019`.
- Pose-aware Multi-level Feature Network for Human Object Interaction Detection - Bo Wan _et al_, `ICCV 2019`. [[code]](https://github.com/bobwan1995/PMFNet)
- Deep Contextual Attention for Human-Object Interaction Detection - Tiancai Wang _et al_, `ICCV 2019`.
- Understanding Human Gaze Communication by Spatio-Temporal Graph Reasoning - Lifeng Fan _et al_, `ICCV 2019`. [[code]](https://github.com/LifengFan/Human-Gaze-Communication)
- Holistic++ Scene Understanding: Single-view 3D Holistic Scene Parsing and Human Pose Estimation with Human-Object Interaction and Physical Commonsense - Yixin Chen _et al_, `ICCV 2019`. [[code]](https://github.com/yixchen/holistic_scene_human)
- Transferable Interactiveness Knowledge for Human-Object Interaction Detection - Yong-Lu Li _et al_, `CVPR 2019`. [[code]](https://github.com/DirtyHarryLYL/Transferable-Interactiveness-Network)
- Do Deep Neural Networks Model Nonlinear Compositionality in the Neural Representation of Human-Object Interactions? - Aditi Jha _et al_, `CCN 2019`.
- Detecting and Recognizing Human-Object Interactions - Georgia Gkioxari _et al_, `CVPR 2018`.
- Learning Human-Object Interactions by Graph Parsing Neural Networks - Siyuan Qi _et al_, `ECCV 2018`. [[code]](https://github.com/SiyuanQi/gpnn)
- Pairwise Body-Part Attention for Recognizing Human-Object Interactions - Hao-Shu Fang _et al_, `ECCV 2018`.
- iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection - Chen Gao _et al_, `BMVC 2018`. [[code]](https://github.com/vt-vl-lab/iCAN)
- Interact as You Intend: Intention-Driven Human-Object Interaction Detection - Bingjie Xu _et al_, `TMM 2018`.
- Fine-grained Event Learning of Human-Object Interaction with LSTM-CRF - Tuan Do _et al_, `ESANN 2017`.
- Learning Models for Actions and Person-Object Interactions with Transfer to Question Answering - Arun Mallya _et al_, `ECCV 2016`.
- Human Centred Object Co-Segmentation - Chenxia Wu _et al_, `ARXIV 2016`.
- Recognising Human-Object Interaction via Exemplar based Modelling - Jian-Fang Hu _et al_, `ICCV 2013`.
- Learning person-object interactions for action recognition in still images - Vincent Delaitre _et al_, `NIPS 2011`.
- Modeling Mutual Context of Object and Human Pose in Human-Object Interaction Activities - Bangpeng Yao _et al_, `CVPR 2010`.
- Discriminative models for static human-object interactions - Chaitanya Desai _et al_, `CVPRW 2010`.
- Grounded Human-Object Interaction Hotspots from Video - Tushar Nagarajan _et al_, `ICCV 2019`. [[code]](https://github.com/Tushar-N/interaction-hotspots)
- iMapper: Interaction-guided Joint Scene and Human Motion Mapping from Monocular Videos - Aron Monszpart _et al_, `Siggraph 2019`.
- Causality Inspired Retrieval of Human-object Interactions from Video - Liting Zhou _et al_, `CBMI 2019`.
- Zero-Shot Generation of Human-Object Interaction Videos - Megha Nawhal _et al_, `ARXIV 2019`.
- Forecasting Human Object Interaction: Joint Prediction of Motor Attention and Egocentric Activity - Miao Liu _et al_, `ARXIV 2019`.
- Attend and Interact: Higher-Order Object Interactions for Video Understanding - Chih-Yao Ma _et al_, `CVPR 2018`.
- Learning Human-Object Interaction Detection using Interaction Points - Tiancai Wang _et al_, `CVPR 2020`. [[code]](https://github.com/vaesl/IP-Net)
- Classifying All Interacting Pairs in a Single Shot - Sanaa Chafik _et al_, `WACV 2020`.
- Learning to Detect Human-Object Interactions - Yu-Wei Chao _et al_, `WACV 2018`. [[code]](https://github.com/ywchao/ho-rcnn)
- iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection - Chen Gao _et al_, `BMVC 2018`. [[code]](https://github.com/vt-vl-lab/iCAN)
-
Person in Centext(PIC)
- Visual Relationship Prediction via Label Clustering and Incorporation of Depth Information - Hsuan-Kung Yang _et al_, `ECCVW 2018`.
-
-
Workshops
-
Other Applications
- ECCV PIC 2018 Workshop
- ECCV PIC 2018 Workshop
- ICCV SGRL 2019 Workshop
- ICML LRG 2019 Workshop - Structured Representations
-
-
Challenges
-
Other Applications
- Person in Context Challenge - [Dataset](http://picdataset.com/challenge/dataset/download/) - [Baseline](https://github.com/siliu-group/pic-challenge-baseline)
- Person in Context Challenge - [Dataset](http://picdataset.com/challenge/dataset/download/) - [Baseline](https://github.com/siliu-group/pic-challenge-baseline)
- ACM MM 2019 Video Relation Understanding (VRU) Challenge - [Dataset](https://xdshang.github.io/docs/vidor.html)
-
-
Licenses
-
Other Applications
-
Categories
Sub Categories
2-D Scene Graph
97
Image Retrieval
43
Human-Object Interaction(HOI)
40
Image Caption
31
Other Applications
20
Datasets
16
Image Generation - Content-based Image Retrieval(CBIR)
16
Visual Question Answering
15
Referring Expression Comprehension - Visual Grounding
10
Visual Reasoning
9
HCR Datasets
7
Spatio-Temporal Scene Graph
4
3-D Scene Graph
3
Other Works
3
Generate Scene Graph from Textual Description
2
Person in Centext(PIC)
1