awesome-scene-graph

A curated list of scene graph generation and related area resources. :-)
https://github.com/mqjyl/awesome-scene-graph

Last synced: 15 days ago
JSON representation

Related High-level Vision-and-Language Tasks
- Image Retrieval
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Cross-modal Scene Graph Matching for Relationship-aware Image-Text Retrieval - Sijin Wang _et al_, `WACV 2020`.
  - Scene Graph based Image Retrieval -- A case study on the CLEVR Dataset - Sahana Ramnath _et al_, `ICCVW 2019`.
  - Compact Scene Graphs for Layout Composition and Patch Retrieval - Subarna Tripathi _et al_, `CVPRW 2019`.
  - Revisiting Visual Grounding - Erik Conser _et al_, `ACL 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning Relationship-aware Visual Features - Nicola Messina _et al_, `ECCVW 2018`. [[code]](https://github.com/mesnico/learning-relationship-aware-visual-features)
  - Image retrieval by dense caption reasoning - Xinru Wei _et al_, `VCIP 2017`.
  - Image retrieval using scene graphs - Justin Johnson _et al_, `CVPR 2015`.
  - Generating Semantically Precise Scene Graphs from Textual Descriptions for Improved Image Retrieval - Sebastian Schuster _et al_, `EMNLP 2015`.
  - Deep Learning of Binary Hash Codes for Fast Image Retrieval - Kevin Lin _et al_, `CVPRW 2015`. [[code]](https://github.com/kevinlin311tw/caffe-cvprw15)
  - Beyond instance-level image retrieval: Leveraging captions to learn a global visual representation for semantic retrieval - Albert Gordo _et al_, `CVPR 2017`.
  - PatternNet: A Benchmark Dataset for Performance Evaluation of Remote Sensing Image Retrieval - Weixun Zhou _et al_, `ISPRS 2018`. [[download]](https://sites.google.com/view/zhouwx/dataset)
  - Large-Scale Image Retrieval with Attentive Deep Local Features - Hyeonwoo Noh _et al_, `ICCV 2017`. [[download]](https://github.com/tensorflow/models/tree/master/research/delf)
  - Google Landmarks Dataset v2 -- A Large-Scale Benchmark for Instance-Level Recognition and Retrieval - Tobias Weyand _et al_, `CVPR 2020`. [[download]](https://github.com/cvdfoundation/google-landmark)
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Representation Learning for Visual-Relational Knowledge Graphs - Daniel Oñoro-Rubio _et al_, `ARXIV 2017`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
  - Learning visual features for relational CBIR - Nicola Messina _et al_, `MIR 2019`.
- Image Caption
  - Exploring Semantic Relationships for Image Captioning without Parallel Data - Fenglin Liu _et al_, `ICDM 2019`.
  - Visual Genome: Connecting Language and Vision Using Crowdsourced Dense Image Annotations - Ranjay Krishna _et al_, `IJCV 2016`. [[download]](http://visualgenome.org/)
  - Learning visual relationship and context-aware attention for image captioning - Junbo Wang _et al_, `Pattern Recognition 2020`.
  - Object Relational Graph with Teacher-Recommended Learning for Video Captioning - Ziqi Zhang _et al_, `CVPR 2020`.
  - Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs - Shizhe Chen _et al_, `CVPR 2020`. [[code]](https://github.com/cshizhe/asg2cap)
  - Joint Commonsense and Relation Reasoning for Image and Video Captioning - Jingyi Hou _et al_, `AAAI 2020`.
  - Auto-Encoding Scene Graphs for Image Captioning - Xu Yang _et al_, `CVPR 2019`. [[code]](https://github.com/yangxuntu/SGAE)
  - Dense Relational Captioning: Triple-Stream Networks for Relationship-Based Captioning - Dong-Jin Kim _et al_, `CVPR 2019`. [[code]](https://github.com/Dong-JinKim/DenseRelationalCaptioning)
  - Visual Semantic Reasoning for Image-Text Matching - Kunpeng Li _et al_, `ICCV 2019`. [[code]](https://github.com/KunpengLi1994/VSRN)
  - Unpaired Image Captioning via Scene Graph Alignments - Jiuxiang Gu _et al_, `ICCV 2019`. [[code]](https://github.com/gujiuxiang/unpaired_image_captioning)
  - Expressing Visual Relationships via Language - Hao Tan _et al_, `ACL 2019`. [[code]](https://github.com/airsplay/VisualRelationships)
  - On the Role of Scene Graphs in Image Captioning - Dalin Wang _et al_, `ACL 2019`.
  - Adversarial Adaptation of Scene Graph Models for Understanding Civic Issues - Shanu Kumar _et al_, `WWW 2019`. [[code]](https://github.com/Sshanu/civic_issue_dataset)
  - Aligning Linguistic Words and Visual Semantic Units for Image Captioning - Longteng Guo _et al_, `ACM MM 2019`. [[code]](https://github.com/ltguo19/VSUA-Captioning)
  - Better Understanding Hierarchical Visual Relationship for Image Caption - Zheng-cong Fei _et al_, `NeurIPS 2019 workshop on New In ML`.
  - Visual Relationship Embedding Network for Image Paragraph Generation - Wenbin Che _et al_, `TMM 2019`.
  - Know More Say Less: Image Captioning Based on Scene Graphs - Xiangyang Li _et al_, `TMM 2019`.
  - Visual Relationship Attention for Image Captioning - Zongjian Zhang _et al_, `IJCNN 2019`.
  - Exploring Semantic Relationships for Image Captioning without Parallel Data - Fenglin Liu _et al_, `ICDM 2019`.
  - TPsgtR: Neural-Symbolic Tensor Product Scene-Graph-Triplet Representation for Image Captioning - Chiranjib Sur _et al_, `ARXIV 2019`.
  - Learning Visual Relation Priors for Image-Text Matching and Image Captioning with Neural Scene Graph Generators - Kuang-Huei Lee _et al_, `ARXIV 2019`.
  - Relational Reasoning using Prior Knowledge for Visual Captioning - Jingyi Hou _et al_, `ARXIV 2019`.
  - Exploring Visual Relationship for Image Captioning - Ting Yao _et al_, `ECCV 2018`.
  - Improved Image Description Via Embedded Object Structure Graph and Semantic Feature Matching - Li Ren _et al_, `ISM 2018`.
  - Sports Video Captioning by Attentive Motion Representation based Hierarchical Recurrent Neural Networks - Mengshi Qi _et al_, `2018`.
  - Image Captioning and Visual Question Answering Based on Attributes and External Knowledge - Qi Wu _et al_, `TPAMI 2017`.
  - SPICE: Semantic Propositional Image Caption Evaluation - Peter Anderson _et al_, `ECCV 2016`. [[code]](https://github.com/peteanderson80/SPICE)
  - Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models - Bryan A. Plummer _et al_, `IJCV 2017`. [[download]](http://bryanplummer.com/Flickr30kEntities/)
  - Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics - Micah Hodosh _et al_, `IJCAI 2013`. [[download]](http://academictorrents.com/details/9dea07ba660a722ae1008c4c8afdd303b6f6e53b)
  - The IAPR TC-12 Benchmark: A New Evaluation Resource for Visual Information Systems - Michael Grubinger _et al_, `International workshop onto Image 2006`. [[download]](https://www.imageclef.org/photodata)
- Referring Expression Comprehension - Visual Grounding
  - Cross-Modal Relationship Inference for Grounding Referring Expressions - Sibei Yang _et al_, `CVPR 2019`. [[code]](https://github.com/sibeiyang/sgmn/tree/master/lib/cmrin_models)
  - Relationship-Embedded Representation Learning for Grounding Referring Expressions - Sibei Yang _et al_, `TPAMI 2020`. [[code]](https://github.com/sibeiyang/sgmn/tree/master/lib/cmrin_models)
  - Referring Expression Comprehension with Semantic Visual Relationship and Word Mapping - Chao Zhang _et al_, `ACM MM 2019`.
  - Learning to Relate from Captions and Bounding Boxes - Sarthak Garg _et al_, `ACL 2019`.
  - Joint Visual Grounding with Language Scene Graphs - Daqing Liu _et al_, `ARXIV 2019`.
  - Modeling Relationships in Referential Expressions With Compositional Modular Networks - Ronghang Hu _et al_, `CVPR 2017`. [[code]](https://github.com/ronghanghu/cmn)
  - Graph-Structured Referring Expression Reasoning in The Wild - Sibei Yang _et al_, `CVPR 2020`. [[code]](https://github.com/sibeiyang/sgmn)
  - Dynamic Graph Attention for Referring Expression Comprehension - Sibei Yang _et al_, `ICCV 2019`. [[code]](https://github.com/sibeiyang/sgmn/tree/master/lib/dga_models)
  - Grounding Referring Expressions in Images by Variational Context - Hanwang Zhang _et al_, `CVPR 2018`. [[code]](https://github.com/yuleiniu/vc)
  - Modeling Context in Referring Expressions - Licheng Yu _et al_, `ECCV 2016`. [[download]](https://github.com/lichengunc/refer)
- Visual Question Answering
  - DualVD: An Adaptive Dual Encoding Model for Deep Visual Understanding in Visual Dialogue - Xiaoze Jiang _et al_, `AAAI 2020`. [[code]](https://github.com/JXZe/DualVD)
  - Visual Query Answering by Entity-Attribute Graph Matching and Reasoning - Peixi Xiong _et al_, `CVPR 2019`.
  - Relation-Aware Graph Attention Network for Visual Question Answering - Linjie Li _et al_, `ICCV 2019`. [[code]](https://github.com/linjieli222/VQA_ReGAT)
  - Multi-interaction Network with Object Relation for Video Question Answering - Weike Jin _et al_, `ACM MM 2019`.
  - An Empirical Study on Leveraging Scene Graphs for Visual Question Answering - Cheng Zhangs _et al_, `BMVC 2019`. [[code]](https://github.com/czhang0528/scene-graphs-vqa)
  - Generating Natural Language Explanations for Visual Question Answering using Scene Graphs and Visual Attention - Shalini Ghosh _et al_, `ARXIV 2019`.
  - R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering - Pan Lu _et al_, `SIGKDD 2018`. [[code]](https://github.com/BierOne/relation-vqa)
  - Scene Graph Reasoning with Prior Visual Relationship for Visual Question Answering - Zhuoqian Yang _et al_, `ARXIV 2018`.
  - VisKE: Visual Knowledge Extraction and Question Answering by Visual Verification of Relation Phrases - Fereshteh Sadeghi _et al_, `CVPR 2015`. [[code]](https://github.com/fsadeghi/VisKE)
  - Image Question Answering using Convolutional Neural Network with Dynamic Parameter Prediction - Hyeonwoo Noh _et al_, `CVPR 2016`. [[code]](https://github.com/HyeonwooNoh/DPPnet)
  - Ask Your Neurons: A Neural-based Approach to Answering Questions about Images - Mateusz Malinowski _et al_, `ICCV 2015`.
  - VQA: Visual question answering - Aishwarya Agrawal _et al_, `ICCV 2015`. [[download]](https://visualqa.org/vqa_v1_download.html)
  - Making the V in VQA matter: Elevating the role of image understanding in Visual Question Answering - Yash Goyal _et al_, `CVPR 2017`. [[download]](https://visualqa.org/download.html)
  - Image Question Answering: A Visual Semantic Embedding Model and a New Dataset - Mengye Ren _et al_, `ICML 2015`. or [Exploring Models and Data for Image Question Answering](https://arxiv.org/abs/1505.02074) - Mengye Ren _et al_, `NIPS 2015`. [[download]](http://www.cs.toronto.edu/~mren/imageqa/data/cocoqa/)
- Visual Reasoning
  - Language-Conditioned Graph Networks for Relational Reasoning - Ronghang Hu _et al_, `ICCV 2019`. [[code]](https://github.com/ronghanghu/lcgn)
  - Explainable and Explicit Visual Reasoning over Scene Graphs - Jiaxin Shi _et al_, `CVPR 2019`. [[code]](https://github.com/shijx12/XNM-Net)
  - Referring Relationships - Ranjay Krishna _et al_, `CVPR 2018`. [[code]](https://github.com/StanfordVL/ReferringRelationships)
  - Broadcasting Convolutional Network for Visual Relational Reasoning - Simyung Chang _et al_, `ECCV 2018`.
  - Object level Visual Reasoning in Videos - Fabien Baradel _et al_, `ECCV 2018`. [[code]](https://github.com/fabienbaradel/object_level_visual_reasoning)
  - GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering - Drew A. Hudson _et al_, `CVPR 2019`. [[download]](https://cs.stanford.edu/people/dorarad/gqa/index.html)
  - CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning - Justin Johnson _et al_, `CVPR 2017`. [[download]](https://cs.stanford.edu/people/jcjohns/clevr/) [[code]](https://github.com/facebookresearch/clevr-dataset-gen)
  - Differentiable Scene Graphs - Moshiko Raboh _et al_, `WACV 2020`.
  - A simple neural network module for relational reasoning - Adam Santoro _et al_, `NIPS 2017`. [[code]](https://github.com/clvrai/Relation-Network-Tensorflow)
- Image Generation - Content-based Image Retrieval(CBIR)
  - PasteGAN: A Semi-Parametric Method to Generate Image from Scene Graph - Yikang Li _et al_, `NIPS 2019`. [[code]](https://github.com/yikang-li/PasteGAN)
  - Specifying Object Attributes and Relations in Interactive Scene Generation - Oron Ashual _et al_, `ICCV 2019`. [[code]](https://github.com/ashual/scene_generation)
  - Triplet-Aware Scene Graph Embeddings - Brigit Schroeder _et al_, `ICCVW 2019`.
  - Heuristics for Image Generation from Scene Graphs - Subarna Tripathi _et al_, `ICLR 2019`.
  - Interactive Image Generation Using Scene Graphs - Gaurav Mittal _et al_, `ICLR 2019`.
  - Visual-Relation Conscious Image Generation from Structured-Text - Duc Minh Vo _et al_, `ARXIV 2019`.
  - Using Scene Graph Context to Improve Image Generation - Subarna Tripathi _et al_, `ARXIV 2019`.
  - Learning Canonical Representations for Scene Graph to Image Generation - Roei Herzig _et al_, `ARXIV 2019`.
  - Relationship-Aware Spatial Perception Fusion for Realistic Scene Layout Generation - Hongdong Zheng _et al_, `ARXIV 2019`.
  - Image Generation from Scene Graphs - Justin Johnson _et al_, `CVPR 2018`. [[code]](https://github.com/google/sg2im)
  - Image Generation From Small Datasets via Batch Statistics Adaptation - Atsuhiro Noguchi _et al_, `ICCV 2019`. [[code]](https://github.com/nogu-atsu/small-dataset-image-generation)
  - Text2Scene: Generating Compositional Scenes from Textual Descriptions - Fuwen Tan _et al_, `CVPR 2019`. [[code]](https://github.com/uvavision/Text2Scene)
  - Unsupervised Cross-Domain Image Generation - Yaniv Taigman _et al_, `ICLR 2017 conference submission`. [[code]](https://github.com/yunjey/domain-transfer-network)
  - Generative Visual Manipulation on the Natural Image Manifold - Jun-Yan Zhu _et al_, `ECCV 2016`. [[code]](https://github.com/junyanz/iGAN)
  - Attribute2Image: Conditional Image Generation from Visual Attributes - Xinchen Yan _et al_, `ECCV 2016`. [[code]](https://github.com/xcyan/eccv16_attr2img)
  - Microsoft COCO: Common objects in context - Tsung-Yi Lin _et al_, `ECCV 2014`. [[download]](http://cocodataset.org/#home)
- Other Applications
  - Semantic Image Manipulation Using Scene Graphs - Helisa Dhamo _et al_, `CVPR 2020`.
  - SOGNet: Scene Overlap Graph Network for Panoptic Segmentation - Yibo Yang _et al_, `AAAI 2020`. [[code]](https://github.com/LaoYang1994/SOGNet)
  - ReLaText: Exploiting Visual Relationships for Arbitrary-Shaped Scene Text Detection with Graph Convolutional Networks - Chixiang Ma _et al_, `ARXIV 2020`.
  - Event Detection with Relation-Aware Graph Convolutional Neural Networks - Shiyao Cui _et al_, `ARXIV 2020`.
  - SceneGraphNet: Neural Message Passing for 3D Indoor Scene Augmentation - Yang Zhou _et al_, `ICCV 2019`. [[code]](https://github.com/yzhou359/3DIndoor-SceneGraphNet)
  - Seq-SG2SL: Inferring Semantic Layout from Scene Graph Through Sequence to Sequence Learning - Boren Li _et al_, `ICCV 2019`.
  - PlanIT: planning and instantiating indoor scenes with relation graph and spatial prior networks - Kai Wang _et al_, `TOGS 2019`.
  - Hierarchical Relational Networks for Group Activity Recognition and Retrieval - Mostafa S. Ibrahim _et al_, `ECCV 2018`. [[code]](https://github.com/mostafa-saad/hierarchical-relational-network)
  - Scene Graphs for Interpretable Video Anomaly Classification - Nicholas F. Y. Chen _et al_, `NIPS 2018 ViGIL Workshop`.
  - Towards a Domain Specific Language for a Scene Graph based Robotic World Model - Sebastian Blumenthal _et al_, `DSLRob 2013`.
  - Semantic Image Manipulation Using Scene Graphs - Helisa Dhamo _et al_, `CVPR 2020`.
Human-centric Relation
- HCR Datasets
  - [download
  - [download
  - Visual Semantic Role Labeling - Saurabh Gupta _et al_, `ARXIV 2015`. [[download]](https://github.com/s-gupta/v-coco)
  - A Benchmark for Recognizing Human-Object Interactions in Images - Dieu-Thu Le _et al_, `ACL 2014`. [[download]](http://disi.unitn.it/~dle/dataset/TUHOI.html)
  - A Benchmark for Recognizing Human-Object Interactions in Images - Dieu-Thu Le _et al_, `ACL 2014`. [[download]](http://disi.unitn.it/~dle/dataset/TUHOI.html)
  - The "something something" video database for learning and evaluating visual common sense - Raghav Goyal _et al_, `ICCV 2017`. [[download]](https://20bn.com/datasets/something-something)
  - A Benchmark for Recognizing Human-Object Interactions in Images - Yu-Wei Chao _et al_, `ICCV 2015`. [[download]](http://www-personal.umich.edu/~ywchao/hico/)
- Human-Object Interaction(HOI)
  - Interact as You Intend: Intention-Driven Human-Object Interaction Detection - Bingjie Xu _et al_, `TMM 2018`.
  - VSGNet: Spatial Attention Network for Detecting Human Object Interactions Using Graph Convolutions - Oytun Ulutan _et al_, `CVPR 2020`. [[code]](https://github.com/ASMIftekhar/VSGNet)
  - Learning Human-Object Interaction Detection using Interaction Points - Tiancai Wang _et al_, `CVPR 2020`. [[code]](https://github.com/vaesl/IP-Net)
  - Detailed 2D-3D Joint Representation for Human-Object Interaction - Yong-Lu Li _et al_, `CVPR 2020`. [[code]](https://github.com/DirtyHarryLYL/DJ-RN)
  - Cascaded Human-Object Interaction Recognition - Tianfei Zhou _et al_, `CVPR 2020`. [[code]](https://github.com/tfzhou/C-HOI)
  - PPDM: Parallel Point Detection and Matching for Real-time Human-Object Interaction Detection - Yue Liao _et al_, `CVPR 2020`. [[code]](https://github.com/YueLiao/PPDM)
  - Detecting Human-Object Interactions via Functional Generalization - Ankan Bansal _et al_, `AAAI 2020`.
  - Classifying All Interacting Pairs in a Single Shot - Sanaa Chafik _et al_, `WACV 2020`.
  - Visual-Semantic Graph Attention Network for Human-Object Interaction Detection - Zhijun Liang _et al_, `ARXIV 2020`.
  - Spatial Priming for Detecting Human-Object Interactions - Ankan Bansal _et al_, `ARXIV 2020`.
  - GID-Net: Detecting Human-Object Interaction with Global and Instance Dependency - Dongming Yang _et al_, `ARXIV 2020`.
  - Reasoning About Human-Object Interactions Through Dual Attention Networks - Tete Xiao _et al_, `ICCV 2019`.
  - Pose-aware Multi-level Feature Network for Human Object Interaction Detection - Bo Wan _et al_, `ICCV 2019`. [[code]](https://github.com/bobwan1995/PMFNet)
  - Deep Contextual Attention for Human-Object Interaction Detection - Tiancai Wang _et al_, `ICCV 2019`.
  - Understanding Human Gaze Communication by Spatio-Temporal Graph Reasoning - Lifeng Fan _et al_, `ICCV 2019`. [[code]](https://github.com/LifengFan/Human-Gaze-Communication)
  - Holistic++ Scene Understanding: Single-view 3D Holistic Scene Parsing and Human Pose Estimation with Human-Object Interaction and Physical Commonsense - Yixin Chen _et al_, `ICCV 2019`. [[code]](https://github.com/yixchen/holistic_scene_human)
  - Transferable Interactiveness Knowledge for Human-Object Interaction Detection - Yong-Lu Li _et al_, `CVPR 2019`. [[code]](https://github.com/DirtyHarryLYL/Transferable-Interactiveness-Network)
  - Do Deep Neural Networks Model Nonlinear Compositionality in the Neural Representation of Human-Object Interactions? - Aditi Jha _et al_, `CCN 2019`.
  - Detecting and Recognizing Human-Object Interactions - Georgia Gkioxari _et al_, `CVPR 2018`.
  - Learning Human-Object Interactions by Graph Parsing Neural Networks - Siyuan Qi _et al_, `ECCV 2018`. [[code]](https://github.com/SiyuanQi/gpnn)
  - Pairwise Body-Part Attention for Recognizing Human-Object Interactions - Hao-Shu Fang _et al_, `ECCV 2018`.
  - iCAN: Instance-Centric Attention Network for Human-Object Interaction Detection - Chen Gao _et al_, `BMVC 2018`. [[code]](https://github.com/vt-vl-lab/iCAN)
  - Interact as You Intend: Intention-Driven Human-Object Interaction Detection - Bingjie Xu _et al_, `TMM 2018`.
  - Fine-grained Event Learning of Human-Object Interaction with LSTM-CRF - Tuan Do _et al_, `ESANN 2017`.
  - Learning Models for Actions and Person-Object Interactions with Transfer to Question Answering - Arun Mallya _et al_, `ECCV 2016`.
  - Human Centred Object Co-Segmentation - Chenxia Wu _et al_, `ARXIV 2016`.
  - Recognising Human-Object Interaction via Exemplar based Modelling - Jian-Fang Hu _et al_, `ICCV 2013`.
  - Learning person-object interactions for action recognition in still images - Vincent Delaitre _et al_, `NIPS 2011`.
  - Modeling Mutual Context of Object and Human Pose in Human-Object Interaction Activities - Bangpeng Yao _et al_, `CVPR 2010`.
  - Discriminative models for static human-object interactions - Chaitanya Desai _et al_, `CVPRW 2010`.
  - Grounded Human-Object Interaction Hotspots from Video - Tushar Nagarajan _et al_, `ICCV 2019`. [[code]](https://github.com/Tushar-N/interaction-hotspots)
  - iMapper: Interaction-guided Joint Scene and Human Motion Mapping from Monocular Videos - Aron Monszpart _et al_, `Siggraph 2019`.
  - Causality Inspired Retrieval of Human-object Interactions from Video - Liting Zhou _et al_, `CBMI 2019`.
  - Zero-Shot Generation of Human-Object Interaction Videos - Megha Nawhal _et al_, `ARXIV 2019`.
  - Forecasting Human Object Interaction: Joint Prediction of Motor Attention and Egocentric Activity - Miao Liu _et al_, `ARXIV 2019`.
  - Attend and Interact: Higher-Order Object Interactions for Video Understanding - Chih-Yao Ma _et al_, `CVPR 2018`.
  - Learning Human-Object Interaction Detection using Interaction Points - Tiancai Wang _et al_, `CVPR 2020`. [[code]](https://github.com/vaesl/IP-Net)
  - Learning to Detect Human-Object Interactions - Yu-Wei Chao _et al_, `WACV 2018`. [[code]](https://github.com/ywchao/ho-rcnn)
- Person in Centext(PIC)
  - Visual Relationship Prediction via Label Clustering and Incorporation of Depth Information - Hsuan-Kung Yang _et al_, `ECCVW 2018`.
Workshops
- Other Applications
  - ECCV PIC 2018 Workshop
  - ECCV PIC 2018 Workshop
  - ICCV SGRL 2019 Workshop
  - ICML LRG 2019 Workshop - Structured Representations
Challenges
- Other Applications
  - Person in Context Challenge - [Dataset](http://picdataset.com/challenge/dataset/download/) - [Baseline](https://github.com/siliu-group/pic-challenge-baseline)
  - Person in Context Challenge - [Dataset](http://picdataset.com/challenge/dataset/download/) - [Baseline](https://github.com/siliu-group/pic-challenge-baseline)
  - ACM MM 2019 Video Relation Understanding (VRU) Challenge - [Dataset](https://xdshang.github.io/docs/vidor.html)
Scene Graph Generation
- 2-D Scene Graph
  - Unbiased Scene Graph Generation from Biased Training - Kaihua Tang _et al_, `CVPR 2020`. [[code]](https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch)
  - One-shot Scene Graph Generation - Yuyu Gao _et al_, `MM 2020`.
  - Contextual Translation Embedding for Visual Relationship Detection and Scene Graph Generation - Zih-Siou Hung _et al_, `T-PAMI 2020`.
  - Leveraging Auxiliary Text for Deep Recognition of Unseen Visual Relationships - Gal Sadeh Kenigsfield _et al_, `ICLR 2020`.
  - Unbiased Scene Graph Generation from Biased Training - Kaihua Tang _et al_, `CVPR 2020`. [[code]](https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch)
  - Weakly Supervised Visual Semantic Parsing - Alireza Zareian _et al_, `CVPR 2020`.
  - GPS-Net: Graph Property Sensing Network for Scene Graph Generation - Xin Lin _et al_, `CVPR 2020`. [[code]](https://github.com/taksau/GPS-Net)
  - Deep Generative Probabilistic Graph Neural Networks for Scene Graph Generation - Mahmoud Khademi _et al_, `AAAI 2020`.
  - PCPL: Predicate-Correlation Perception Learning for Unbiased Scene Graph Generation - Shaotian Yan _et al_, `MM 2020`.
  - HOSE-Net: Higher Order Structure Embedded Network for Scene Graph Generation - Meng Wei _et al_, `MM 2020`.
  - Unbiased Scene Graph Generation via Rich and Fair Semantic Extraction - Bin Wen _et al_, `ARXIV 2020`.
  - Long-tail Visual Relationship Recognition with a Visiolinguistic Hubless Loss - Sherif Abdelkarim _et al_, `ARXIV 2020`.
  - Bridging Knowledge Graphs to Generate Scene Graphs - Alireza Zareian _et al_, `ARXIV 2020`.
  - NODIS: Neural Ordinary Differential Scene Understanding - Cong Yuren _et al_, `ARXIV 2020`.
  - AVR: Attention based Salient Visual Relationship Detection - Jianming Lv _et al_, `ARXIV 2020`.
  - Large-Scale Visual Relationship Understanding - Ji Zhang _et al_, `AAAI 2019`. [[code]](https://github.com/facebookresearch/Large-Scale-VRD)
  - Learning to Compose Dynamic Tree Structures for Visual Contexts - Kaihua Tang _et al_, `CVPR 2019 Oral`. [[code]](https://github.com/KaihuaTang/VCTree-Scene-Graph-Generation)
  - Counterfactual Critic Multi-Agent Training for Scene Graph Generation - Long Chen _et al_, `ICCV 2019 Oral`.
  - On Exploring Undetermined Relationships for Visual Relationship Detection - Yibing Zhan _et al_, `CVPR 2019`. [[code]](https://github.com//Atmegal//MFURLN-CVPR-2019-relationship-detection-method)
  - Exploring Context and Visual Pattern of Relationship for Scene Graph Generation - Wenbin Wang _et al_, `CVPR 2019`.
  - Relationship-Aware Spatial Perception Fusion for Realistic Scene Layout Generation - Hongdong Zheng _et al_, `arXiv 2019`.
  - The Limited Multi-Label Projection Layer - Brandon Amos _et al_, `arXiv 2019`. [[code]](https://github.com/locuslab/lml)
  - Detecting Visual Relationships Using Box Attention - Alexander Kolesnikov _et al_, `ICCVW 2019`.
  - Visual Relationships as Functions: Enabling Few-Shot Scene Graph Prediction - Apoorva Dornadula _et al_, `ICCVW 2019`.
  - Attention-Translation-Relation Network for Scalable Scene Graph Generation - Nikolaos Gkanatsios _et al_, `ICCVW 2019`. [[code]](https://github.com/deeplab-ai/atr-net)
  - Attentive Relational Networks for Mapping Images to Scene Graphs - Mengshi Qi _et al_, `CVPR 2019`.
  - Visual Relation Detection with Multi-Level Attention - Sipeng Zheng, _et al_, `ACM MM 2019`.
  - Visual Relationship Recognition via Language and Position Guided Attention - Hao Zhou, _et al_, `ICASSP 2019`.
  - Relationship Detection Based on Object Semantic Inference and Attention Mechanisms - Liang Zhang _et al_, `ICMR 2019`.
  - Natural Language Guided Visual Relationship Detection - Wentong Liao _et al_, `CVPR 2019`.
  - Knowledge-Embedded Routing Network for Scene Graph Generation - Tianshui Chen _et al_, `CVPR 2019`. [[code]](https://github.com/yuweihao/KERN)
  - Soft Transfer Learning via Gradient Diagnosis for Visual Relationship Detection - Diqi Chen _et al_, `WACV 2019`.
  - Compensating Supervision Incompleteness with Prior Knowledge in Semantic Image Interpretation - Ivan Donadello _et al_, `IJCNN 2019`. [[code]](https://github.com/ivanDonadello/Visual-Relationship-Detection-LTN)
  - Visual Relationship Detection with Low Rank Non-Negative Tensor Decomposition - Mohammed Haroon Dupty _et al_, `arXiv 2019`.
  - Relational Reasoning using Prior Knowledge for Visual Captioning - Jingyi Hou _et al_, `arXiv 2019`.
  - Attention-Translation-Relation Network for Scalable Scene Graph Generation - Nikolaos Gkanatsios _et al_, `ICCV 2019`.
  - Detecting Unseen Visual Relations Using Analogies - Julia Peyre _et al_, `ICCV 2019`.
  - BLOCK: Bilinear Superdiagonal Fusion for Visual Question Answering and Visual Relationship Detection - Kaiyu Yang _et al_, `AAAI 2019`. [[code]](https://github.com/Cadene/block.bootstrap.pytorch)
  - On Class Imbalance and Background Filtering in Visual Relationship Detection - Alessio Sarullo _et al_, `arXiv 2019`.
  - Support Relation Analysis for Objects in Multiple View RGB-D Images - Peng Zhang _et al_, `IJCAIW QR 2019`.
  - Improving Visual Relation Detection using Depth Maps - Sahand Sharifzadeh _et al_, `arXiv 2019`. [[code]](https://github.com/Sina-Baharlou/Depth-VRD)
  - MR-NET: Exploiting Mutual Relation for Visual Relationship Detection - Yi Bin _et al_, `AAAI 2019`.
  - Scene Graph Prediction with Limited Labels - Vincent S. Chen _et al_, `ICCV 2019`. [[code]](https://github.com/vincentschen/limited-label-scene-graphs)
  - Graphical Contrastive Losses for Scene Graph Parsing - Ji Zhang _et al_, `CVPR 2019`. [[code]](https://github.com/NVIDIA/ContrastiveLosses4VRD)
  - Neural Message Passing for Visual Relationship Detection - Yue Hu _et al_, `ICML LRG Workshop 2019`. [[code]](https://github.com/PhyllisH/NMP)
  - PANet: A Context Based Predicate Association Network for Scene Graph Generation - Yunian Chen _et al_, `ICME 2019`.
  - Visual Relationship Detection with Relative Location Mining - Hao Zhou _et al_, `ACM MM 2019`.
  - Visual relationship detection based on bidirectional recurrent neural network - Yibo Dai _et al_, `Multimedia Tools and Applications 2019`.
  - Exploring the Semantics for Visual Relationship Detection - Wentong Liao _et al_, `arXiv 2019`.
  - Optimising the Input Image to Improve Visual Relationship Detection - Noel Mizzi _et al_, `arXiv 2019`.
  - Deeply Supervised Multimodal Attentional Translation Embeddings for Visual Relationship Detection - Nikolaos Gkanatsios _et al_, `arXiv 2019`.
  - Learning Effective Visual Relationship Detector on 1 GPU - Yichao Lu _et al_, `arXiv 2019`.
  - Graph R-CNN for Scene Graph Generation - Jianwei Yang _et al_, `ECCV 2018`. [[code]](https://github.com/jwyang/graph-rcnn.pytorch)
  - LinkNet_Relational Embedding for Scene Graph - Sanghyun Woo _et al_, `NIPS 2018`. [[code]](https://github.com/jiayan97/linknet-pytorch)
  - Generating Triples with Adversarial Networks for Scene Graph Construction - Matthew Klawonn _et al_, `AAAI 2018`.
  - Scene Graph Generation Based on Node-Relation Context Module - Xin Lin _et al_, `ICONIP 2018`.
  - Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation - Yikang Li _et al_, `ECCV 2018`. [[code]](https://github.com/yikang-li/FactorizableNet)
  - Neural Motifs_Scene Graph Parsing with Global Context - Rowan Zellers _et al_, `CVPR 2018`. [[code]](https://github.com/rowanz/neural-motifs)
  - Zoom-Net: Mining Deep Feature Interactions for Visual Relationship Recognition - Guojun Yin _et al_, `ECCV 2018`. [[code]](https://github.com/gjyin91/ZoomNet)
  - Deep Structured Learning for Visual Relationship Detection - Yaohui Zhu _et al_, `AAAI 2018`.
  - Visual Relationship Detection Using Joint Visual-Semantic Embedding - Binglin Li _et al_, `ICPR 2018`.
  - Object Relation Detection Based on One-shot Learning - Li Zhou _et al_, `arXiv 2018`.
  - A Problem Reduction Approach for Visual Relationships Detection - Toshiyuki Fukuzawa _et al_, `ECCVW 2018`.
  - An Interpretable Model for Scene Graph Generation - Ji Zhang _et al_, `arXiv 2018`.
  - Shuffle-Then-Assemble: Learning Object-Agnostic Visual Relationship Features - Xu Yang _et al_, `ECCV 2018`. [[code]](https://github.com/yangxuntu/vrd)
  - Visual Relationship Detection with Deep Structural Ranking - Kongming Liang _et al_, `AAAI 2018`. [[code]](https://github.com/GriffinLiang/vrd-dsr)
  - Mapping Images to Scene Graphs with Permutation-Invariant Structured Prediction - Roei Herzig _et al_, `NIPS 2018`. [[code]](https://github.com/shikorab/SceneGraph)
  - Visual Relationship Detection with Language prior and Softmax - Jaewon Jung _et al_, `IPAS 2018`. [[code]](https://github.com/pranoyr/visual-relationship-detection)
  - Visual Relationship Detection Based on Guided Proposals and Semantic Knowledge Distillation - François Plesse _et al_, `ICME 2018`.
  - Context-Dependent Diffusion Network for Visual Relationship Detection - Zhen Cui _et al_, `ACM MM 2018`. [[code]](https://github.com/pranoyr/visual-relationship-detection)
  - Region-Object Relevance-Guided Visual Relationship Detection - Yusuke Goutsu _et al_, `BMVC 2018`.
  - Deep Image Understanding Using Multilayered Contexts - Donghyeop Shin _et al_, `MPE 2018`.
  - Scene Graph Generation via Conditional Random Fields - Weilin Cong _et al_, `arXiv 2018`.
  - Scene Graph Generation by Iterative Message Passing - Danfei Xu _et al_, `CVPR 2017`. [[code]](https://github.com/danfeiX/scene-graph-TF-release)
  - Scene Graph Generation from Objects, Phrases and Region Captions - Yikang Li _et al_, `ICCV 2017`. [[code]](https://github.com/yikang-li/MSDN)
  - ViP-CNN: Visual Phrase Guided Convolutional Neural Network - Yikang Li _et al_, `CVPR 2017`.
  - Towards Context-Aware Interaction Recognition for Visual Relationship Detection - Bohan Zhuang _et al_, `ICCV 2017`.
  - Phrase Localization and Visual Relationship Detection with Comprehensive Image-Language Cues - Bryan A _et al_, `ICCV 2017`. [[code]](https://github.com/BryanPlummer/pl-clc)
  - Deep Variation-Structured Reinforcement Learning for Visual Relationship and Attribute Detection - Xiaodan Liang _et al_, `CVPR 2017`. [[code]](https://github.com/nexusapoorvacus/DeepVariationStructuredRL)
  - Visual Relationship Detection with Internal and External Linguistic Knowledge Distillation - Ruichi Yu _et al_, `ICCV 2017`.
  - Visual Translation Embedding Network for Visual Relation Detection - Hanwang Zhang _et al_, `CVPR 2017`. [[code]](https://github.com/YANYANYEAH/vtranse)
  - Pixels to Graphs by Associative Embedding - Alejandro Newell _et al_, `NIPS 2017`. [[code]](https://github.com/princeton-vl/px2graph)
  - PPR-FCN: Weakly Supervised Visual Relation Detection via Parallel Pairwise R-FCN - Hanwang Zhang _et al_, `ICCV 2017`. [[code]](https://github.com/jpeyre/unrel)
  - Visual relationship detection with object spatial distribution - Yaohui Zhu _et al_, `ICME 2017`.
  - On Support Relations and Semantic Scene Graphs - Michael Ying Yang _et al_, `ISPRS 2017`.
  - Improving Visual Relationship Detection using Semantic Modeling of Scene Descriptions - Bryan A _et al_, `ISWC 2017`.
  - Recurrent Visual Relationship Recognition with Triplet Unit - Kento Masui _et al_, `ISM 2017`.
  - Recognition using visual phrases - Mohammad Amin Sadeghi _et al_, `CVPR 2011`.
  - Memory-Based Network for Scene Graph with Unbalanced Relations - Weitao Wang _et al_, `MM 2020`.
  - Part-Aware Interactive Learning for Scene Graph Generation - Hongshuo Tian _et al_, `MM 2020`.
  - Visual Spatial Attention Network for Relationship Detection - Chaojun Han, _et al_, `ACM MM 2019`.
  - Scene Graph Generation with External Knowledge and Image Reconstruction - Jiuxiang Gu _et al_, `CVPR 2019`. [[code]](https://github.com/arxrean/SGG_Ex_RC)
  - Learning Prototypes for Visual Relationship Detection - François Plesse _et al_, `CBMI 2018`.
  - Detecting Visual Relationships with Deep Relational Networks - Bo Dai _et al_, `CVPR 2017`. [[code]](https://github.com/doubledaibo/drnet_cvpr2017)
  - Recurrent Visual Relationship Recognition with Triplet Unit for Diversity - Kento Masui _et al_, `IJSC 2018`.
- Datasets
  - Weakly-Supervised Learning of Visual Relations - Julia Peyre _et al_, `ICCV 2017`. [[download]](https://www.di.ens.fr/willow/research/unrel/)
  - Learning 3D Semantic Scene Graphs from 3D Indoor Reconstructions - Johanna Wald _et al_, `CVPR 2020`. [[download]](https://3dssg.github.io/)
  - The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale - Alina Kuznetsova _et al_, `IJCV 2018`. [[download]](https://storage.googleapis.com/openimages/web/index.html)
  - Image Retrieval using Scene Graphs - Justin Johnson _et al_, `CVPR 2015`. [[download]](http://imagenet.stanford.edu/internal/jcjohns/scene_graphs/sg_dataset.zip)
  - Recognition Using Visual Phrases - Ali Farhadi _et al_, `CVPR 2011`. [[download]](http://vision.cs.uiuc.edu/phrasal/)
  - VrR-VG: Refocusing Visually-Relevant Relationships - Yuanzhi Liang _et al_, `ICCV 2019`. [[download]](http://vrr-vg.com/)
  - SpatialVOC2K: A Multilingual Dataset of Images with Annotations and Features for Spatial Relations between Objects - Anja Belz _et al_, `INLG 2018`. [[download]](https://github.com/muskata/SpatialVOC2K)
  - Image Description using Visual Dependency Representations - Desmond Elliott _et al_, `EMNLP 2013`. [[download]]
  - Combining geometric, textual and visual features for predicting prepositions in image descriptions - Arnau Ramisa _et al_, `EMNLP 2015`. [[download]]
  - SynthRel0: Towards a Diagnostic Dataset for Relational Representation Learning - Daniel Dorda _et al_, `ICCVW 2019`. [[download]]
  - Indoor Segmentation and Support Inference from RGBD Images - Nathan Silberman _et al_, `ECCV 2012`. [[download]](https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html)
  - 3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera - Iro Armeni _et al_, `ICCV 2019`. [[download]](https://3dscenegraph.stanford.edu/)
  - Visual Relationship Detection with Language Priors - Cewu Lu _et al_, `ECCV 2016 Oral`. [[download]](https://cs.stanford.edu/people/ranjaykrishna/vrd/)
  - SpatialSense: An Adversarially Crowdsourced Benchmark for Spatial Relation Recognition - Kaiyu Yang _et al_, `ICCV 2019`. [[download]](https://drive.google.com/drive/folders/125fgCq-1YYfKOAxRxVEdmnyZ7sKWlyqZ?usp=sharing)
- Spatio-Temporal Scene Graph
  - Video Relationship Reasoning using Gated Spatio-Temporal Energy Graph - Yao-Hung Hubert Tsai _et al_, `CVPR 2019`. [[code]](https://github.com/yaohungt/Gated-Spatio-Temporal-Energy-Graph)
  - Video Visual Relation Detection via Multi-modal Feature Fusion - Xu Sun _et al_, `ACM MM 2019`.
  - Action Genome: Actions as Composition of Spatio-temporal Scene Graphs - Jingwei Ji _et al_, `arXiv 2019`.
- 3-D Scene Graph
  - 3-D Scene Graph: A Sparse and Semantic Representation of Physical Environments for Intelligent Agents - Ue-Hwan Kim _et al_, `IEEE transactions on cybernetics 2019`. [[code]](https://github.com/Uehwan/3-D-Scene-Graph)
  - 3D Scene Graph: A Structure for Unified Semantics, 3D Space, and Camera - Iro Armeni _et al_, `ICCV 2019`. [[code]](https://github.com/StanfordVL/3DSceneGraph)
  - Visual Graphs from Motion (VGfM): Scene understanding with object geometry reasoning - Paul Gay _et al_, `ACCV 2018`. [[code]](https://github.com/paulgay/VGfM)
- Generate Scene Graph from Textual Description
  - Scene Graph Parsing as Dependency Parsing - Yu-Siang Wang _et al_, `NAACL 2018`. [[code]](https://github.com/vacancy/SceneGraphParser)
  - Scene Graph Parsing by Attention Graph - Martin Andrews _et al_, `NIPS 2018`.
- Other Works
  - Relationship Prediction for Scene Graph Generation - Uzair Navid Iftikhar _et al_, `2019`.
  - Joint Learning of Scene Graph Generation and Reasoning for Visual Question Answering Mid-term report - Arka Sadhu _et al_, `2019`.
  - Joint Embeddings of Scene Graphs and Images - Eugene Belilovsky _et al_, `2020`.
Licenses
- Other Applications
  - Youliang Jiang
  - ![CC0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

awesome-scene-graph

Image Retrieval

Image Caption

Referring Expression Comprehension - Visual Grounding

Visual Question Answering

Visual Reasoning

Image Generation - Content-based Image Retrieval(CBIR)

Other Applications

Human-centric Relation

HCR Datasets

Human-Object Interaction(HOI)

Person in Centext(PIC)

Workshops

Other Applications

Challenges

Other Applications

Scene Graph Generation

2-D Scene Graph

Datasets

Spatio-Temporal Scene Graph

3-D Scene Graph

Generate Scene Graph from Textual Description

Other Works

Licenses

Other Applications

awesome-scene-graph

Related High-level Vision-and-Language Tasks

Image Retrieval

Image Caption

Referring Expression Comprehension - Visual Grounding

Visual Question Answering

Visual Reasoning

Image Generation - Content-based Image Retrieval(CBIR)

Other Applications

Human-centric Relation

HCR Datasets

Human-Object Interaction(HOI)

Person in Centext(PIC)

Workshops

Other Applications

Challenges

Other Applications

Scene Graph Generation

2-D Scene Graph

Datasets

Spatio-Temporal Scene Graph

3-D Scene Graph

Generate Scene Graph from Textual Description

Other Works

Licenses

Other Applications