https://github.com/52cv/iccv-2021-papers

Last synced: 4 months ago
JSON representation
Host: GitHub
URL: https://github.com/52cv/iccv-2021-papers
Owner: 52CV
Created: 2021-07-26T01:56:55.000Z (almost 5 years ago)
Default Branch: main
Last Pushed: 2022-04-11T05:49:35.000Z (about 4 years ago)
Last Synced: 2025-02-24T05:14:36.510Z (over 1 year ago)
Size: 5.62 MB
Stars: 253
Watchers: 11
Forks: 40
Open Issues: 0
Metadata Files:
- Readme: README.md
Awesome Lists containing this project

README

          # ICCV2021最新信息及已接收论文/代码



  



官网链接：http://iccv2021.thecvf.com/home


开会时间：2021年10月11日至17日


# :exclamation::exclamation::exclamation::star2::star2::star2:📗📗📗ICCV 2021收录论文已全部公布，下载可在【我爱计算机视觉】后台回复“paper”，即可收到。共计 1612 篇。

# :exclamation::exclamation::exclamation::star2::star2::star2:全部论文已粗略分类完毕，请查阅

## 历年综述论文分类汇总戳这里↘️[CV-Surveys](https://github.com/52CV/CV-Surveys)施工中~~~~~~~~~~

## 2022 年论文分类汇总戳这里

↘️[CVPR-2022-Papers](https://github.com/52CV/CVPR-2022-Papers)

↘️[WACV-2022-Papers](https://github.com/52CV/WACV-2022-Papers)

## 2021年论文分类汇总戳这里

↘️[ICCV-2021-Papers](https://github.com/52CV/ICCV-2021-Papers)

↘️[CVPR-2021-Papers](https://github.com/52CV/CVPR-2021-Papers)

## 2020 年论文分类汇总戳这里

↘️[CVPR-2020-Papers](https://github.com/52CV/CVPR-2020-Papers)

↘️[ECCV-2020-Papers](https://github.com/52CV/ECCV-2020-Papers)

# 目录

|:dog:|:mouse:|:hamster:|:tiger:|

|------|------|------|------|

|[65.Optical Flow Estimation(光流估计)](#65)|

|[61.Metric Learning(元学习)](#61)|[62.Open-Set Recognition(开放集识别)](#62)|[63.Data Augmentation(数据增强)](#63)|[64.Anomaly Detection(异常检测)](#64)|

|[57.Image Matching(图像匹配)](#57)|[58.Computational Photography(光学、几何、光场成像、计算摄影)](#58)|[59.Graph Neural Networks(图神经网络)](#59)|[60.Federated Learning(联合学习)](#60)

|[53.Vision Localization(视觉定位)](#53)|[54.Sketch recognition(草图)](#54)|[55.Activity Recognition(活动识别)](#55)|[56.Dataset(数据集)](#56)|

|[49.Human-Object Interaction(人物交互)](#49)|[50.Continual Learning(持续学习)](#50)|[51.View Synthesis(视图合成)](#51)|[52.Vision-and-Language(视觉语言)](#52)|

|[45.Image Caption(图像字幕)](#45)|[46.Defect Detection(缺陷检测)](#46)|[47.NAS](#47)|[48.6DoF](#48)|

|[41.Out-of-Distribution Detection(OOD)](#41)|[42.Visual Representations Learning(视觉表征学习)](#42)|[43.Dense Prediction(密集预测)](#43)|[44.Human motion prediction(人体运动预测)](#44)|

|[37.Multitask Learning(多任务学习)](#37)|[38.Weakly/Semi-Supervised/Self-supervised/Unsupervised Learning(自/半/弱监督学习)](#38)|[39.Incremental Learning(增量学习)](#39)|[40.Metric Learning(度量学习)](#40)|

|[33.Remote Sensing Images(遥感影像)](#33)|[34.Image Super-Resolution(图像超分辨率)](#34)|[35.Quantization/Pruning/Knowledge Distillation/Model Compression(量化、剪枝、蒸馏、模型压缩/扩展与优化)](#35)|[36.SLAM/AR/VR/机器人](#36)|

|[29.Image Retrieval(图像检索)](#29)|[30.Image Generation/synthesis(图像生成/合成)](#30)|[31.Style Transfer(风格迁移)](#31)|[32.语音](#32)|

|[25.Medical Image(医学影像)](#25)|[26.Image Processing(图像处理)](#26)|[27.Multi-label image recognition(多标签图像识别)](#27)|[28.Contrastive Learning(对比学习)](#28)]

|[21.Active Learning(主动学习)](#21)|[22.GAN](#22)|[23.Gaze Estimation(视线估计)](#23)|[24.Face(人脸)](#24)|

|[17.3D(三维视觉)](#17)|[18.Transformers](#18)|[19.Self-Driving Vehicles(自动驾驶)](#19)|[20.Adversarial Learning(对抗学习)](#20)|

|[13.Image Segmentation(图像分割)](#13)|[14.Object Detection(目标检测)](#13)|[15.Object Tracking(目标跟踪)](#15)|[16.Re-Identification(重识别)](#16)|

|[9.Video](#9)|[10.OCR](#10)|[11.Visual Question Answering(视觉问答)](#11)|[12.Image/Fine-Grained Classification(图像/细粒度分类)](#12)|

|[5.Few-Shot/Zero-Shot Learning;Domain Generalization/Adaptation(小/零样本学习;域适应/泛化)](#5)|[6.Point Cloud(点云)](#6)|[7.Scene Graph Generation(场景图生成)](#7)|[8.Human Pose Estimation(人体姿态估计)](#8)|

|[1.Other(其它)](#1)|[2.Sign Language(手语识别)](#2)|[3.Image Clustering(图像聚类)](#3)|[4.Neural rendering(神经渲染)](#4)|



## 65.Optical Flow Estimation(光流估计)

* [Separable Flow: Learning Motion Cost Volumes for Optical Flow Estimation](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhang_Separable_Flow_Learning_Motion_Cost_Volumes_for_Optical_Flow_Estimation_ICCV_2021_paper.pdf)
:star:[code](https://github.com/feihuzhang/SeparableFlow)

* [High-Resolution Optical Flow from 1D Attention and Correlation](https://arxiv.org/abs/2104.13918)
:open_mouth:oral:star:[code](https://github.com/haofeixu/flow1d)

* [GyroFlow: Gyroscope-Guided Unsupervised Optical Flow Learning](https://arxiv.org/abs/2103.13725)
:star:[code](https://github.com/megvii-research/GyroFlow)

* [Sensor-Guided Optical Flow](https://arxiv.org/abs/2109.15321)
:star:[code](https://github.com/mattpoggi/sensor-guided-flow)



## 64.Anomaly Detection(异常检测)

* 表面异常检测

  * [DRÆM – A discriminatively trained reconstru](https://openaccess.thecvf.com/content/ICCV2021/papers/Zavrtanik_DRAEM_-_A_Discriminatively_Trained_Reconstruction_Embedding_for_Surface_Anomaly_ICCV_2021_paper.pdf)
:star:[code](https://github.com/VitjanZ/DRAEM)

* 异常检测

  * [Weakly Supervised Temporal Anomaly Segmentation with Dynamic Time Warping](https://arxiv.org/abs/2108.06816)

  * [Learning Unsupervised Metaformer for Anomaly Detection](https://openaccess.thecvf.com/content/ICCV2021/papers/Wu_Learning_Unsupervised_Metaformer_for_Anomaly_Detection_ICCV_2021_paper.pdf)
解决图像异常的分类或定位



## 63.Data Augmentation(数据增强)

* [DivAug: Plug-In Automated Data Augmentation With Explicit Diversity Maximization](https://arxiv.org/abs/2103.14545)
:star:[code](https://github.com/warai-0toko/DivAug)

* [TrivialAugment: Tuning-Free Yet State-of-the-Art Data Augmentation](https://openaccess.thecvf.com/content/ICCV2021/papers/Gu_Removing_the_Bias_of_Integral_Pose_Regression_ICCV_2021_paper.pdf)
:open_mouth:oral:star:[code](https://github.com/automl/trivialaugment)

* [Semantic Aware Data Augmentation for Cell Nuclei Microscopical Images With Artificial Neural Networks](https://openaccess.thecvf.com/content/ICCV2021/papers/Naghizadeh_Semantic_Aware_Data_Augmentation_for_Cell_Nuclei_Microscopical_Images_With_ICCV_2021_paper.pdf)

* [A Simple Baseline for Semi-Supervised Semantic Segmentation With Strong Data Augmentation](https://arxiv.org/abs/2104.07256)



## 62.Open-Set Recognition(开放集识别)

* [OpenGAN: Open-Set Recognition via Open Data Generation](https://arxiv.org/abs/2104.02939)
:trophy:Best Paper Honorable Mention

* [Conditional Variational Capsule Network for Open Set Recognition](https://arxiv.org/abs/2104.09159)
:star:[code](https://github.com/guglielmocamporese/cvaecaposr)



## 61.Metric Learning(元学习)

* [Do Different Deep Metric Learning Losses Lead to Similar Learned Features?](https://openaccess.thecvf.com/content/ICCV2021/papers/Kobs_Do_Different_Deep_Metric_Learning_Losses_Lead_to_Similar_Learned_ICCV_2021_paper.pdf)
:star:[code](https://github.com/konstantinkobs/DML-analysis)

* [Learning With Memory-Based Virtual Classes for Deep Metric Learning](https://arxiv.org/abs/2103.16940)
:star:[code](https://github.com/navervision/MemVir)



## 60.Federated Learning(联合学习)

* [Federated Learning for Non-IID Data via Unified Feature Learning and Optimization Objective Alignment](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhang_Federated_Learning_for_Non-IID_Data_via_Unified_Feature_Learning_and_ICCV_2021_paper.pdf)

* [Ensemble Attention Distillation for Privacy-Preserving Federated Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Gong_Ensemble_Attention_Distillation_for_Privacy-Preserving_Federated_Learning_ICCV_2021_paper.pdf)



## 59.Graph Neural Networks(图神经网络)

* [Meta-Aggregator: Learning to Aggregate for 1-bit Graph Neural Networks](https://arxiv.org/abs/2109.12872) 

* [PoGO-Net: Pose Graph Optimization With Graph Neural Networks](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_PoGO-Net_Pose_Graph_Optimization_With_Graph_Neural_Networks_ICCV_2021_paper.pdf)
:star:[code](https://github.com/xxylii/PoGO-Net)

* [Dynamic Dual Gating Neural Networks](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Dynamic_Dual_Gating_Neural_Networks_ICCV_2021_paper.pdf)
:star:[code](https://github.com/lfr-0531/DGNet)



## 58.Computational Photography(光学、几何、光场成像、计算摄影)

* [An Asynchronous Kalman Filter for Hybrid Event Cameras](https://arxiv.org/abs/2012.05590)
:star:[code](https://github.com/ziweiWWANG/AKF)

* [4D Cloud Scattering Tomography](https://openaccess.thecvf.com/content/ICCV2021/papers/Ronen_4D_Cloud_Scattering_Tomography_ICCV_2021_paper.pdf)

* Snapshot compressive imaging(快照压缩成像)

  * [Dense Deep Unfolding Network with 3D-CNN Prior for Snapshot Compressive Imaging](https://arxiv.org/abs/2109.06548)
:star:[code](https://github.com/jianzhangcs/SCI3D)

* 光场

  * [Light Field Saliency Detection with Dual Local Graph Learning andReciprocative Guidance](https://arxiv.org/abs/2110.00698)

  * [Fast Light-Field Disparity Estimation With Multi-Disparity-Scale Cost Aggregation](https://openaccess.thecvf.com/content/ICCV2021/papers/Huang_Fast_Light-Field_Disparity_Estimation_With_Multi-Disparity-Scale_Cost_Aggregation_ICCV_2021_paper.pdf)
:star:[code](https://github.com/zcong17huang/FastLFnet)

  * [SeLFVi: Self-supervised Light-Field Video Reconstruction from Stereo Video](https://openaccess.thecvf.com/content/ICCV2021/papers/Shedligeri_SeLFVi_Self-Supervised_Light-Field_Video_Reconstruction_From_Stereo_Video_ICCV_2021_paper.pdf)

  * [SIGNET: Efficient Neural Representation for Light Fields](https://openaccess.thecvf.com/content/ICCV2021/papers/Feng_SIGNET_Efficient_Neural_Representation_for_Light_Fields_ICCV_2021_paper.pdf)

  * 光场重建

    * [Learning Dynamic Interpolation for Extremely Sparse Light Fields With Wide Baselines](https://openaccess.thecvf.com/content/ICCV2021/papers/Guo_Learning_Dynamic_Interpolation_for_Extremely_Sparse_Light_Fields_With_Wide_ICCV_2021_paper.pdf)
:star:[code](https://github.com/MantangGuo/DI4SLF)

* 压缩成像

  * [Time-Multiplexed Coded Aperture Imaging: Learned Coded Aperture and Pixel Exposures for Compressive Imaging Systems](https://arxiv.org/abs/2104.02820)

* Homography Estimation

  * [LocalTrans: A Multiscale Local Transformer Network for Cross-Resolution Homography Estimation](http://arxiv.org/abs/2106.04067)

* 计算成像

  * [Extreme-Quality Computational Imaging via Degradation Framework](https://openaccess.thecvf.com/content/ICCV2021/papers/Chen_Extreme-Quality_Computational_Imaging_via_Degradation_Framework_ICCV_2021_paper.pdf)

* 光学像差矫正

  * [Universal and Flexible Optical Aberration Correction Using Deep-Prior Based Deconvolution](https://arxiv.org/abs/2104.03078)
:star:[code](https://github.com/leehsiu/UABC)



## 57.Image Matching(图像匹配)

* [Matching in the Dark: A Dataset for Matching Image Pairs of Low-light Scenes](https://arxiv.org/abs/2109.03585)

* 特征点匹配

  * [P2-Net: Joint Description and Detection of Local Features for Pixel and Point Matching](https://openaccess.thecvf.com/content/ICCV2021/papers/Wang_P2-Net_Joint_Description_and_Detection_of_Local_Features_for_Pixel_ICCV_2021_paper.pdf)
:star:[code](https://github.com/BingCS/P2-Net)

 



## 56.Dataset(数据集)

* [Large Scale Multi-Illuminant (LSMI) Dataset for Developing White Balance Algorithm Under Mixed Illumination](https://openaccess.thecvf.com/content/ICCV2021/papers/Kim_Large_Scale_Multi-Illuminant_LSMI_Dataset_for_Developing_White_Balance_Algorithm_ICCV_2021_paper.pdf)
:sunflower:[dataset](https://github.com/DY112/LSMI-dataset)

* [FloW: A Dataset and Benchmark for Floating Waste Detection in Inland Waters](https://openaccess.thecvf.com/content/ICCV2021/papers/Cheng_FloW_A_Dataset_and_Benchmark_for_Floating_Waste_Detection_in_ICCV_2021_paper.pdf)
:sunflower:[dataset](https://github.com/ORCA-Uboat/FloW-Dataset)
内陆水域漂浮废物检测数据集和基准

* [FloorPlanCAD: A Large-Scale CAD Drawing Dataset for Panoptic Symbol Spotting](https://openaccess.thecvf.com/content/ICCV2021/papers/Fan_FloorPlanCAD_A_Large-Scale_CAD_Drawing_Dataset_for_Panoptic_Symbol_Spotting_ICCV_2021_paper.pdf)
:house:[project](https://floorplancad.github.io/)

* 生物医学图像

  * [BioFors: A Large Biomedical Image Forensics Dataset](https://arxiv.org/abs/2108.12961)

* 3D重建

  * [Common Objects in 3D: Large-Scale Learning and Evaluation of Real-life 3D Category Reconstruction](https://arxiv.org/abs/2109.00512)
:sunflower:[dataset](https://github.com/facebookresearch/co3d)

* 航空影像数据集

  * [Beyond Road Extraction: A Dataset for Map Update using Aerial Images](https://arxiv.org/abs/2110.04690)
:star:[code](https://github.com/favyen/muno21):house:[project](https://favyen.com/muno21/)
用于使用航拍图像更新地图的数据集

* 动作识别

  * [HAA500: Human-Centric Atomic Action Dataset with Curated Videos](https://arxiv.org/abs/2009.05224)

* 目标识别

  * [ORBIT: A Real-World Few-Shot Dataset for Teachable Object Recognition](https://arxiv.org/abs/2104.03841)
:star:[code](https://github.com/microsoft/ORBIT-Dataset):sunflower:[dataset](https://city.figshare.com/articles/dataset/ORBIT_A_real-world_few-shot_dataset_for_teachable_object_recognition_collected_from_people_who_are_blind_or_low_vision/14294597)

* 车道线检测

  * [VIL-100: A New Dataset and a Baseline Model for Video Instance Lane Detection](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhang_VIL-100_A_New_Dataset_and_a_Baseline_Model_for_Video_ICCV_2021_paper.pdf)
:sunflower:[dataset](https://github.com/yujun0-0/MMA-Net)

* 自动驾驶

  * [Large Scale Interactive Motion Forecasting for Autonomous Driving: The Waymo Open Motion Dataset](https://openaccess.thecvf.com/content/ICCV2021/papers/Ettinger_Large_Scale_Interactive_Motion_Forecasting_for_Autonomous_Driving_The_Waymo_ICCV_2021_paper.pdf)

* 视觉语言数据集

  * [E-ViL: A Dataset and Benchmark for Natural Language Explanations in Vision-Language Tasks](https://openaccess.thecvf.com/content/ICCV2021/papers/Kayser_E-ViL_A_Dataset_and_Benchmark_for_Natural_Language_Explanations_in_ICCV_2021_paper.pdf)
:star:[code](https://github.com/maximek3/e-ViL)VL

* DeepFake检测

  * [KoDF: A Large-Scale Korean DeepFake Detection Dataset](https://arxiv.org/abs/2103.10094)
:sunflower:[dataset](https://deepbrainai-research.github.io/kodf/)

* 高质量视频

  * [Seeing Dynamic Scene in the Dark: A High-Quality Video Dataset With Mechatronic Alignment](https://openaccess.thecvf.com/content/ICCV2021/papers/Wang_Seeing_Dynamic_Scene_in_the_Dark_A_High-Quality_Video_Dataset_ICCV_2021_paper.pdf)
:sunflower:[dataset](https://github.com/dvlab-research/SDSD)视频



## 55.Activity Recognition(活动识别)

* [Selective Feature Compression for Efficient Activity Recognition Inference](https://arxiv.org/abs/2104.00179)

* 小组活动识别

  * [Spatio-Temporal Dynamic Inference Network for Group Activity Recognition](https://arxiv.org/abs/2108.11743)
:star:[code](https://github.com/JacobYuan7/DIN_GAR)

  * [GroupFormer: Group Activity Recognition with Clustered Spatial-Temporal Transformer](https://arxiv.org/abs/2108.12630)
:star:[code](https://github.com/xueyee/GroupFormer) 

 



## 54.Sketch recognition(草图)

* [SketchLattice: Latticed Representation for Sketch Manipulation](https://arxiv.org/abs/2108.11636)

* [SketchAA: Abstract Representation for Abstract Sketches](https://openaccess.thecvf.com/content/ICCV2021/papers/Yang_SketchAA_Abstract_Representation_for_Abstract_Sketches_ICCV_2021_paper.pdf)



## 53.Vision Localization(视觉定位)

* [Continual Learning for Image-Based Camera Localization](https://arxiv.org/abs/2108.09112)
:star:[code](https://github.com/AaltoVision/CL_HSCNet)

* [CrowdDriven: A New Challenging Dataset for Outdoor Visual Localization](https://arxiv.org/abs/2109.04527)
:sunflower:[dataset](http://mapillary.com/)

* [Pose Correction for Highly Accurate Visual Localization in Large-Scale Indoor Spaces](https://openaccess.thecvf.com/content/ICCV2021/papers/Hyeon_Pose_Correction_for_Highly_Accurate_Visual_Localization_in_Large-Scale_Indoor_ICCV_2021_paper.pdf)
:star:[code](https://github.com/JanghunHyeon/PCLoc)

* [Cross-Descriptor Visual Localization and Mapping](https://openaccess.thecvf.com/content/ICCV2021/papers/Dusmanu_Cross-Descriptor_Visual_Localization_and_Mapping_ICCV_2021_paper.pdf)



## 52.Vision-and-Language(视觉语言)

* [YouRefIt: Embodied Reference Understanding with Language and Gesture](https://arxiv.org/abs/2109.03413)
:open_mouth:oral:house:[project](https://yixchen.github.io/YouRefIt/)

* [VLGrammar: Grounded Grammar Induction of Vision and Language](https://arxiv.org/abs/2103.12975)
:star:[code](https://github.com/evelinehong/VLGrammar)

* [COOKIE: Contrastive Cross-Modal Knowledge Sharing Pre-Training for Vision-Language Representation](https://openaccess.thecvf.com/content/ICCV2021/papers/Wen_COOKIE_Contrastive_Cross-Modal_Knowledge_Sharing_Pre-Training_for_Vision-Language_Representation_ICCV_2021_paper.pdf)
:star:[code](https://github.com/kywen1119/COOKIE)

* [Panoptic Narrative Grounding](https://openaccess.thecvf.com/content/ICCV2021/papers/Gonzalez_Panoptic_Narrative_Grounding_ICCV_2021_paper.pdf)
:open_mouth:oral:star:[code](https://github.com/BCV-Uniandes/PNG)

* [AESOP: Abstract Encoding of Stories, Objects, and Pictures](https://openaccess.thecvf.com/content/ICCV2021/papers/Ravi_AESOP_Abstract_Encoding_of_Stories_Objects_and_Pictures_ICCV_2021_paper.pdf)
:star:[code](https://github.com/Hareesh-Ravi/AESOP):tv:[video](https://www.youtube.com/watch?v=ygGzY1DSSMk)

* [Adaptive Hierarchical Graph Reasoning With Semantic Coherence for Video-and-Language Inference](https://arxiv.org/abs/2107.12270)

* 视觉推理

  * [Interpretable Visual Reasoning via Induced Symbolic Space](https://arxiv.org/abs/2011.11603)

* 语义导航

  * [THDA: Treasure Hunt Data Augmentation for Semantic Navigation](https://openaccess.thecvf.com/content/ICCV2021/papers/Maksymets_THDA_Treasure_Hunt_Data_Augmentation_for_Semantic_Navigation_ICCV_2021_paper.pdf)

* 视觉语言导航

  * [Airbert: In-domain Pretraining for Vision-and-Language Navigation](https://arxiv.org/abs/2108.09105)
:house:[project](https://airbert-vln.github.io/)

  * [Waypoint Models for Instruction-guided Navigation in Continuous Environments](https://arxiv.org/abs/2110.02207)
:open_mouth:oral:star:[code](https://github.com/jacobkrantz/VLN-CE):house:[project](https://jacobkrantz.github.io/waypoint-vlnce/):tv:[video](https://youtu.be/hrHj9-1xoio)

  * [The Road To Know-Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation](https://openaccess.thecvf.com/content/ICCV2021/papers/Qi_The_Road_To_Know-Where_An_Object-and-Room_Informed_Sequential_BERT_for_ICCV_2021_paper.pdf)
:star:[code](https://github.com/YuankaiQi/ORIST)

  * [Vision-Language Navigation With Random Environmental Mixup](https://arxiv.org/abs/2106.07876)

* 视觉对话导航

  * [Self-Motivated Communication Agent for Real-World Vision-Dialog Navigation](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhu_Self-Motivated_Communication_Agent_for_Real-World_Vision-Dialog_Navigation_ICCV_2021_paper.pdf)

* 视觉导航

  * [Pose Invariant Topological Memory for Visual Navigation](https://openaccess.thecvf.com/content/ICCV2021/papers/Taniguchi_Pose_Invariant_Topological_Memory_for_Visual_Navigation_ICCV_2021_paper.pdf)
:star:[code](https://github.com/jonkhler/s2cnn)

  * [Visual Graph Memory With Unsupervised Representation for Visual Navigation](https://openaccess.thecvf.com/content/ICCV2021/papers/Kwon_Visual_Graph_Memory_With_Unsupervised_Representation_for_Visual_Navigation_ICCV_2021_paper.pdf)
:star:[code](https://github.com/rllab-snu/Visual-Graph-Memory):house:[project](https://rllab-snu.github.io/projects/vgm/doc.html):tv:[video](https://www.youtube.com/watch?v=Uksb_kR80Hk)

* visual grounding

  * [InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring](https://arxiv.org/abs/2103.01128)
:star:[code](https://github.com/CurryYuan/InstanceRefer)

  * [TransVG: End-to-End Visual Grounding With Transformers](https://arxiv.org/abs/2104.08541)
:star:[code](https://github.com/djiajunustc/TransVG)

* 视觉对话

  * [Unified Questioner Transformer for Descriptive Question Generation in Goal-Oriented Visual Dialogue](https://arxiv.org/abs/2106.15550)



## 51.View Synthesis(视图合成)

* [Out-of-boundary View Synthesis Towards Full-Frame Video Stabilization](https://arxiv.org/abs/2108.09041)
:star:[code](https://github.com/Annbless/OVS_Stabilization)

* [Deep 3D Mask Volume for View Synthesis of Dynamic Scenes](https://arxiv.org/abs/2108.13408)
:house:[project](https://cseweb.ucsd.edu//~viscomp/projects/ICCV21Deep/)

* [Embedding Novel Views in a Single JPEG Image](https://arxiv.org/abs/2108.13003)

* [Video Autoencoder: self-supervised disentanglement of static 3D structure and motion](https://arxiv.org/abs/2110.02951)
:open_mouth:oral:star:[code](https://github.com/zlai0/VideoAutoencoder/):house:[project](https://zlai0.github.io/VideoAutoencoder/#method_video):tv:[video](https://www.youtube.com/watch?v=UaJZd4FrM8E)

* [Geometry-Free View Synthesis: Transformers and No 3D Priors](https://arxiv.org/abs/2104.07652)
:star:[code](https://github.com/CompVis/geometry-free-view-synthesis)

* [Dynamic View Synthesis From Dynamic Monocular Video](https://arxiv.org/abs/2105.06468)
:house:[project](https://free-view-video.github.io/):tv:[video](https://youtu.be/j8CUzIR0f8M)

* [Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis](https://arxiv.org/abs/2104.00677)
:house:[project](https://www.ajayj.com/dietnerf):tv:[video](https://youtu.be/RF_3hsNizqw)

* [Infinite Nature: Perpetual View Generation of Natural Scenes from a Single Image](https://arxiv.org/abs/2012.09855)
:open_mouth:oral:star:[code](https://github.com/google-research/google-research/tree/master/infinite_nature):house:[project](https://infinite-nature.github.io/):tv:[video](https://www.youtube.com/watch?v=oXUf6anNAtc)

* [Worldsheet: Wrapping the World in a 3D Sheet for View Synthesis from a Single Image](https://arxiv.org/abs/2012.09854)
:open_mouth:oral:star:[code](https://github.com/facebookresearch/worldsheet):house:[project](https://worldsheet.github.io/):tv:[video](https://youtu.be/j5aT3zRxFlk)



## 50.Continual Learning(持续学习)

* [Online Continual Learning with Natural Distribution Shifts: An Empirical Study with Visual Data](https://arxiv.org/abs/2108.09020)
:star:[code](https://github.com/IntelLabs/continuallearning)

* [Continual Learning on Noisy Data Streams via Self-Purified Replay](https://openaccess.thecvf.com/content/ICCV2021/papers/Kim_Continual_Learning_on_Noisy_Data_Streams_via_Self-Purified_Replay_ICCV_2021_paper.pdf)
:star:[code](https://github.com/ecrireme/SPR)

* [Rehearsal Revealed: The Limits and Merits of Revisiting Samples in Continual Learning](https://arxiv.org/abs/2104.07446)
:star:[code](https://github.com/Mattdl/RehearsalRevealed)

* [Co2L: Contrastive Continual Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Cha_Co2L_Contrastive_Continual_Learning_ICCV_2021_paper.pdf)
:star:[code](https://github.com/chaht01/Co2L)

  



## 49.Human-Object Interaction(人物交互)

* [Exploiting Scene Graphs for Human-Object Interaction Detection](https://arxiv.org/abs/2108.08584)
:star:[code](https://github.com/ht014/SG2HOI)

* [Spatially Conditioned Graphs for Detecting Human-Object Interactions](https://arxiv.org/abs/2012.06060)
:star:[code](https://github.com/fredzzhang/spatially-conditioned-graphs):tv:[video](https://www.youtube.com/watch?v=gkBWi_rWedU)

* [Virtual Multi-Modality Self-Supervised Foreground Matting for Human-Object Interaction](https://arxiv.org/abs/2110.03278)

* [Detecting Human-Object Relationships in Videos](https://openaccess.thecvf.com/content/ICCV2021/papers/Ji_Detecting_Human-Object_Relationships_in_Videos_ICCV_2021_paper.pdf)

* [Weakly Supervised Human-Object Interaction Detection in Video via Contrastive Spatiotemporal Regions](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Weakly_Supervised_Human-Object_Interaction_Detection_in_Video_via_Contrastive_Spatiotemporal_ICCV_2021_paper.pdf)
:star:[code](https://github.com/ShuangLI59/weakly-supervised-human-object-detection-video):house:[project](https://shuangli-project.github.io/weakly-supervised-human-object-detection-video/):sunflower:[dataset](https://shuangli-project.github.io/VHICO-Dataset/)

* [Discovering Human Interactions With Large-Vocabulary Objects via Query and Multi-Scale Detection](https://openaccess.thecvf.com/content/ICCV2021/papers/Wang_Discovering_Human_Interactions_With_Large-Vocabulary_Objects_via_Query_and_Multi-Scale_ICCV_2021_paper.pdf)
:star:[code](https://github.com/scwangdyd/large_vocabulary_hoi_detection)

* [Visual Relationship Detection Using Part-and-Sum Transformers With Composite Queries](https://arxiv.org/abs/2105.02170)VRD和HOI

* [Interaction Compass: Multi-Label Zero-Shot Learning of Human-Object Interactions via Spatial Relations](https://openaccess.thecvf.com/content/ICCV2021/papers/Huynh_Interaction_Compass_Multi-Label_Zero-Shot_Learning_of_Human-Object_Interactions_via_Spatial_ICCV_2021_paper.pdf)
:star:[code](https://github.com/hbdat/iccv21_relational_direction)

* H2O

  * [H2O: A Benchmark for Visual Human-Human Object Handover Analysis](https://arxiv.org/abs/2104.11466)

* Human Interaction Understanding

  * [Consistency-Aware Graph Network for Human Interaction Understanding](https://arxiv.org/abs/2011.10250)
:star:[code](https://github.com/deepgogogo/CAGNet?v=1)

  * [H2O: Two Hands Manipulating Objects for First Person Interaction Recognition](https://arxiv.org/abs/2104.11181)
:house:[project](https://www.taeinkwon.com/projects/h2o)

* 手物交互

  * [Toward Human-Like Grasp: Dexterous Grasping via Semantic Representation of Object-Hand](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhu_Toward_Human-Like_Grasp_Dexterous_Grasping_via_Semantic_Representation_of_Object-Hand_ICCV_2021_paper.pdf)

  * [Reconstructing Hand-Object Interactions in the Wild](https://arxiv.org/abs/2012.09856)
:house:[project](https://people.eecs.berkeley.edu/~zhecao/rhoi/)

  * [CPF: Learning a Contact Potential Field To Model the Hand-Object Interaction](https://arxiv.org/abs/2012.00924)
:star:[code](https://github.com/lixiny/CPF)手物交互

* HOI(行为理解)

  * [GeomNet: A Neural Network Based on Riemannian Geometries of SPD Matrix Space and Cholesky Space for 3D Skeleton-Based Interaction Recognition](https://openaccess.thecvf.com/content/ICCV2021/papers/Nguyen_GeomNet_A_Neural_Network_Based_on_Riemannian_Geometries_of_SPD_ICCV_2021_paper.pdf)



## 48.6DoF

* [SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation](https://arxiv.org/abs/2108.08367)
:star:[code](https://github.com/shangbuhuan13/SO-Pose)

* [StereOBJ-1M: Large-scale Stereo Image Dataset for 6D Object Pose Estimation](https://arxiv.org/abs/2109.10115)

* [SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation](https://openaccess.thecvf.com/content/ICCV2021/papers/Chen_SGPA_Structure-Guided_Prior_Adaptation_for_Category-Level_6D_Object_Pose_Estimation_ICCV_2021_paper.pdf)

* [RePOSE: Fast 6D Object Pose Refinement via Deep Texture Rendering](https://arxiv.org/abs/2104.00633)
:star:[code](https://github.com/sh8/repose)

* [DualPoseNet: Category-Level 6D Object Pose and Size Estimation Using Dual Pose Network With Refined Learning of Pose Consistency](https://arxiv.org/abs/2103.06526)
:star:[code](https://github.com/Gorilla-Lab-SCUT/DualPoseNet)

* [PR-GCN: A Deep Graph Convolutional Network With Point Refinement for 6D Pose Estimation](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhou_PR-GCN_A_Deep_Graph_Convolutional_Network_With_Point_Refinement_for_ICCV_2021_paper.pdf)

* 物体姿势估计

  * [CAPTRA: CAtegory-Level Pose Tracking for Rigid and Articulated Objects From Point Clouds](https://arxiv.org/abs/2104.03437)
:open_mouth:oral:star:[code](https://github.com/halfsummer11/CAPTRA):house:[project](https://yijiaweng.github.io/CAPTRA/):tv:[video](https://youtu.be/EkcCEj7gZGg)



## 47.NAS

* [BN-NAS: Neural Architecture Search with Batch Normalization](https://arxiv.org/abs/2108.07375)
:star:[code](https://github.com/bychen515/BNNAS)

* [RANK-NOSH: Efficient Predictor-Based Architecture Search via Non-Uniform Successive Halving](https://arxiv.org/abs/2108.08019)

* [Pi-NAS: Improving Neural Architecture Search by Reducing Supernet Training Consistency Shift](https://arxiv.org/abs/2108.09671)
:star:[code](https://github.com/Ernie1/Pi-NAS)

* [Evolving Search Space for Neural Architecture Search](https://arxiv.org/abs/2011.10904)
:star:[code](https://github.com/orashi/NSE_NAS):tv:[video](https://www.youtube.com/watch?v=fq21WBaumRc)

* [FairNAS: Rethinking Evaluation Fairness of Weight Sharing Neural Architecture Search](https://arxiv.org/abs/1907.01845)
:star:[code](https://github.com/xiaomi-automl/FairNAS)

* [GLiT: Neural Architecture Search for Global and Local Image Transformer](https://arxiv.org/abs/2107.02960)
:star:[code](https://github.com/bychen515/GLiT)

* [Neural Architecture Search for Joint Human Parsing and Pose Estimation](https://openaccess.thecvf.com/content/ICCV2021/papers/Zeng_Neural_Architecture_Search_for_Joint_Human_Parsing_and_Pose_Estimation_ICCV_2021_paper.pdf)
:star:[code](https://github.com/GuHuangAI/NPP)

* [Distilling Optimal Neural Networks: Rapid Search in Diverse Spaces](https://arxiv.org/abs/2012.08859)

* [Learning Latent Architectural Distribution in Differentiable Neural Architecture Search via Variational Information Maximization](https://openaccess.thecvf.com/content/ICCV2021/papers/Wang_Learning_Latent_Architectural_Distribution_in_Differentiable_Neural_Architecture_Search_via_ICCV_2021_paper.pdf)

* [Not All Operations Contribute Equally: Hierarchical Operation-Adaptive Predictor for Neural Architecture Search](https://openaccess.thecvf.com/content/ICCV2021/papers/Chen_Not_All_Operations_Contribute_Equally_Hierarchical_Operation-Adaptive_Predictor_for_Neural_ICCV_2021_paper.pdf)

* [Zen-NAS: A Zero-Shot NAS for High-Performance Image Recognition](https://openaccess.thecvf.com/content/ICCV2021/papers/Lin_Zen-NAS_A_Zero-Shot_NAS_for_High-Performance_Image_Recognition_ICCV_2021_paper.pdf)
:star:[code](https://github.com/idstcv/ZenNAS)

* [BossNAS: Exploring Hybrid CNN-Transformers With Block-Wisely Self-Supervised Neural Architecture Search](https://arxiv.org/abs/2103.12424)
:star:[code](https://github.com/changlin31/BossNAS)

* [NAS-OoD: Neural Architecture Search for Out-of-Distribution Generalization](https://openaccess.thecvf.com/content/ICCV2021/papers/Bai_NAS-OoD_Neural_Architecture_Search_for_Out-of-Distribution_Generalization_ICCV_2021_paper.pdf)

* [AutoSpace: Neural Architecture Search With Less Human Interference](https://arxiv.org/abs/2103.11833)
:star:[code](https://github.com/zhoudaquan/AutoSpace)

* [IDARTS: Interactive Differentiable Architecture Search](https://openaccess.thecvf.com/content/ICCV2021/papers/Xue_IDARTS_Interactive_Differentiable_Architecture_Search_ICCV_2021_paper.pdf)



## 46.Defect Detection(缺陷检测)

* [DRÆM -- A discriminatively trained reconstruction embedding for surface anomaly detection](https://arxiv.org/abs/2108.07610)



## 45.Image Caption(图像字幕)

* [Who's Waldo? Linking People Across Text and Images](https://arxiv.org/abs/2108.07253)
:open_mouth:oral:house:[project](https://whoswaldo.github.io/)
:newspaper:解读:[ICCV2021 Oral-新任务！新数据集！康奈尔大学提出了类似VG但又不是VG的PVG任务](https://mp.weixin.qq.com/s/QC1UQRmZKgS0dctTXQ77Bg)

* [Partial Off-Policy Learning: Balance Accuracy and Diversity for Human-Oriented Image Captioning](https://openaccess.thecvf.com/content/ICCV2021/papers/Shi_Partial_Off-Policy_Learning_Balance_Accuracy_and_Diversity_for_Human-Oriented_Image_ICCV_2021_paper.pdf)

* [Topic Scene Graph Generation by Attention Distillation From Caption](https://openaccess.thecvf.com/content/ICCV2021/papers/Wang_Topic_Scene_Graph_Generation_by_Attention_Distillation_From_Caption_ICCV_2021_paper.pdf)
:star:[code](https://vipl.ict.ac.cn/view_database.php?id=6)

* [Understanding and Evaluating Racial Biases in Image Captioning](https://arxiv.org/abs/2106.08503)
:star:[code](https://github.com/princetonvisualai/imagecaptioning-bias):house:[project](https://princetonvisualai.github.io/imagecaptioning-bias/)

* [In Defense of Scene Graphs for Image Captioning](https://arxiv.org/abs/2102.04990)
:star:[code](https://github.com/Kien085/SG2Caps)

* art description generation(艺术描述生成)

  * [Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation](https://arxiv.org/abs/2109.05743)
:star:[code](https://github.com/noagarcia/explain-paintings)

* Change Captioning

  * [Viewpoint-Agnostic Change Captioning With Cycle Consistency](https://openaccess.thecvf.com/content/ICCV2021/papers/Kim_Viewpoint-Agnostic_Change_Captioning_With_Cycle_Consistency_ICCV_2021_paper.pdf)



## 44.Human motion prediction(人体运动预测)

* [MSR-GCN: Multi-Scale Residual Graph Convolution Networks for Human Motion Prediction](https://arxiv.org/abs/2108.07152)
:star:[code](https://github.com/Droliven/MSRGCN)

* [Stochastic Scene-Aware Motion Prediction](https://arxiv.org/abs/2108.08284)
:star:[code](https://github.com/mohamedhassanmus/SAMP):house:[project](https://samp.is.tue.mpg.de/)  

* [Generating Smooth Pose Sequences for Diverse Human Motion Prediction](https://arxiv.org/abs/2108.08422)
:open_mouth:oral:star:[code](https://github.com/wei-mao-2019/gsps)

* [TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild](https://arxiv.org/abs/2104.04029)
:house:[project](https://somof.stanford.edu/)

* [Motion Prediction using Trajectory Cues](https://openaccess.thecvf.com/content/ICCV2021/papers/Liu_Motion_Prediction_Using_Trajectory_Cues_ICCV_2021_paper.pdf)

* 3D人体运动预测

  * [Contextually Plausible and Diverse 3D Human Motion Prediction](https://arxiv.org/abs/1912.08521)



## 43.Dense Prediction(密集预测)

* [FaPN: Feature-aligned Pyramid Network for Dense Image Prediction](https://arxiv.org/abs/2108.07058)
:star:[code](https://github.com/EMI-Group/FaPN)

* 多任务密集预测

  * [Exploring Relational Context for Multi-Task Dense Prediction](https://arxiv.org/abs/2104.13874)



## 42.Representations Learning(表征学习)

* [Learning From Noisy Data With Robust Representation Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Learning_From_Noisy_Data_With_Robust_Representation_Learning_ICCV_2021_paper.pdf)
:star:[code](https://github.com/salesforce/RRL/)

* [Self-Supervised Representation Learning From Flow Equivariance](https://arxiv.org/abs/2101.06553)

* [Exploring Visual Engagement Signals for Representation Learning](https://arxiv.org/abs/2104.07767)
:star:[code](https://github.com/KMnP/vise)

* [Switchable K-class Hyperplanes for Noise-Robust Representation Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Liu_Switchable_K-Class_Hyperplanes_for_Noise-Robust_Representation_Learning_ICCV_2021_paper.pdf)
:star:[code](https://github.com/liubx07/SKH)

* [Region Similarity Representation Learning](https://arxiv.org/abs/2103.12902)
:star:[code](https://github.com/Tete-Xiao/ReSim)

* [Curious Representation Learning for Embodied Intelligence](https://arxiv.org/abs/2105.01060)
:star:[code](https://github.com/yilundu/crl):house:[project](https://yilundu.github.io/crl/)

* 视觉表征学习

  * [Self-Supervised Visual Representations Learning by Contrastive Mask Prediction](https://arxiv.org/abs/2108.07954)
:newspaper:解读:[ICCV2021 比MoCo更通用的对比学习范式，中科大&MSRA提出对比学习新方法MaskCo](https://mp.weixin.qq.com/s/t53ASvoSTTlXgxKTfEoZ7g)

  * [Temporal Knowledge Consistency for Unsupervised Visual Representation Learning](https://arxiv.org/abs/2108.10668)

  * [Contrasting Contrastive Self-Supervised Representation Learning Pipelines](https://arxiv.org/abs/2103.14005)
:star:[code](https://github.com/allenai/virb)

  * [Concept Generalization in Visual Representation Learning](https://arxiv.org/abs/2012.05649)
:house:[project](https://europe.naverlabs.com/cog-benchmark)

  * [Collaborative Unsupervised Visual Representation Learning from Decentralized Data](https://arxiv.org/abs/2108.06492)

  * [Episodic Transformer for Vision-and-Language Navigation](https://arxiv.org/abs/2105.06453)
:star:[code](https://github.com/alexpashevich/E.T.)

  * [Multi-VAE: Learning Disentangled View-Common and View-Peculiar Visual Representations for Multi-View Clustering](https://openaccess.thecvf.com/content/ICCV2021/papers/Xu_Multi-VAE_Learning_Disentangled_View-Common_and_View-Peculiar_Visual_Representations_for_Multi-View_ICCV_2021_paper.pdf)

* 视频表示学习

  * [Composable Augmentation Encoding for Video Representation Learning](https://arxiv.org/abs/2104.00616)

  * [Motion-Focused Contrastive Learning of Video Representations](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Motion-Focused_Contrastive_Learning_of_Video_Representations_ICCV_2021_paper.pdf)

 * [ASCNet: Self-Supervised Video Representation Learning With Appearance-Speed Consistency](https://arxiv.org/abs/2106.02342)

  * [ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning](https://arxiv.org/abs/2101.10803)
:house:[project](https://acav100m.github.io/)

  * [Time-Equivariant Contrastive Video Representation Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Jenni_Time-Equivariant_Contrastive_Video_Representation_Learning_ICCV_2021_paper.pdf)

  * [Space-Time Crop & Attend: Improving Cross-Modal Video Representation Learning](https://arxiv.org/abs/2103.10211)
:star:[code](https://github.com/facebookresearch/GDT)



## 41.Out-of-Distribution Detection(OOD)

* [CODEs: Chamfer Out-of-Distribution Examples against Overconfidence Issue](https://arxiv.org/abs/2108.06024)

* [Semantically Coherent Out-of-Distribution Detection](https://arxiv.org/abs/2108.11941)
:star:[code](https://github.com/jingkang50/ICCV21_SCOOD):house:[project](https://jingkang50.github.io/projects/scood)

* [The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization](https://arxiv.org/abs/2006.16241)
:star:[code](https://github.com/hendrycks/imagenet-r)



## 40.Metric Learning(度量学习)

* [Towards Interpretable Deep Metric Learning with Structural Matching](https://arxiv.org/abs/2108.05889)
:star:[code](https://github.com/wl-zhao/DIML)

* [Deep Relational Metric Learning](https://arxiv.org/abs/2108.10026)
:star:[code](https://github.com/zbr17/DRML)

* [LoOp: Looking for Optimal Hard Negative Embeddings for Deep Metric Learning](https://arxiv.org/abs/2108.09335)
:star:[code](https://github.com/puneesh00/LoOp)

* [Manifold Matching via Deep Metric Learning for Generative Modeling](https://arxiv.org/abs/2106.10777)
:star:[code](https://github.com/dzld00/pytorch-manifold-matching)



## 39.Incremental Learning(增量学习)

* 类增量学习

  * [Always Be Dreaming: A New Approach for Data-Free Class-Incremental Learning](https://arxiv.org/abs/2106.09701)
:newspaper:解读:[让模型实现“终生学习”，佐治亚理工学院提出Data-Free的增量学习](https://mp.weixin.qq.com/s/Fm9ufPD6rzL2VzaqpdFpjg)

  * [Striking a Balance Between Stability and Plasticity for Class-Incremental Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Wu_Striking_a_Balance_Between_Stability_and_Plasticity_for_Class-Incremental_Learning_ICCV_2021_paper.pdf)

  * [Synthesized Feature Based Few-Shot Class-Incremental Learning on a Mixture of Subspaces](https://openaccess.thecvf.com/content/ICCV2021/papers/Cheraghian_Synthesized_Feature_Based_Few-Shot_Class-Incremental_Learning_on_a_Mixture_of_ICCV_2021_paper.pdf)
:star:[code](https://github.com/ali-chr/Synthesized-Feature-based-Few-Shot-Class-Incremental-Learningon-a-Mixture-of-Subspaces)



## 38.Weakly/Semi-Supervised/Self-supervised/Unsupervised Learning(自/半/弱监督学习)

* 半监督

  * [Trash to Treasure: Harvesting OOD Data with Cross-Modal Matching for Open-Set Semi-Supervised Learning](https://arxiv.org/abs/2108.05617)

  * [Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting View Assignments With Support Samples](https://arxiv.org/abs/2104.13963)
:star:[code](https://github.com/facebookresearch/suncet)

  * [Semi-Supervised Active Learning for Semi-Supervised Models: Exploit Adversarial Examples With Graph-Based Virtual Labels](https://openaccess.thecvf.com/content/ICCV2021/papers/Guo_Semi-Supervised_Active_Learning_for_Semi-Supervised_Models_Exploit_Adversarial_Examples_With_ICCV_2021_paper.pdf)

  * [CoMatch: Semi-Supervised Learning With Contrastive Graph Regularization](https://arxiv.org/abs/2011.11183)
:star:[code](https://github.com/salesforce/CoMatch)

  * [Multiview Pseudo-Labeling for Semi-supervised Learning from Video](https://arxiv.org/abs/2104.00682)

* 自监督

  * [Focus on the Positives: Self-Supervised Learning for Biodiversity Monitoring](https://arxiv.org/abs/2108.06435)
:star:[code](https://github.com/omipan/camera_traps_self_supervised)

  * [Self-supervised Neural Networks for Spectral Snapshot Compressive Imaging](https://arxiv.org/abs/2108.12654)
:star:[code](https://github.com/mengziyi64/CASSI-Self-Supervised)

  * [ISD: Self-Supervised Learning by Iterative Similarity Distillation](https://arxiv.org/abs/2012.09259)
:star:[code](https://github.com/UMBCvision/ISD)

  * [Contrast and Order Representations for Video Self-Supervised Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Hu_Contrast_and_Order_Representations_for_Video_Self-Supervised_Learning_ICCV_2021_paper.pdf)

  * [On Feature Decorrelation in Self-Supervised Learning](https://arxiv.org/abs/2105.00470)
:open_mouth:oral

  * [Geography-Aware Self-Supervised Learning](https://arxiv.org/abs/2011.09980)

  * [Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos](https://arxiv.org/abs/2104.12671)

  * [Efficient Visual Pretraining with Contrastive Detection](https://arxiv.org/abs/2103.10957)

  * [Broaden Your Views for Self-Supervised Video Learning](https://arxiv.org/abs/2103.16559)

  * [CDS: Cross-Domain Self-supervised Pre-training](https://openaccess.thecvf.com/content/ICCV2021/papers/Kim_CDS_Cross-Domain_Self-Supervised_Pre-Training_ICCV_2021_paper.pdf)

  * [On Compositions of Transformations in Contrastive Self-Supervised Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Patrick_On_Compositions_of_Transformations_in_Contrastive_Self-Supervised_Learning_ICCV_2021_paper.pdf)
:star:[code](https://github.com/facebookresearch/GDT)

  * [Solving Inefficiency of Self-Supervised Representation Learning](https://arxiv.org/abs/2104.08760)
:star:[code](https://github.com/wanggrun/triplet)

  * [Divide and Contrast: Self-supervised Learning from Uncurated Data](https://openaccess.thecvf.com/content/ICCV2021/papers/Tian_Divide_and_Contrast_Self-Supervised_Learning_From_Uncurated_Data_ICCV_2021_paper.pdf)

  * [Emerging Properties in Self-Supervised Vision Transformers](https://arxiv.org/abs/2104.14294)
:star:[code](https://github.com/facebookresearch/dino)

  * [Mean Shift for Self-Supervised Learning](https://arxiv.org/abs/2105.07269)
:star:[code](https://github.com/UMBCvision/MSF)

* 弱监督

  * [Weakly Supervised Representation Learning With Coarse Labels](https://arxiv.org/abs/2005.09681)
:star:[code](https://github.com/idstcv/CoIns)



## 37.Multitask Learning(多任务学习)

* [MultiTask-CenterNet (MCN): Efficient and Diverse Multitask Learning using an Anchor Free Approach](https://arxiv.org/abs/2108.05060)
:newspaper:解读:[ICCV2021《MultiTask CenterNet》CV多任务新进展！一节更比三节强](https://mp.weixin.qq.com/s/toAZS0OHdW4MG30P1wAAUA)

* [Multi-Task Self-Training for Learning General Representations](https://arxiv.org/abs/2108.11353)
:newspaper:解读:[ICCV2021 MuST：还在特定任务里为刷点而苦苦挣扎？谷歌的大佬们都已经开始玩多任务训练了](https://mp.weixin.qq.com/s/nhv1l9xBSaZceibIn5fhqw)

* [UniT: Multimodal Multitask Learning With a Unified Transformer](https://arxiv.org/abs/2102.10772)
:star:[code](https://mmf.sh/)

* [Learning Multiple Pixelwise Tasks Based on Loss Scale Balancing](https://openaccess.thecvf.com/content/ICCV2021/papers/Lee_Learning_Multiple_Pixelwise_Tasks_Based_on_Loss_Scale_Balancing_ICCV_2021_paper.pdf)
:star:[code](https://github.com/jaehanlee-mcl/LSB-MTL)

* [Learning With Privileged Tasks](https://openaccess.thecvf.com/content/ICCV2021/papers/Song_Learning_With_Privileged_Tasks_ICCV_2021_paper.pdf)

* [Task Switching Network for Multi-Task Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Sun_Task_Switching_Network_for_Multi-Task_Learning_ICCV_2021_paper.pdf)



## 36.SLAM/AR/VR/机器人

* 机器人

  * 室内导航

    * [The Surprising Effectiveness of Visual Odometry Techniques for Embodied PointGoal Navigation](https://arxiv.org/abs/2108.11550)
:star:[code](https://github.com/Xiaoming-Zhao/PointNav-VO):house:[project](https://xiaoming-zhao.github.io/projects/pointnav-vo/)

    * [Pathdreamer: A World Model for Indoor Navigation](https://arxiv.org/abs/2105.08756)
:tv:[video](https://www.youtube.com/watch?v=StklIENGqs0)

  * 机器手抓取

    * [Hand-Object Contact Consistency Reasoning for Human Grasps Generation](https://arxiv.org/abs/2104.03304)
:open_mouth:oral:star:[code](https://github.com/hwjiang1510/GraspTTA):house:[project](https://hwjiang1510.github.io/GraspTTA/):tv:[video](https://youtu.be/zGVLVXZoVZs)

* VR/AR

  * [The Power of Points for Modeling Humans in Clothing](https://arxiv.org/abs/2109.01137)
:star:[code](https://github.com/qianlim/POP):house:[project](https://qianlim.github.io/POP):tv:[video](https://youtu.be/5M4F9zSWIEE)

  * 虚拟试穿  

    * [M3D-VTON: A Monocular-to-3D Virtual Try-On Network](https://arxiv.org/abs/2108.05126)
:star:[code](https://github.com/fyviezhao/M3D-VTON)

    * [ZFlow: Gated Appearance Flow-based Virtual Try-on with 3D Priors](https://arxiv.org/abs/2109.07001)

    * [Dressing in Order: Recurrent Person Image Generation for Pose Transfer, Virtual Try-On and Outfit Editing](https://arxiv.org/abs/2104.07021)

    * [FashionMirror: Co-Attention Feature-Remapping Virtual Try-On With Sequential Template Poses](https://openaccess.thecvf.com/content/ICCV2021/papers/Chen_FashionMirror_Co-Attention_Feature-Remapping_Virtual_Try-On_With_Sequential_Template_Poses_ICCV_2021_paper.pdf)

    * [Structure-transformed Texture-enhanced Network for Person Image Synthesis](https://openaccess.thecvf.com/content/ICCV2021/papers/Xu_Structure-Transformed_Texture-Enhanced_Network_for_Person_Image_Synthesis_ICCV_2021_paper.pdf)

* SLAM

  * [On the Limits of Pseudo Ground Truth in Visual Camera Re-localisation](https://arxiv.org/abs/2109.00524)
:star:[code](https://github.com/tsattler/visloc_pseudo_gt_limitations/)

  * [Transfusion: A Novel SLAM Method Focused on Transparent Objects](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhu_Transfusion_A_Novel_SLAM_Method_Focused_on_Transparent_Objects_ICCV_2021_paper.pdf)

  * [iMAP: Implicit Mapping and Positioning in Real-Time](https://arxiv.org/abs/2103.12352)

  * [Learning To Bundle-Adjust: A Graph Network Approach to Faster Optimization of Bundle Adjustment for Vehicular SLAM](https://openaccess.thecvf.com/content/ICCV2021/papers/Tanaka_Learning_To_Bundle-Adjust_A_Graph_Network_Approach_to_Faster_Optimization_ICCV_2021_paper.pdf)

  * [R-SLAM: Optimizing Eye Tracking From Rolling Shutter Video of the Retina](https://openaccess.thecvf.com/content/ICCV2021/papers/Shenoy_R-SLAM_Optimizing_Eye_Tracking_From_Rolling_Shutter_Video_of_the_ICCV_2021_paper.pdf)

  * Place Recognition

    * [Attentional Pyramid Pooling of Salient Visual Residuals for Place Recognition](https://openaccess.thecvf.com/content/ICCV2021/papers/Peng_Attentional_Pyramid_Pooling_of_Salient_Visual_Residuals_for_Place_Recognition_ICCV_2021_paper.pdf)

    * [Pyramid Point Cloud Transformer for Large-Scale Place Recognition](https://openaccess.thecvf.com/content/ICCV2021/papers/Hui_Pyramid_Point_Cloud_Transformer_for_Large-Scale_Place_Recognition_ICCV_2021_paper.pdf)
:star:[code](https://github.com/fpthink/PPT-Net)



## 35.Quantization/Pruning/Knowledge Distillation/Model Compression(量化、剪枝、蒸馏、模型压缩/扩展与优化)

* 知识蒸馏

  * [Distilling Holistic Knowledge with Graph Neural Networks](https://arxiv.org/abs/2108.05507)
:star:[code](https://github.com/wyc-ruiker/HKD)

  * [Lipschitz Continuity Guided Knowledge Distillation](https://arxiv.org/abs/2108.12905)
:star:[code](https://github.com/42Shawn/LONDON/tree/master)

  * [Densely Guided Knowledge Distillation Using Multiple Teacher Assistants](https://arxiv.org/abs/2009.08825)
:star:[code](https://github.com/wonchulSon/DGKD)

  * [Revisiting Adversarial Robustness Distillation: Robust Soft Labels Make Student Better](https://arxiv.org/abs/2108.07969)
:star:[code](https://github.com/zibojia/RSLAD)

  * [Compressing Visual-linguistic Model via Knowledge Distillation](https://arxiv.org/abs/2104.02096)

  * [Self-Knowledge Distillation With Progressive Refinement of Targets](https://arxiv.org/abs/2006.12000)
:star:[code](https://github.com/lgcnsai/PS-KD-Pytorch):tv:[video](https://drive.google.com/file/d/1QxqSbzn-egdYI13IYn3W4dmIvm_Iw4ku/view)

  * [Student Customized Knowledge Distillation: Bridging the Gap Between Student and Teacher](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhu_Student_Customized_Knowledge_Distillation_Bridging_the_Gap_Between_Student_and_ICCV_2021_paper.pdf)

  * [Channel-Wise Knowledge Distillation for Dense Prediction](https://arxiv.org/abs/2011.13256)
:star:[code](https://github.com/irfanICMLL/TorchDistiller/tree/main/SemSeg-distill)

  * [Exploring Inter-Channel Correlation for Diversity-Preserved Knowledge Distillation](https://openaccess.thecvf.com/content/ICCV2021/papers/Liu_Exploring_Inter-Channel_Correlation_for_Diversity-Preserved_Knowledge_Distillation_ICCV_2021_paper.pdf)
:star:[code](https://github.com/ADLab-AutoDrive/ICKD)

* 量化

  * [Distance-aware Quantization](https://arxiv.org/abs/2108.06983)
:star:[code](https://github.com/cvlab-yonsei/DAQ):house:[project](https://cvlab.yonsei.ac.kr/projects/DAQ/) 

  * [Dynamic Network Quantization for Efficient Video Inference](https://arxiv.org/abs/2108.10394)
:star:[code](https://github.com/sunxm2357/VideoIQ):house:[project](https://cs-people.bu.edu/sunxm/VideoIQ/project.html)

  * [Cluster-Promoting Quantization with Bit-Drop for Minimizing Network Quantization Loss](https://arxiv.org/abs/2109.02100)

  * [Improving Low-Precision Network Quantization via Bin Regularization](https://openaccess.thecvf.com/content/ICCV2021/papers/Han_Improving_Low-Precision_Network_Quantization_via_Bin_Regularization_ICCV_2021_paper.pdf)

  * [Towards Mixed-Precision Quantization of Neural Networks via Constrained Optimization](https://openaccess.thecvf.com/content/ICCV2021/papers/Chen_Towards_Mixed-Precision_Quantization_of_Neural_Networks_via_Constrained_Optimization_ICCV_2021_paper.pdf)

  * [Integer-arithmetic-only Certified Robustness for Quantized Neural Networks](https://arxiv.org/abs/2108.09413)

  * [RMSMP: A Novel Deep Neural Network Quantization Framework With Row-Wise Mixed Schemes and Multiple Precisions](https://openaccess.thecvf.com/content/ICCV2021/papers/Chang_RMSMP_A_Novel_Deep_Neural_Network_Quantization_Framework_With_Row-Wise_ICCV_2021_paper.pdf)

  * [Improving Neural Network Efficiency via Post-Training Quantization With Adaptive Floating-Point](https://openaccess.thecvf.com/content/ICCV2021/papers/Liu_Improving_Neural_Network_Efficiency_via_Post-Training_Quantization_With_Adaptive_Floating-Point_ICCV_2021_paper.pdf)
:star:[code](https://github.com/MXHX7199/ICCV_2021_AFP)

  * [Once Quantization-Aware Training: High Performance Extremely Low-bit Architecture Search](https://arxiv.org/abs/2010.04354)
:star:[code](https://github.com/LaVieEnRoseSMZ/OQA)

* 模型压缩

  * [GDP: Stabilized Neural Network Pruning via Gates with Differentiable Polarization](https://arxiv.org/abs/2109.02220)

  * [Exploration and Estimation for Model Compression](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhang_Exploration_and_Estimation_for_Model_Compression_ICCV_2021_paper.pdf)

* 剪枝

  * [ResRep: Lossless CNN Pruning via Decoupling Remembering and Forgetting](https://arxiv.org/abs/2007.03260)
:star:[code](https://github.com/DingXiaoH/ResRep)

  * [Auto Graph Encoder-Decoder for Neural Network Pruning](https://openaccess.thecvf.com/content/ICCV2021/papers/Yu_Auto_Graph_Encoder-Decoder_for_Neural_Network_Pruning_ICCV_2021_paper.pdf)



## 34.Super-Resolution(超分辨率)

* ISR

  * [Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution](https://arxiv.org/abs/2108.05302)
:star:[code](https://github.com/JingyunLiang/MANet)

  * [Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling](https://arxiv.org/abs/2108.05301)
:star:[code](https://github.com/JingyunLiang/HCFlow)

  * [Deep Reparametrization of Multi-Frame Super-Resolution and Denoising](https://arxiv.org/abs/2108.08286)
:open_mouth:oral

  * [Dual-Camera Super-Resolution with Aligned Attention Modules](https://arxiv.org/abs/2109.01349)
:star:[code](https://github.com/Tengfei-Wang/DualCameraSR):house:[project](https://tengfei-wang.github.io/Dual-Camera-SR/index.html):tv:[video](https://www.youtube.com/watch?v=5TiUfAcNvuw)

  * [Attention-Based Multi-Reference Learning for Image Super-Resolution](https://arxiv.org/abs/2108.13697)
:star:[code](https://github.com/marcopesavento/AMRSR):house:[project](https://marcopesavento.github.io/AMRSR/)

  * [Learning a Single Network for Scale-Arbitrary Super-Resolution](https://arxiv.org/abs/2004.03791)

  * [Fourier Space Losses for Efficient Perceptual Image Super-Resolution](https://arxiv.org/abs/2106.00783)
:star:[code](https://github.com/dariofuoli)

  * [Achieving On-Mobile Real-Time Super-Resolution With Neural Architecture and Pruning Search](https://arxiv.org/abs/2108.08910)

  * [Designing a Practical Degradation Model for Deep Blind Image Super-Resolution](https://arxiv.org/abs/2103.14006)
:star:[code](https://github.com/cszn/BSRGAN)

  * [Event Stream Super-Resolution via Spatiotemporal Constraint Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Event_Stream_Super-Resolution_via_Spatiotemporal_Constraint_Learning_ICCV_2021_paper.pdf)

  * [Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution](https://openaccess.thecvf.com/content/ICCV2021/papers/Magid_Dynamic_High-Pass_Filtering_and_Multi-Spectral_Attention_for_Image_Super-Resolution_ICCV_2021_paper.pdf)

  * [Super-Resolving Cross-Domain Face Miniatures by Peeking at One-Shot Exemplar](https://arxiv.org/abs/2103.08863)

  * [Context Reasoning Attention Network for Image Super-Resolution](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhang_Context_Reasoning_Attention_Network_for_Image_Super-Resolution_ICCV_2021_paper.pdf)

  * [EvIntSR-Net: Event Guided Multiple Latent Frames Reconstruction and Super-Resolution](https://openaccess.thecvf.com/content/ICCV2021/papers/Han_EvIntSR-Net_Event_Guided_Multiple_Latent_Frames_Reconstruction_and_Super-Resolution_ICCV_2021_paper.pdf)

  * [Super Resolve Dynamic Scene from Continuous Spike Streams](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhao_Super_Resolve_Dynamic_Scene_From_Continuous_Spike_Streams_ICCV_2021_paper.pdf)

  * [Deep Blind Video Super-Resolution](https://openaccess.thecvf.com/content/ICCV2021/papers/Pan_Deep_Blind_Video_Super-Resolution_ICCV_2021_paper.pdf)

  * [Benchmarking Ultra-High-Definition Image Super-Resolution](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhang_Benchmarking_Ultra-High-Definition_Image_Super-Resolution_ICCV_2021_paper.pdf)

  * [Lucas-Kanade Reloaded: End-to-End Super-Resolution From Raw Image Bursts](https://arxiv.org/abs/2104.06191)

  * [Unsupervised Real-World Super-Resolution: A Domain Adaptation Perspective](https://openaccess.thecvf.com/content/ICCV2021/papers/Wang_Unsupervised_Real-World_Super-Resolution_A_Domain_Adaptation_Perspective_ICCV_2021_paper.pdf)

  * [Real-World Video Super-Resolution: A Benchmark Dataset and a Decomposition Based Learning Scheme](https://openaccess.thecvf.com/content/ICCV2021/papers/Yang_Real-World_Video_Super-Resolution_A_Benchmark_Dataset_and_a_Decomposition_Based_ICCV_2021_paper.pdf)
:star:[code](https://github.com/IanYeung/RealVSR)
:newspaper:解读:[ICCV2021 香港理工、阿里达摩院提出RealVSR：视频超分任务中的新数据集与损失方案](https://mp.weixin.qq.com/s/pQWQgJJgCzDFX6lQcXt3wg)

* VSR

  * [Omniscient Video Super-Resolution](https://arxiv.org/abs/2103.15683)
:star:[code](https://github.com/psychopa4/OVSR)

  * [COMISR: Compression-Informed Video Super-Resolution](https://arxiv.org/abs/2105.01237)
:star:[code](https://github.com/google-research/google-research/tree/master/comisr)
:newspaper:解读:[谷歌提出COMISR算法：针对视频压缩的压缩感知超分辨率](https://mp.weixin.qq.com/s/DhE49Ek0v0PelDewNNjP3w)

  * [Learning Frequency-Aware Dynamic Network for Efficient Super-Resolution](https://arxiv.org/abs/2103.08357)

  * [Efficient Video Compression via Content-Adaptive Super-Resolution](https://arxiv.org/abs/2104.02322)
:star:[code](https://github.com/AdaptiveVC/SRVC)



## 33.Remote Sensing Images(遥感影像)

* [SUNet: Symmetric Undistortion Network for Rolling Shutter Correction](https://arxiv.org/abs/2108.04775)

* [Change is Everywhere: Single-Temporal Supervised Object Change Detection in Remote Sensing Imagery](https://arxiv.org/abs/2108.07002)
:star:[code](https://github.com/Z-Zheng/ChangeStar)
:newspaper:解读:[ICCV2021｜武汉大学RSIDEA团队提出一种新颖的弱监督遥感变化检测算法STAR](https://mp.weixin.qq.com/s/hATPy1T2zh9JgwBeRMWh0A)

* 卫星图像全景视频合成

  * [Sat2Vid: Street-view Panoramic Video Synthesis from a Single Satellite Image](https://arxiv.org/abs/2012.06628)

* 基于卫星影像的交通事故检测

  * [Inferring High-Resolution Traffic Accident Risk Maps Based on Satellite Imagery and GPS Trajectories](https://openaccess.thecvf.com/content/ICCV2021/papers/He_Inferring_High-Resolution_Traffic_Accident_Risk_Maps_Based_on_Satellite_Imagery_ICCV_2021_paper.pdf)

* 遥感数据

  * [Seasonal Contrast: Unsupervised Pre-Training From Uncurated Remote Sensing Data](https://openaccess.thecvf.com/content/ICCV2021/papers/Manas_Seasonal_Contrast_Unsupervised_Pre-Training_From_Uncurated_Remote_Sensing_Data_ICCV_2021_paper.pdf)
:star:[code](https://github.com/ElementAI/seasonal-contrast)

  * [Dynamic Cross Feature Fusion for Remote Sensing Pansharpening](https://openaccess.thecvf.com/content/ICCV2021/papers/Wu_Dynamic_Cross_Feature_Fusion_for_Remote_Sensing_Pansharpening_ICCV_2021_paper.pdf)

* 分割

  * [Self-Mutating Network for Domain Adaptive Segmentation in Aerial Images](https://openaccess.thecvf.com/content/ICCV2021/papers/Lee_Self-Mutating_Network_for_Domain_Adaptive_Segmentation_in_Aerial_Images_ICCV_2021_paper.pdf)

  * 卫星图像的全景分割

    * [Panoptic Segmentation of Satellite Image Time Series With Convolutional Temporal Attention Networks](https://arxiv.org/abs/2107.07933)
:star:[code](https://github.com/VSainteuf/utae-paps):sunflower:[PASTIS dataset](https://github.com/VSainteuf/pastis-benchmark)

* 三维重建     

  * [3D Building Reconstruction from Monocular Remote Sensing Images](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_3D_Building_Reconstruction_From_Monocular_Remote_Sensing_Images_ICCV_2021_paper.pdf)
:house:[project](https://liweijia.github.io/projects/building_3d/)



## 32.语音

* [The Right to Talk: An Audio-Visual Transformer Approach](https://arxiv.org/abs/2108.03256)
:star:[code](https://github.com/uark-cviu/Right2Talk)

* [Image2Reverb: Cross-Modal Reverb Impulse Response Synthesis](https://arxiv.org/abs/2103.14201)
:star:[code](https://github.com/nikhilsinghmus/image2reverb):house:[project](https://web.media.mit.edu/~nsingh1/image2reverb/)

* 音频分离

  * [Visual Scene Graphs for Audio Source Separation](https://arxiv.org/abs/2109.11955)

* 音频-手势

  * [Audio2Gestures: Generating Diverse Gestures from Speech Audio with Conditional Variational Autoencoders](https://arxiv.org/abs/2108.06720)
:house:[project](https://openaccess.thecvf.com/content/ICCV2021/papers/Xiang_SnowflakeNet_Point_Cloud_Completion_by_Snowflake_Point_Deconvolution_With_Skip-Transformer_ICCV_2021_paper.pdf)

* Active Speaker Detection(ASD主动式扬声器检测)

  * [How To Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild](https://arxiv.org/abs/2106.03932)
:star:[code](https://github.com/okankop/ASDNet)

  * [MAAS: Multi-Modal Assignation for Active Speaker Detection](https://openaccess.thecvf.com/content/ICCV2021/papers/Alcazar_MAAS_Multi-Modal_Assignation_for_Active_Speaker_Detection_ICCV_2021_paper.pdf)

* 从人脸视频中重新收集音频

  * [Multi-Modality Associative Bridging Through Memory: Speech Sound Recollected From Face Video](https://openaccess.thecvf.com/content/ICCV2021/papers/Kim_Multi-Modality_Associative_Bridging_Through_Memory_Speech_Sound_Recollected_From_Face_ICCV_2021_paper.pdf)

* 视听源定位

  * [Localize to Binauralize: Audio Spatialization From Visual Sound Source Localization](https://openaccess.thecvf.com/content/ICCV2021/papers/Rachavarapu_Localize_to_Binauralize_Audio_Spatialization_From_Visual_Sound_Source_Localization_ICCV_2021_paper.pdf)
:star:[code](https://github.com/KranthiKumarR/Localize-to-Binauralize):tv:[video](https://drive.google.com/drive/folders/1a5BV0U3RaQJS5wXyR7pzIAPMKOsGQz_q)

* 视听源分离

  * [Move2Hear: Active Audio-Visual Source Separation](https://openaccess.thecvf.com/content/ICCV2021/papers/Majumder_Move2Hear_Active_Audio-Visual_Source_Separation_ICCV_2021_paper.pdf)
:star:[code](https://github.com/SAGNIKMJR/move2hear-active-AV-separation):house:[project](http://vision.cs.utexas.edu/projects/move2hear/)

* 视听平面图重建

  * [Audio-Visual Floorplan Reconstruction](https://openaccess.thecvf.com/content/ICCV2021/papers/Purushwalkam_Audio-Visual_Floorplan_Reconstruction_ICCV_2021_paper.pdf)
:star:[code](https://github.com/senthilps8/avmap):house:[project](http://www.cs.cmu.edu/~spurushw/publication/avmap/):tv:[video](https://youtu.be/wRslVfd1hOI)



## 31.Style Transfer(风格迁移)

* [AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer](https://arxiv.org/abs/2108.03647)
:star:[code](https://github.com/Huage001/AdaAttN)

* [Domain-Aware Universal Style Transfer](https://arxiv.org/abs/2108.04441)
:star:[code](https://github.com/Kibeom-Hong/Domain-Aware-Style-Transfer)

* [Diverse Image Style Transfer via Invertible Cross-Space Mapping](https://openaccess.thecvf.com/content/ICCV2021/papers/Chen_Diverse_Image_Style_Transfer_via_Invertible_Cross-Space_Mapping_ICCV_2021_paper.pdf)

* [StyleFormer: Real-Time Arbitrary Style Transfer via Parametric Style Composition](https://openaccess.thecvf.com/content/ICCV2021/papers/Wu_StyleFormer_Real-Time_Arbitrary_Style_Transfer_via_Parametric_Style_Composition_ICCV_2021_paper.pdf)

* [Manifold Alignment for Semantically Aligned Style Transfer](https://arxiv.org/abs/2005.10777)
:star:[code](https://github.com/NJUHuoJing/MAST)



## 30.Image Generation/synthesis(图像生成/合成)

* [ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models](https://arxiv.org/abs/2108.02938)
:open_mouth:oral

* [Image Synthesis via Semantic Composition](https://arxiv.org/abs/2109.07053)
:star:[code](https://github.com/dvlab-research/SCGAN):house:[project](https://shepnerd.github.io/scg/)

* [Image Synthesis From Layout With Locality-Aware Mask Adaption](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Image_Synthesis_From_Layout_With_Locality-Aware_Mask_Adaption_ICCV_2021_paper.pdf)

* 图像融合

  * [DTMNet: A Discrete Tchebichef Moments-Based Deep Neural Network for Multi-Focus Image Fusion](https://openaccess.thecvf.com/content/ICCV2021/papers/Xiao_DTMNet_A_Discrete_Tchebichef_Moments-Based_Deep_Neural_Network_for_Multi-Focus_ICCV_2021_paper.pdf)



## 29.Image Retrieval(图像检索)

* [DOLG: Single-Stage Image Retrieval with Deep Orthogonal Fusion of Local and Global Features](https://arxiv.org/abs/2108.02927)
:star:[code](https://github.com/feymanpriv/DOLG)

* [Image Retrieval on Real-life Images with Pre-trained Vision-and-Language Models](https://arxiv.org/abs/2108.04024)
:star:[code](https://github.com/Cuberick-Orion/CIRR):house:[project](https://cuberick-orion.github.io/CIRR/)

* [Self-supervised Product Quantization for Deep Unsupervised Image Retrieval](https://arxiv.org/abs/2109.02244)
:star:[code](https://github.com/youngkyunJang/SPQ)

* [Instance-Level Image Retrieval Using Reranking Transformers](https://arxiv.org/abs/2103.12236)
:star:[code](https://github.com/uvavision/RerankingTransformer)

* [Learning Attribute-Driven Disentangled Representations for Interactive Fashion Retrieval](https://openaccess.thecvf.com/content/ICCV2021/papers/Hou_Learning_Attribute-Driven_Disentangled_Representations_for_Interactive_Fashion_Retrieval_ICCV_2021_paper.pdf)
:star:[code](https://github.com/amzn/fashion-attribute-disentanglement)

* [Telling the What While Pointing to the Where: Multimodal Queries for Image Retrieval](https://openaccess.thecvf.com/content/ICCV2021/papers/Changpinyo_Telling_the_What_While_Pointing_to_the_Where_Multimodal_Queries_ICCV_2021_paper.pdf)

* [Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval](https://arxiv.org/abs/2104.00650)

* [Learning Deep Local Features With Multiple Dynamic Attentions for Large-Scale Image Retrieval](https://openaccess.thecvf.com/content/ICCV2021/papers/Wu_Learning_Deep_Local_Features_With_Multiple_Dynamic_Attentions_for_Large-Scale_ICCV_2021_paper.pdf)
:star:[code](https://github.com/CHANWH/MDA)

* [Bayesian Triplet Loss: Uncertainty Quantification in Image Retrieval](https://openaccess.thecvf.com/content/ICCV2021/papers/Warburg_Bayesian_Triplet_Loss_Uncertainty_Quantification_in_Image_Retrieval_ICCV_2021_paper.pdf)

* 跨域检索

  * [Universal Cross-Domain Retrieval: Generalizing Across Classes and Domains](https://arxiv.org/abs/2108.08356)

* Visual Geolocalization

  * [Viewpoint Invariant Dense Matching for Visual Geolocalization](https://arxiv.org/abs/2109.09827)
:star:[code](https://github.com/gmberton/geo_warp)

* 跨模态检索

  * [Ask&Confirm: Active Detail Enriching for Cross-Modal Retrieval With Partial Query](https://openaccess.thecvf.com/content/ICCV2021/papers/Cai_AskConfirm_Active_Detail_Enriching_for_Cross-Modal_Retrieval_With_Partial_Query_ICCV_2021_paper.pdf)
:star:[code](https://github.com/CuthbertCai/Ask-Confirm)

  * [Product1M: Towards Weakly Supervised Instance-Level Product Retrieval via Cross-modal Pretraining](https://arxiv.org/abs/2107.14572)
:star:[code](https://github.com/zhanxlin/Product1M)

  * [Wasserstein Coupled Graph Learning for Cross-Modal Retrieval](https://openaccess.thecvf.com/content/ICCV2021/papers/Wang_Wasserstein_Coupled_Graph_Learning_for_Cross-Modal_Retrieval_ICCV_2021_paper.pdf)

  * [Adversarial Attack on Deep Cross-Modal Hamming Retrieval](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Adversarial_Attack_on_Deep_Cross-Modal_Hamming_Retrieval_ICCV_2021_paper.pdf)

* 文本-视频检索

  * [TEACHTEXT: CrossModal Generalized Distillation for Text-Video Retrieval](https://arxiv.org/abs/2104.08271)
:house:[project](https://www.robots.ox.ac.uk/~vgg/research/teachtext/)

* 视频- 文本检索

  * [HiT: Hierarchical Transformer With Momentum Contrast for Video-Text Retrieval](https://arxiv.org/abs/2103.15049)

* image-based 3D shape retrieval 

  * [Single Image 3D Shape Retrieval via Cross-Modal Instance and Category Contrastive Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Lin_Single_Image_3D_Shape_Retrieval_via_Cross-Modal_Instance_and_Category_ICCV_2021_paper.pdf)

* 近邻搜索

  * [Product Quantizer Aware Inverted Index for Scalable Nearest Neighbor Search](https://openaccess.thecvf.com/content/ICCV2021/papers/Noh_Product_Quantizer_Aware_Inverted_Index_for_Scalable_Nearest_Neighbor_Search_ICCV_2021_paper.pdf)



## 28.Contrastive Learning(对比学习)

* [Improving Contrastive Learning by Visualizing Feature Transformation](https://arxiv.org/abs/2108.02982)
:open_mouth:oral:star:[code](https://github.com/DTennant/CL-Visualizing-Feature-Transformation)

* [TACo: Token-aware Cascade Contrastive Learning for Video-Text Alignment](https://arxiv.org/abs/2108.09980)
:newspaper:解读:[ICCV2021-TOCo-微软&CMU提出Token感知的级联对比学习方法，在视频文本对齐任务上“吊打”其他SOTA方法](https://mp.weixin.qq.com/s/sNwvYL1qsgyVrRe3-QmzhA)

* [A Broad Study on the Transferability of Visual Representations With Contrastive Learning](https://arxiv.org/abs/2103.13517)
:star:[code](https://github.com/asrafulashiq/transfer_broad)

* [Vi2CLR: Video and Image for Visual Contrastive Learning of Representation](https://openaccess.thecvf.com/content/ICCV2021/papers/Diba_Vi2CLR_Video_and_Image_for_Visual_Contrastive_Learning_of_Representation_ICCV_2021_paper.pdf)

* [LatentCLR: A Contrastive Learning Approach for Unsupervised Discovery of Interpretable Directions](https://arxiv.org/abs/2104.00820)
:star:[code](https://github.com/catlab-team/latentclr)

* [CrossCLR: Cross-Modal Contrastive Learning for Multi-Modal Video Representations](https://arxiv.org/abs/2109.14910)

* [Social NCE: Contrastive Learning of Socially-Aware Motion Representations](https://arxiv.org/abs/2012.11717)
:star:[code](https://github.com/vita-epfl/social-nce):tv:[video](https://www.youtube.com/watch?v=s1khZWWiQfA)

* [With a Little Help From My Friends: Nearest-Neighbor Contrastive Learning of Visual Representations](https://arxiv.org/abs/2104.14548)

* [Contrastive Learning of Image Representations With Cross-Video Cycle-Consistency](https://arxiv.org/abs/2105.06463)
:house:[project](https://happywu.github.io/cycle_contrast_video/)

* [Weakly Supervised Contrastive Learning](https://arxiv.org/abs/2110.04770)



## 27.Multi-label image recognition(多标签图像识别)

* [Residual Attention: A Simple but Effective Method for Multi-Label Recognition](https://arxiv.org/abs/2108.02456)
:star:[code](https://github.com/Kevinz-code/CSRA)

* [Transformer-based Dual Relation Graph for Multi-label Image Recognition](https://arxiv.org/abs/2110.04722)



## 26.Image Processing(图像处理)

* [Aligning Latent and Image Spaces to Connect the Unconnectable](https://arxiv.org/abs/2104.06954)
:star:[code](https://github.com/universome/alis):house:[project](https://universome.github.io/alis)

* 图像形状操纵

  * [Image Shape Manipulation from a Single Augmented Training Sample](https://arxiv.org/abs/2109.06151)
:open_mouth:oral:star:[code](https://github.com/eliahuhorwitz/DeepSIM):house:[project](http://www.vision.huji.ac.il/deepsim/)

* 边缘检测

  * [RINDNet: Edge Detection for Discontinuity in Reflectance, Illumination, Normal and Depth](https://arxiv.org/abs/2108.00616)
:open_mouth:oral:star:[code](https://github.com/MengyangPu/RINDNet)

  * [Pixel Difference Networks for Efficient Edge Detection](https://arxiv.org/abs/2108.07009)
:star:[code](https://github.com/zhuoinoulu/pidinet)

* 图像识别

  * [MicroNet: Improving Image Recognition with Extremely Low FLOPs](https://arxiv.org/abs/2108.05894)
:star:[code](https://github.com/liyunsheng13/micronet)

* 图像去模糊

  * [Rethinking Coarse-to-Fine Approach in Single Image Deblurring](https://arxiv.org/abs/2108.05054)
:star:[code](https://github.com/chosj95/MIMO-UNet)

  * [Single Image Defocus Deblurring Using Kernel-Sharing Parallel Atrous Convolutions](https://arxiv.org/abs/2108.09108)

  * [Defocus Map Estimation and Deblurring From a Single Dual-Pixel Image](https://openaccess.thecvf.com/content/ICCV2021/papers/Xin_Defocus_Map_Estimation_and_Deblurring_From_a_Single_Dual-Pixel_Image_ICCV_2021_paper.pdf)

  * [Motion Deblurring with Real Events](https://arxiv.org/abs/2109.13695)

  * [Pyramid Architecture Search for Real-Time Image Deblurring](https://openaccess.thecvf.com/content/ICCV2021/papers/Hu_Pyramid_Architecture_Search_for_Real-Time_Image_Deblurring_ICCV_2021_paper.pdf)

  * 运动去模糊

    * [Perceptual Variousness Motion Deblurring With Light Global Context Refinement](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Perceptual_Variousness_Motion_Deblurring_With_Light_Global_Context_Refinement_ICCV_2021_paper.pdf)

* 视频去模糊

  * [Bringing Events Into Video Deblurring With Non-Consecutively Blurry Frames](https://openaccess.thecvf.com/content/ICCV2021/papers/Shang_Bringing_Events_Into_Video_Deblurring_With_Non-Consecutively_Blurry_Frames_ICCV_2021_paper.pdf)
:star:[code](https://github.com/shangwei5/D2Net)

* Image quality assessment(图像质量评估IQA)

  * [MUSIQ: Multi-scale Image Quality Transformer](https://arxiv.org/abs/2108.05997)
:star:[code](https://github.com/google-research/google-research/tree/master/musiq)

* Image Harmonization

  * [SSH: A Self-Supervised Framework for Image Harmonization](https://arxiv.org/abs/2108.06805)
:star:[code](https://github.com/VITA-Group/SSHarmonization)

  * [Learning Conditional Knowledge Distillation for Degraded-Reference Image Quality Assessment](https://arxiv.org/abs/2108.07948)
:star:[code](https://github.com/researchmm/CKDN)

* 去阴影

  * [CANet: A Context-Aware Network for Shadow Removal](https://arxiv.org/abs/2108.09894)
:star:[code](https://github.com/Zipei-Chen/CANet)

  * [DC-ShadowNet: Single-Image Hard and Soft Shadow Removal Using Unsupervised Domain-Classifier Guided Network](https://openaccess.thecvf.com/content/ICCV2021/papers/Jin_DC-ShadowNet_Single-Image_Hard_and_Soft_Shadow_Removal_Using_Unsupervised_Domain-Classifier_ICCV_2021_paper.pdf)

* 去噪

  * [Rethinking Deep Image Prior for Denoising](https://arxiv.org/abs/2108.12841)
:star:[code](https://github.com/gistvision/DIP-denosing)

  * [Rethinking Noise Synthesis and Modeling in Raw Denoising](https://arxiv.org/abs/2110.04756)
:star:[code](https://github.com/zhangyi-3/noise-synthesis)

  * [C2N: Practical Generative Noise Modeling for Real-World Denoising](https://openaccess.thecvf.com/content/ICCV2021/papers/Jang_C2N_Practical_Generative_Noise_Modeling_for_Real-World_Denoising_ICCV_2021_paper.pdf)

  * [The Benefit of Distraction: Denoising Camera-Based Physiological Measurements Using Inverse Attention](https://openaccess.thecvf.com/content/ICCV2021/papers/Nowara_The_Benefit_of_Distraction_Denoising_Camera-Based_Physiological_Measurements_Using_Inverse_ICCV_2021_paper.pdf)
:star:[code](https://github.com/ewanowara/benefitofdistraction)

  * [Hyperspectral Image Denoising with Realistic Data](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhang_Hyperspectral_Image_Denoising_With_Realistic_Data_ICCV_2021_paper.pdf)
:star:[code](https://github.com/ColinTaoZhang/HSIDwRD)

  * [End-to-End Unsupervised Document Image Blind Denoising](https://openaccess.thecvf.com/content/ICCV2021/papers/Gangeh_End-to-End_Unsupervised_Document_Image_Blind_Denoising_ICCV_2021_paper.pdf)

  * [Cross-Patch Graph Convolutional Network for Image Denoising](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Cross-Patch_Graph_Convolutional_Network_for_Image_Denoising_ICCV_2021_paper.pdf)

  * 视频去噪

    * [Patch Craft: Video Denoising by Deep Modeling and Patch Matching](http://arxiv.org/abs/2103.13767)

* 图像着色

  * [Towards Vivid and Diverse Image Colorization with Generative Color Prior](https://arxiv.org/abs/2108.08826)

  * [Deep Edge-Aware Interactive Colorization Against Color-Bleeding Effects](https://arxiv.org/abs/2107.01619)
:open_mouth:oral:house:[project](https://eungyeupkim.github.io/edge-enhancing-colorization/)

* 图像增强

  * [Real-time Image Enhancer via Learnable Spatial-aware 3D Lookup Tables](https://arxiv.org/abs/2108.08697)

  * [Adaptive Unfolding Total Variation Network for Low-Light Image Enhancement](https://arxiv.org/abs/2110.00984)
:star:[code](https://github.com/CharlieZCJ/UTVNet)

  * [Representative Color Transform for Image Enhancement](https://openaccess.thecvf.com/content/ICCV2021/papers/Kim_Representative_Color_Transform_for_Image_Enhancement_ICCV_2021_paper.pdf)

  * [STAR: A Structure-Aware Lightweight Transformer for Real-Time Image Enhancement](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhang_STAR_A_Structure-Aware_Lightweight_Transformer_for_Real-Time_Image_Enhancement_ICCV_2021_paper.pdf)

  * [Deep Symmetric Network for Underexposed Image Enhancement With Recurrent Attentional Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhao_Deep_Symmetric_Network_for_Underexposed_Image_Enhancement_With_Recurrent_Attentional_ICCV_2021_paper.pdf)
:star:[code](https://www.shaopinglu.net/proj-iccv21/ImageEnhancement.html):house:[project](https://github.com/lin-zhao-resoLve/Deep-Symmetric-Network-Enhancement)

  * [StarEnhancer: Learning Real-Time and Style-Aware Image Enhancement](https://arxiv.org/abs/2107.12898)

* 图像恢复

  * [Spatially-Adaptive Image Restoration using Distortion-Guided Networks](https://arxiv.org/abs/2108.08617)
:star:[code](https://github.com/human-analysis/spatially-adaptive-image-restoration)

  * [Dynamic Attentive Graph Learning for Image Restoration](https://arxiv.org/abs/2109.06620)
:star:[code](https://github.com/jianzhangcs/DAGL)

  * [Self-Supervised Cryo-Electron Tomography Volumetric Image Restoration From Single Noisy Volume With Sparsity Constraint](https://openaccess.thecvf.com/content/ICCV2021/papers/Yang_Self-Supervised_Cryo-Electron_Tomography_Volumetric_Image_Restoration_From_Single_Noisy_Volume_ICCV_2021_paper.pdf)
:star:[code](https://github.com/icthrm/SC-Net)

  * [Searching for Controllable Image Restoration Networks](https://arxiv.org/abs/2012.11225)
:star:[code](https://github.com/ghimhw)

* 图像压缩

  * [Variable-Rate Deep Image Compression through Spatially-Adaptive Feature Transform](https://arxiv.org/abs/2108.09551)
:star:[code](https://github.com/micmic123/QmapCompression)

  * [Neural Image Compression via Attentional Multi-Scale Back Projection and Frequency Decomposition](https://openaccess.thecvf.com/content/ICCV2021/papers/Gao_Neural_Image_Compression_via_Attentional_Multi-Scale_Back_Projection_and_Frequency_ICCV_2021_paper.pdf)

* 图像修复

  * [Image Inpainting via Conditional Texture and Structure Dual Generation](https://arxiv.org/abs/2108.09760)
:star:[code](https://github.com/Xiefan-Guo/CTSDG)

  * [CR-Fill: Generative Image Inpainting With Auxiliary Contextual Reconstruction](https://openaccess.thecvf.com/content/ICCV2021/papers/Zeng_CR-Fill_Generative_Image_Inpainting_With_Auxiliary_Contextual_Reconstruction_ICCV_2021_paper.pdf)
:star:[code](https://github.com/zengxianyu/crfill)

  * [Parallel Multi-Resolution Fusion Network for Image Inpainting](https://openaccess.thecvf.com/content/ICCV2021/papers/Wang_Parallel_Multi-Resolution_Fusion_Network_for_Image_Inpainting_ICCV_2021_paper.pdf)

  * [Painting from Part](https://openaccess.thecvf.com/content/ICCV2021/papers/Guo_Painting_From_Part_ICCV_2021_paper.pdf)
:star:[code](https://github.com/zhenglab/partpainting)

  * [WaveFill: A Wavelet-Based Generation Network for Image Inpainting](https://arxiv.org/abs/2107.11027)

  * [Distillation-Guided Image Inpainting](https://openaccess.thecvf.com/content/ICCV2021/papers/Suin_Distillation-Guided_Image_Inpainting_ICCV_2021_paper.pdf)

  * [Learning a Sketch Tensor Space for Image Inpainting of Man-made Scenes](https://arxiv.org/abs/2103.15087)
:star:[code](https://github.com/ewrfcas/MST_inpainting):house:[project](https://ewrfcas.github.io/MST_inpainting/)

* Image extrapolation

  * [SemIE: Semantically-aware Image Extrapolation](https://arxiv.org/abs/2108.13702)
:house:[project](https://semie-iccv.github.io/)

* Reversible Image Conversion

  * [IICNet: A Generic Framework for Reversible Image Conversion](https://arxiv.org/abs/2109.04242)
:star:[code](https://github.com/felixcheng97/IICNet)

* 伪影去除

  * [Towards Flexible Blind JPEG Artifacts Removal](https://arxiv.org/abs/2109.14573)
:star:[code](https://github.com/jiaxi-jiang/FBCNN)

  * [Learning Dual Priors for JPEG Compression Artifacts Removal](https://openaccess.thecvf.com/content/ICCV2021/papers/Fu_Learning_Dual_Priors_for_JPEG_Compression_Artifacts_Removal_ICCV_2021_paper.pdf)

  * [Let's See Clearly: Contaminant Artifact Removal for Moving Cameras](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Lets_See_Clearly_Contaminant_Artifact_Removal_for_Moving_Cameras_ICCV_2021_paper.pdf)

* De-rendering

  * [De-rendering Stylized Texts](https://arxiv.org/abs/2110.01890)
:star:[code](https://github.com/CyberAgentAILab/derendering-text):house:[project](https://cyberagentailab.github.io/derendering-text/)

* 去除光晕

  * [Light Source Guided Single-Image Flare Removal From Unpaired Data](https://openaccess.thecvf.com/content/ICCV2021/papers/Qiao_Light_Source_Guided_Single-Image_Flare_Removal_From_Unpaired_Data_ICCV_2021_paper.pdf)

* 全景图拼接

  * [Minimal Solutions for Panoramic Stitching Given Gravity Prior](https://arxiv.org/abs/2012.00465)

* Flare Removal

  * [How to Train Neural Networks for Flare Removal](https://arxiv.org/abs/2011.12485)
:house:[project](https://yichengwu.github.io/flare-removal/):tv:[video](https://www.youtube.com/watch?v=eAXhcDjWoZ0)

* 图像裁剪

  * [TransView: Inside, Outside, and Across the Cropping View Boundaries](https://openaccess.thecvf.com/content/ICCV2021/papers/Pan_TransView_Inside_Outside_and_Across_the_Cropping_View_Boundaries_ICCV_2021_paper.pdf)

  * [Dissecting Image Crops](https://arxiv.org/abs/2011.11831)
:star:[code](https://github.com/basilevh/dissecting-image-crops)

* 去反射

  * [Location-Aware Single Image Reflection Removal](https://arxiv.org/abs/2012.07131)
:star:[code](https://github.com/zdlarr/Location-aware-SIRR)

  * [V-DESIRR: Very Fast Deep Embedded Single Image Reflection Removal](https://openaccess.thecvf.com/content/ICCV2021/papers/Prasad_V-DESIRR_Very_Fast_Deep_Embedded_Single_Image_Reflection_Removal_ICCV_2021_paper.pdf)
:star:[code](https://www.github.com/ee19d005/vdesirr)

* 去雨

  * [Improving De-Raining Generalization via Neural Reorganization](https://openaccess.thecvf.com/content/ICCV2021/papers/Xiao_Improving_De-Raining_Generalization_via_Neural_Reorganization_ICCV_2021_paper.pdf)

  * [Unpaired Learning for Deep Image Deraining with Rain Direction Regularize](https://openaccess.thecvf.com/content/ICCV2021/papers/Liu_Unpaired_Learning_for_Deep_Image_Deraining_With_Rain_Direction_Regularizer_ICCV_2021_paper.pdf)
:house:[project](https://lewisyangliu.github.io/projects/UDRDR/)

  * [Structure-Preserving Deraining With Residue Channel Prior Guidance](https://arxiv.org/abs/2108.09079)
:star:[code](https://github.com/Joyies/SPDNet)

* 图像失真去除

  * [Unsupervised Non-Rigid Image Distortion Removal via Grid Deformation](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Unsupervised_Non-Rigid_Image_Distortion_Removal_via_Grid_Deformation_ICCV_2021_paper.pdf)
:star:[code](https://github.com/Nianyi-Li/unsupervised-NDIR):tv:[video](https://www.youtube.com/watch?v=aeJkb5u0Cb8)

* 消除水下图像的折射失真

  * [Learning To Remove Refractive Distortions From Underwater Images](https://openaccess.thecvf.com/content/ICCV2021/papers/Thapa_Learning_To_Remove_Refractive_Distortions_From_Underwater_Images_ICCV_2021_paper.pdf)

* 图像补全

  * [High-Fidelity Pluralistic Image Completion With Transformers](https://arxiv.org/abs/2103.14031)
:star:[code](https://github.com/raywzy/ICT):house:[project](http://raywzy.com/ICT/)

* Image Decomposition

  * [Unsupervised Layered Image Decomposition into Object Prototypes](https://arxiv.org/abs/2104.14575)

* 失真矫正

  * [Towards Complete Scene and Regular Shape for Distortion Rectification by Curve-Aware Extrapolation](https://openaccess.thecvf.com/content/ICCV2021/papers/Liao_Towards_Complete_Scene_and_Regular_Shape_for_Distortion_Rectification_by_ICCV_2021_paper.pdf)

* HDR

  * [Unpaired Learning for High Dynamic Range Image Tone Mapping](https://openaccess.thecvf.com/content/ICCV2021/papers/Vinker_Unpaired_Learning_for_High_Dynamic_Range_Image_Tone_Mapping_ICCV_2021_paper.pdf)

  * 超高清图像HDR重建

    * [Ultra-High-Definition Image HDR Reconstruction via Collaborative Bilateral Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Zheng_Ultra-High-Definition_Image_HDR_Reconstruction_via_Collaborative_Bilateral_Learning_ICCV_2021_paper.pdf)

* 图像去雪

  * [ALL Snow Removed: Single Image Desnowing Algorithm Using Hierarchical Dual-Tree Complex Wavelet Representation and Contradict Channel Loss](https://openaccess.thecvf.com/content/ICCV2021/papers/Chen_ALL_Snow_Removed_Single_Image_Desnowing_Algorithm_Using_Hierarchical_Dual-Tree_ICCV_2021_paper.pdf)
:star:[code](https://github.com/weitingchen83/ICCV2021-Single-Image-Desnowing-HDCWNet)

* Image Harmonization

  * [Image Harmonization With Transformer](https://openaccess.thecvf.com/content/ICCV2021/papers/Guo_Image_Harmonization_With_Transformer_ICCV_2021_paper.pdf)
:star:[code](https://github.com/zhenglab/HarmonyTransformer)

* 图像编辑

  * [Language-Guided Global Image Editing via Cross-Modal Cyclic Mechanism](https://openaccess.thecvf.com/content/ICCV2021/papers/Jiang_Language-Guided_Global_Image_Editing_via_Cross-Modal_Cyclic_Mechanism_ICCV_2021_paper.pdf)

* image hiding(图像隐藏)

  * [HiNet: Deep Image Hiding by Invertible Network](https://openaccess.thecvf.com/content/ICCV2021/papers/Jing_HiNet_Deep_Image_Hiding_by_Invertible_Network_ICCV_2021_paper.pdf)
:star:[code](https://github.com/TomTomTommi/HiNet)



## 25.Medical Image(医学影像)

* [Equivariant Imaging: Learning Beyond the Range Space](https://arxiv.org/abs/2103.14756)
:open_mouth:oral:star:[code](https://github.com/edongdongchen/EI)

* [Deep Survival Analysis With Longitudinal X-Rays for COVID-19](https://openaccess.thecvf.com/content/ICCV2021/papers/Shu_Deep_Survival_Analysis_With_Longitudinal_X-Rays_for_COVID-19_ICCV_2021_paper.pdf)

* 医学图像分割

  * [Recurrent Mask Refinement for Few-Shot Medical Image Segmentation](https://arxiv.org/abs/2108.00622)
:star:[code](https://github.com/uci-cbcl/RP-Net)

  * [Graph-BAS3Net: Boundary-Aware Semi-Supervised Segmentation Network With Bilateral Graph Convolution](https://openaccess.thecvf.com/content/ICCV2021/papers/Huang_Graph-BAS3Net_Boundary-Aware_Semi-Supervised_Segmentation_Network_With_Bilateral_Graph_Convolution_ICCV_2021_paper.pdf)医学图像分割

  * 病变分割

    * [T-AutoML: Automated Machine Learning for Lesion Segmentation Using Transformers in 3D Medical Imaging](https://openaccess.thecvf.com/content/ICCV2021/papers/Yang_T-AutoML_Automated_Machine_Learning_for_Lesion_Segmentation_Using_Transformers_in_ICCV_2021_paper.pdf)

  * 息肉分割

    * [Collaborative and Adversarial Learning of Focused and Dispersive Representations for Semi-Supervised Polyp Segmentation](https://openaccess.thecvf.com/content/ICCV2021/papers/Wu_Collaborative_and_Adversarial_Learning_of_Focused_and_Dispersive_Representations_for_ICCV_2021_paper.pdf)

  * 血管分割

    * [Self-Supervised Vessel Segmentation via Adversarial Learning](https://github.com/AISIGSJTU/SSVS)

  * 脑肿瘤分割

    * [RFNet: Region-Aware Fusion Network for Incomplete Multi-Modal Brain Tumor Segmentation](https://openaccess.thecvf.com/content/ICCV2021/papers/Ding_RFNet_Region-Aware_Fusion_Network_for_Incomplete_Multi-Modal_Brain_Tumor_Segmentation_ICCV_2021_paper.pdf)

* 病理学图像表示

  * [A QuadTree Image Representation for Computational Pathology](https://arxiv.org/abs/2108.10873)

* 医学图像分析

  * [Preservational Learning Improves Self-supervised Medical Image Models by Reconstructing Diverse Contexts](https://arxiv.org/abs/2109.04379)
:star:[code](https://github.com/Luchixiang/PCRL)
:newspaper:解读:[ICCV2021 2D和3D通用！新医疗影像自监督SOTA（代码已开源）](https://mp.weixin.qq.com/s/mM0ddlImo87a8tDkRbsfHg)

* 医学图像去噪

  * [Eformer: Edge Enhancement based Transformer for Medical Image Denoising](https://arxiv.org/abs/2109.08044)

* 视频翻译

  * [Long-Term Temporally Consistent Unpaired Video Translation From Simulated Surgical 3D Data](https://arxiv.org/abs/2103.17204)
:star:[code](https://gitlab.com/nct_tso_public/surgical-video-sim2real):house:[project](http://opencas.dkfz.de/video-sim2real/)

* 病理学图像核检测分割

  * [Mutual-Complementing Framework for Nuclei Detection and Segmentation in Pathology Image](https://openaccess.thecvf.com/content/ICCV2021/papers/Feng_Mutual-Complementing_Framework_for_Nuclei_Detection_and_Segmentation_in_Pathology_Image_ICCV_2021_paper.pdf)

* 医学报告生成

  * [Visual-Textual Attentive Semantic Consistency for Medical Report Generation](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhou_Visual-Textual_Attentive_Semantic_Consistency_for_Medical_Report_Generation_ICCV_2021_paper.pdf)

* CT

  * [3DeepCT: Learning Volumetric Scattering Tomography of Clouds](https://openaccess.thecvf.com/content/ICCV2021/papers/Sde-Chen_3DeepCT_Learning_Volumetric_Scattering_Tomography_of_Clouds_ICCV_2021_paper.pdf)

  * [IntraTomo: Self-Supervised Learning-Based Tomography via Sinogram Synthesis and Prediction](https://openaccess.thecvf.com/content/ICCV2021/papers/Zang_IntraTomo_Self-Supervised_Learning-Based_Tomography_via_Sinogram_Synthesis_and_Prediction_ICCV_2021_paper.pdf)
:star:[code](https://github.com/vccimaging/IntraTomo)

  * CT重建

    * [Dynamic CT Reconstruction From Limited Views With Implicit Neural Representations and Parametric Motion Fields](https://arxiv.org/abs/2104.11745)
:star:[code](https://github.com/awreed/DynamicCTReconstruction)

* 医学图像识别

  * [GLoRIA: A Multimodal Global-Local Representation Learning Framework for Label-Efficient Medical Image Recognition](https://openaccess.thecvf.com/content/ICCV2021/papers/Huang_GLoRIA_A_Multimodal_Global-Local_Representation_Learning_Framework_for_Label-Efficient_Medical_ICCV_2021_paper.pdf)
:star:[code](https://github.com/marshuang80/gloria)

* 医学图像分类

  * [Big Self-Supervised Models Advance Medical Image Classification](https://arxiv.org/abs/2101.05224)

  * [Large-Scale Robust Deep AUC Maximization: A New Surrogate Loss and Empirical Studies on Medical Image Classification](https://arxiv.org/abs/2012.03173)
:star:[code](https://github.com/Optimization-AI/LibAUC)



## 24.Face(人脸)

* [VariTex: Variational Neural Face Textures](https://arxiv.org/abs/2104.05988)
:star:[code](https://github.com/mcbuehler/VariTex):house:[project](https://mcbuehler.github.io/VariTex/):tv:[video](https://www.youtube.com/watch?v=6-GFHcLkbik)

* 人脸造假检测

  * [OpenForensics: Large-Scale Challenging Dataset For Multi-Face Forgery Detection And Segmentation In-The-Wild](https://arxiv.org/abs/2107.14480)
:house:[project](https://sites.google.com/view/ltnghia/research/openforensics)

  * [Exploring Temporal Coherence for More General Video Face Forgery Detection](https://arxiv.org/abs/2108.06693)

* 人脸合成

  * [Disentangled Lifespan Face Synthesis](https://arxiv.org/abs/2108.02874)
:star:[code](https://github.com/SenHe/DLFS):house:[project](https://senhe.github.io/projects/iccv_2021_lifespan_face/):tv:[video](https://youtu.be/uklX03ns0m0)

* 人脸识别                                                                                      

  * [PASS: Protected Attribute Suppression System for Mitigating Bias in Face Recognition](https://arxiv.org/abs/2108.03764)

  * [SynFace: Face Recognition with Synthetic Data](https://arxiv.org/abs/2108.07960)

  * [Adaptive Label Noise Cleaning With Meta-Supervision for Deep Face Recognition](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhang_Adaptive_Label_Noise_Cleaning_With_Meta-Supervision_for_Deep_Face_Recognition_ICCV_2021_paper.pdf)

  * [Disentangled Representation for Age-Invariant Face Recognition: A Mutual Information Minimization Perspective](https://openaccess.thecvf.com/content/ICCV2021/papers/Hou_Disentangled_Representation_for_Age-Invariant_Face_Recognition_A_Mutual_Information_Minimization_ICCV_2021_paper.pdf)

  * [Teacher-Student Adversarial Depth Hallucination To Improve Face Recognition](https://openaccess.thecvf.com/content/ICCV2021/papers/Uppal_Teacher-Student_Adversarial_Depth_Hallucination_To_Improve_Face_Recognition_ICCV_2021_paper.pdf)
:star:[code](https://github.com/hardik-uppal/teacher-student-gan)

  * [DAM: Discrepancy Alignment Metric for Face Recognition](https://openaccess.thecvf.com/content/ICCV2021/papers/Liu_DAM_Discrepancy_Alignment_Metric_for_Face_Recognition_ICCV_2021_paper.pdf)

  * “去”识别

    * [Personalized and Invertible Face De-Identification by Disentangled Identity Information Manipulation](https://openaccess.thecvf.com/content/ICCV2021/papers/Cao_Personalized_and_Invertible_Face_De-Identification_by_Disentangled_Identity_Information_Manipulation_ICCV_2021_paper.pdf)

* Face perception面部感知

  * [Learning Facial Representations from the Cycle-consistency of Face](https://arxiv.org/abs/2108.03427)
:star:[code](https://github.com/JiaRenChang/FaceCycle)

* 说话人脸生成

  * [FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning](https://arxiv.org/abs/2108.07938)

* 说话头合成

  * [AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis](https://openaccess.thecvf.com/content/ICCV2021/papers/Guo_AD-NeRF_Audio_Driven_Neural_Radiance_Fields_for_Talking_Head_Synthesis_ICCV_2021_paper.pdf)
:star:[code](https://github.com/YudongGuo/AD-NeRF)

  * [Learned Spatial Representations for Few-Shot Talking-Head Synthesis](https://arxiv.org/abs/2104.14557)
:star:[code](https://github.com/MoustafaMeshry/lsr):house:[project](http://www.cs.umd.edu/~mmeshry/projects/lsr/)

* 人脸表情识别

  * [Understanding and Mitigating Annotation Bias in Facial Expression Recognition](https://arxiv.org/abs/2108.08504)

  * [TransFER: Learning Relation-aware Facial Expression Representations with Transformers](https://arxiv.org/abs/2108.11116)

* 人脸呈现攻击检测

  * [Detection and Continual Learning of Novel Face Presentation Attacks](https://arxiv.org/abs/2108.12081)
:star:[code](https://github.com/mrostami1366)

* 人脸编辑

  * [Talk-to-Edit: Fine-Grained Facial Editing via Dialog](https://arxiv.org/abs/2109.04425)
:star:[code](https://github.com/yumingj/Talk-to-Edit):house:[project](https://www.mmlab-ntu.com/project/talkedit/)
:newspaper:解读:[ICCV2021 | 南洋理工大学、港中大提出Talk-to-Edit，对话实现高细粒度人脸编辑](https://mp.weixin.qq.com/s/48FsUqsppXaXUu-QMUIhCQ)

  * [A Latent Transformer for Disentangled Face Editing in Images and Videos](https://arxiv.org/abs/2106.11895)
:star:[code](https://github.com/InterDigitalInc/latent-transformer)

* 人脸对齐

  * [ADNet: Leveraging Error-Bias Towards Normal Direction in Face Alignment](https://arxiv.org/abs/2109.05721) 

* 人脸图像重建

  * [Focal Frequency Loss for Image Reconstruction and Synthesis](https://arxiv.org/abs/2012.12821)
:star:[code](https://github.com/EndlessSora/focal-frequency-loss):house:[project](https://www.mmlab-ntu.com/project/ffl/index.html):tv:[video](https://www.youtube.com/watch?v=RNTnDtKvcpc)

  * [Towards High Fidelity Monocular Face Reconstruction With Rich Reflectance Using Self-Supervised Learning and Ray Tracing](https://arxiv.org/abs/2103.15432)

  * [Neural Photofit: Gaze-Based Mental Image Reconstruction](https://arxiv.org/abs/2108.07524)
:house:[project](https://perceptualui.org/publications/strohm21_iccv/)

* 3D人脸重建

  * [Topologically Consistent Multi-View Face Inference Using Volumetric Sampling](https://arxiv.org/abs/2110.02948)
:star:[code](https://tianyeli.github.io/tofu) 

  * [Self-Supervised 3D Face Reconstruction via Conditional Estimation](https://arxiv.org/abs/2110.04800)

* 三维人脸动画

  * [MeshTalk: 3D Face Animation From Speech Using Cross-Modality Disentanglement](https://openaccess.thecvf.com/content/ICCV2021/papers/Richard_MeshTalk_3D_Face_Animation_From_Speech_Using_Cross-Modality_Disentanglement_ICCV_2021_paper.pdf)
:star:[code](https://github.com/facebookresearch/meshtalk):tv:[video](https://research.fb.com/wp-content/uploads/2021/04/mesh_talk.mp4)

* Remote Photoplethysmography (rPPG远程光电容积描记术)

  * [The Way to My Heart Is Through Contrastive Learning: Remote Photoplethysmography From Unlabelled Video](https://openaccess.thecvf.com/content/ICCV2021/papers/Gideon_The_Way_to_My_Heart_Is_Through_Contrastive_Learning_Remote_ICCV_2021_paper.pdf)
:star:[code](https://github.com/ToyotaResearchInstitute/RemotePPG)

* 人脸加密

  * [Towards Face Encryption by Generating Adversarial Identity Masks](https://arxiv.org/abs/2003.06814)
:star:[code](https://github.com/ShawnXYang/TIP-IM)

* Deepfake检测

  * [Learning Self-Consistency for Deepfake Detection](https://arxiv.org/abs/2012.09311)
:open_mouth:oral

  * [Joint Audio-Visual Deepfake Detection](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhou_Joint_Audio-Visual_Deepfake_Detection_ICCV_2021_paper.pdf)

  * [Artificial Fingerprinting for Generative Models: Rooting Deepfake Attribution in Training Data](https://arxiv.org/abs/2007.08457)
:open_mouth:oral

* 人脸纹理补全

  * [Learning High-Fidelity Face Texture Completion Without Complete Face Texture](https://openaccess.thecvf.com/content/ICCV2021/papers/Kim_Learning_High-Fidelity_Face_Texture_Completion_Without_Complete_Face_Texture_ICCV_2021_paper.pdf)

* 面部动作单元检测

  * [PIAP-DF: Pixel-Interested and Anti Person-Specific Facial Action Unit Detection Net With Discrete Feedback Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Tang_PIAP-DF_Pixel-Interested_and_Anti_Person-Specific_Facial_Action_Unit_Detection_Net_ICCV_2021_paper.pdf)

* 人脸分析

  * [Fake I
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/52cv/iccv-2021-papers

Awesome Lists containing this project

README