An open API service indexing awesome lists of open source software.

https://github.com/52cv/iccv-2021-papers


https://github.com/52cv/iccv-2021-papers

Last synced: 4 months ago
JSON representation

Awesome Lists containing this project

README

          

# ICCV2021最新信息及已接收论文/代码



官网链接:http://iccv2021.thecvf.com/home

开会时间:2021年10月11日至17日

# :exclamation::exclamation::exclamation::star2::star2::star2:📗📗📗ICCV 2021收录论文已全部公布,下载可在【我爱计算机视觉】后台回复“paper”,即可收到。共计 1612 篇。

# :exclamation::exclamation::exclamation::star2::star2::star2:全部论文已粗略分类完毕,请查阅

## 历年综述论文分类汇总戳这里↘️[CV-Surveys](https://github.com/52CV/CV-Surveys)施工中~~~~~~~~~~

## 2022 年论文分类汇总戳这里
↘️[CVPR-2022-Papers](https://github.com/52CV/CVPR-2022-Papers)
↘️[WACV-2022-Papers](https://github.com/52CV/WACV-2022-Papers)

## 2021年论文分类汇总戳这里
↘️[ICCV-2021-Papers](https://github.com/52CV/ICCV-2021-Papers)
↘️[CVPR-2021-Papers](https://github.com/52CV/CVPR-2021-Papers)

## 2020 年论文分类汇总戳这里
↘️[CVPR-2020-Papers](https://github.com/52CV/CVPR-2020-Papers)
↘️[ECCV-2020-Papers](https://github.com/52CV/ECCV-2020-Papers)

# 目录

|:dog:|:mouse:|:hamster:|:tiger:|
|------|------|------|------|
|[65.Optical Flow Estimation(光流估计)](#65)|
|[61.Metric Learning(元学习)](#61)|[62.Open-Set Recognition(开放集识别)](#62)|[63.Data Augmentation(数据增强)](#63)|[64.Anomaly Detection(异常检测)](#64)|
|[57.Image Matching(图像匹配)](#57)|[58.Computational Photography(光学、几何、光场成像、计算摄影)](#58)|[59.Graph Neural Networks(图神经网络)](#59)|[60.Federated Learning(联合学习)](#60)
|[53.Vision Localization(视觉定位)](#53)|[54.Sketch recognition(草图)](#54)|[55.Activity Recognition(活动识别)](#55)|[56.Dataset(数据集)](#56)|
|[49.Human-Object Interaction(人物交互)](#49)|[50.Continual Learning(持续学习)](#50)|[51.View Synthesis(视图合成)](#51)|[52.Vision-and-Language(视觉语言)](#52)|
|[45.Image Caption(图像字幕)](#45)|[46.Defect Detection(缺陷检测)](#46)|[47.NAS](#47)|[48.6DoF](#48)|
|[41.Out-of-Distribution Detection(OOD)](#41)|[42.Visual Representations Learning(视觉表征学习)](#42)|[43.Dense Prediction(密集预测)](#43)|[44.Human motion prediction(人体运动预测)](#44)|
|[37.Multitask Learning(多任务学习)](#37)|[38.Weakly/Semi-Supervised/Self-supervised/Unsupervised Learning(自/半/弱监督学习)](#38)|[39.Incremental Learning(增量学习)](#39)|[40.Metric Learning(度量学习)](#40)|
|[33.Remote Sensing Images(遥感影像)](#33)|[34.Image Super-Resolution(图像超分辨率)](#34)|[35.Quantization/Pruning/Knowledge Distillation/Model Compression(量化、剪枝、蒸馏、模型压缩/扩展与优化)](#35)|[36.SLAM/AR/VR/机器人](#36)|
|[29.Image Retrieval(图像检索)](#29)|[30.Image Generation/synthesis(图像生成/合成)](#30)|[31.Style Transfer(风格迁移)](#31)|[32.语音](#32)|
|[25.Medical Image(医学影像)](#25)|[26.Image Processing(图像处理)](#26)|[27.Multi-label image recognition(多标签图像识别)](#27)|[28.Contrastive Learning(对比学习)](#28)]
|[21.Active Learning(主动学习)](#21)|[22.GAN](#22)|[23.Gaze Estimation(视线估计)](#23)|[24.Face(人脸)](#24)|
|[17.3D(三维视觉)](#17)|[18.Transformers](#18)|[19.Self-Driving Vehicles(自动驾驶)](#19)|[20.Adversarial Learning(对抗学习)](#20)|
|[13.Image Segmentation(图像分割)](#13)|[14.Object Detection(目标检测)](#13)|[15.Object Tracking(目标跟踪)](#15)|[16.Re-Identification(重识别)](#16)|
|[9.Video](#9)|[10.OCR](#10)|[11.Visual Question Answering(视觉问答)](#11)|[12.Image/Fine-Grained Classification(图像/细粒度分类)](#12)|
|[5.Few-Shot/Zero-Shot Learning;Domain Generalization/Adaptation(小/零样本学习;域适应/泛化)](#5)|[6.Point Cloud(点云)](#6)|[7.Scene Graph Generation(场景图生成)](#7)|[8.Human Pose Estimation(人体姿态估计)](#8)|
|[1.Other(其它)](#1)|[2.Sign Language(手语识别)](#2)|[3.Image Clustering(图像聚类)](#3)|[4.Neural rendering(神经渲染)](#4)|

## 65.Optical Flow Estimation(光流估计)
* [Separable Flow: Learning Motion Cost Volumes for Optical Flow Estimation](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhang_Separable_Flow_Learning_Motion_Cost_Volumes_for_Optical_Flow_Estimation_ICCV_2021_paper.pdf)
:star:[code](https://github.com/feihuzhang/SeparableFlow)
* [High-Resolution Optical Flow from 1D Attention and Correlation](https://arxiv.org/abs/2104.13918)
:open_mouth:oral:star:[code](https://github.com/haofeixu/flow1d)
* [GyroFlow: Gyroscope-Guided Unsupervised Optical Flow Learning](https://arxiv.org/abs/2103.13725)
:star:[code](https://github.com/megvii-research/GyroFlow)
* [Sensor-Guided Optical Flow](https://arxiv.org/abs/2109.15321)
:star:[code](https://github.com/mattpoggi/sensor-guided-flow)

## 64.Anomaly Detection(异常检测)
* 表面异常检测
* [DRÆM – A discriminatively trained reconstru](https://openaccess.thecvf.com/content/ICCV2021/papers/Zavrtanik_DRAEM_-_A_Discriminatively_Trained_Reconstruction_Embedding_for_Surface_Anomaly_ICCV_2021_paper.pdf)
:star:[code](https://github.com/VitjanZ/DRAEM)
* 异常检测
* [Weakly Supervised Temporal Anomaly Segmentation with Dynamic Time Warping](https://arxiv.org/abs/2108.06816)
* [Learning Unsupervised Metaformer for Anomaly Detection](https://openaccess.thecvf.com/content/ICCV2021/papers/Wu_Learning_Unsupervised_Metaformer_for_Anomaly_Detection_ICCV_2021_paper.pdf)
解决图像异常的分类或定位

## 63.Data Augmentation(数据增强)
* [DivAug: Plug-In Automated Data Augmentation With Explicit Diversity Maximization](https://arxiv.org/abs/2103.14545)
:star:[code](https://github.com/warai-0toko/DivAug)
* [TrivialAugment: Tuning-Free Yet State-of-the-Art Data Augmentation](https://openaccess.thecvf.com/content/ICCV2021/papers/Gu_Removing_the_Bias_of_Integral_Pose_Regression_ICCV_2021_paper.pdf)
:open_mouth:oral:star:[code](https://github.com/automl/trivialaugment)
* [Semantic Aware Data Augmentation for Cell Nuclei Microscopical Images With Artificial Neural Networks](https://openaccess.thecvf.com/content/ICCV2021/papers/Naghizadeh_Semantic_Aware_Data_Augmentation_for_Cell_Nuclei_Microscopical_Images_With_ICCV_2021_paper.pdf)
* [A Simple Baseline for Semi-Supervised Semantic Segmentation With Strong Data Augmentation](https://arxiv.org/abs/2104.07256)

## 62.Open-Set Recognition(开放集识别)
* [OpenGAN: Open-Set Recognition via Open Data Generation](https://arxiv.org/abs/2104.02939)
:trophy:Best Paper Honorable Mention
* [Conditional Variational Capsule Network for Open Set Recognition](https://arxiv.org/abs/2104.09159)
:star:[code](https://github.com/guglielmocamporese/cvaecaposr)

## 61.Metric Learning(元学习)
* [Do Different Deep Metric Learning Losses Lead to Similar Learned Features?](https://openaccess.thecvf.com/content/ICCV2021/papers/Kobs_Do_Different_Deep_Metric_Learning_Losses_Lead_to_Similar_Learned_ICCV_2021_paper.pdf)
:star:[code](https://github.com/konstantinkobs/DML-analysis)
* [Learning With Memory-Based Virtual Classes for Deep Metric Learning](https://arxiv.org/abs/2103.16940)
:star:[code](https://github.com/navervision/MemVir)

## 60.Federated Learning(联合学习)
* [Federated Learning for Non-IID Data via Unified Feature Learning and Optimization Objective Alignment](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhang_Federated_Learning_for_Non-IID_Data_via_Unified_Feature_Learning_and_ICCV_2021_paper.pdf)
* [Ensemble Attention Distillation for Privacy-Preserving Federated Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Gong_Ensemble_Attention_Distillation_for_Privacy-Preserving_Federated_Learning_ICCV_2021_paper.pdf)

## 59.Graph Neural Networks(图神经网络)
* [Meta-Aggregator: Learning to Aggregate for 1-bit Graph Neural Networks](https://arxiv.org/abs/2109.12872)
* [PoGO-Net: Pose Graph Optimization With Graph Neural Networks](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_PoGO-Net_Pose_Graph_Optimization_With_Graph_Neural_Networks_ICCV_2021_paper.pdf)
:star:[code](https://github.com/xxylii/PoGO-Net)
* [Dynamic Dual Gating Neural Networks](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Dynamic_Dual_Gating_Neural_Networks_ICCV_2021_paper.pdf)
:star:[code](https://github.com/lfr-0531/DGNet)

## 58.Computational Photography(光学、几何、光场成像、计算摄影)
* [An Asynchronous Kalman Filter for Hybrid Event Cameras](https://arxiv.org/abs/2012.05590)
:star:[code](https://github.com/ziweiWWANG/AKF)
* [4D Cloud Scattering Tomography](https://openaccess.thecvf.com/content/ICCV2021/papers/Ronen_4D_Cloud_Scattering_Tomography_ICCV_2021_paper.pdf)
* Snapshot compressive imaging(快照压缩成像)
* [Dense Deep Unfolding Network with 3D-CNN Prior for Snapshot Compressive Imaging](https://arxiv.org/abs/2109.06548)
:star:[code](https://github.com/jianzhangcs/SCI3D)
* 光场
* [Light Field Saliency Detection with Dual Local Graph Learning andReciprocative Guidance](https://arxiv.org/abs/2110.00698)
* [Fast Light-Field Disparity Estimation With Multi-Disparity-Scale Cost Aggregation](https://openaccess.thecvf.com/content/ICCV2021/papers/Huang_Fast_Light-Field_Disparity_Estimation_With_Multi-Disparity-Scale_Cost_Aggregation_ICCV_2021_paper.pdf)
:star:[code](https://github.com/zcong17huang/FastLFnet)
* [SeLFVi: Self-supervised Light-Field Video Reconstruction from Stereo Video](https://openaccess.thecvf.com/content/ICCV2021/papers/Shedligeri_SeLFVi_Self-Supervised_Light-Field_Video_Reconstruction_From_Stereo_Video_ICCV_2021_paper.pdf)
* [SIGNET: Efficient Neural Representation for Light Fields](https://openaccess.thecvf.com/content/ICCV2021/papers/Feng_SIGNET_Efficient_Neural_Representation_for_Light_Fields_ICCV_2021_paper.pdf)
* 光场重建
* [Learning Dynamic Interpolation for Extremely Sparse Light Fields With Wide Baselines](https://openaccess.thecvf.com/content/ICCV2021/papers/Guo_Learning_Dynamic_Interpolation_for_Extremely_Sparse_Light_Fields_With_Wide_ICCV_2021_paper.pdf)
:star:[code](https://github.com/MantangGuo/DI4SLF)
* 压缩成像
* [Time-Multiplexed Coded Aperture Imaging: Learned Coded Aperture and Pixel Exposures for Compressive Imaging Systems](https://arxiv.org/abs/2104.02820)
* Homography Estimation
* [LocalTrans: A Multiscale Local Transformer Network for Cross-Resolution Homography Estimation](http://arxiv.org/abs/2106.04067)
* 计算成像
* [Extreme-Quality Computational Imaging via Degradation Framework](https://openaccess.thecvf.com/content/ICCV2021/papers/Chen_Extreme-Quality_Computational_Imaging_via_Degradation_Framework_ICCV_2021_paper.pdf)
* 光学像差矫正
* [Universal and Flexible Optical Aberration Correction Using Deep-Prior Based Deconvolution](https://arxiv.org/abs/2104.03078)
:star:[code](https://github.com/leehsiu/UABC)

## 57.Image Matching(图像匹配)
* [Matching in the Dark: A Dataset for Matching Image Pairs of Low-light Scenes](https://arxiv.org/abs/2109.03585)
* 特征点匹配
* [P2-Net: Joint Description and Detection of Local Features for Pixel and Point Matching](https://openaccess.thecvf.com/content/ICCV2021/papers/Wang_P2-Net_Joint_Description_and_Detection_of_Local_Features_for_Pixel_ICCV_2021_paper.pdf)
:star:[code](https://github.com/BingCS/P2-Net)

## 56.Dataset(数据集)
* [Large Scale Multi-Illuminant (LSMI) Dataset for Developing White Balance Algorithm Under Mixed Illumination](https://openaccess.thecvf.com/content/ICCV2021/papers/Kim_Large_Scale_Multi-Illuminant_LSMI_Dataset_for_Developing_White_Balance_Algorithm_ICCV_2021_paper.pdf)
:sunflower:[dataset](https://github.com/DY112/LSMI-dataset)
* [FloW: A Dataset and Benchmark for Floating Waste Detection in Inland Waters](https://openaccess.thecvf.com/content/ICCV2021/papers/Cheng_FloW_A_Dataset_and_Benchmark_for_Floating_Waste_Detection_in_ICCV_2021_paper.pdf)
:sunflower:[dataset](https://github.com/ORCA-Uboat/FloW-Dataset)
内陆水域漂浮废物检测数据集和基准
* [FloorPlanCAD: A Large-Scale CAD Drawing Dataset for Panoptic Symbol Spotting](https://openaccess.thecvf.com/content/ICCV2021/papers/Fan_FloorPlanCAD_A_Large-Scale_CAD_Drawing_Dataset_for_Panoptic_Symbol_Spotting_ICCV_2021_paper.pdf)
:house:[project](https://floorplancad.github.io/)
* 生物医学图像
* [BioFors: A Large Biomedical Image Forensics Dataset](https://arxiv.org/abs/2108.12961)
* 3D重建
* [Common Objects in 3D: Large-Scale Learning and Evaluation of Real-life 3D Category Reconstruction](https://arxiv.org/abs/2109.00512)
:sunflower:[dataset](https://github.com/facebookresearch/co3d)
* 航空影像数据集
* [Beyond Road Extraction: A Dataset for Map Update using Aerial Images](https://arxiv.org/abs/2110.04690)
:star:[code](https://github.com/favyen/muno21):house:[project](https://favyen.com/muno21/)
用于使用航拍图像更新地图的数据集
* 动作识别
* [HAA500: Human-Centric Atomic Action Dataset with Curated Videos](https://arxiv.org/abs/2009.05224)
* 目标识别
* [ORBIT: A Real-World Few-Shot Dataset for Teachable Object Recognition](https://arxiv.org/abs/2104.03841)
:star:[code](https://github.com/microsoft/ORBIT-Dataset):sunflower:[dataset](https://city.figshare.com/articles/dataset/ORBIT_A_real-world_few-shot_dataset_for_teachable_object_recognition_collected_from_people_who_are_blind_or_low_vision/14294597)
* 车道线检测
* [VIL-100: A New Dataset and a Baseline Model for Video Instance Lane Detection](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhang_VIL-100_A_New_Dataset_and_a_Baseline_Model_for_Video_ICCV_2021_paper.pdf)
:sunflower:[dataset](https://github.com/yujun0-0/MMA-Net)
* 自动驾驶
* [Large Scale Interactive Motion Forecasting for Autonomous Driving: The Waymo Open Motion Dataset](https://openaccess.thecvf.com/content/ICCV2021/papers/Ettinger_Large_Scale_Interactive_Motion_Forecasting_for_Autonomous_Driving_The_Waymo_ICCV_2021_paper.pdf)
* 视觉语言数据集
* [E-ViL: A Dataset and Benchmark for Natural Language Explanations in Vision-Language Tasks](https://openaccess.thecvf.com/content/ICCV2021/papers/Kayser_E-ViL_A_Dataset_and_Benchmark_for_Natural_Language_Explanations_in_ICCV_2021_paper.pdf)
:star:[code](https://github.com/maximek3/e-ViL)VL
* DeepFake检测
* [KoDF: A Large-Scale Korean DeepFake Detection Dataset](https://arxiv.org/abs/2103.10094)
:sunflower:[dataset](https://deepbrainai-research.github.io/kodf/)
* 高质量视频
* [Seeing Dynamic Scene in the Dark: A High-Quality Video Dataset With Mechatronic Alignment](https://openaccess.thecvf.com/content/ICCV2021/papers/Wang_Seeing_Dynamic_Scene_in_the_Dark_A_High-Quality_Video_Dataset_ICCV_2021_paper.pdf)
:sunflower:[dataset](https://github.com/dvlab-research/SDSD)视频

## 55.Activity Recognition(活动识别)
* [Selective Feature Compression for Efficient Activity Recognition Inference](https://arxiv.org/abs/2104.00179)
* 小组活动识别
* [Spatio-Temporal Dynamic Inference Network for Group Activity Recognition](https://arxiv.org/abs/2108.11743)
:star:[code](https://github.com/JacobYuan7/DIN_GAR)
* [GroupFormer: Group Activity Recognition with Clustered Spatial-Temporal Transformer](https://arxiv.org/abs/2108.12630)
:star:[code](https://github.com/xueyee/GroupFormer)

## 54.Sketch recognition(草图)
* [SketchLattice: Latticed Representation for Sketch Manipulation](https://arxiv.org/abs/2108.11636)
* [SketchAA: Abstract Representation for Abstract Sketches](https://openaccess.thecvf.com/content/ICCV2021/papers/Yang_SketchAA_Abstract_Representation_for_Abstract_Sketches_ICCV_2021_paper.pdf)

## 53.Vision Localization(视觉定位)
* [Continual Learning for Image-Based Camera Localization](https://arxiv.org/abs/2108.09112)
:star:[code](https://github.com/AaltoVision/CL_HSCNet)
* [CrowdDriven: A New Challenging Dataset for Outdoor Visual Localization](https://arxiv.org/abs/2109.04527)
:sunflower:[dataset](http://mapillary.com/)
* [Pose Correction for Highly Accurate Visual Localization in Large-Scale Indoor Spaces](https://openaccess.thecvf.com/content/ICCV2021/papers/Hyeon_Pose_Correction_for_Highly_Accurate_Visual_Localization_in_Large-Scale_Indoor_ICCV_2021_paper.pdf)
:star:[code](https://github.com/JanghunHyeon/PCLoc)
* [Cross-Descriptor Visual Localization and Mapping](https://openaccess.thecvf.com/content/ICCV2021/papers/Dusmanu_Cross-Descriptor_Visual_Localization_and_Mapping_ICCV_2021_paper.pdf)

## 52.Vision-and-Language(视觉语言)
* [YouRefIt: Embodied Reference Understanding with Language and Gesture](https://arxiv.org/abs/2109.03413)
:open_mouth:oral:house:[project](https://yixchen.github.io/YouRefIt/)
* [VLGrammar: Grounded Grammar Induction of Vision and Language](https://arxiv.org/abs/2103.12975)
:star:[code](https://github.com/evelinehong/VLGrammar)
* [COOKIE: Contrastive Cross-Modal Knowledge Sharing Pre-Training for Vision-Language Representation](https://openaccess.thecvf.com/content/ICCV2021/papers/Wen_COOKIE_Contrastive_Cross-Modal_Knowledge_Sharing_Pre-Training_for_Vision-Language_Representation_ICCV_2021_paper.pdf)
:star:[code](https://github.com/kywen1119/COOKIE)
* [Panoptic Narrative Grounding](https://openaccess.thecvf.com/content/ICCV2021/papers/Gonzalez_Panoptic_Narrative_Grounding_ICCV_2021_paper.pdf)
:open_mouth:oral:star:[code](https://github.com/BCV-Uniandes/PNG)
* [AESOP: Abstract Encoding of Stories, Objects, and Pictures](https://openaccess.thecvf.com/content/ICCV2021/papers/Ravi_AESOP_Abstract_Encoding_of_Stories_Objects_and_Pictures_ICCV_2021_paper.pdf)
:star:[code](https://github.com/Hareesh-Ravi/AESOP):tv:[video](https://www.youtube.com/watch?v=ygGzY1DSSMk)
* [Adaptive Hierarchical Graph Reasoning With Semantic Coherence for Video-and-Language Inference](https://arxiv.org/abs/2107.12270)
* 视觉推理
* [Interpretable Visual Reasoning via Induced Symbolic Space](https://arxiv.org/abs/2011.11603)
* 语义导航
* [THDA: Treasure Hunt Data Augmentation for Semantic Navigation](https://openaccess.thecvf.com/content/ICCV2021/papers/Maksymets_THDA_Treasure_Hunt_Data_Augmentation_for_Semantic_Navigation_ICCV_2021_paper.pdf)
* 视觉语言导航
* [Airbert: In-domain Pretraining for Vision-and-Language Navigation](https://arxiv.org/abs/2108.09105)
:house:[project](https://airbert-vln.github.io/)
* [Waypoint Models for Instruction-guided Navigation in Continuous Environments](https://arxiv.org/abs/2110.02207)
:open_mouth:oral:star:[code](https://github.com/jacobkrantz/VLN-CE):house:[project](https://jacobkrantz.github.io/waypoint-vlnce/):tv:[video](https://youtu.be/hrHj9-1xoio)
* [The Road To Know-Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation](https://openaccess.thecvf.com/content/ICCV2021/papers/Qi_The_Road_To_Know-Where_An_Object-and-Room_Informed_Sequential_BERT_for_ICCV_2021_paper.pdf)
:star:[code](https://github.com/YuankaiQi/ORIST)
* [Vision-Language Navigation With Random Environmental Mixup](https://arxiv.org/abs/2106.07876)
* 视觉对话导航
* [Self-Motivated Communication Agent for Real-World Vision-Dialog Navigation](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhu_Self-Motivated_Communication_Agent_for_Real-World_Vision-Dialog_Navigation_ICCV_2021_paper.pdf)
* 视觉导航
* [Pose Invariant Topological Memory for Visual Navigation](https://openaccess.thecvf.com/content/ICCV2021/papers/Taniguchi_Pose_Invariant_Topological_Memory_for_Visual_Navigation_ICCV_2021_paper.pdf)
:star:[code](https://github.com/jonkhler/s2cnn)
* [Visual Graph Memory With Unsupervised Representation for Visual Navigation](https://openaccess.thecvf.com/content/ICCV2021/papers/Kwon_Visual_Graph_Memory_With_Unsupervised_Representation_for_Visual_Navigation_ICCV_2021_paper.pdf)
:star:[code](https://github.com/rllab-snu/Visual-Graph-Memory):house:[project](https://rllab-snu.github.io/projects/vgm/doc.html):tv:[video](https://www.youtube.com/watch?v=Uksb_kR80Hk)
* visual grounding
* [InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring](https://arxiv.org/abs/2103.01128)
:star:[code](https://github.com/CurryYuan/InstanceRefer)
* [TransVG: End-to-End Visual Grounding With Transformers](https://arxiv.org/abs/2104.08541)
:star:[code](https://github.com/djiajunustc/TransVG)
* 视觉对话
* [Unified Questioner Transformer for Descriptive Question Generation in Goal-Oriented Visual Dialogue](https://arxiv.org/abs/2106.15550)

## 51.View Synthesis(视图合成)
* [Out-of-boundary View Synthesis Towards Full-Frame Video Stabilization](https://arxiv.org/abs/2108.09041)
:star:[code](https://github.com/Annbless/OVS_Stabilization)
* [Deep 3D Mask Volume for View Synthesis of Dynamic Scenes](https://arxiv.org/abs/2108.13408)
:house:[project](https://cseweb.ucsd.edu//~viscomp/projects/ICCV21Deep/)
* [Embedding Novel Views in a Single JPEG Image](https://arxiv.org/abs/2108.13003)
* [Video Autoencoder: self-supervised disentanglement of static 3D structure and motion](https://arxiv.org/abs/2110.02951)
:open_mouth:oral:star:[code](https://github.com/zlai0/VideoAutoencoder/):house:[project](https://zlai0.github.io/VideoAutoencoder/#method_video):tv:[video](https://www.youtube.com/watch?v=UaJZd4FrM8E)
* [Geometry-Free View Synthesis: Transformers and No 3D Priors](https://arxiv.org/abs/2104.07652)
:star:[code](https://github.com/CompVis/geometry-free-view-synthesis)
* [Dynamic View Synthesis From Dynamic Monocular Video](https://arxiv.org/abs/2105.06468)
:house:[project](https://free-view-video.github.io/):tv:[video](https://youtu.be/j8CUzIR0f8M)
* [Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis](https://arxiv.org/abs/2104.00677)
:house:[project](https://www.ajayj.com/dietnerf):tv:[video](https://youtu.be/RF_3hsNizqw)
* [Infinite Nature: Perpetual View Generation of Natural Scenes from a Single Image](https://arxiv.org/abs/2012.09855)
:open_mouth:oral:star:[code](https://github.com/google-research/google-research/tree/master/infinite_nature):house:[project](https://infinite-nature.github.io/):tv:[video](https://www.youtube.com/watch?v=oXUf6anNAtc)
* [Worldsheet: Wrapping the World in a 3D Sheet for View Synthesis from a Single Image](https://arxiv.org/abs/2012.09854)
:open_mouth:oral:star:[code](https://github.com/facebookresearch/worldsheet):house:[project](https://worldsheet.github.io/):tv:[video](https://youtu.be/j5aT3zRxFlk)

## 50.Continual Learning(持续学习)
* [Online Continual Learning with Natural Distribution Shifts: An Empirical Study with Visual Data](https://arxiv.org/abs/2108.09020)
:star:[code](https://github.com/IntelLabs/continuallearning)
* [Continual Learning on Noisy Data Streams via Self-Purified Replay](https://openaccess.thecvf.com/content/ICCV2021/papers/Kim_Continual_Learning_on_Noisy_Data_Streams_via_Self-Purified_Replay_ICCV_2021_paper.pdf)
:star:[code](https://github.com/ecrireme/SPR)
* [Rehearsal Revealed: The Limits and Merits of Revisiting Samples in Continual Learning](https://arxiv.org/abs/2104.07446)
:star:[code](https://github.com/Mattdl/RehearsalRevealed)
* [Co2L: Contrastive Continual Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Cha_Co2L_Contrastive_Continual_Learning_ICCV_2021_paper.pdf)
:star:[code](https://github.com/chaht01/Co2L)


## 49.Human-Object Interaction(人物交互)
* [Exploiting Scene Graphs for Human-Object Interaction Detection](https://arxiv.org/abs/2108.08584)
:star:[code](https://github.com/ht014/SG2HOI)
* [Spatially Conditioned Graphs for Detecting Human-Object Interactions](https://arxiv.org/abs/2012.06060)
:star:[code](https://github.com/fredzzhang/spatially-conditioned-graphs):tv:[video](https://www.youtube.com/watch?v=gkBWi_rWedU)
* [Virtual Multi-Modality Self-Supervised Foreground Matting for Human-Object Interaction](https://arxiv.org/abs/2110.03278)
* [Detecting Human-Object Relationships in Videos](https://openaccess.thecvf.com/content/ICCV2021/papers/Ji_Detecting_Human-Object_Relationships_in_Videos_ICCV_2021_paper.pdf)
* [Weakly Supervised Human-Object Interaction Detection in Video via Contrastive Spatiotemporal Regions](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Weakly_Supervised_Human-Object_Interaction_Detection_in_Video_via_Contrastive_Spatiotemporal_ICCV_2021_paper.pdf)
:star:[code](https://github.com/ShuangLI59/weakly-supervised-human-object-detection-video):house:[project](https://shuangli-project.github.io/weakly-supervised-human-object-detection-video/):sunflower:[dataset](https://shuangli-project.github.io/VHICO-Dataset/)
* [Discovering Human Interactions With Large-Vocabulary Objects via Query and Multi-Scale Detection](https://openaccess.thecvf.com/content/ICCV2021/papers/Wang_Discovering_Human_Interactions_With_Large-Vocabulary_Objects_via_Query_and_Multi-Scale_ICCV_2021_paper.pdf)
:star:[code](https://github.com/scwangdyd/large_vocabulary_hoi_detection)
* [Visual Relationship Detection Using Part-and-Sum Transformers With Composite Queries](https://arxiv.org/abs/2105.02170)VRD和HOI
* [Interaction Compass: Multi-Label Zero-Shot Learning of Human-Object Interactions via Spatial Relations](https://openaccess.thecvf.com/content/ICCV2021/papers/Huynh_Interaction_Compass_Multi-Label_Zero-Shot_Learning_of_Human-Object_Interactions_via_Spatial_ICCV_2021_paper.pdf)
:star:[code](https://github.com/hbdat/iccv21_relational_direction)
* H2O
* [H2O: A Benchmark for Visual Human-Human Object Handover Analysis](https://arxiv.org/abs/2104.11466)
* Human Interaction Understanding
* [Consistency-Aware Graph Network for Human Interaction Understanding](https://arxiv.org/abs/2011.10250)
:star:[code](https://github.com/deepgogogo/CAGNet?v=1)
* [H2O: Two Hands Manipulating Objects for First Person Interaction Recognition](https://arxiv.org/abs/2104.11181)
:house:[project](https://www.taeinkwon.com/projects/h2o)
* 手物交互
* [Toward Human-Like Grasp: Dexterous Grasping via Semantic Representation of Object-Hand](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhu_Toward_Human-Like_Grasp_Dexterous_Grasping_via_Semantic_Representation_of_Object-Hand_ICCV_2021_paper.pdf)
* [Reconstructing Hand-Object Interactions in the Wild](https://arxiv.org/abs/2012.09856)
:house:[project](https://people.eecs.berkeley.edu/~zhecao/rhoi/)
* [CPF: Learning a Contact Potential Field To Model the Hand-Object Interaction](https://arxiv.org/abs/2012.00924)
:star:[code](https://github.com/lixiny/CPF)手物交互
* HOI(行为理解)
* [GeomNet: A Neural Network Based on Riemannian Geometries of SPD Matrix Space and Cholesky Space for 3D Skeleton-Based Interaction Recognition](https://openaccess.thecvf.com/content/ICCV2021/papers/Nguyen_GeomNet_A_Neural_Network_Based_on_Riemannian_Geometries_of_SPD_ICCV_2021_paper.pdf)

## 48.6DoF
* [SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation](https://arxiv.org/abs/2108.08367)
:star:[code](https://github.com/shangbuhuan13/SO-Pose)
* [StereOBJ-1M: Large-scale Stereo Image Dataset for 6D Object Pose Estimation](https://arxiv.org/abs/2109.10115)
* [SGPA: Structure-Guided Prior Adaptation for Category-Level 6D Object Pose Estimation](https://openaccess.thecvf.com/content/ICCV2021/papers/Chen_SGPA_Structure-Guided_Prior_Adaptation_for_Category-Level_6D_Object_Pose_Estimation_ICCV_2021_paper.pdf)
* [RePOSE: Fast 6D Object Pose Refinement via Deep Texture Rendering](https://arxiv.org/abs/2104.00633)
:star:[code](https://github.com/sh8/repose)
* [DualPoseNet: Category-Level 6D Object Pose and Size Estimation Using Dual Pose Network With Refined Learning of Pose Consistency](https://arxiv.org/abs/2103.06526)
:star:[code](https://github.com/Gorilla-Lab-SCUT/DualPoseNet)
* [PR-GCN: A Deep Graph Convolutional Network With Point Refinement for 6D Pose Estimation](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhou_PR-GCN_A_Deep_Graph_Convolutional_Network_With_Point_Refinement_for_ICCV_2021_paper.pdf)
* 物体姿势估计
* [CAPTRA: CAtegory-Level Pose Tracking for Rigid and Articulated Objects From Point Clouds](https://arxiv.org/abs/2104.03437)
:open_mouth:oral:star:[code](https://github.com/halfsummer11/CAPTRA):house:[project](https://yijiaweng.github.io/CAPTRA/):tv:[video](https://youtu.be/EkcCEj7gZGg)

## 47.NAS
* [BN-NAS: Neural Architecture Search with Batch Normalization](https://arxiv.org/abs/2108.07375)
:star:[code](https://github.com/bychen515/BNNAS)
* [RANK-NOSH: Efficient Predictor-Based Architecture Search via Non-Uniform Successive Halving](https://arxiv.org/abs/2108.08019)
* [Pi-NAS: Improving Neural Architecture Search by Reducing Supernet Training Consistency Shift](https://arxiv.org/abs/2108.09671)
:star:[code](https://github.com/Ernie1/Pi-NAS)
* [Evolving Search Space for Neural Architecture Search](https://arxiv.org/abs/2011.10904)
:star:[code](https://github.com/orashi/NSE_NAS):tv:[video](https://www.youtube.com/watch?v=fq21WBaumRc)
* [FairNAS: Rethinking Evaluation Fairness of Weight Sharing Neural Architecture Search](https://arxiv.org/abs/1907.01845)
:star:[code](https://github.com/xiaomi-automl/FairNAS)
* [GLiT: Neural Architecture Search for Global and Local Image Transformer](https://arxiv.org/abs/2107.02960)
:star:[code](https://github.com/bychen515/GLiT)
* [Neural Architecture Search for Joint Human Parsing and Pose Estimation](https://openaccess.thecvf.com/content/ICCV2021/papers/Zeng_Neural_Architecture_Search_for_Joint_Human_Parsing_and_Pose_Estimation_ICCV_2021_paper.pdf)
:star:[code](https://github.com/GuHuangAI/NPP)
* [Distilling Optimal Neural Networks: Rapid Search in Diverse Spaces](https://arxiv.org/abs/2012.08859)
* [Learning Latent Architectural Distribution in Differentiable Neural Architecture Search via Variational Information Maximization](https://openaccess.thecvf.com/content/ICCV2021/papers/Wang_Learning_Latent_Architectural_Distribution_in_Differentiable_Neural_Architecture_Search_via_ICCV_2021_paper.pdf)
* [Not All Operations Contribute Equally: Hierarchical Operation-Adaptive Predictor for Neural Architecture Search](https://openaccess.thecvf.com/content/ICCV2021/papers/Chen_Not_All_Operations_Contribute_Equally_Hierarchical_Operation-Adaptive_Predictor_for_Neural_ICCV_2021_paper.pdf)
* [Zen-NAS: A Zero-Shot NAS for High-Performance Image Recognition](https://openaccess.thecvf.com/content/ICCV2021/papers/Lin_Zen-NAS_A_Zero-Shot_NAS_for_High-Performance_Image_Recognition_ICCV_2021_paper.pdf)
:star:[code](https://github.com/idstcv/ZenNAS)
* [BossNAS: Exploring Hybrid CNN-Transformers With Block-Wisely Self-Supervised Neural Architecture Search](https://arxiv.org/abs/2103.12424)
:star:[code](https://github.com/changlin31/BossNAS)
* [NAS-OoD: Neural Architecture Search for Out-of-Distribution Generalization](https://openaccess.thecvf.com/content/ICCV2021/papers/Bai_NAS-OoD_Neural_Architecture_Search_for_Out-of-Distribution_Generalization_ICCV_2021_paper.pdf)
* [AutoSpace: Neural Architecture Search With Less Human Interference](https://arxiv.org/abs/2103.11833)
:star:[code](https://github.com/zhoudaquan/AutoSpace)
* [IDARTS: Interactive Differentiable Architecture Search](https://openaccess.thecvf.com/content/ICCV2021/papers/Xue_IDARTS_Interactive_Differentiable_Architecture_Search_ICCV_2021_paper.pdf)

## 46.Defect Detection(缺陷检测)
* [DRÆM -- A discriminatively trained reconstruction embedding for surface anomaly detection](https://arxiv.org/abs/2108.07610)

## 45.Image Caption(图像字幕)
* [Who's Waldo? Linking People Across Text and Images](https://arxiv.org/abs/2108.07253)
:open_mouth:oral:house:[project](https://whoswaldo.github.io/)
:newspaper:解读:[ICCV2021 Oral-新任务!新数据集!康奈尔大学提出了类似VG但又不是VG的PVG任务](https://mp.weixin.qq.com/s/QC1UQRmZKgS0dctTXQ77Bg)
* [Partial Off-Policy Learning: Balance Accuracy and Diversity for Human-Oriented Image Captioning](https://openaccess.thecvf.com/content/ICCV2021/papers/Shi_Partial_Off-Policy_Learning_Balance_Accuracy_and_Diversity_for_Human-Oriented_Image_ICCV_2021_paper.pdf)
* [Topic Scene Graph Generation by Attention Distillation From Caption](https://openaccess.thecvf.com/content/ICCV2021/papers/Wang_Topic_Scene_Graph_Generation_by_Attention_Distillation_From_Caption_ICCV_2021_paper.pdf)
:star:[code](https://vipl.ict.ac.cn/view_database.php?id=6)
* [Understanding and Evaluating Racial Biases in Image Captioning](https://arxiv.org/abs/2106.08503)
:star:[code](https://github.com/princetonvisualai/imagecaptioning-bias):house:[project](https://princetonvisualai.github.io/imagecaptioning-bias/)
* [In Defense of Scene Graphs for Image Captioning](https://arxiv.org/abs/2102.04990)
:star:[code](https://github.com/Kien085/SG2Caps)
* art description generation(艺术描述生成)
* [Explain Me the Painting: Multi-Topic Knowledgeable Art Description Generation](https://arxiv.org/abs/2109.05743)
:star:[code](https://github.com/noagarcia/explain-paintings)
* Change Captioning
* [Viewpoint-Agnostic Change Captioning With Cycle Consistency](https://openaccess.thecvf.com/content/ICCV2021/papers/Kim_Viewpoint-Agnostic_Change_Captioning_With_Cycle_Consistency_ICCV_2021_paper.pdf)

## 44.Human motion prediction(人体运动预测)
* [MSR-GCN: Multi-Scale Residual Graph Convolution Networks for Human Motion Prediction](https://arxiv.org/abs/2108.07152)
:star:[code](https://github.com/Droliven/MSRGCN)
* [Stochastic Scene-Aware Motion Prediction](https://arxiv.org/abs/2108.08284)
:star:[code](https://github.com/mohamedhassanmus/SAMP):house:[project](https://samp.is.tue.mpg.de/)
* [Generating Smooth Pose Sequences for Diverse Human Motion Prediction](https://arxiv.org/abs/2108.08422)
:open_mouth:oral:star:[code](https://github.com/wei-mao-2019/gsps)
* [TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild](https://arxiv.org/abs/2104.04029)
:house:[project](https://somof.stanford.edu/)
* [Motion Prediction using Trajectory Cues](https://openaccess.thecvf.com/content/ICCV2021/papers/Liu_Motion_Prediction_Using_Trajectory_Cues_ICCV_2021_paper.pdf)
* 3D人体运动预测
* [Contextually Plausible and Diverse 3D Human Motion Prediction](https://arxiv.org/abs/1912.08521)

## 43.Dense Prediction(密集预测)
* [FaPN: Feature-aligned Pyramid Network for Dense Image Prediction](https://arxiv.org/abs/2108.07058)
:star:[code](https://github.com/EMI-Group/FaPN)
* 多任务密集预测
* [Exploring Relational Context for Multi-Task Dense Prediction](https://arxiv.org/abs/2104.13874)

## 42.Representations Learning(表征学习)
* [Learning From Noisy Data With Robust Representation Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Learning_From_Noisy_Data_With_Robust_Representation_Learning_ICCV_2021_paper.pdf)
:star:[code](https://github.com/salesforce/RRL/)
* [Self-Supervised Representation Learning From Flow Equivariance](https://arxiv.org/abs/2101.06553)
* [Exploring Visual Engagement Signals for Representation Learning](https://arxiv.org/abs/2104.07767)
:star:[code](https://github.com/KMnP/vise)
* [Switchable K-class Hyperplanes for Noise-Robust Representation Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Liu_Switchable_K-Class_Hyperplanes_for_Noise-Robust_Representation_Learning_ICCV_2021_paper.pdf)
:star:[code](https://github.com/liubx07/SKH)
* [Region Similarity Representation Learning](https://arxiv.org/abs/2103.12902)
:star:[code](https://github.com/Tete-Xiao/ReSim)
* [Curious Representation Learning for Embodied Intelligence](https://arxiv.org/abs/2105.01060)
:star:[code](https://github.com/yilundu/crl):house:[project](https://yilundu.github.io/crl/)
* 视觉表征学习
* [Self-Supervised Visual Representations Learning by Contrastive Mask Prediction](https://arxiv.org/abs/2108.07954)
:newspaper:解读:[ICCV2021 比MoCo更通用的对比学习范式,中科大&MSRA提出对比学习新方法MaskCo](https://mp.weixin.qq.com/s/t53ASvoSTTlXgxKTfEoZ7g)
* [Temporal Knowledge Consistency for Unsupervised Visual Representation Learning](https://arxiv.org/abs/2108.10668)
* [Contrasting Contrastive Self-Supervised Representation Learning Pipelines](https://arxiv.org/abs/2103.14005)
:star:[code](https://github.com/allenai/virb)
* [Concept Generalization in Visual Representation Learning](https://arxiv.org/abs/2012.05649)
:house:[project](https://europe.naverlabs.com/cog-benchmark)
* [Collaborative Unsupervised Visual Representation Learning from Decentralized Data](https://arxiv.org/abs/2108.06492)
* [Episodic Transformer for Vision-and-Language Navigation](https://arxiv.org/abs/2105.06453)
:star:[code](https://github.com/alexpashevich/E.T.)
* [Multi-VAE: Learning Disentangled View-Common and View-Peculiar Visual Representations for Multi-View Clustering](https://openaccess.thecvf.com/content/ICCV2021/papers/Xu_Multi-VAE_Learning_Disentangled_View-Common_and_View-Peculiar_Visual_Representations_for_Multi-View_ICCV_2021_paper.pdf)
* 视频表示学习
* [Composable Augmentation Encoding for Video Representation Learning](https://arxiv.org/abs/2104.00616)
* [Motion-Focused Contrastive Learning of Video Representations](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Motion-Focused_Contrastive_Learning_of_Video_Representations_ICCV_2021_paper.pdf)
* [ASCNet: Self-Supervised Video Representation Learning With Appearance-Speed Consistency](https://arxiv.org/abs/2106.02342)
* [ACAV100M: Automatic Curation of Large-Scale Datasets for Audio-Visual Video Representation Learning](https://arxiv.org/abs/2101.10803)
:house:[project](https://acav100m.github.io/)
* [Time-Equivariant Contrastive Video Representation Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Jenni_Time-Equivariant_Contrastive_Video_Representation_Learning_ICCV_2021_paper.pdf)
* [Space-Time Crop & Attend: Improving Cross-Modal Video Representation Learning](https://arxiv.org/abs/2103.10211)
:star:[code](https://github.com/facebookresearch/GDT)

## 41.Out-of-Distribution Detection(OOD)
* [CODEs: Chamfer Out-of-Distribution Examples against Overconfidence Issue](https://arxiv.org/abs/2108.06024)
* [Semantically Coherent Out-of-Distribution Detection](https://arxiv.org/abs/2108.11941)
:star:[code](https://github.com/jingkang50/ICCV21_SCOOD):house:[project](https://jingkang50.github.io/projects/scood)
* [The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization](https://arxiv.org/abs/2006.16241)
:star:[code](https://github.com/hendrycks/imagenet-r)

## 40.Metric Learning(度量学习)
* [Towards Interpretable Deep Metric Learning with Structural Matching](https://arxiv.org/abs/2108.05889)
:star:[code](https://github.com/wl-zhao/DIML)
* [Deep Relational Metric Learning](https://arxiv.org/abs/2108.10026)
:star:[code](https://github.com/zbr17/DRML)
* [LoOp: Looking for Optimal Hard Negative Embeddings for Deep Metric Learning](https://arxiv.org/abs/2108.09335)
:star:[code](https://github.com/puneesh00/LoOp)
* [Manifold Matching via Deep Metric Learning for Generative Modeling](https://arxiv.org/abs/2106.10777)
:star:[code](https://github.com/dzld00/pytorch-manifold-matching)

## 39.Incremental Learning(增量学习)
* 类增量学习
* [Always Be Dreaming: A New Approach for Data-Free Class-Incremental Learning](https://arxiv.org/abs/2106.09701)
:newspaper:解读:[让模型实现“终生学习”,佐治亚理工学院提出Data-Free的增量学习](https://mp.weixin.qq.com/s/Fm9ufPD6rzL2VzaqpdFpjg)
* [Striking a Balance Between Stability and Plasticity for Class-Incremental Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Wu_Striking_a_Balance_Between_Stability_and_Plasticity_for_Class-Incremental_Learning_ICCV_2021_paper.pdf)
* [Synthesized Feature Based Few-Shot Class-Incremental Learning on a Mixture of Subspaces](https://openaccess.thecvf.com/content/ICCV2021/papers/Cheraghian_Synthesized_Feature_Based_Few-Shot_Class-Incremental_Learning_on_a_Mixture_of_ICCV_2021_paper.pdf)
:star:[code](https://github.com/ali-chr/Synthesized-Feature-based-Few-Shot-Class-Incremental-Learningon-a-Mixture-of-Subspaces)

## 38.Weakly/Semi-Supervised/Self-supervised/Unsupervised Learning(自/半/弱监督学习)
* 半监督
* [Trash to Treasure: Harvesting OOD Data with Cross-Modal Matching for Open-Set Semi-Supervised Learning](https://arxiv.org/abs/2108.05617)
* [Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting View Assignments With Support Samples](https://arxiv.org/abs/2104.13963)
:star:[code](https://github.com/facebookresearch/suncet)
* [Semi-Supervised Active Learning for Semi-Supervised Models: Exploit Adversarial Examples With Graph-Based Virtual Labels](https://openaccess.thecvf.com/content/ICCV2021/papers/Guo_Semi-Supervised_Active_Learning_for_Semi-Supervised_Models_Exploit_Adversarial_Examples_With_ICCV_2021_paper.pdf)
* [CoMatch: Semi-Supervised Learning With Contrastive Graph Regularization](https://arxiv.org/abs/2011.11183)
:star:[code](https://github.com/salesforce/CoMatch)
* [Multiview Pseudo-Labeling for Semi-supervised Learning from Video](https://arxiv.org/abs/2104.00682)
* 自监督
* [Focus on the Positives: Self-Supervised Learning for Biodiversity Monitoring](https://arxiv.org/abs/2108.06435)
:star:[code](https://github.com/omipan/camera_traps_self_supervised)
* [Self-supervised Neural Networks for Spectral Snapshot Compressive Imaging](https://arxiv.org/abs/2108.12654)
:star:[code](https://github.com/mengziyi64/CASSI-Self-Supervised)
* [ISD: Self-Supervised Learning by Iterative Similarity Distillation](https://arxiv.org/abs/2012.09259)
:star:[code](https://github.com/UMBCvision/ISD)
* [Contrast and Order Representations for Video Self-Supervised Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Hu_Contrast_and_Order_Representations_for_Video_Self-Supervised_Learning_ICCV_2021_paper.pdf)
* [On Feature Decorrelation in Self-Supervised Learning](https://arxiv.org/abs/2105.00470)
:open_mouth:oral
* [Geography-Aware Self-Supervised Learning](https://arxiv.org/abs/2011.09980)
* [Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos](https://arxiv.org/abs/2104.12671)
* [Efficient Visual Pretraining with Contrastive Detection](https://arxiv.org/abs/2103.10957)
* [Broaden Your Views for Self-Supervised Video Learning](https://arxiv.org/abs/2103.16559)
* [CDS: Cross-Domain Self-supervised Pre-training](https://openaccess.thecvf.com/content/ICCV2021/papers/Kim_CDS_Cross-Domain_Self-Supervised_Pre-Training_ICCV_2021_paper.pdf)
* [On Compositions of Transformations in Contrastive Self-Supervised Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Patrick_On_Compositions_of_Transformations_in_Contrastive_Self-Supervised_Learning_ICCV_2021_paper.pdf)
:star:[code](https://github.com/facebookresearch/GDT)
* [Solving Inefficiency of Self-Supervised Representation Learning](https://arxiv.org/abs/2104.08760)
:star:[code](https://github.com/wanggrun/triplet)
* [Divide and Contrast: Self-supervised Learning from Uncurated Data](https://openaccess.thecvf.com/content/ICCV2021/papers/Tian_Divide_and_Contrast_Self-Supervised_Learning_From_Uncurated_Data_ICCV_2021_paper.pdf)
* [Emerging Properties in Self-Supervised Vision Transformers](https://arxiv.org/abs/2104.14294)
:star:[code](https://github.com/facebookresearch/dino)
* [Mean Shift for Self-Supervised Learning](https://arxiv.org/abs/2105.07269)
:star:[code](https://github.com/UMBCvision/MSF)
* 弱监督
* [Weakly Supervised Representation Learning With Coarse Labels](https://arxiv.org/abs/2005.09681)
:star:[code](https://github.com/idstcv/CoIns)

## 37.Multitask Learning(多任务学习)
* [MultiTask-CenterNet (MCN): Efficient and Diverse Multitask Learning using an Anchor Free Approach](https://arxiv.org/abs/2108.05060)
:newspaper:解读:[ICCV2021《MultiTask CenterNet》CV多任务新进展!一节更比三节强](https://mp.weixin.qq.com/s/toAZS0OHdW4MG30P1wAAUA)
* [Multi-Task Self-Training for Learning General Representations](https://arxiv.org/abs/2108.11353)
:newspaper:解读:[ICCV2021 MuST:还在特定任务里为刷点而苦苦挣扎?谷歌的大佬们都已经开始玩多任务训练了](https://mp.weixin.qq.com/s/nhv1l9xBSaZceibIn5fhqw)
* [UniT: Multimodal Multitask Learning With a Unified Transformer](https://arxiv.org/abs/2102.10772)
:star:[code](https://mmf.sh/)
* [Learning Multiple Pixelwise Tasks Based on Loss Scale Balancing](https://openaccess.thecvf.com/content/ICCV2021/papers/Lee_Learning_Multiple_Pixelwise_Tasks_Based_on_Loss_Scale_Balancing_ICCV_2021_paper.pdf)
:star:[code](https://github.com/jaehanlee-mcl/LSB-MTL)
* [Learning With Privileged Tasks](https://openaccess.thecvf.com/content/ICCV2021/papers/Song_Learning_With_Privileged_Tasks_ICCV_2021_paper.pdf)
* [Task Switching Network for Multi-Task Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Sun_Task_Switching_Network_for_Multi-Task_Learning_ICCV_2021_paper.pdf)

## 36.SLAM/AR/VR/机器人
* 机器人
* 室内导航
* [The Surprising Effectiveness of Visual Odometry Techniques for Embodied PointGoal Navigation](https://arxiv.org/abs/2108.11550)
:star:[code](https://github.com/Xiaoming-Zhao/PointNav-VO):house:[project](https://xiaoming-zhao.github.io/projects/pointnav-vo/)
* [Pathdreamer: A World Model for Indoor Navigation](https://arxiv.org/abs/2105.08756)
:tv:[video](https://www.youtube.com/watch?v=StklIENGqs0)
* 机器手抓取
* [Hand-Object Contact Consistency Reasoning for Human Grasps Generation](https://arxiv.org/abs/2104.03304)
:open_mouth:oral:star:[code](https://github.com/hwjiang1510/GraspTTA):house:[project](https://hwjiang1510.github.io/GraspTTA/):tv:[video](https://youtu.be/zGVLVXZoVZs)
* VR/AR
* [The Power of Points for Modeling Humans in Clothing](https://arxiv.org/abs/2109.01137)
:star:[code](https://github.com/qianlim/POP):house:[project](https://qianlim.github.io/POP):tv:[video](https://youtu.be/5M4F9zSWIEE)
* 虚拟试穿
* [M3D-VTON: A Monocular-to-3D Virtual Try-On Network](https://arxiv.org/abs/2108.05126)
:star:[code](https://github.com/fyviezhao/M3D-VTON)
* [ZFlow: Gated Appearance Flow-based Virtual Try-on with 3D Priors](https://arxiv.org/abs/2109.07001)
* [Dressing in Order: Recurrent Person Image Generation for Pose Transfer, Virtual Try-On and Outfit Editing](https://arxiv.org/abs/2104.07021)
* [FashionMirror: Co-Attention Feature-Remapping Virtual Try-On With Sequential Template Poses](https://openaccess.thecvf.com/content/ICCV2021/papers/Chen_FashionMirror_Co-Attention_Feature-Remapping_Virtual_Try-On_With_Sequential_Template_Poses_ICCV_2021_paper.pdf)
* [Structure-transformed Texture-enhanced Network for Person Image Synthesis](https://openaccess.thecvf.com/content/ICCV2021/papers/Xu_Structure-Transformed_Texture-Enhanced_Network_for_Person_Image_Synthesis_ICCV_2021_paper.pdf)
* SLAM
* [On the Limits of Pseudo Ground Truth in Visual Camera Re-localisation](https://arxiv.org/abs/2109.00524)
:star:[code](https://github.com/tsattler/visloc_pseudo_gt_limitations/)
* [Transfusion: A Novel SLAM Method Focused on Transparent Objects](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhu_Transfusion_A_Novel_SLAM_Method_Focused_on_Transparent_Objects_ICCV_2021_paper.pdf)
* [iMAP: Implicit Mapping and Positioning in Real-Time](https://arxiv.org/abs/2103.12352)
* [Learning To Bundle-Adjust: A Graph Network Approach to Faster Optimization of Bundle Adjustment for Vehicular SLAM](https://openaccess.thecvf.com/content/ICCV2021/papers/Tanaka_Learning_To_Bundle-Adjust_A_Graph_Network_Approach_to_Faster_Optimization_ICCV_2021_paper.pdf)
* [R-SLAM: Optimizing Eye Tracking From Rolling Shutter Video of the Retina](https://openaccess.thecvf.com/content/ICCV2021/papers/Shenoy_R-SLAM_Optimizing_Eye_Tracking_From_Rolling_Shutter_Video_of_the_ICCV_2021_paper.pdf)
* Place Recognition
* [Attentional Pyramid Pooling of Salient Visual Residuals for Place Recognition](https://openaccess.thecvf.com/content/ICCV2021/papers/Peng_Attentional_Pyramid_Pooling_of_Salient_Visual_Residuals_for_Place_Recognition_ICCV_2021_paper.pdf)
* [Pyramid Point Cloud Transformer for Large-Scale Place Recognition](https://openaccess.thecvf.com/content/ICCV2021/papers/Hui_Pyramid_Point_Cloud_Transformer_for_Large-Scale_Place_Recognition_ICCV_2021_paper.pdf)
:star:[code](https://github.com/fpthink/PPT-Net)

## 35.Quantization/Pruning/Knowledge Distillation/Model Compression(量化、剪枝、蒸馏、模型压缩/扩展与优化)
* 知识蒸馏
* [Distilling Holistic Knowledge with Graph Neural Networks](https://arxiv.org/abs/2108.05507)
:star:[code](https://github.com/wyc-ruiker/HKD)
* [Lipschitz Continuity Guided Knowledge Distillation](https://arxiv.org/abs/2108.12905)
:star:[code](https://github.com/42Shawn/LONDON/tree/master)
* [Densely Guided Knowledge Distillation Using Multiple Teacher Assistants](https://arxiv.org/abs/2009.08825)
:star:[code](https://github.com/wonchulSon/DGKD)
* [Revisiting Adversarial Robustness Distillation: Robust Soft Labels Make Student Better](https://arxiv.org/abs/2108.07969)
:star:[code](https://github.com/zibojia/RSLAD)
* [Compressing Visual-linguistic Model via Knowledge Distillation](https://arxiv.org/abs/2104.02096)
* [Self-Knowledge Distillation With Progressive Refinement of Targets](https://arxiv.org/abs/2006.12000)
:star:[code](https://github.com/lgcnsai/PS-KD-Pytorch):tv:[video](https://drive.google.com/file/d/1QxqSbzn-egdYI13IYn3W4dmIvm_Iw4ku/view)
* [Student Customized Knowledge Distillation: Bridging the Gap Between Student and Teacher](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhu_Student_Customized_Knowledge_Distillation_Bridging_the_Gap_Between_Student_and_ICCV_2021_paper.pdf)
* [Channel-Wise Knowledge Distillation for Dense Prediction](https://arxiv.org/abs/2011.13256)
:star:[code](https://github.com/irfanICMLL/TorchDistiller/tree/main/SemSeg-distill)
* [Exploring Inter-Channel Correlation for Diversity-Preserved Knowledge Distillation](https://openaccess.thecvf.com/content/ICCV2021/papers/Liu_Exploring_Inter-Channel_Correlation_for_Diversity-Preserved_Knowledge_Distillation_ICCV_2021_paper.pdf)
:star:[code](https://github.com/ADLab-AutoDrive/ICKD)
* 量化
* [Distance-aware Quantization](https://arxiv.org/abs/2108.06983)
:star:[code](https://github.com/cvlab-yonsei/DAQ):house:[project](https://cvlab.yonsei.ac.kr/projects/DAQ/)
* [Dynamic Network Quantization for Efficient Video Inference](https://arxiv.org/abs/2108.10394)
:star:[code](https://github.com/sunxm2357/VideoIQ):house:[project](https://cs-people.bu.edu/sunxm/VideoIQ/project.html)
* [Cluster-Promoting Quantization with Bit-Drop for Minimizing Network Quantization Loss](https://arxiv.org/abs/2109.02100)
* [Improving Low-Precision Network Quantization via Bin Regularization](https://openaccess.thecvf.com/content/ICCV2021/papers/Han_Improving_Low-Precision_Network_Quantization_via_Bin_Regularization_ICCV_2021_paper.pdf)
* [Towards Mixed-Precision Quantization of Neural Networks via Constrained Optimization](https://openaccess.thecvf.com/content/ICCV2021/papers/Chen_Towards_Mixed-Precision_Quantization_of_Neural_Networks_via_Constrained_Optimization_ICCV_2021_paper.pdf)
* [Integer-arithmetic-only Certified Robustness for Quantized Neural Networks](https://arxiv.org/abs/2108.09413)
* [RMSMP: A Novel Deep Neural Network Quantization Framework With Row-Wise Mixed Schemes and Multiple Precisions](https://openaccess.thecvf.com/content/ICCV2021/papers/Chang_RMSMP_A_Novel_Deep_Neural_Network_Quantization_Framework_With_Row-Wise_ICCV_2021_paper.pdf)
* [Improving Neural Network Efficiency via Post-Training Quantization With Adaptive Floating-Point](https://openaccess.thecvf.com/content/ICCV2021/papers/Liu_Improving_Neural_Network_Efficiency_via_Post-Training_Quantization_With_Adaptive_Floating-Point_ICCV_2021_paper.pdf)
:star:[code](https://github.com/MXHX7199/ICCV_2021_AFP)
* [Once Quantization-Aware Training: High Performance Extremely Low-bit Architecture Search](https://arxiv.org/abs/2010.04354)
:star:[code](https://github.com/LaVieEnRoseSMZ/OQA)
* 模型压缩
* [GDP: Stabilized Neural Network Pruning via Gates with Differentiable Polarization](https://arxiv.org/abs/2109.02220)
* [Exploration and Estimation for Model Compression](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhang_Exploration_and_Estimation_for_Model_Compression_ICCV_2021_paper.pdf)
* 剪枝
* [ResRep: Lossless CNN Pruning via Decoupling Remembering and Forgetting](https://arxiv.org/abs/2007.03260)
:star:[code](https://github.com/DingXiaoH/ResRep)
* [Auto Graph Encoder-Decoder for Neural Network Pruning](https://openaccess.thecvf.com/content/ICCV2021/papers/Yu_Auto_Graph_Encoder-Decoder_for_Neural_Network_Pruning_ICCV_2021_paper.pdf)

## 34.Super-Resolution(超分辨率)
* ISR
* [Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution](https://arxiv.org/abs/2108.05302)
:star:[code](https://github.com/JingyunLiang/MANet)
* [Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling](https://arxiv.org/abs/2108.05301)
:star:[code](https://github.com/JingyunLiang/HCFlow)
* [Deep Reparametrization of Multi-Frame Super-Resolution and Denoising](https://arxiv.org/abs/2108.08286)
:open_mouth:oral
* [Dual-Camera Super-Resolution with Aligned Attention Modules](https://arxiv.org/abs/2109.01349)
:star:[code](https://github.com/Tengfei-Wang/DualCameraSR):house:[project](https://tengfei-wang.github.io/Dual-Camera-SR/index.html):tv:[video](https://www.youtube.com/watch?v=5TiUfAcNvuw)
* [Attention-Based Multi-Reference Learning for Image Super-Resolution](https://arxiv.org/abs/2108.13697)
:star:[code](https://github.com/marcopesavento/AMRSR):house:[project](https://marcopesavento.github.io/AMRSR/)
* [Learning a Single Network for Scale-Arbitrary Super-Resolution](https://arxiv.org/abs/2004.03791)
* [Fourier Space Losses for Efficient Perceptual Image Super-Resolution](https://arxiv.org/abs/2106.00783)
:star:[code](https://github.com/dariofuoli)
* [Achieving On-Mobile Real-Time Super-Resolution With Neural Architecture and Pruning Search](https://arxiv.org/abs/2108.08910)
* [Designing a Practical Degradation Model for Deep Blind Image Super-Resolution](https://arxiv.org/abs/2103.14006)
:star:[code](https://github.com/cszn/BSRGAN)
* [Event Stream Super-Resolution via Spatiotemporal Constraint Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Event_Stream_Super-Resolution_via_Spatiotemporal_Constraint_Learning_ICCV_2021_paper.pdf)
* [Dynamic High-Pass Filtering and Multi-Spectral Attention for Image Super-Resolution](https://openaccess.thecvf.com/content/ICCV2021/papers/Magid_Dynamic_High-Pass_Filtering_and_Multi-Spectral_Attention_for_Image_Super-Resolution_ICCV_2021_paper.pdf)
* [Super-Resolving Cross-Domain Face Miniatures by Peeking at One-Shot Exemplar](https://arxiv.org/abs/2103.08863)
* [Context Reasoning Attention Network for Image Super-Resolution](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhang_Context_Reasoning_Attention_Network_for_Image_Super-Resolution_ICCV_2021_paper.pdf)
* [EvIntSR-Net: Event Guided Multiple Latent Frames Reconstruction and Super-Resolution](https://openaccess.thecvf.com/content/ICCV2021/papers/Han_EvIntSR-Net_Event_Guided_Multiple_Latent_Frames_Reconstruction_and_Super-Resolution_ICCV_2021_paper.pdf)
* [Super Resolve Dynamic Scene from Continuous Spike Streams](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhao_Super_Resolve_Dynamic_Scene_From_Continuous_Spike_Streams_ICCV_2021_paper.pdf)
* [Deep Blind Video Super-Resolution](https://openaccess.thecvf.com/content/ICCV2021/papers/Pan_Deep_Blind_Video_Super-Resolution_ICCV_2021_paper.pdf)
* [Benchmarking Ultra-High-Definition Image Super-Resolution](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhang_Benchmarking_Ultra-High-Definition_Image_Super-Resolution_ICCV_2021_paper.pdf)
* [Lucas-Kanade Reloaded: End-to-End Super-Resolution From Raw Image Bursts](https://arxiv.org/abs/2104.06191)
* [Unsupervised Real-World Super-Resolution: A Domain Adaptation Perspective](https://openaccess.thecvf.com/content/ICCV2021/papers/Wang_Unsupervised_Real-World_Super-Resolution_A_Domain_Adaptation_Perspective_ICCV_2021_paper.pdf)
* [Real-World Video Super-Resolution: A Benchmark Dataset and a Decomposition Based Learning Scheme](https://openaccess.thecvf.com/content/ICCV2021/papers/Yang_Real-World_Video_Super-Resolution_A_Benchmark_Dataset_and_a_Decomposition_Based_ICCV_2021_paper.pdf)
:star:[code](https://github.com/IanYeung/RealVSR)
:newspaper:解读:[ICCV2021 香港理工、阿里达摩院提出RealVSR:视频超分任务中的新数据集与损失方案](https://mp.weixin.qq.com/s/pQWQgJJgCzDFX6lQcXt3wg)
* VSR
* [Omniscient Video Super-Resolution](https://arxiv.org/abs/2103.15683)
:star:[code](https://github.com/psychopa4/OVSR)
* [COMISR: Compression-Informed Video Super-Resolution](https://arxiv.org/abs/2105.01237)
:star:[code](https://github.com/google-research/google-research/tree/master/comisr)
:newspaper:解读:[谷歌提出COMISR算法:针对视频压缩的压缩感知超分辨率](https://mp.weixin.qq.com/s/DhE49Ek0v0PelDewNNjP3w)
* [Learning Frequency-Aware Dynamic Network for Efficient Super-Resolution](https://arxiv.org/abs/2103.08357)
* [Efficient Video Compression via Content-Adaptive Super-Resolution](https://arxiv.org/abs/2104.02322)
:star:[code](https://github.com/AdaptiveVC/SRVC)

## 33.Remote Sensing Images(遥感影像)
* [SUNet: Symmetric Undistortion Network for Rolling Shutter Correction](https://arxiv.org/abs/2108.04775)
* [Change is Everywhere: Single-Temporal Supervised Object Change Detection in Remote Sensing Imagery](https://arxiv.org/abs/2108.07002)
:star:[code](https://github.com/Z-Zheng/ChangeStar)
:newspaper:解读:[ICCV2021|武汉大学RSIDEA团队提出一种新颖的弱监督遥感变化检测算法STAR](https://mp.weixin.qq.com/s/hATPy1T2zh9JgwBeRMWh0A)
* 卫星图像全景视频合成
* [Sat2Vid: Street-view Panoramic Video Synthesis from a Single Satellite Image](https://arxiv.org/abs/2012.06628)
* 基于卫星影像的交通事故检测
* [Inferring High-Resolution Traffic Accident Risk Maps Based on Satellite Imagery and GPS Trajectories](https://openaccess.thecvf.com/content/ICCV2021/papers/He_Inferring_High-Resolution_Traffic_Accident_Risk_Maps_Based_on_Satellite_Imagery_ICCV_2021_paper.pdf)
* 遥感数据
* [Seasonal Contrast: Unsupervised Pre-Training From Uncurated Remote Sensing Data](https://openaccess.thecvf.com/content/ICCV2021/papers/Manas_Seasonal_Contrast_Unsupervised_Pre-Training_From_Uncurated_Remote_Sensing_Data_ICCV_2021_paper.pdf)
:star:[code](https://github.com/ElementAI/seasonal-contrast)
* [Dynamic Cross Feature Fusion for Remote Sensing Pansharpening](https://openaccess.thecvf.com/content/ICCV2021/papers/Wu_Dynamic_Cross_Feature_Fusion_for_Remote_Sensing_Pansharpening_ICCV_2021_paper.pdf)
* 分割
* [Self-Mutating Network for Domain Adaptive Segmentation in Aerial Images](https://openaccess.thecvf.com/content/ICCV2021/papers/Lee_Self-Mutating_Network_for_Domain_Adaptive_Segmentation_in_Aerial_Images_ICCV_2021_paper.pdf)
* 卫星图像的全景分割
* [Panoptic Segmentation of Satellite Image Time Series With Convolutional Temporal Attention Networks](https://arxiv.org/abs/2107.07933)
:star:[code](https://github.com/VSainteuf/utae-paps):sunflower:[PASTIS dataset](https://github.com/VSainteuf/pastis-benchmark)
* 三维重建
* [3D Building Reconstruction from Monocular Remote Sensing Images](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_3D_Building_Reconstruction_From_Monocular_Remote_Sensing_Images_ICCV_2021_paper.pdf)
:house:[project](https://liweijia.github.io/projects/building_3d/)

## 32.语音
* [The Right to Talk: An Audio-Visual Transformer Approach](https://arxiv.org/abs/2108.03256)
:star:[code](https://github.com/uark-cviu/Right2Talk)
* [Image2Reverb: Cross-Modal Reverb Impulse Response Synthesis](https://arxiv.org/abs/2103.14201)
:star:[code](https://github.com/nikhilsinghmus/image2reverb):house:[project](https://web.media.mit.edu/~nsingh1/image2reverb/)
* 音频分离
* [Visual Scene Graphs for Audio Source Separation](https://arxiv.org/abs/2109.11955)
* 音频-手势
* [Audio2Gestures: Generating Diverse Gestures from Speech Audio with Conditional Variational Autoencoders](https://arxiv.org/abs/2108.06720)
:house:[project](https://openaccess.thecvf.com/content/ICCV2021/papers/Xiang_SnowflakeNet_Point_Cloud_Completion_by_Snowflake_Point_Deconvolution_With_Skip-Transformer_ICCV_2021_paper.pdf)
* Active Speaker Detection(ASD主动式扬声器检测)
* [How To Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild](https://arxiv.org/abs/2106.03932)
:star:[code](https://github.com/okankop/ASDNet)
* [MAAS: Multi-Modal Assignation for Active Speaker Detection](https://openaccess.thecvf.com/content/ICCV2021/papers/Alcazar_MAAS_Multi-Modal_Assignation_for_Active_Speaker_Detection_ICCV_2021_paper.pdf)
* 从人脸视频中重新收集音频
* [Multi-Modality Associative Bridging Through Memory: Speech Sound Recollected From Face Video](https://openaccess.thecvf.com/content/ICCV2021/papers/Kim_Multi-Modality_Associative_Bridging_Through_Memory_Speech_Sound_Recollected_From_Face_ICCV_2021_paper.pdf)
* 视听源定位
* [Localize to Binauralize: Audio Spatialization From Visual Sound Source Localization](https://openaccess.thecvf.com/content/ICCV2021/papers/Rachavarapu_Localize_to_Binauralize_Audio_Spatialization_From_Visual_Sound_Source_Localization_ICCV_2021_paper.pdf)
:star:[code](https://github.com/KranthiKumarR/Localize-to-Binauralize):tv:[video](https://drive.google.com/drive/folders/1a5BV0U3RaQJS5wXyR7pzIAPMKOsGQz_q)
* 视听源分离
* [Move2Hear: Active Audio-Visual Source Separation](https://openaccess.thecvf.com/content/ICCV2021/papers/Majumder_Move2Hear_Active_Audio-Visual_Source_Separation_ICCV_2021_paper.pdf)
:star:[code](https://github.com/SAGNIKMJR/move2hear-active-AV-separation):house:[project](http://vision.cs.utexas.edu/projects/move2hear/)
* 视听平面图重建
* [Audio-Visual Floorplan Reconstruction](https://openaccess.thecvf.com/content/ICCV2021/papers/Purushwalkam_Audio-Visual_Floorplan_Reconstruction_ICCV_2021_paper.pdf)
:star:[code](https://github.com/senthilps8/avmap):house:[project](http://www.cs.cmu.edu/~spurushw/publication/avmap/):tv:[video](https://youtu.be/wRslVfd1hOI)

## 31.Style Transfer(风格迁移)
* [AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer](https://arxiv.org/abs/2108.03647)
:star:[code](https://github.com/Huage001/AdaAttN)
* [Domain-Aware Universal Style Transfer](https://arxiv.org/abs/2108.04441)
:star:[code](https://github.com/Kibeom-Hong/Domain-Aware-Style-Transfer)
* [Diverse Image Style Transfer via Invertible Cross-Space Mapping](https://openaccess.thecvf.com/content/ICCV2021/papers/Chen_Diverse_Image_Style_Transfer_via_Invertible_Cross-Space_Mapping_ICCV_2021_paper.pdf)
* [StyleFormer: Real-Time Arbitrary Style Transfer via Parametric Style Composition](https://openaccess.thecvf.com/content/ICCV2021/papers/Wu_StyleFormer_Real-Time_Arbitrary_Style_Transfer_via_Parametric_Style_Composition_ICCV_2021_paper.pdf)
* [Manifold Alignment for Semantically Aligned Style Transfer](https://arxiv.org/abs/2005.10777)
:star:[code](https://github.com/NJUHuoJing/MAST)

## 30.Image Generation/synthesis(图像生成/合成)
* [ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models](https://arxiv.org/abs/2108.02938)
:open_mouth:oral
* [Image Synthesis via Semantic Composition](https://arxiv.org/abs/2109.07053)
:star:[code](https://github.com/dvlab-research/SCGAN):house:[project](https://shepnerd.github.io/scg/)
* [Image Synthesis From Layout With Locality-Aware Mask Adaption](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Image_Synthesis_From_Layout_With_Locality-Aware_Mask_Adaption_ICCV_2021_paper.pdf)
* 图像融合
* [DTMNet: A Discrete Tchebichef Moments-Based Deep Neural Network for Multi-Focus Image Fusion](https://openaccess.thecvf.com/content/ICCV2021/papers/Xiao_DTMNet_A_Discrete_Tchebichef_Moments-Based_Deep_Neural_Network_for_Multi-Focus_ICCV_2021_paper.pdf)

## 29.Image Retrieval(图像检索)
* [DOLG: Single-Stage Image Retrieval with Deep Orthogonal Fusion of Local and Global Features](https://arxiv.org/abs/2108.02927)
:star:[code](https://github.com/feymanpriv/DOLG)
* [Image Retrieval on Real-life Images with Pre-trained Vision-and-Language Models](https://arxiv.org/abs/2108.04024)
:star:[code](https://github.com/Cuberick-Orion/CIRR):house:[project](https://cuberick-orion.github.io/CIRR/)
* [Self-supervised Product Quantization for Deep Unsupervised Image Retrieval](https://arxiv.org/abs/2109.02244)
:star:[code](https://github.com/youngkyunJang/SPQ)
* [Instance-Level Image Retrieval Using Reranking Transformers](https://arxiv.org/abs/2103.12236)
:star:[code](https://github.com/uvavision/RerankingTransformer)
* [Learning Attribute-Driven Disentangled Representations for Interactive Fashion Retrieval](https://openaccess.thecvf.com/content/ICCV2021/papers/Hou_Learning_Attribute-Driven_Disentangled_Representations_for_Interactive_Fashion_Retrieval_ICCV_2021_paper.pdf)
:star:[code](https://github.com/amzn/fashion-attribute-disentanglement)
* [Telling the What While Pointing to the Where: Multimodal Queries for Image Retrieval](https://openaccess.thecvf.com/content/ICCV2021/papers/Changpinyo_Telling_the_What_While_Pointing_to_the_Where_Multimodal_Queries_ICCV_2021_paper.pdf)
* [Frozen in Time: A Joint Video and Image Encoder for End-to-End Retrieval](https://arxiv.org/abs/2104.00650)
* [Learning Deep Local Features With Multiple Dynamic Attentions for Large-Scale Image Retrieval](https://openaccess.thecvf.com/content/ICCV2021/papers/Wu_Learning_Deep_Local_Features_With_Multiple_Dynamic_Attentions_for_Large-Scale_ICCV_2021_paper.pdf)
:star:[code](https://github.com/CHANWH/MDA)
* [Bayesian Triplet Loss: Uncertainty Quantification in Image Retrieval](https://openaccess.thecvf.com/content/ICCV2021/papers/Warburg_Bayesian_Triplet_Loss_Uncertainty_Quantification_in_Image_Retrieval_ICCV_2021_paper.pdf)
* 跨域检索
* [Universal Cross-Domain Retrieval: Generalizing Across Classes and Domains](https://arxiv.org/abs/2108.08356)
* Visual Geolocalization
* [Viewpoint Invariant Dense Matching for Visual Geolocalization](https://arxiv.org/abs/2109.09827)
:star:[code](https://github.com/gmberton/geo_warp)
* 跨模态检索
* [Ask&Confirm: Active Detail Enriching for Cross-Modal Retrieval With Partial Query](https://openaccess.thecvf.com/content/ICCV2021/papers/Cai_AskConfirm_Active_Detail_Enriching_for_Cross-Modal_Retrieval_With_Partial_Query_ICCV_2021_paper.pdf)
:star:[code](https://github.com/CuthbertCai/Ask-Confirm)
* [Product1M: Towards Weakly Supervised Instance-Level Product Retrieval via Cross-modal Pretraining](https://arxiv.org/abs/2107.14572)
:star:[code](https://github.com/zhanxlin/Product1M)
* [Wasserstein Coupled Graph Learning for Cross-Modal Retrieval](https://openaccess.thecvf.com/content/ICCV2021/papers/Wang_Wasserstein_Coupled_Graph_Learning_for_Cross-Modal_Retrieval_ICCV_2021_paper.pdf)
* [Adversarial Attack on Deep Cross-Modal Hamming Retrieval](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Adversarial_Attack_on_Deep_Cross-Modal_Hamming_Retrieval_ICCV_2021_paper.pdf)
* 文本-视频检索
* [TEACHTEXT: CrossModal Generalized Distillation for Text-Video Retrieval](https://arxiv.org/abs/2104.08271)
:house:[project](https://www.robots.ox.ac.uk/~vgg/research/teachtext/)
* 视频- 文本检索
* [HiT: Hierarchical Transformer With Momentum Contrast for Video-Text Retrieval](https://arxiv.org/abs/2103.15049)
* image-based 3D shape retrieval
* [Single Image 3D Shape Retrieval via Cross-Modal Instance and Category Contrastive Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Lin_Single_Image_3D_Shape_Retrieval_via_Cross-Modal_Instance_and_Category_ICCV_2021_paper.pdf)
* 近邻搜索
* [Product Quantizer Aware Inverted Index for Scalable Nearest Neighbor Search](https://openaccess.thecvf.com/content/ICCV2021/papers/Noh_Product_Quantizer_Aware_Inverted_Index_for_Scalable_Nearest_Neighbor_Search_ICCV_2021_paper.pdf)

## 28.Contrastive Learning(对比学习)
* [Improving Contrastive Learning by Visualizing Feature Transformation](https://arxiv.org/abs/2108.02982)
:open_mouth:oral:star:[code](https://github.com/DTennant/CL-Visualizing-Feature-Transformation)
* [TACo: Token-aware Cascade Contrastive Learning for Video-Text Alignment](https://arxiv.org/abs/2108.09980)
:newspaper:解读:[ICCV2021-TOCo-微软&CMU提出Token感知的级联对比学习方法,在视频文本对齐任务上“吊打”其他SOTA方法](https://mp.weixin.qq.com/s/sNwvYL1qsgyVrRe3-QmzhA)
* [A Broad Study on the Transferability of Visual Representations With Contrastive Learning](https://arxiv.org/abs/2103.13517)
:star:[code](https://github.com/asrafulashiq/transfer_broad)
* [Vi2CLR: Video and Image for Visual Contrastive Learning of Representation](https://openaccess.thecvf.com/content/ICCV2021/papers/Diba_Vi2CLR_Video_and_Image_for_Visual_Contrastive_Learning_of_Representation_ICCV_2021_paper.pdf)
* [LatentCLR: A Contrastive Learning Approach for Unsupervised Discovery of Interpretable Directions](https://arxiv.org/abs/2104.00820)
:star:[code](https://github.com/catlab-team/latentclr)
* [CrossCLR: Cross-Modal Contrastive Learning for Multi-Modal Video Representations](https://arxiv.org/abs/2109.14910)
* [Social NCE: Contrastive Learning of Socially-Aware Motion Representations](https://arxiv.org/abs/2012.11717)
:star:[code](https://github.com/vita-epfl/social-nce):tv:[video](https://www.youtube.com/watch?v=s1khZWWiQfA)
* [With a Little Help From My Friends: Nearest-Neighbor Contrastive Learning of Visual Representations](https://arxiv.org/abs/2104.14548)
* [Contrastive Learning of Image Representations With Cross-Video Cycle-Consistency](https://arxiv.org/abs/2105.06463)
:house:[project](https://happywu.github.io/cycle_contrast_video/)
* [Weakly Supervised Contrastive Learning](https://arxiv.org/abs/2110.04770)

## 27.Multi-label image recognition(多标签图像识别)
* [Residual Attention: A Simple but Effective Method for Multi-Label Recognition](https://arxiv.org/abs/2108.02456)
:star:[code](https://github.com/Kevinz-code/CSRA)
* [Transformer-based Dual Relation Graph for Multi-label Image Recognition](https://arxiv.org/abs/2110.04722)

## 26.Image Processing(图像处理)
* [Aligning Latent and Image Spaces to Connect the Unconnectable](https://arxiv.org/abs/2104.06954)
:star:[code](https://github.com/universome/alis):house:[project](https://universome.github.io/alis)
* 图像形状操纵
* [Image Shape Manipulation from a Single Augmented Training Sample](https://arxiv.org/abs/2109.06151)
:open_mouth:oral:star:[code](https://github.com/eliahuhorwitz/DeepSIM):house:[project](http://www.vision.huji.ac.il/deepsim/)
* 边缘检测
* [RINDNet: Edge Detection for Discontinuity in Reflectance, Illumination, Normal and Depth](https://arxiv.org/abs/2108.00616)
:open_mouth:oral:star:[code](https://github.com/MengyangPu/RINDNet)
* [Pixel Difference Networks for Efficient Edge Detection](https://arxiv.org/abs/2108.07009)
:star:[code](https://github.com/zhuoinoulu/pidinet)
* 图像识别
* [MicroNet: Improving Image Recognition with Extremely Low FLOPs](https://arxiv.org/abs/2108.05894)
:star:[code](https://github.com/liyunsheng13/micronet)
* 图像去模糊
* [Rethinking Coarse-to-Fine Approach in Single Image Deblurring](https://arxiv.org/abs/2108.05054)
:star:[code](https://github.com/chosj95/MIMO-UNet)
* [Single Image Defocus Deblurring Using Kernel-Sharing Parallel Atrous Convolutions](https://arxiv.org/abs/2108.09108)
* [Defocus Map Estimation and Deblurring From a Single Dual-Pixel Image](https://openaccess.thecvf.com/content/ICCV2021/papers/Xin_Defocus_Map_Estimation_and_Deblurring_From_a_Single_Dual-Pixel_Image_ICCV_2021_paper.pdf)
* [Motion Deblurring with Real Events](https://arxiv.org/abs/2109.13695)
* [Pyramid Architecture Search for Real-Time Image Deblurring](https://openaccess.thecvf.com/content/ICCV2021/papers/Hu_Pyramid_Architecture_Search_for_Real-Time_Image_Deblurring_ICCV_2021_paper.pdf)
* 运动去模糊
* [Perceptual Variousness Motion Deblurring With Light Global Context Refinement](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Perceptual_Variousness_Motion_Deblurring_With_Light_Global_Context_Refinement_ICCV_2021_paper.pdf)
* 视频去模糊
* [Bringing Events Into Video Deblurring With Non-Consecutively Blurry Frames](https://openaccess.thecvf.com/content/ICCV2021/papers/Shang_Bringing_Events_Into_Video_Deblurring_With_Non-Consecutively_Blurry_Frames_ICCV_2021_paper.pdf)
:star:[code](https://github.com/shangwei5/D2Net)
* Image quality assessment(图像质量评估IQA)
* [MUSIQ: Multi-scale Image Quality Transformer](https://arxiv.org/abs/2108.05997)
:star:[code](https://github.com/google-research/google-research/tree/master/musiq)
* Image Harmonization
* [SSH: A Self-Supervised Framework for Image Harmonization](https://arxiv.org/abs/2108.06805)
:star:[code](https://github.com/VITA-Group/SSHarmonization)
* [Learning Conditional Knowledge Distillation for Degraded-Reference Image Quality Assessment](https://arxiv.org/abs/2108.07948)
:star:[code](https://github.com/researchmm/CKDN)
* 去阴影
* [CANet: A Context-Aware Network for Shadow Removal](https://arxiv.org/abs/2108.09894)
:star:[code](https://github.com/Zipei-Chen/CANet)
* [DC-ShadowNet: Single-Image Hard and Soft Shadow Removal Using Unsupervised Domain-Classifier Guided Network](https://openaccess.thecvf.com/content/ICCV2021/papers/Jin_DC-ShadowNet_Single-Image_Hard_and_Soft_Shadow_Removal_Using_Unsupervised_Domain-Classifier_ICCV_2021_paper.pdf)
* 去噪
* [Rethinking Deep Image Prior for Denoising](https://arxiv.org/abs/2108.12841)
:star:[code](https://github.com/gistvision/DIP-denosing)
* [Rethinking Noise Synthesis and Modeling in Raw Denoising](https://arxiv.org/abs/2110.04756)
:star:[code](https://github.com/zhangyi-3/noise-synthesis)
* [C2N: Practical Generative Noise Modeling for Real-World Denoising](https://openaccess.thecvf.com/content/ICCV2021/papers/Jang_C2N_Practical_Generative_Noise_Modeling_for_Real-World_Denoising_ICCV_2021_paper.pdf)
* [The Benefit of Distraction: Denoising Camera-Based Physiological Measurements Using Inverse Attention](https://openaccess.thecvf.com/content/ICCV2021/papers/Nowara_The_Benefit_of_Distraction_Denoising_Camera-Based_Physiological_Measurements_Using_Inverse_ICCV_2021_paper.pdf)
:star:[code](https://github.com/ewanowara/benefitofdistraction)
* [Hyperspectral Image Denoising with Realistic Data](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhang_Hyperspectral_Image_Denoising_With_Realistic_Data_ICCV_2021_paper.pdf)
:star:[code](https://github.com/ColinTaoZhang/HSIDwRD)
* [End-to-End Unsupervised Document Image Blind Denoising](https://openaccess.thecvf.com/content/ICCV2021/papers/Gangeh_End-to-End_Unsupervised_Document_Image_Blind_Denoising_ICCV_2021_paper.pdf)
* [Cross-Patch Graph Convolutional Network for Image Denoising](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Cross-Patch_Graph_Convolutional_Network_for_Image_Denoising_ICCV_2021_paper.pdf)
* 视频去噪
* [Patch Craft: Video Denoising by Deep Modeling and Patch Matching](http://arxiv.org/abs/2103.13767)
* 图像着色
* [Towards Vivid and Diverse Image Colorization with Generative Color Prior](https://arxiv.org/abs/2108.08826)
* [Deep Edge-Aware Interactive Colorization Against Color-Bleeding Effects](https://arxiv.org/abs/2107.01619)
:open_mouth:oral:house:[project](https://eungyeupkim.github.io/edge-enhancing-colorization/)
* 图像增强
* [Real-time Image Enhancer via Learnable Spatial-aware 3D Lookup Tables](https://arxiv.org/abs/2108.08697)
* [Adaptive Unfolding Total Variation Network for Low-Light Image Enhancement](https://arxiv.org/abs/2110.00984)
:star:[code](https://github.com/CharlieZCJ/UTVNet)
* [Representative Color Transform for Image Enhancement](https://openaccess.thecvf.com/content/ICCV2021/papers/Kim_Representative_Color_Transform_for_Image_Enhancement_ICCV_2021_paper.pdf)
* [STAR: A Structure-Aware Lightweight Transformer for Real-Time Image Enhancement](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhang_STAR_A_Structure-Aware_Lightweight_Transformer_for_Real-Time_Image_Enhancement_ICCV_2021_paper.pdf)
* [Deep Symmetric Network for Underexposed Image Enhancement With Recurrent Attentional Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhao_Deep_Symmetric_Network_for_Underexposed_Image_Enhancement_With_Recurrent_Attentional_ICCV_2021_paper.pdf)
:star:[code](https://www.shaopinglu.net/proj-iccv21/ImageEnhancement.html):house:[project](https://github.com/lin-zhao-resoLve/Deep-Symmetric-Network-Enhancement)
* [StarEnhancer: Learning Real-Time and Style-Aware Image Enhancement](https://arxiv.org/abs/2107.12898)
* 图像恢复
* [Spatially-Adaptive Image Restoration using Distortion-Guided Networks](https://arxiv.org/abs/2108.08617)
:star:[code](https://github.com/human-analysis/spatially-adaptive-image-restoration)
* [Dynamic Attentive Graph Learning for Image Restoration](https://arxiv.org/abs/2109.06620)
:star:[code](https://github.com/jianzhangcs/DAGL)
* [Self-Supervised Cryo-Electron Tomography Volumetric Image Restoration From Single Noisy Volume With Sparsity Constraint](https://openaccess.thecvf.com/content/ICCV2021/papers/Yang_Self-Supervised_Cryo-Electron_Tomography_Volumetric_Image_Restoration_From_Single_Noisy_Volume_ICCV_2021_paper.pdf)
:star:[code](https://github.com/icthrm/SC-Net)
* [Searching for Controllable Image Restoration Networks](https://arxiv.org/abs/2012.11225)
:star:[code](https://github.com/ghimhw)
* 图像压缩
* [Variable-Rate Deep Image Compression through Spatially-Adaptive Feature Transform](https://arxiv.org/abs/2108.09551)
:star:[code](https://github.com/micmic123/QmapCompression)
* [Neural Image Compression via Attentional Multi-Scale Back Projection and Frequency Decomposition](https://openaccess.thecvf.com/content/ICCV2021/papers/Gao_Neural_Image_Compression_via_Attentional_Multi-Scale_Back_Projection_and_Frequency_ICCV_2021_paper.pdf)
* 图像修复
* [Image Inpainting via Conditional Texture and Structure Dual Generation](https://arxiv.org/abs/2108.09760)
:star:[code](https://github.com/Xiefan-Guo/CTSDG)
* [CR-Fill: Generative Image Inpainting With Auxiliary Contextual Reconstruction](https://openaccess.thecvf.com/content/ICCV2021/papers/Zeng_CR-Fill_Generative_Image_Inpainting_With_Auxiliary_Contextual_Reconstruction_ICCV_2021_paper.pdf)
:star:[code](https://github.com/zengxianyu/crfill)
* [Parallel Multi-Resolution Fusion Network for Image Inpainting](https://openaccess.thecvf.com/content/ICCV2021/papers/Wang_Parallel_Multi-Resolution_Fusion_Network_for_Image_Inpainting_ICCV_2021_paper.pdf)
* [Painting from Part](https://openaccess.thecvf.com/content/ICCV2021/papers/Guo_Painting_From_Part_ICCV_2021_paper.pdf)
:star:[code](https://github.com/zhenglab/partpainting)
* [WaveFill: A Wavelet-Based Generation Network for Image Inpainting](https://arxiv.org/abs/2107.11027)
* [Distillation-Guided Image Inpainting](https://openaccess.thecvf.com/content/ICCV2021/papers/Suin_Distillation-Guided_Image_Inpainting_ICCV_2021_paper.pdf)
* [Learning a Sketch Tensor Space for Image Inpainting of Man-made Scenes](https://arxiv.org/abs/2103.15087)
:star:[code](https://github.com/ewrfcas/MST_inpainting):house:[project](https://ewrfcas.github.io/MST_inpainting/)
* Image extrapolation
* [SemIE: Semantically-aware Image Extrapolation](https://arxiv.org/abs/2108.13702)
:house:[project](https://semie-iccv.github.io/)
* Reversible Image Conversion
* [IICNet: A Generic Framework for Reversible Image Conversion](https://arxiv.org/abs/2109.04242)
:star:[code](https://github.com/felixcheng97/IICNet)
* 伪影去除
* [Towards Flexible Blind JPEG Artifacts Removal](https://arxiv.org/abs/2109.14573)
:star:[code](https://github.com/jiaxi-jiang/FBCNN)
* [Learning Dual Priors for JPEG Compression Artifacts Removal](https://openaccess.thecvf.com/content/ICCV2021/papers/Fu_Learning_Dual_Priors_for_JPEG_Compression_Artifacts_Removal_ICCV_2021_paper.pdf)
* [Let's See Clearly: Contaminant Artifact Removal for Moving Cameras](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Lets_See_Clearly_Contaminant_Artifact_Removal_for_Moving_Cameras_ICCV_2021_paper.pdf)
* De-rendering
* [De-rendering Stylized Texts](https://arxiv.org/abs/2110.01890)
:star:[code](https://github.com/CyberAgentAILab/derendering-text):house:[project](https://cyberagentailab.github.io/derendering-text/)
* 去除光晕
* [Light Source Guided Single-Image Flare Removal From Unpaired Data](https://openaccess.thecvf.com/content/ICCV2021/papers/Qiao_Light_Source_Guided_Single-Image_Flare_Removal_From_Unpaired_Data_ICCV_2021_paper.pdf)
* 全景图拼接
* [Minimal Solutions for Panoramic Stitching Given Gravity Prior](https://arxiv.org/abs/2012.00465)
* Flare Removal
* [How to Train Neural Networks for Flare Removal](https://arxiv.org/abs/2011.12485)
:house:[project](https://yichengwu.github.io/flare-removal/):tv:[video](https://www.youtube.com/watch?v=eAXhcDjWoZ0)
* 图像裁剪
* [TransView: Inside, Outside, and Across the Cropping View Boundaries](https://openaccess.thecvf.com/content/ICCV2021/papers/Pan_TransView_Inside_Outside_and_Across_the_Cropping_View_Boundaries_ICCV_2021_paper.pdf)
* [Dissecting Image Crops](https://arxiv.org/abs/2011.11831)
:star:[code](https://github.com/basilevh/dissecting-image-crops)
* 去反射
* [Location-Aware Single Image Reflection Removal](https://arxiv.org/abs/2012.07131)
:star:[code](https://github.com/zdlarr/Location-aware-SIRR)
* [V-DESIRR: Very Fast Deep Embedded Single Image Reflection Removal](https://openaccess.thecvf.com/content/ICCV2021/papers/Prasad_V-DESIRR_Very_Fast_Deep_Embedded_Single_Image_Reflection_Removal_ICCV_2021_paper.pdf)
:star:[code](https://www.github.com/ee19d005/vdesirr)
* 去雨
* [Improving De-Raining Generalization via Neural Reorganization](https://openaccess.thecvf.com/content/ICCV2021/papers/Xiao_Improving_De-Raining_Generalization_via_Neural_Reorganization_ICCV_2021_paper.pdf)
* [Unpaired Learning for Deep Image Deraining with Rain Direction Regularize](https://openaccess.thecvf.com/content/ICCV2021/papers/Liu_Unpaired_Learning_for_Deep_Image_Deraining_With_Rain_Direction_Regularizer_ICCV_2021_paper.pdf)
:house:[project](https://lewisyangliu.github.io/projects/UDRDR/)
* [Structure-Preserving Deraining With Residue Channel Prior Guidance](https://arxiv.org/abs/2108.09079)
:star:[code](https://github.com/Joyies/SPDNet)
* 图像失真去除
* [Unsupervised Non-Rigid Image Distortion Removal via Grid Deformation](https://openaccess.thecvf.com/content/ICCV2021/papers/Li_Unsupervised_Non-Rigid_Image_Distortion_Removal_via_Grid_Deformation_ICCV_2021_paper.pdf)
:star:[code](https://github.com/Nianyi-Li/unsupervised-NDIR):tv:[video](https://www.youtube.com/watch?v=aeJkb5u0Cb8)
* 消除水下图像的折射失真
* [Learning To Remove Refractive Distortions From Underwater Images](https://openaccess.thecvf.com/content/ICCV2021/papers/Thapa_Learning_To_Remove_Refractive_Distortions_From_Underwater_Images_ICCV_2021_paper.pdf)
* 图像补全
* [High-Fidelity Pluralistic Image Completion With Transformers](https://arxiv.org/abs/2103.14031)
:star:[code](https://github.com/raywzy/ICT):house:[project](http://raywzy.com/ICT/)
* Image Decomposition
* [Unsupervised Layered Image Decomposition into Object Prototypes](https://arxiv.org/abs/2104.14575)
* 失真矫正
* [Towards Complete Scene and Regular Shape for Distortion Rectification by Curve-Aware Extrapolation](https://openaccess.thecvf.com/content/ICCV2021/papers/Liao_Towards_Complete_Scene_and_Regular_Shape_for_Distortion_Rectification_by_ICCV_2021_paper.pdf)
* HDR
* [Unpaired Learning for High Dynamic Range Image Tone Mapping](https://openaccess.thecvf.com/content/ICCV2021/papers/Vinker_Unpaired_Learning_for_High_Dynamic_Range_Image_Tone_Mapping_ICCV_2021_paper.pdf)
* 超高清图像HDR重建
* [Ultra-High-Definition Image HDR Reconstruction via Collaborative Bilateral Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Zheng_Ultra-High-Definition_Image_HDR_Reconstruction_via_Collaborative_Bilateral_Learning_ICCV_2021_paper.pdf)
* 图像去雪
* [ALL Snow Removed: Single Image Desnowing Algorithm Using Hierarchical Dual-Tree Complex Wavelet Representation and Contradict Channel Loss](https://openaccess.thecvf.com/content/ICCV2021/papers/Chen_ALL_Snow_Removed_Single_Image_Desnowing_Algorithm_Using_Hierarchical_Dual-Tree_ICCV_2021_paper.pdf)
:star:[code](https://github.com/weitingchen83/ICCV2021-Single-Image-Desnowing-HDCWNet)
* Image Harmonization
* [Image Harmonization With Transformer](https://openaccess.thecvf.com/content/ICCV2021/papers/Guo_Image_Harmonization_With_Transformer_ICCV_2021_paper.pdf)
:star:[code](https://github.com/zhenglab/HarmonyTransformer)
* 图像编辑
* [Language-Guided Global Image Editing via Cross-Modal Cyclic Mechanism](https://openaccess.thecvf.com/content/ICCV2021/papers/Jiang_Language-Guided_Global_Image_Editing_via_Cross-Modal_Cyclic_Mechanism_ICCV_2021_paper.pdf)
* image hiding(图像隐藏)
* [HiNet: Deep Image Hiding by Invertible Network](https://openaccess.thecvf.com/content/ICCV2021/papers/Jing_HiNet_Deep_Image_Hiding_by_Invertible_Network_ICCV_2021_paper.pdf)
:star:[code](https://github.com/TomTomTommi/HiNet)

## 25.Medical Image(医学影像)
* [Equivariant Imaging: Learning Beyond the Range Space](https://arxiv.org/abs/2103.14756)
:open_mouth:oral:star:[code](https://github.com/edongdongchen/EI)
* [Deep Survival Analysis With Longitudinal X-Rays for COVID-19](https://openaccess.thecvf.com/content/ICCV2021/papers/Shu_Deep_Survival_Analysis_With_Longitudinal_X-Rays_for_COVID-19_ICCV_2021_paper.pdf)
* 医学图像分割
* [Recurrent Mask Refinement for Few-Shot Medical Image Segmentation](https://arxiv.org/abs/2108.00622)
:star:[code](https://github.com/uci-cbcl/RP-Net)
* [Graph-BAS3Net: Boundary-Aware Semi-Supervised Segmentation Network With Bilateral Graph Convolution](https://openaccess.thecvf.com/content/ICCV2021/papers/Huang_Graph-BAS3Net_Boundary-Aware_Semi-Supervised_Segmentation_Network_With_Bilateral_Graph_Convolution_ICCV_2021_paper.pdf)医学图像分割
* 病变分割
* [T-AutoML: Automated Machine Learning for Lesion Segmentation Using Transformers in 3D Medical Imaging](https://openaccess.thecvf.com/content/ICCV2021/papers/Yang_T-AutoML_Automated_Machine_Learning_for_Lesion_Segmentation_Using_Transformers_in_ICCV_2021_paper.pdf)
* 息肉分割
* [Collaborative and Adversarial Learning of Focused and Dispersive Representations for Semi-Supervised Polyp Segmentation](https://openaccess.thecvf.com/content/ICCV2021/papers/Wu_Collaborative_and_Adversarial_Learning_of_Focused_and_Dispersive_Representations_for_ICCV_2021_paper.pdf)
* 血管分割
* [Self-Supervised Vessel Segmentation via Adversarial Learning](https://github.com/AISIGSJTU/SSVS)
* 脑肿瘤分割
* [RFNet: Region-Aware Fusion Network for Incomplete Multi-Modal Brain Tumor Segmentation](https://openaccess.thecvf.com/content/ICCV2021/papers/Ding_RFNet_Region-Aware_Fusion_Network_for_Incomplete_Multi-Modal_Brain_Tumor_Segmentation_ICCV_2021_paper.pdf)
* 病理学图像表示
* [A QuadTree Image Representation for Computational Pathology](https://arxiv.org/abs/2108.10873)
* 医学图像分析
* [Preservational Learning Improves Self-supervised Medical Image Models by Reconstructing Diverse Contexts](https://arxiv.org/abs/2109.04379)
:star:[code](https://github.com/Luchixiang/PCRL)
:newspaper:解读:[ICCV2021 2D和3D通用!新医疗影像自监督SOTA(代码已开源)](https://mp.weixin.qq.com/s/mM0ddlImo87a8tDkRbsfHg)
* 医学图像去噪
* [Eformer: Edge Enhancement based Transformer for Medical Image Denoising](https://arxiv.org/abs/2109.08044)
* 视频翻译
* [Long-Term Temporally Consistent Unpaired Video Translation From Simulated Surgical 3D Data](https://arxiv.org/abs/2103.17204)
:star:[code](https://gitlab.com/nct_tso_public/surgical-video-sim2real):house:[project](http://opencas.dkfz.de/video-sim2real/)
* 病理学图像核检测分割
* [Mutual-Complementing Framework for Nuclei Detection and Segmentation in Pathology Image](https://openaccess.thecvf.com/content/ICCV2021/papers/Feng_Mutual-Complementing_Framework_for_Nuclei_Detection_and_Segmentation_in_Pathology_Image_ICCV_2021_paper.pdf)
* 医学报告生成
* [Visual-Textual Attentive Semantic Consistency for Medical Report Generation](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhou_Visual-Textual_Attentive_Semantic_Consistency_for_Medical_Report_Generation_ICCV_2021_paper.pdf)
* CT
* [3DeepCT: Learning Volumetric Scattering Tomography of Clouds](https://openaccess.thecvf.com/content/ICCV2021/papers/Sde-Chen_3DeepCT_Learning_Volumetric_Scattering_Tomography_of_Clouds_ICCV_2021_paper.pdf)
* [IntraTomo: Self-Supervised Learning-Based Tomography via Sinogram Synthesis and Prediction](https://openaccess.thecvf.com/content/ICCV2021/papers/Zang_IntraTomo_Self-Supervised_Learning-Based_Tomography_via_Sinogram_Synthesis_and_Prediction_ICCV_2021_paper.pdf)
:star:[code](https://github.com/vccimaging/IntraTomo)
* CT重建
* [Dynamic CT Reconstruction From Limited Views With Implicit Neural Representations and Parametric Motion Fields](https://arxiv.org/abs/2104.11745)
:star:[code](https://github.com/awreed/DynamicCTReconstruction)
* 医学图像识别
* [GLoRIA: A Multimodal Global-Local Representation Learning Framework for Label-Efficient Medical Image Recognition](https://openaccess.thecvf.com/content/ICCV2021/papers/Huang_GLoRIA_A_Multimodal_Global-Local_Representation_Learning_Framework_for_Label-Efficient_Medical_ICCV_2021_paper.pdf)
:star:[code](https://github.com/marshuang80/gloria)
* 医学图像分类
* [Big Self-Supervised Models Advance Medical Image Classification](https://arxiv.org/abs/2101.05224)
* [Large-Scale Robust Deep AUC Maximization: A New Surrogate Loss and Empirical Studies on Medical Image Classification](https://arxiv.org/abs/2012.03173)
:star:[code](https://github.com/Optimization-AI/LibAUC)

## 24.Face(人脸)
* [VariTex: Variational Neural Face Textures](https://arxiv.org/abs/2104.05988)
:star:[code](https://github.com/mcbuehler/VariTex):house:[project](https://mcbuehler.github.io/VariTex/):tv:[video](https://www.youtube.com/watch?v=6-GFHcLkbik)
* 人脸造假检测
* [OpenForensics: Large-Scale Challenging Dataset For Multi-Face Forgery Detection And Segmentation In-The-Wild](https://arxiv.org/abs/2107.14480)
:house:[project](https://sites.google.com/view/ltnghia/research/openforensics)
* [Exploring Temporal Coherence for More General Video Face Forgery Detection](https://arxiv.org/abs/2108.06693)
* 人脸合成
* [Disentangled Lifespan Face Synthesis](https://arxiv.org/abs/2108.02874)
:star:[code](https://github.com/SenHe/DLFS):house:[project](https://senhe.github.io/projects/iccv_2021_lifespan_face/):tv:[video](https://youtu.be/uklX03ns0m0)
* 人脸识别
* [PASS: Protected Attribute Suppression System for Mitigating Bias in Face Recognition](https://arxiv.org/abs/2108.03764)
* [SynFace: Face Recognition with Synthetic Data](https://arxiv.org/abs/2108.07960)
* [Adaptive Label Noise Cleaning With Meta-Supervision for Deep Face Recognition](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhang_Adaptive_Label_Noise_Cleaning_With_Meta-Supervision_for_Deep_Face_Recognition_ICCV_2021_paper.pdf)
* [Disentangled Representation for Age-Invariant Face Recognition: A Mutual Information Minimization Perspective](https://openaccess.thecvf.com/content/ICCV2021/papers/Hou_Disentangled_Representation_for_Age-Invariant_Face_Recognition_A_Mutual_Information_Minimization_ICCV_2021_paper.pdf)
* [Teacher-Student Adversarial Depth Hallucination To Improve Face Recognition](https://openaccess.thecvf.com/content/ICCV2021/papers/Uppal_Teacher-Student_Adversarial_Depth_Hallucination_To_Improve_Face_Recognition_ICCV_2021_paper.pdf)
:star:[code](https://github.com/hardik-uppal/teacher-student-gan)
* [DAM: Discrepancy Alignment Metric for Face Recognition](https://openaccess.thecvf.com/content/ICCV2021/papers/Liu_DAM_Discrepancy_Alignment_Metric_for_Face_Recognition_ICCV_2021_paper.pdf)
* “去”识别
* [Personalized and Invertible Face De-Identification by Disentangled Identity Information Manipulation](https://openaccess.thecvf.com/content/ICCV2021/papers/Cao_Personalized_and_Invertible_Face_De-Identification_by_Disentangled_Identity_Information_Manipulation_ICCV_2021_paper.pdf)
* Face perception面部感知
* [Learning Facial Representations from the Cycle-consistency of Face](https://arxiv.org/abs/2108.03427)
:star:[code](https://github.com/JiaRenChang/FaceCycle)
* 说话人脸生成
* [FACIAL: Synthesizing Dynamic Talking Face with Implicit Attribute Learning](https://arxiv.org/abs/2108.07938)
* 说话头合成
* [AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head Synthesis](https://openaccess.thecvf.com/content/ICCV2021/papers/Guo_AD-NeRF_Audio_Driven_Neural_Radiance_Fields_for_Talking_Head_Synthesis_ICCV_2021_paper.pdf)
:star:[code](https://github.com/YudongGuo/AD-NeRF)
* [Learned Spatial Representations for Few-Shot Talking-Head Synthesis](https://arxiv.org/abs/2104.14557)
:star:[code](https://github.com/MoustafaMeshry/lsr):house:[project](http://www.cs.umd.edu/~mmeshry/projects/lsr/)
* 人脸表情识别
* [Understanding and Mitigating Annotation Bias in Facial Expression Recognition](https://arxiv.org/abs/2108.08504)
* [TransFER: Learning Relation-aware Facial Expression Representations with Transformers](https://arxiv.org/abs/2108.11116)
* 人脸呈现攻击检测
* [Detection and Continual Learning of Novel Face Presentation Attacks](https://arxiv.org/abs/2108.12081)
:star:[code](https://github.com/mrostami1366)
* 人脸编辑
* [Talk-to-Edit: Fine-Grained Facial Editing via Dialog](https://arxiv.org/abs/2109.04425)
:star:[code](https://github.com/yumingj/Talk-to-Edit):house:[project](https://www.mmlab-ntu.com/project/talkedit/)
:newspaper:解读:[ICCV2021 | 南洋理工大学、港中大提出Talk-to-Edit,对话实现高细粒度人脸编辑](https://mp.weixin.qq.com/s/48FsUqsppXaXUu-QMUIhCQ)
* [A Latent Transformer for Disentangled Face Editing in Images and Videos](https://arxiv.org/abs/2106.11895)
:star:[code](https://github.com/InterDigitalInc/latent-transformer)
* 人脸对齐
* [ADNet: Leveraging Error-Bias Towards Normal Direction in Face Alignment](https://arxiv.org/abs/2109.05721)
* 人脸图像重建
* [Focal Frequency Loss for Image Reconstruction and Synthesis](https://arxiv.org/abs/2012.12821)
:star:[code](https://github.com/EndlessSora/focal-frequency-loss):house:[project](https://www.mmlab-ntu.com/project/ffl/index.html):tv:[video](https://www.youtube.com/watch?v=RNTnDtKvcpc)
* [Towards High Fidelity Monocular Face Reconstruction With Rich Reflectance Using Self-Supervised Learning and Ray Tracing](https://arxiv.org/abs/2103.15432)
* [Neural Photofit: Gaze-Based Mental Image Reconstruction](https://arxiv.org/abs/2108.07524)
:house:[project](https://perceptualui.org/publications/strohm21_iccv/)
* 3D人脸重建
* [Topologically Consistent Multi-View Face Inference Using Volumetric Sampling](https://arxiv.org/abs/2110.02948)
:star:[code](https://tianyeli.github.io/tofu)
* [Self-Supervised 3D Face Reconstruction via Conditional Estimation](https://arxiv.org/abs/2110.04800)
* 三维人脸动画
* [MeshTalk: 3D Face Animation From Speech Using Cross-Modality Disentanglement](https://openaccess.thecvf.com/content/ICCV2021/papers/Richard_MeshTalk_3D_Face_Animation_From_Speech_Using_Cross-Modality_Disentanglement_ICCV_2021_paper.pdf)
:star:[code](https://github.com/facebookresearch/meshtalk):tv:[video](https://research.fb.com/wp-content/uploads/2021/04/mesh_talk.mp4)
* Remote Photoplethysmography (rPPG远程光电容积描记术)
* [The Way to My Heart Is Through Contrastive Learning: Remote Photoplethysmography From Unlabelled Video](https://openaccess.thecvf.com/content/ICCV2021/papers/Gideon_The_Way_to_My_Heart_Is_Through_Contrastive_Learning_Remote_ICCV_2021_paper.pdf)
:star:[code](https://github.com/ToyotaResearchInstitute/RemotePPG)
* 人脸加密
* [Towards Face Encryption by Generating Adversarial Identity Masks](https://arxiv.org/abs/2003.06814)
:star:[code](https://github.com/ShawnXYang/TIP-IM)
* Deepfake检测
* [Learning Self-Consistency for Deepfake Detection](https://arxiv.org/abs/2012.09311)
:open_mouth:oral
* [Joint Audio-Visual Deepfake Detection](https://openaccess.thecvf.com/content/ICCV2021/papers/Zhou_Joint_Audio-Visual_Deepfake_Detection_ICCV_2021_paper.pdf)
* [Artificial Fingerprinting for Generative Models: Rooting Deepfake Attribution in Training Data](https://arxiv.org/abs/2007.08457)
:open_mouth:oral
* 人脸纹理补全
* [Learning High-Fidelity Face Texture Completion Without Complete Face Texture](https://openaccess.thecvf.com/content/ICCV2021/papers/Kim_Learning_High-Fidelity_Face_Texture_Completion_Without_Complete_Face_Texture_ICCV_2021_paper.pdf)
* 面部动作单元检测
* [PIAP-DF: Pixel-Interested and Anti Person-Specific Facial Action Unit Detection Net With Discrete Feedback Learning](https://openaccess.thecvf.com/content/ICCV2021/papers/Tang_PIAP-DF_Pixel-Interested_and_Anti_Person-Specific_Facial_Action_Unit_Detection_Net_ICCV_2021_paper.pdf)
* 人脸分析
* [Fake I