https://github.com/52cv/cvpr-2023-papers

Last synced: 5 months ago
JSON representation
Host: GitHub
URL: https://github.com/52cv/cvpr-2023-papers
Owner: 52CV
Created: 2022-11-30T02:57:26.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2023-11-01T08:00:36.000Z (over 2 years ago)
Last Synced: 2025-02-24T05:14:37.758Z (over 1 year ago)
Size: 248 KB
Stars: 936
Watchers: 15
Forks: 76
Open Issues: 1
Metadata Files:
- Readme: README.md
Awesome Lists containing this project

README

          # CVPR-2023-Papers

![1ad4f8f92d9208b0f4b579e426b2dcd](https://user-images.githubusercontent.com/62801906/225788627-781870be-cc92-4054-b865-e2556b88cefc.jpg)

# ❣❣❣ CVPR 2023 论文分类整理已完成

# :loudspeaker::loudspeaker::loudspeaker:获奖论文

### :trophy:Best Paper

* [Planning-oriented Autonomous Driving](https://arxiv.org/abs/2212.10156)
:house:[project](https://opendrivelab.github.io/UniAD/)

* [Visual Programming: Compositional visual reasoning without training](https://arxiv.org/abs/2211.11559)

### :trophy:Best student Paper

* [3D Registration with Maximal Cliques](http://arxiv.org/abs/2305.10854v1)

### :trophy:Honorable Mention

* [DynIBaR: Neural Dynamic Image-Based Rendering](https://arxiv.org/abs/2211.11082)
:house:[project](http://dynibar.github.io/)

### :trophy:Honorable Mention(Student)

* [DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation](https://arxiv.org/abs/2208.12242)
:house:[project](https://dreambooth.github.io/)

## 历年综述论文分类汇总戳这里↘️[CV-Surveys](https://github.com/52CV/CV-Surveys)施工中~~~~~~~~~~

## 2024 年论文分类汇总戳这里

↘️[WACV-2024-Papers](https://github.com/52CV/WACV-2024-Papers)

## 2023 年论文分类汇总戳这里

↘️[CVPR-2023-Papers](https://github.com/52CV/CVPR-2023-Papers)

↘️[WACV-2023-Papers](https://github.com/52CV/WACV-2023-Papers)

↘️[ICCV-2023-Papers](https://github.com/52CV/ICCV-2023-Papers)

## [2022 年论文分类汇总戳这里](#000)

## [2021 年论文分类汇总戳这里](#00)

## [2020 年论文分类汇总戳这里](#0)

## 目录

|:cat:|:dog:|:tiger:|:wolf:|

|------|------|------|------|

|[1.其它](#1)|[2.Image Segmentation(图像分割)](#2)|[3.Image Progress(图像处理)](#4)|[4.Image Captioning(图像字幕)](#)|

|[5.Object Detection(目标检测)](#5)|[6.Object Tracking(目标跟踪)](#6)|[7.Point Cloud(点云)](#7)|[8.Action Detection(人体动作检测与识别)](#8)|

|[9.Human Pose Estimation(人体姿态估计)](#9)|[10.3D(三维视觉)](#10)|[11.Face](#11)|[12.Image-to-Image Translation(图像到图像翻译)](#12)|

|[13.GAN](#13)|[14.Video](#14)|[15.Transformer](#15)|[16.Semi/self-supervised learning(半/自监督)](#16)|

|[17.Medical Image(医学影像)](#17)|[18.Person Re-Identification(人员重识别)](#18)|[19.Neural Architecture Search(神经架构搜索)](#19)|[20.Autonomous vehicles(自动驾驶)](#20)|

|[21.UAV/Remote Sensing/Satellite Image(无人机/遥感/卫星图像)](#21)|[22.Image Synthesis/Generation(图像合成)](#22)|[23.Image Retrieval(图像检索)](#23)|[24.Super-Resolution(超分辨率)](#24)|

|[25.Fine-Grained/Image Classification(细粒度/图像分类)](#25)|[26.GCN/GNN](#26)|[27.Pose Estimation(物体姿势估计)](#27)|[28.Style Transfer(风格迁移)](#28)|

|[29.Augmented Reality/Virtual Reality/Robotics(增强/虚拟现实/机器人)](#29)|[30.Visual Answer Questions(视觉问答)](#30)|[31.Vision-Language(视觉语言)](#31)|[32.Data Augmentation(数据增强)](#32)|

|[33.Human-Object Interaction(人物交互)](#33)|[34.Model Compression/Knowledge Distillation/Pruning(模型压缩/知识蒸馏/剪枝)](#34)|[35.OCR](#35)|[36.Optical Flow(光流估计)](#36)|

|[37.Contrastive Learning(对比学习)](#37)|[38.Meta-Learning(元学习)](#38)|[39.Continual Learning(持续学习)](#39)|[40.Adversarial Learning(对抗学习)](#40)|

|[41.Incremental Learning(增量学习)](#41)|[42.Metric Learning(度量学习)](#42)|[43.Multi-Task Learning(多任务学习)](#43)|[44.Federated Learning(联邦学习)](#44)|

|[45.Dense Prediction(密集预测)](#45)|[46.Scene Graph Generation(场景图生成)](#46)|[47.Few/Zero-Shot Learning/DG/Adaptation(小/零样本/域泛化/适应)](#47)|[48.NLP(自然语言处理)](#48)|

|[49.Image Geo-localization(图像地理定位)](#49)|[50.Anomaly Detection(异常检测)](#50)|[51.光学、几何、光场成像](#51)|[52.Human Motion Forecasting(人体运动预测)](#52)|

|[53.Sign Language Translation(手语翻译)](#53)|[54.Benchmark/Dataset(基准/数据集)](#54)|[55.Novel View Synthesis(视图合成)](#55)|[56.Sound](#56)|

|[57.Gaze Estimation(视线估计)](#57)|[58.Neural rendering(神经渲染)](#58)|[59.Image\Video Compression(图像视频压缩)](#59)|[60.Industrial Anomaly Detection(工业缺陷检测)](#60)|

|[61.Object Re-identification(物体重识别)](#61)|[62.Object Counting(物体计数)](#62)|[63.edge detection(边缘检测)](#63)|[64.Motion Retargeting(动作重定向)](#64)|

|[65.Scene flow estimation(场景流估计)](#65)|[66.Clustering(聚类)](#66)|[67.Active Learning(主动学习)](#67)|[68.Lifelong Learning(终身学习)](#68)|

|[69.Reinforcement learning(强化学习)](#69)|[70.Image Forgery Detection](#70)|[71.visual reasoning(视觉推理)](#71)|[72.open-set recognition(开集识别)](#72)

|[73.Neural Radiance Fields(神经辐射场)](#73)|[74.Machine Learning(机器学习)](#74)|[75.Semantic Scene Completion(语义场景补全)](#75)|[76.IP protection(知识产权保护)](#76)|

|[77.sketch(草图)](#77)|[78.Image/Video Editing(图像/视频编辑)](#78)|[79.thermal imaging technology(热敏成像技术)](#79)|[80.计算机图形学](#80)|



## 80.计算机图形学

* [Learning Anchor Transformations for 3D Garment Animation](http://arxiv.org/abs/2304.00761v1)
:star:[code](https://semanticdh.github.io/AnchorDEF)

* [Trace and Pace: Controllable Pedestrian Animation via Guided Trajectory Diffusion](http://arxiv.org/abs/2304.01893v1)
:star:[code](https://nv-tlabs.github.io/trace-pace)

* [CloSET: Modeling Clothed Humans on Continuous Surface with Explicit Template Decomposition](http://arxiv.org/abs/2304.03167v1)
:house:[project](https://www.liuyebin.com/closet)

* [FLEX: Full-Body Grasping Without Full-Body Grasps](https://openaccess.thecvf.com/content/CVPR2023/papers/Tendulkar_FLEX_Full-Body_Grasping_Without_Full-Body_Grasps_CVPR_2023_paper.pdf)
:house:[project](flex.cs.columbia.edu)



## 79.thermal imaging technology(热敏成像技术)

* [What Happened 3 Seconds Ago? Inferring the Past with Thermal Imaging](https://arxiv.org/abs/2304.13651)
:star:[code](https://github.com/ZitianTang/Thermal-IM)



## 78.Image/Video Editing(图像/视频编辑)

* [PREIM3D: 3D Consistent Precise Image Attribute Editing from a Single Image](https://arxiv.org/abs/2304.10263)
:house:[project](https://mybabyyh.github.io/Preim3D/)

* 文本驱动的视频编辑

  * [Shape-aware Text-driven Layered Video Editing](https://arxiv.org/abs/2301.13173)
:house:[project](https://text-video-edit.github.io/)

* Image Editing(图像编辑)

  * [CoralStyleCLIP: Co-optimized Region and Layer Selection for Image Editing](https://arxiv.org/abs/2303.05031)

  * [SIEDOB: Semantic Image Editing by Disentangling Object and Background](http://arxiv.org/abs/2303.13062v1)

  * [NULL-Text Inversion for Editing Real Images Using Guided Diffusion Models](https://arxiv.org/abs/2211.09794)

  * [InstructPix2Pix: Learning To Follow Image Editing Instructions](https://arxiv.org/abs/2211.09800)
:house:[project](https://www.timothybrooks.com/instruct-pix2pix)

  * [Local 3D Editing via 3D Distillation of CLIP Knowledge](https://openaccess.thecvf.com/content/CVPR2023/papers/Hyung_Local_3D_Editing_via_3D_Distillation_of_CLIP_Knowledge_CVPR_2023_paper.pdf)

  * [Deep Curvilinear Editing: Commutative and Nonlinear Image Manipulation for Pretrained Deep Generative Model](http://arxiv.org/abs/2211.14573)

  * [Imagic: Text-Based Real Image Editing With Diffusion Models](http://arxiv.org/abs/2210.09276)

  * 基于样本的图像编辑

    * [Paint by Example: Exemplar-based Image Editing with Diffusion Models](https://arxiv.org/abs/2211.13227)
:star:[code](https://github.com/Fantasy-Studio/Paint-by-Example)



## 77.sketch(草图)

* [Photo Pre-Training, but for Sketch](https://openaccess.thecvf.com/content/CVPR2023/papers/Li_Photo_Pre-Training_but_for_Sketch_CVPR_2023_paper.pdf)
:star:[code](https://github.com/KeLi-SketchX/Photo-Pre-Training-But-for-Sketch)

* [Restoration of Hand-Drawn Architectural Drawings Using Latent Space Mapping With Degradation Generator](https://openaccess.thecvf.com/content/CVPR2023/papers/Choi_Restoration_of_Hand-Drawn_Architectural_Drawings_Using_Latent_Space_Mapping_With_CVPR_2023_paper.pdf)

* [SECAD-Net: Self-Supervised CAD Reconstruction by Learning Sketch-Extrude Operations](https://openaccess.thecvf.com/content/CVPR2023/papers/Li_SECAD-Net_Self-Supervised_CAD_Reconstruction_by_Learning_Sketch-Extrude_Operations_CVPR_2023_paper.pdf)
:star:[code](https://github.com/BunnySoCrazy/SECAD-Net)



## 76.IP protection(知识产权保护)

* [Model Barrier: A Compact Un-Transferable Isolation Domain for Model Intellectual Property Protection](https://arxiv.org/abs/2303.11078)

* [Effective Ambiguity Attack Against Passport-Based DNN Intellectual Property Protection Schemes Through Fully Connected Layer Substitution](https://arxiv.org/abs/2303.11595)



## 75.Semantic Scene Completion(语义场景补全)

* [Semantic Scene Completion With Cleaner Self](https://arxiv.org/abs/2303.09977)

* [VoxFormer: Sparse Voxel Transformer for Camera-Based 3D Semantic Scene Completion](https://arxiv.org/abs/2302.12251)
:star:[code](https://github.com/NVlabs/VoxFormer)



## 74.Machine Learning(机器学习)

* [Cooperation or Competition: Avoiding Player Domination for Multi-Target Robustness via Adaptive Budgets](https://openaccess.thecvf.com/content/CVPR2023/papers/Wang_Cooperation_or_Competition_Avoiding_Player_Domination_for_Multi-Target_Robustness_via_CVPR_2023_paper.pdf )

* [Multi-Agent Automated Machine Learning](https://arxiv.org/abs/2210.09084)

* [Towards Better Decision Forests: Forest Alternating Optimization](https://openaccess.thecvf.com/content/CVPR2023/papers/Carreira-Perpinan_Towards_Better_Decision_Forests_Forest_Alternating_Optimization_CVPR_2023_paper.pdf)

* [ERM-KTP: Knowledge-Level Machine Unlearning via Knowledge Transfer](https://openaccess.thecvf.com/content/CVPR2023/papers/Lin_ERM-KTP_Knowledge-Level_Machine_Unlearning_via_Knowledge_Transfer_CVPR_2023_paper.pdf)
:star:[code](https://github.com/RUIYUN-ML/ERM-KTP)

* [A Whac-a-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others](https://openaccess.thecvf.com/content/CVPR2023/papers/Li_A_Whac-a-Mole_Dilemma_Shortcuts_Come_in_Multiples_Where_Mitigating_One_CVPR_2023_paper.pdf)
:star:[code](https://github.com/facebookresearch/Whac-A-Mole)

* 新类别发现

  * [Bootstrap Your Own Prior: Towards Distribution-Agnostic Novel Class Discovery](https://openaccess.thecvf.com/content/CVPR2023/papers/Yang_Bootstrap_Your_Own_Prior_Towards_Distribution-Agnostic_Novel_Class_Discovery_CVPR_2023_paper.pdf)
:star:[code](https://github.com/muliyangm/BYOP)

  * [Modeling Inter-Class and Intra-Class Constraints in Novel Class Discovery](http://arxiv.org/abs/2210.03591)

* 迁移学习

  * [Visual Prompt Tuning for Generative Transfer Learning](https://arxiv.org/abs/2210.00990)

  * [A Data-Based Perspective on Transfer Learning](https://arxiv.org/abs/2207.05739)
:star:[code](https://github.com/MadryLab/data-transfer)

  * [Manipulating Transfer Learning for Property Inference](https://arxiv.org/abs/2303.11643)
:star:[code](https://github.com/yulongt23/Transfer-Inference)



## 73.Neural Radiance Fields(神经辐射场)

* [Ref-NPR: Reference-Based Non-Photorealistic Radiance Fields for Controllable Scene Stylization](https://openaccess.thecvf.com/content/CVPR2023/papers/Zhang_Ref-NPR_Reference-Based_Non-Photorealistic_Radiance_Fields_for_Controllable_Scene_Stylization_CVPR_2023_paper.pdf)

* [Discriminating Known From Unknown Objects via Structure-Enhanced Recurrent Variational AutoEncoder](https://openaccess.thecvf.com/content/CVPR2023/papers/Wu_Discriminating_Known_From_Unknown_Objects_via_Structure-Enhanced_Recurrent_Variational_AutoEncoder_CVPR_2023_paper.pdf)

* [Occlusion-Free Scene Recovery via Neural Radiance Fields](https://openaccess.thecvf.com/content/CVPR2023/papers/Zhu_Occlusion-Free_Scene_Recovery_via_Neural_Radiance_Fields_CVPR_2023_paper.pdf)

* [Grid-guided Neural Radiance Fields for Large Urban Scenes](https://arxiv.org/abs/2303.14001)
:house:[project](https://city-super.github.io/gridnerf/)

* [NeRFLight: Fast and Light Neural Radiance Fields using a Shared Feature Grid](https://openaccess.thecvf.com/content/CVPR2023/papers/Rivas-Manzaneque_NeRFLight_Fast_and_Light_Neural_Radiance_Fields_Using_a_Shared_CVPR_2023_paper.pdf)

* [GazeNeRF: 3D-Aware Gaze Redirection With Neural Radiance Fields](https://arxiv.org/abs/2212.04823)
:star:[code](https://github.com/AlessandroRuzzi/GazeNeRF)

* [SPARF: Neural Radiance Fields from Sparse and Noisy Poses](https://arxiv.org/abs/2211.11738)
:star:[code](https://github.com/google-research/sparf)

* [Masked Wavelet Representation for Compact Neural Radiance Fields](https://arxiv.org/abs/2212.09069)
:star:[code](https://github.com/daniel03c1/masked_wavelet_nerf)

* [MobileNeRF: Exploiting the Polygon Rasterization Pipeline for Efficient Neural Field Rendering on Mobile Architectures](https://arxiv.org/abs/2208.00277)
:star:[code](https://github.com/google-research/jax3d/tree/main/jax3d/projects/mobilenerf)

* [AligNeRF: High-Fidelity Neural Radiance Fields via Alignment-Aware Training](https://arxiv.org/abs/2211.09682)
:house:[project](https://yifanjiang.net/alignerf)

* [JacobiNeRF: NeRF Shaping With Mutual Information Gradients](https://arxiv.org/abs/2304.00341)

* [Robust Dynamic Radiance Fields](https://arxiv.org/abs/2301.02239)
:house:[project](https://robust-dynrf.github.io/)

* [Exact-NeRF: An Exploration of a Precise Volumetric Parameterization for Neural Radiance Fields](https://openaccess.thecvf.com/content/CVPR2023/papers/Isaac-Medina_Exact-NeRF_An_Exploration_of_a_Precise_Volumetric_Parameterization_for_Neural_CVPR_2023_paper.pdf)

* [PaletteNeRF: Palette-Based Appearance Editing of Neural Radiance Fields](https://arxiv.org/abs/2212.10699)

* [EditableNeRF: Editing Topologically Varying Neural Radiance Fields by Key Points](https://arxiv.org/abs/2212.04247)
:house:[project](https://chengwei-zheng.github.io/EditableNeRF/)

* [SinGRAF: Learning a 3D Generative Radiance Field for a Single Scene](https://arxiv.org/abs/2211.17260)
:house:[project](https://www.computationalimaging.org/publications/singraf/)

* [ShadowNeuS: Neural SDF Reconstruction by Shadow Ray Supervision](https://arxiv.org/abs/2211.14086)
:star:[code](https://github.com/gerwang/ShadowNeuS)

* [Flow supervision for Deformable NeRF](https://arxiv.org/abs/2303.16333)

* [Local-to-Global Registration for Bundle-Adjusting Neural Radiance Fields](https://arxiv.org/abs/2211.11505)
:house:[project](https://rover-xingyu.github.io/L2G-NeRF/)

* [EventNeRF: Neural Radiance Fields From a Single Colour Event Camera](https://arxiv.org/abs/2206.11896)
:house:[project](https://4dqv.mpi-inf.mpg.de/EventNeRF)

* [SeaThru-NeRF: Neural Radiance Fields in Scattering Media](https://openaccess.thecvf.com/content/CVPR2023/papers/Levy_SeaThru-NeRF_Neural_Radiance_Fields_in_Scattering_Media_CVPR_2023_paper.pdf)

* [SteerNeRF: Accelerating NeRF Rendering via Smooth Viewpoint Trajectory](https://arxiv.org/abs/2212.08476)

* [Complementary Intrinsics From Neural Radiance Fields and CNNs for Outdoor Scene Relighting](https://openaccess.thecvf.com/content/CVPR2023/papers/Yang_Complementary_Intrinsics_From_Neural_Radiance_Fields_and_CNNs_for_Outdoor_CVPR_2023_paper.pdf)

* [Point2Pix: Photo-Realistic Point Cloud Rendering via Neural Radiance Fields](https://arxiv.org/abs/2303.16482)

* [Removing Objects From Neural Radiance Fields](https://arxiv.org/abs/2212.11966)

* [Grid-guided Neural Radiance Fields for Large Urban Scenes](http://arxiv.org/abs/2303.14001v1)
:star:[code](https://city-super.github.io/gridnerf/)

* [GM-NeRF: Learning Generalizable Model-based Neural Radiance Fields from Multi-view Images](http://arxiv.org/abs/2303.13777v1)

* [HandNeRF: Neural Radiance Fields for Animatable Interacting Hands](http://arxiv.org/abs/2303.13825v1)

* [NeRF-DS: Neural Radiance Fields for Dynamic Specular Objects](http://arxiv.org/abs/2303.14435v1)
:star:[code](https://github.com/JokerYan/NeRF-DS)

* [JAWS: Just A Wild Shot for Cinematic Transfer in Neural Radiance Fields](http://arxiv.org/abs/2303.15427v1)
:house:[project](http://www.lix.polytechnique.fr/vista/projects/2023_cvpr_wang)

* [Multi-Space Neural Radiance Fields](http://arxiv.org/abs/2305.04268v1)
:star:[code](https://zx-yin.github.io/msnerf)

* [DBARF: Deep Bundle-Adjusting Generalizable Neural Radiance Fields](https://arxiv.org/abs/2303.14478)
:star:[code](https://aibluefisher.github.io/dbarf)

* [StyleRF: Zero-shot 3D Style Transfer of Neural Radiance Fields](https://arxiv.org/abs/2303.10598)
:house:[project](https://kunhao-liu.github.io/StyleRF/)

* [Temporal Interpolation Is All You Need for Dynamic Neural Radiance Fields](https://arxiv.org/abs/2302.09311)
:house:[project](https://sungheonpark.github.io/tempinterpnerf)

* [SPIn-NeRF: Multiview Segmentation and Perceptual Inpainting With Neural Radiance Fields](https://openaccess.thecvf.com/content/CVPR2023/papers/Mirzaei_SPIn-NeRF_Multiview_Segmentation_and_Perceptual_Inpainting_With_Neural_Radiance_Fields_CVPR_2023_paper.pdf)

* [F2-NeRF: Fast Neural Radiance Field Training With Free Camera Trajectories](https://openaccess.thecvf.com/content/CVPR2023/papers/Wang_F2-NeRF_Fast_Neural_Radiance_Field_Training_With_Free_Camera_Trajectories_CVPR_2023_paper.pdf)
:house:[project](totoro97.github.io/projects/f2-nerf)

* [Clothed Human Performance Capture with a Double-layer Neural Radiance Fields](https://openaccess.thecvf.com/content/CVPR2023/papers/Wang_Clothed_Human_Performance_Capture_With_a_Double-Layer_Neural_Radiance_Fields_CVPR_2023_paper.pdf)

* [DiffusioNeRF: Regularizing Neural Radiance Fields with Denoising Diffusion Models](https://arxiv.org/abs/2302.12231)

* 去模糊

  * [BAD-NeRF: Bundle Adjusted Deblur Neural Radiance Fields](https://openaccess.thecvf.com/content/CVPR2023/papers/Wang_BAD-NeRF_Bundle_Adjusted_Deblur_Neural_Radiance_Fields_CVPR_2023_paper.pdf)
:star:[code](https://github.com/WU-CVGL/BAD-NeRF)

  * [DP-NeRF: Deblurred Neural Radiance Field With Physical Scene Priors](https://openaccess.thecvf.com/content/CVPR2023/papers/Lee_DP-NeRF_Deblurred_Neural_Radiance_Field_With_Physical_Scene_Priors_CVPR_2023_paper.pdf)
:star:[code](https://github.com/dogyoonlee/DP-NeRF)
:house:[project](https://dogyoonlee.github.io/dpnerf/)



## 72.open-set recognition(开集识别)

* [Glocal Energy-based Learning for Few-Shot Open-Set Recognition](http://arxiv.org/abs/2304.11855v1)



## 71.visual reasoning(视觉推理)

* [Visual Programming: Compositional visual reasoning without training](https://arxiv.org/abs/2211.11559)
:trophy:Best Paper

* [Abstract Visual Reasoning: An Algebraic Approach for Solving Raven's Progressive Matrices](https://arxiv.org/abs/2303.11730)
:star:[code](https://github.com/Xu-Jingyi/AlgebraicMR)

* [Super-CLEVR: A Virtual Benchmark To Diagnose Domain Robustness in Visual Reasoning](https://openaccess.thecvf.com/content/CVPR2023/papers/Li_Super-CLEVR_A_Virtual_Benchmark_To_Diagnose_Domain_Robustness_in_Visual_CVPR_2023_paper.pdf)
:star:[code](https://github.com/Lizw14/Super-CLEVR)

* [Unicode Analogies: An Anti-Objectivist Visual Reasoning Challenge](https://openaccess.thecvf.com/content/CVPR2023/papers/Spratley_Unicode_Analogies_An_Anti-Objectivist_Visual_Reasoning_Challenge_CVPR_2023_paper.pdf)



## 70.Image Forgery Detection

* [Hierarchical Fine-Grained Image Forgery Detection and Localization](http://arxiv.org/abs/2303.17111v1)
:star:[code](https://github.com/CHELSEA234/HiFi_IFDL)

* [Detecting and Grounding Multi-Modal Media Manipulation](http://arxiv.org/abs/2304.02556v1)
:star:[code](https://rshaojimmy.github.io/Projects/MultiModal-DeepFake)
:star:[code](https://github.com/rshaojimmy/MultiModal-DeepFake)虚假信息检测

* [Evading DeepFake Detectors via Adversarial Statistical Consistency](http://arxiv.org/abs/2304.11670v1)

* [Edge-Aware Regional Message Passing Controller for Image Forgery Localization](https://openaccess.thecvf.com/content/CVPR2023/papers/Li_Edge-Aware_Regional_Message_Passing_Controller_for_Image_Forgery_Localization_CVPR_2023_paper.pdf)

* [TruFor: Leveraging all-round clues for trustworthy image forgery detection and localization](https://arxiv.org/abs/2212.10957)
:house:[project](https://grip-unina.github.io/TruFor/)

* [Towards Universal Fake Image Detectors That Generalize Across Generative Models](http://arxiv.org/abs/2302.10174)

* Deepfake Detection

  * [Implicit Identity Leakage: The Stumbling Block to Improving Deepfake Detection Generalization](https://arxiv.org/abs/2210.14457)
:star:[code](https://github.com/megvii-research/CADDM)

  * [Dynamic Graph Learning With Content-Guided Spatial-Frequency Relation Reasoning for Deepfake Detection](https://openaccess.thecvf.com/content/CVPR2023/papers/Wang_Dynamic_Graph_Learning_With_Content-Guided_Spatial-Frequency_Relation_Reasoning_for_Deepfake_CVPR_2023_paper.pdf)



## 69.Reinforcement learning(强化学习)

* [PIRLNav: Pretraining with Imitation and RL Finetuning for ObjectNav](https://arxiv.org/abs/2301.07302)

* [Local-Guided Global: Paired Similarity Representation for Visual Reinforcement Learning](https://openaccess.thecvf.com/content/CVPR2023/papers/Choi_Local-Guided_Global_Paired_Similarity_Representation_for_Visual_Reinforcement_Learning_CVPR_2023_paper.pdf)

* [Fusing Pre-Trained Language Models With Multimodal Prompts Through Reinforcement Learning](https://openaccess.thecvf.com/content/CVPR2023/papers/Yu_Fusing_Pre-Trained_Language_Models_With_Multimodal_Prompts_Through_Reinforcement_Learning_CVPR_2023_paper.pdf)
:star:[code](https://github.com/JiwanChung/esper)

* [Galactic: Scaling End-to-End Reinforcement Learning for Rearrangement at 100k Steps-per-Second](https://openaccess.thecvf.com/content/CVPR2023/papers/Berges_Galactic_Scaling_End-to-End_Reinforcement_Learning_for_Rearrangement_at_100k_Steps-per-Second_CVPR_2023_paper.pdf)
:star:[code](https://github.com/facebookresearch/galactic)

* [Frustratingly Easy Regularization on Representation Can Boost Deep Reinforcement Learning](https://arxiv.org/abs/2205.14557)
:house:[project](https://sites.google.com/view/peer-cvpr2023/)



## 68.Lifelong Learning(终身学习)

* [Task Difficulty Aware Parameter Allocation & Regularization for Lifelong Learning](http://arxiv.org/abs/2304.05288v1)
:star:[code](https://github.com/WenjinW/PAR)



## 67.Active Learning(主动学习)

* [Re-thinking Federated Active Learning based on Inter-class Diversity](http://arxiv.org/abs/2303.12317v1)

* [Box-Level Active Detection](http://arxiv.org/abs/2303.13089v1)
:star:[code](https://github.com/lyumengyao/blad)

* [Are Binary Annotations Sufficient? Video Moment Retrieval via Hierarchical Uncertainty-Based Active Learning](https://openaccess.thecvf.com/content/CVPR2023/papers/Ji_Are_Binary_Annotations_Sufficient_Video_Moment_Retrieval_via_Hierarchical_Uncertainty-Based_CVPR_2023_paper.pdf)
:star:[code](https://github.com/renjie-liang/HUAL)

* [Re-Thinking Federated Active Learning Based on Inter-Class Diversity](http://arxiv.org/abs/2303.12317)



## 66.Clustering(聚类)

* [DivClust: Controlling Diversity in Deep Clustering](http://arxiv.org/abs/2304.01042v1)

* MVC

  * [On the Effects of Self-supervision and Contrastive Alignment in Deep Multi-view Clustering](https://arxiv.org/abs/2303.09877)
:star:[code](https://github.com/DanielTrosten/DeepMVC)

  * [GCFAgg: Global and Cross-View Feature Aggregation for Multi-View Clustering](https://arxiv.org/abs/2305.06799)

  * [Sample-Level Multi-View Graph Clustering](https://openaccess.thecvf.com/content/CVPR2023/papers/Tan_Sample-Level_Multi-View_Graph_Clustering_CVPR_2023_paper.pdf)

  * [On the Effects of Self-Supervision and Contrastive Alignment in Deep Multi-View Clustering](https://openaccess.thecvf.com/content/CVPR2023/papers/Trosten_On_the_Effects_of_Self-Supervision_and_Contrastive_Alignment_in_Deep_CVPR_2023_paper.pdf)
:star:[code](https://github.com/DanielTrosten/DeepMVC)

  * [Deep Incomplete Multi-View Clustering With Cross-View Partial Sample and Prototype Alignment](http://arxiv.org/abs/2303.15689)

  * [Highly Confident Local Structure Based Consensus Graph Learning for Incomplete Multi-View Clustering](https://openaccess.thecvf.com/content/CVPR2023/papers/Wen_Highly_Confident_Local_Structure_Based_Consensus_Graph_Learning_for_Incomplete_CVPR_2023_paper.pdf)

  



## 65.Scene flow estimation(场景流估计)

* [Hidden Gems: 4D Radar Scene Flow Learning Using Cross-Modal Supervision](https://arxiv.org/pdf/2303.00462.pdf)
:star:[code](https://github.com/Toytiny/CMFlow)

* [Self-Supervised 3D Scene Flow Estimation Guided by Superpoints](http://arxiv.org/abs/2305.02528v1)

* [Unsupervised Cumulative Domain Adaptation for Foggy Scene Optical Flow](https://arxiv.org/abs/2303.07564)



## 64.Motion Retargeting(动作重定向)

* [Skinned Motion Retargeting with Residual Perception of Motion Semantics & Geometry](https://arxiv.org/abs/2303.08658)
:star:[code](https://github.com/Kebii/R2ET)



## 63.edge detection(边缘检测)

* edge detection

  * [The Treasure Beneath Multiple Annotations: An Uncertainty-aware Edge Detector](https://arxiv.org/abs/2303.11828)
:star:[code](https://github.com/ZhouCX117/UAED)



## 62.Object Counting(物体计数)

* [Zero-shot Object Counting](https://arxiv.org/abs/2303.02001)
:star:[code](https://github.com/cvlab-stonybrook/zero-shot-counting)

* [Indiscernible Object Counting in Underwater Scenes](http://arxiv.org/abs/2304.11677v1)
:star:[code](https://github.com/GuoleiSun/Indiscernible-Object-Counting)



## 61.Object Re-identification(物体重识别)

* [MSINet: Twins Contrastive Search of Multi-Scale Interaction for Object ReID](https://arxiv.org/abs/2303.07065)
:star:[code](https://github.com/vimar-gu/MSINet)

* [Large-scale Training Data Search for Object Re-identification](http://arxiv.org/abs/2303.16186v1)
:star:[code](https://github.com/yorkeyao/SnP)

* [Adaptive Sparse Pairwise Loss for Object Re-Identification](http://arxiv.org/abs/2303.18247v1)
:star:[code](https://github.com/Astaxanthin/AdaSP)



## 60.Industrial Anomaly Detection(工业缺陷检测)

* 缺陷定位

  * [PyramidFlow: High-Resolution Defect Contrastive Localization using Pyramid Normalizing Flow](https://arxiv.org/abs/2303.02595)

* 工业异常检测

  * [Multimodal Industrial Anomaly Detection via Hybrid Fusion](https://arxiv.org/pdf/2303.00601.pdf)
:star:[code](https://github.com/nomewang/M3DM)

  * [OmniAL: A Unified CNN Framework for Unsupervised Anomaly Localization](https://openaccess.thecvf.com/content/CVPR2023/papers/Zhao_OmniAL_A_Unified_CNN_Framework_for_Unsupervised_Anomaly_Localization_CVPR_2023_paper.pdf)

* 异常分割

  * [Winning Solution for the CVPR2023 Visual Anomaly and Novelty Detection Challenge: Multimodal Prompting for Data-centric Anomaly Detection](https://arxiv.org/pdf/2306.09067.pdf)
:star:[code](https://github.com/caoyunkang/Segment-Any-Anomaly)
:thumbsup:[CVPR 2023 冠军解决方案，零样本异常分割新突破！](https://mp.weixin.qq.com/s/_mJKn4o_U_VjEqlz7DXUFQ)



## 59.Image\Video Compression(图像视频压缩)

* [Backdoor Attacks Against Deep Image Compression via Adaptive Frequency Trigger](https://arxiv.org/pdf/2302.14677.pdf)

* [Context-Based Trit-Plane Coding for Progressive Image Compression](https://arxiv.org/abs/2303.05715)
:star:[code](https://github.com/seungminjeon-github/CTC)

* [Learned Image Compression with Mixed Transformer-CNN Architectures](http://arxiv.org/abs/2303.14978v1)
:star:[code](https://github.com/jmliu206/LIC_TCM)

* [LVQAC: Lattice Vector Quantization Coupled with Spatially Adaptive Companding for Efficient Learned Image Compression](http://arxiv.org/abs/2304.12319v1)

* [Optimization-Inspired Cross-Attention Transformer for Compressive Sensing](http://arxiv.org/abs/2304.13986v1)
:star:[code](https://github.com/songjiechong/OCTUF)

* [Multi-Realism Image Compression With a Conditional Generator](https://arxiv.org/abs/2212.13824)

* [AccelIR: Task-aware Image Compression for Accelerating Neural Restoration](https://openaccess.thecvf.com/content/CVPR2023/papers/Ye_AccelIR_Task-Aware_Image_Compression_for_Accelerating_Neural_Restoration_CVPR_2023_paper.pdf)

* 视频压缩

  * [Towards Scalable Neural Representation for Diverse Videos](http://arxiv.org/abs/2303.14124v1)

  * [HNeRV: A Hybrid Neural Representation for Videos](http://arxiv.org/abs/2304.02633v1)
:star:[code](https://haochen-rye.github.io/HNeRV)
:star:[code](https://github.com/haochen-rye/HNeRV) 

  * [Video Compression With Entropy-Constrained Neural Representations](https://openaccess.thecvf.com/content/CVPR2023/papers/Gomes_Video_Compression_With_Entropy-Constrained_Neural_Representations_CVPR_2023_paper.pdf)

  * [Complexity-Guided Slimmable Decoder for Efficient Deep Video Compression](https://openaccess.thecvf.com/content/CVPR2023/papers/Hu_Complexity-Guided_Slimmable_Decoder_for_Efficient_Deep_Video_Compression_CVPR_2023_paper.pdf)

  * [EfficientSCI: Densely Connected Network with Space-time Factorization for Large-scale Video Snapshot Compressive Imaging](https://arxiv.org/abs/2305.10006)
:star:[code](https://github.com/ucaswangls/EfficientSCI.git)

  * [MMVC: Learned Multi-Mode Video Compression with Block-based Prediction Mode Selection and Density-Adaptive Entropy Coding](https://arxiv.org/abs/2304.02273)

  * [Neural Video Compression With Diverse Contexts](https://arxiv.org/abs/2302.14402)
:star:[code](https://github.com/microsoft/DCVC)

  （ [Motion Information Propagation for Neural Video Compression](https://openaccess.thecvf.com/content/CVPR2023/papers/Qi_Motion_Information_Propagation_for_Neural_Video_Compression_CVPR_2023_paper.pdf)

  * [Hierarchical B-Frame Video Coding Using Two-Layer CANF Without Motion Coding](https://openaccess.thecvf.com/content/CVPR2023/papers/Alexandre_Hierarchical_B-Frame_Video_Coding_Using_Two-Layer_CANF_Without_Motion_Coding_CVPR_2023_paper.pdf)

* 矢量量化

  * [NVTC: Nonlinear Vector Transform Coding](https://openaccess.thecvf.com/content/CVPR2023/papers/Feng_NVTC_Nonlinear_Vector_Transform_Coding_CVPR_2023_paper.pdf)
:star:[code](https://github.com/USTC-IMCL/NVTC)   



## 58.Neural rendering(神经渲染)

* [TMO: Textured Mesh Acquisition of Objects With a Mobile Device by Using Differentiable Rendering](http://arxiv.org/abs/2303.15060)

* [Tensor4D: Efficient Neural 4D Decomposition for High-Fidelity Dynamic Reconstruction and Rendering](http://arxiv.org/abs/2211.11610)

* [Hybrid Neural Rendering for Large-Scale Scenes with Motion Blur](https://arxiv.org/abs/2304.12652)
:house:[project](https://daipengwa.github.io/Hybrid-Rendering-ProjectPage)

* [NeUDF: Leaning Neural Unsigned Distance Fields With Volume Rendering](http://arxiv.org/abs/2304.10080)

* [DiffRF: Rendering-Guided 3D Radiance Field Diffusion](https://openaccess.thecvf.com/content/CVPR2023/papers/Muller_DiffRF_Rendering-Guided_3D_Radiance_Field_Diffusion_CVPR_2023_paper.pdf)
:house:[project](https://sirwyver.github.io/DiffRF/)

* [Unsupervised Continual Semantic Adaptation Through Neural Rendering](https://arxiv.org/abs/2211.13969)

* [Neural Fields Meet Explicit Geometric Representations for Inverse Rendering of Urban Scenes](https://arxiv.org/abs/2304.03266)
:house:[project](https://nv-tlabs.github.io/fegr/)

* [UV Volumes for Real-Time Rendering of Editable Free-View Human Performance](https://arxiv.org/abs/2203.14402)
:house:[project](https://fanegg.github.io/UV-Volumes)

* [Inverse Rendering of Translucent Objects Using Physical and Neural Renderers](https://arxiv.org/abs/2305.08336)

* [ORCa: Glossy Objects As Radiance-Field Cameras](https://arxiv.org/abs/2212.04531)
:house:[project](https://ktiwary2.github.io/objectsascam/)

* [MAIR: Multi-View Attention Inverse Rendering With 3D Spatially-Varying Lighting Estimation](https://arxiv.org/abs/2303.12368)
:house:[project](https://bring728.github.io/mair.project/)

* [FlexNeRF: Photorealistic Free-viewpoint Rendering of Moving Humans from Sparse Views](https://arxiv.org/abs/2303.14368)
:house:[project](https://flex-nerf.github.io/)

* [Learning To Render Novel Views From Wide-Baseline Stereo Pairs](https://arxiv.org/abs/2304.08463)
:house:[project](https://yilundu.github.io/wide_baseline/)

* [NeRFLiX: High-Quality Neural View Synthesis by Learning a Degradation-Driven Inter-viewpoint MiXer](https://arxiv.org/abs/2303.06919)
:house:[project](https://redrock303.github.io/nerflix/)

* [FreeNeRF: Improving Few-shot Neural Rendering with Free Frequency Regularization](https://arxiv.org/abs/2303.07418)
:house:[project](https://jiawei-yang.github.io/FreeNeRF/)

* [Local Implicit Ray Function for Generalizable Radiance Field Representation](http://arxiv.org/abs/2304.12746v1)
:star:[code](https://xhuangcv.github.io/lirf/)

* [FitMe: Deep Photorealistic 3D Morphable Model Avatars](http://arxiv.org/abs/2305.09641v1)
:star:[code](https://lattas.github.io/fitme)

* [Pointersect: Neural Rendering with Cloud-Ray Intersection](http://arxiv.org/abs/2304.12390v1)

* [Inverse Rendering of Translucent Objects using Physical and Neural Renderers](http://arxiv.org/abs/2305.08336v1)

* [Semantic Ray: Learning a Generalizable Semantic Field with Cross-Reprojection Attention](http://arxiv.org/abs/2303.13014v1)
:star:[code](https://liuff19.github.io/S-Ray/)

* [ABLE-NeRF: Attention-Based Rendering with Learnable Embeddings for Neural Radiance Field](http://arxiv.org/abs/2303.13817v1)

* [WildLight: In-the-wild Inverse Rendering with a Flashlight](http://arxiv.org/abs/2303.14190v1)
:star:[code](https://junxuan-li.github.io/wildlight-website/)

* [FlexNeRF: Photorealistic Free-viewpoint Rendering of Moving Humans from Sparse Views](http://arxiv.org/abs/2303.14368v1)
:star:[code](https://flex-nerf.github.io/)

* [NeFII: Inverse Rendering for Reflectance Decomposition with Near-Field Indirect Illumination](http://arxiv.org/abs/2303.16617v1)

* [MonoHuman: Animatable Human Neural Field from Monocular Video](http://arxiv.org/abs/2304.02001v1)
:star:[code](https://yzmblog.github.io/projects/MonoHuman/)

* [Neural Residual Radiance Fields for Streamably Free-Viewpoint Videos](http://arxiv.org/abs/2304.04452v1)
:star:[code](https://aoliao12138.github.io/ReRF/)

* [PlenVDB: Memory Efficient VDB-Based Radiance Fields for Fast Training and Rendering](https://openaccess.thecvf.com/content/CVPR2023/papers/Yan_PlenVDB_Memory_Efficient_VDB-Based_Radiance_Fields_for_Fast_Training_and_CVPR_2023_paper.pdf)
在 iPhone12 手机上达到了对于输出 1280x720 分辨率的画面每秒 30 帧的速率。

* [NeFII: Inverse Rendering for Reflectance Decomposition With Near-Field Indirect Illumination](https://arxiv.org/abs/2303.16617)



## 57.Gaze Estimation(视线估计)

* [NeRF-Gaze: A Head-Eye Redirection Parametric Model for Gaze Estimation](https://arxiv.org/abs/2212.14710)

* [Source-free Adaptive Gaze Estimation by Uncertainty Reduction](https://openaccess.thecvf.com/content/CVPR2023/papers/Cai_Source-Free_Adaptive_Gaze_Estimation_by_Uncertainty_Reduction_CVPR_2023_paper.pdf)
:star:[code](https://github.com/caixin1998/UnReGA)

* [ReDirTrans: Latent-to-Latent Translation for Gaze and Head Redirection](https://openaccess.thecvf.com/content/CVPR2023/papers/Jin_ReDirTrans_Latent-to-Latent_Translation_for_Gaze_and_Head_Redirection_CVPR_2023_paper.pdf)

  



## 56.Sound + Vision(声音与视觉)

* [Conditional Generation of Audio from Video via Foley Analogies](http://arxiv.org/abs/2304.08490v1)
:star:[code](https://xypb.github.io/CondFoleyGen/)

* [Vision Transformers Are Parameter-Efficient Audio-Visual Learners](http://arxiv.org/abs/2212.07983)

* 扬声器检测

  * [A Light Weight Model for Active Speaker Detection](https://arxiv.org/abs/2303.04439)
:star:[code](https://github.com/Junhua-Liao/Light-ASD)

* 视听语音识别

  * [Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring](https://arxiv.org/abs/2303.08536)
:star:[code](https://github.com/joannahong/AV-RelScore)

  * [Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception](https://openaccess.thecvf.com/content/CVPR2023/papers/Gao_Collecting_Cross-Modal_Presence-Absence_Evidence_for_Weakly-Supervised_Audio-Visual_Event_Perception_CVPR_2023_paper.pdf)
:star:[code](github.com/MengyuanChen21/CVPR2023-CMPAE)

  * [AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR](http://arxiv.org/abs/2303.16501v1)

  * [SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision](http://arxiv.org/abs/2303.17200v1)

* 视听定位

  * [Learning Audio-Visual Source Localization via False Negative Aware Contrastive Learning](https://arxiv.org/abs/2303.11302)
:star:[code](https://github.com/weixuansun/FNAC-AVL)

  * [Audio-Visual Grouping Network for Sound Localization from Mixtures](http://arxiv.org/abs/2303.17056v1)
:star:[code](https://github.com/stoneMo/AVGN)

* 音频源分离

  * [Language-Guided Audio-Visual Source Separation via Trimodal Consistency](http://arxiv.org/abs/2303.16342v1)

  * [iQuery: Instruments As Queries for Audio-Visual Sound Separation](https://arxiv.org/abs/2212.03814)

* 声音合成

  * [Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos](http://arxiv.org/abs/2303.16897v1)
:star:[code](https://sukun1045.github.io/video-physics-sound-diffusion/)

  * [ReVISE: Self-Supervised Speech Resynthesis With Visual Input for Universal and Generalized Speech Regeneration](https://openaccess.thecvf.com/content/CVPR2023/papers/Hsu_ReVISE_Self-Supervised_Speech_Resynthesis_With_Visual_Input_for_Universal_and_CVPR_2023_paper.pdf)

* 电影音频描述

  * [AutoAD: Movie Description in Context](http://arxiv.org/abs/2303.16899v1)
:house:[project](https://www.robots.ox.ac.uk/~vgg/research/autoad/)

* 从声音中生成场景图像

  * [Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment](http://arxiv.org/abs/2303.17490v1)

* 视听异常检测

  * [Self-Supervised Video Forensics by Audio-Visual Anomaly Detection](https://arxiv.org/abs/2301.01767)
:star:[code](https://cfeng16.github.io/audio-visual-forensics)

* 电影配音

  * [Learning To Dub Movies via Hierarchical Prosody Models](https://arxiv.org/abs/2212.04054)

* 舞蹈生成

  * [EDGE: Editable Dance Generation From Music](https://arxiv.org/abs/2211.10658)
:house:[project](https://edge-dance.github.io/)

  * [Music-Driven Group Choreography](http://arxiv.org/abs/2303.12337)

* 视频显著性预测

  * [CASP-Net: Rethinking Video Saliency Prediction From an Audio-Visual Consistency Perceptual Perspective](https://openaccess.thecvf.com/content/CVPR2023/papers/Xiong_CASP-Net_Rethinking_Video_Saliency_Prediction_From_an_Audio-Visual_Consistency_Perceptual_CVPR_2023_paper.pdf)

* 音频驱动的肖像动画

  * [DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation](http://arxiv.org/abs/2301.03786)

* 听觉定位

  * [Egocentric Auditory Attention Localization in Conversations](http://arxiv.org/abs/2303.16024)



## 55.Novel View Synthesis(视图合成)

* [Neural Pixel Composition for 3D-4D View Synthesis From Multi-Views](https://openaccess.thecvf.com/content/CVPR2023/papers/Bansal_Neural_Pixel_Composition_for_3D-4D_View_Synthesis_From_Multi-Views_CVPR_2023_paper.pdf)

* [Consistent View Synthesis With Pose-Guided Diffusion Models](http://arxiv.org/abs/2303.17598)

* [MixNeRF: Modeling a Ray with Mixture Density for Novel View Synthesis from Sparse Inputs](https://arxiv.org/abs/2302.08788)
:house:[project](https://shawn615.github.io/mixnerf/)

* [NeRF in the Palm of Your Hand: Corrective Augmentation for Robotics via Novel-View Synthesis](https://arxiv.org/abs/2301.08556)
:house:[project](https://bland.website/spartn)

* [NeRDi: Single-View NeRF Synthesis With Language-Guided Diffusion As General Image Priors](https://arxiv.org/abs/2212.03267)

* [Novel-View Acoustic Synthesis](https://arxiv.org/abs/2301.08730)
:house:[project](https://vision.cs.utexas.edu/projects/nvas)

* [Cross-Guided Optimization of Radiance Fields With Multi-View Image Super-Resolution for High-Resolution Novel View Synthesis](https://openaccess.thecvf.com/content/CVPR2023/papers/Yoon_Cross-Guided_Optimization_of_Radiance_Fields_With_Multi-View_Image_Super-Resolution_for_CVPR_2023_paper.pdf)

* [Frequency-Modulated Point Cloud Rendering with Easy Editing](https://arxiv.org/abs/2303.07596)
:star:[code](https://github.com/yizhangphd/FreqPCR)

* [Learning Neural Duplex Radiance Fields for Real-Time View Synthesis](http://arxiv.org/abs/2304.10537v1)
:house:[project](http://raywzy.com/NDRF)

* [ReLight My NeRF: A Dataset for Novel View Synthesis and Relighting of Real World Objects](http://arxiv.org/abs/2304.10448v1)
:star:[code](https://eyecan-ai.github.io/rene)

* [Balanced Spherical Grid for Egocentric View Synthesis](http://arxiv.org/abs/2303.12408v1)

* [Progressively Optimized Local Radiance Fields for Robust View Synthesis](http://arxiv.org/abs/2303.13791v1)
:star:[code](https://localrf.github.io/)

* [F$^{2}$-NeRF: Fast Neural Radiance Field Training with Free Camera Trajectories](http://arxiv.org/abs/2303.15951v1)
:star:[code](https://totoro97.github.io/projects/f2-nerf)

* [Enhanced Stable View Synthesis](http://arxiv.org/abs/2303.17094v1)

* [Consistent View Synthesis with Pose-Guided Diffusion Models](http://arxiv.org/abs/2303.17598v1)
:star:[code](https://poseguided-diffusion.github.io/)

* [Learning to Render Novel Views from Wide-Baseline Stereo Pairs](http://arxiv.org/abs/2304.08463v1)
:star:[code](https://yilundu.github.io/wide_baseline/)

 * [Painting 3D Nature in 2D: View Synthesis of Natural Scenes From a Single Semantic Mask](https://arxiv.org/abs/2302.07224)
:house:[project](https://zju3dv.github.io/paintingnature/)

 * [NoPe-NeRF: Optimising Neural Radiance Field With No Pose Prior](https://openaccess.thecvf.com/content/CVPR2023/papers/Bian_NoPe-NeRF_Optimising_Neural_Radiance_Field_With_No_Pose_Prior_CVPR_2023_paper.pdf)
:house:[project](https://nope-nerf.active.vision)

 * [Multiscale Tensor Decomposition and Rendering Equation Encoding for View Synthesis](https://arxiv.org/abs/2303.03808)
:star:[code](https://github.com/imkanghan/nrff)

* [Efficient View Synthesis and 3D-Based Multi-Frame Denoising With Multiplane Feature Representations](https://arxiv.org/abs/2303.18139)

* [NeRFVS: Neural Radiance Fields for Free View Synthesis via Geometry Scaffolds](https://arxiv.org/abs/2304.06287)

* [DINER: Depth-aware Image-based NEural Radiance fields](https://arxiv.org/abs/2211.16630)
:house:[project](https://malteprinzler.github.io/projects/diner/diner.html)

* [RefSR-NeRF: Towards High Fidelity and Super Resolution View Synthesis](https://openaccess.thecvf.com/content/CVPR2023/papers/Huang_RefSR-NeRF_Towards_High_Fidelity_and_Super_Resolution_View_Synthesis_CVPR_2023_paper.pdf)
:star:[code](https://gitee.com/mindspore/models/tree/master/research/cv/RefSR-NeRF)

* [VDN-NeRF: Resolving Shape-Radiance Ambiguity via View-Dependence Normalization](https://openaccess.thecvf.com/content/CVPR2023/papers/Zhu_VDN-NeRF_Resolving_Shape-Radiance_Ambiguity_via_View-Dependence_Normalization_CVPR_2023_paper.pdf)
:star:[code](https://github.com/BoifZ/VDN-NeRF)

* [DynIBaR: Neural Dynamic Image-Based Rendering](https://arxiv.org/abs/2211.11082)
:house:[project](http://dynibar.github.io/)
:trophy:Honorable Mention

* [Common Pets in 3D: Dynamic New-View Synthesis of Real-Life Deformable Categories](https://arxiv.org/abs/2211.03889)



## 54.Benchmark/Dataset(基准/数据集)

* [Joint HDR Denoising and Fusion: A Real-World Mobile HDR Image Dataset](https://openaccess.thecvf.com/content/CVPR2023/papers/Liu_Joint_HDR_Denoising_and_Fusion_A_Real-World_Mobile_HDR_Image_CVPR_2023_paper.pdf)

* [A New Dataset Based on Images Taken by Blind People for Testing the Robustness of Image Classification Models Trained for ImageNet Categories](https://openaccess.thecvf.com/content/CVPR2023/papers/Bafghi_A_New_Dataset_Based_on_Images_Taken_by_Blind_People_CVPR_2023_paper.pdf)

* [Benchmarking Self-Supervised Learning on Diverse Pathology Datasets](http://arxiv.org/abs/2212.04690)

* [Multispectral Video Semantic Segmentation: A Benchmark Dataset and Baseline](https://openaccess.thecvf.com/content/CVPR2023/papers/Ji_Multispectral_Video_Semantic_Segmentation_A_Benchmark_Dataset_and_Baseline_CVPR_2023_paper.pdf)

* [Towards Artistic Image Aesthetics Assessment: A Large-Scale Dataset and a New Method](http://arxiv.org/abs/2303.15166)

* [ScaleDet: A Scalable Multi-Dataset Object Detector](https://openaccess.thecvf.com/content/CVPR2023/papers/Chen_ScaleDet_A_Scalable_Multi-Dataset_Object_Detector_CVPR_2023_paper.pdf)

* [JRDB-Pose: A Large-Scale Dataset for Multi-Person Pose Estimation and Tracking](https://openaccess.thecvf.com/content/CVPR2023/papers/Vendrow_JRDB-Pose_A_Large-Scale_Dataset_for_Multi-Person_Pose_Estimation_and_Tracking_CVPR_2023_paper.pdf)
:sunflower:[dataset](https://jrdb.erc.monash.edu/)

* [Architecture, Dataset and Model-Scale Agnostic Data-Free Meta-Learning](https://arxiv.org/abs/2303.11183)

* [DF-Platter: Multi-Face Heterogeneous Deepfake Dataset](https://openaccess.thecvf.com/content/CVPR2023/papers/Narayan_DF-Platter_Multi-Face_Heterogeneous_Deepfake_Dataset_CVPR_2023_paper.pdf)
:sunflower:[dataset](http://iab-rubric.org/df-platter-database)

* [HandsOff: Labeled Dataset Generation With No Additional Human Annotations](https://arxiv.org/abs/2212.12645)
:sunflower:[dataset](http://austinxu87.github.io/handsoff)

* [M6Doc: A Large-Scale Multi-Format, Multi-Type, Multi-Layout, Multi-Language, Multi-Annotation Category Dataset for Modern Document Layout Analysis](https://openaccess.thecvf.com/content/CVPR2023/papers/Cheng_M6Doc_A_Large-Scale_Multi-Format_Multi-Type_Multi-Layout_Multi-Language_Multi-Annotation_Category_Dataset_CVPR_2023_paper.pdf)
:star:[code](https://github.com/HCIILAB/M6Doc)

* [ShapeTalk: A Language Dataset and Framework for 3D Shape Edits and Deformations](https://openaccess.thecvf.com/content/CVPR2023/papers/Achlioptas_ShapeTalk_A_Language_Dataset_and_Framework_for_3D_Shape_Edits_CVPR_2023_paper.pdf)
:sunflower:[dataset](https://changeit3d.github.io/)

* [NewsNet: A Novel Dataset for Hierarchical Temporal Segmentation](https://openaccess.thecvf.com/content/CVPR2023/papers/Wu_NewsNet_A_Novel_Dataset_for_Hierarchical_Temporal_Segmentation_CVPR_2023_paper.pdf)
:star:[code](https://github.com/NewsNet-Benchmark/NewsNet)

* [MISC210K: A Large-Scale Dataset for Multi-Instance Semantic Correspondence](https://openaccess.thecvf.com/content/CVPR2023/papers/Sun_MISC210K_A_Large-Scale_Dataset_for_Multi-Instance_Semantic_Correspondence_CVPR_2023_paper.pdf)
:star:[code](https://github.com/YXSUNMADMAX/MISC210K)

* [StarCraftImage: A Dataset for Prototyping Spatial Reasoning Methods for Multi-Agent Environments](https://openaccess.thecvf.com/content/CVPR2023/papers/Kulinski_StarCraftImage_A_Dataset_for_Prototyping_Spatial_Reasoning_Methods_for_Multi-Agent_CVPR_2023_paper.pdf)
:house:[project](https://starcraftdata.davidinouye.com/)

* [Habitat-Matterport 3D Semantics Dataset](https://arxiv.org/abs/2210.05633)

* [CNVid-3.5M: Build, Filter, and Pre-Train the Large-Scale Public Chinese Video-Text Dataset](https://openaccess.thecvf.com/content/CVPR2023/papers/Gan_CNVid-3.5M_Build_Filter_and_Pre-Train_the_Large-Scale_Public_Chinese_Video-Text_CVPR_2023_paper.pdf)
:star:[code](https://github.com/CNVid/CNVid-3.5M)
大规模公共中文视频文本数据集

* [FLAG3D: A 3D Fitness Activity Dataset With Language Instruction](https://arxiv.org/abs/2212.04638)
:house:[project](https://andytang15.github.io/FLAG3D)

* [Multi-Label Compound Expression Recognition: C-EXPR Database & Network](https://openaccess.thecvf.com/content/CVPR2023/papers/Kollias_Multi-Label_Compound_Expression_Recognition_C-EXPR_Database__Network_CVPR_2023_paper.pdf)

* [ARCTIC: A Dataset for Dexterous Bimanual Hand-Object Manipulation](https://arxiv.org/abs/2204.13662)
:house:[project](https://arctic.is.tue.mpg.de/)
手物体操作的数据集

* [xFBD: Focused Building Damage Dataset and Analysis](https://arxiv.org/abs/2212.13876)
建筑物损坏数据集

* [Spring: A High-Resolution High-Detail Dataset and Benchmark for Scene Flow, Optical Flow and Stereo](https://arxiv.org/abs/2303.01943)
:sunflower:[dataset](https://spring-benchmark.org/)

* [Human-Art: A Versatile Human-Centric Dataset Bridging Natural and Artificial Scenes](https://arxiv.org/abs/2303.02760)

* [HairStep: Transfer Synthetic to Real Using Strand and Depth Maps for Single-View 3D Hair Modeling](https://arxiv.org/abs/2303.02700)
:sunflower:[dataset](https://paulyzheng.github.io/research/hairstep/)

* [CUDA: Convolution-based Unlearnable Datasets](https://arxiv.org/abs/2303.04278)
:sunflower:[dataset](https://github.com/vinusankars/Convolution-based-Unlearnability)

* [MVImgNet: A Large-scale Dataset of Multi-view Images](https://arxiv.org/abs/2303.06042)
:sunflower:[dataset](https://gaplab.cuhk.edu.cn/projects/MVImgNet/)

* [V2V4Real: A Real-world Large-scale Dataset for Vehicle-to-Vehicle Cooperative Perception](https://arxiv.org/abs/2303.07601)
:sunflower:[dataset](https://github.com/ucla-mobility/V2V4Real)
Vehicle-to-Vehicle(V2V)感知

* [Polynomial Implicit Neural Representations For Large Diverse Datasets](https://arxiv.org/abs/2303.11424)
:sunflower:[dataset](https://github.com/Rajhans0/Poly_INR)

* [MaskCon: Masked Contrastive Learning for Coarse-Labelled Dataset](http://arxiv.org/abs/2303.12756v1)
:sunflower:[dataset](https://github.com/MrChenFeng/MaskCon_CVPR2023)

* [RaBit: Parametric Modeling of 3D Biped Cartoon Characters with a Topological-consistent Dataset](http://arxiv.org/abs/2303.12564v1)
:sunflower:[dataset](https://gaplab.cuhk.edu.cn/projects/RaBit/)

* [Fantastic Breaks: A Dataset of Paired 3D Scans of Real-World Broken Objects and Their Complete Counterparts](http://arxiv.org/abs/2303.14152v1)
:star:[code](https://terascale-all-sensing-research-studio.github.io/FantasticBreaks)

* [ARKitTrack: A New Diverse Dataset for Tracking Using Mobile RGB-D Data](http://arxiv.org/abs/2303.13885v1)
:star:[code](https://arkittrack.github.io)

* [CelebV-Text: A Large-Scale Facial Text-Video Dataset](http://arxiv.org/abs/2303.14717v1)
:star:[code](https://celebv-text.github.io/)
人脸文本到视频生成

* [Towards Artistic Image Aesthetics Assessment: a Large-scale Dataset and a New Method](http://arxiv.org/abs/2303.15166v1)
:star:[code](https://github.com/Dreemurr-T/BAID.git)
艺术图像美学评估

* [CIMI4D: A Large Multimodal Climbing Motion Dataset under Human-scene Interactions](http://arxiv.org/abs/2303.17948v1)
:house:[project](http://www.lidarhumanmotion.net/cimi4d/)
攀爬动作数据集

* [Uncurated Image-Text Datasets: Shedding Light on Demographic Bias](http://arxiv.org/abs/2304.02828v1)

* [AutoShot: A Short Video Dataset and State-of-the-Art Shot Boundary Detection](http://arxiv.org/abs/2304.06116v1)
:star:[code](https://github.com/wentaozhu/AutoShot.git)
:house:[project](https://paperswithcode.com/paper/autoshot-a-short-video-dataset-and-state-of)公共短视频镜头边界检测数据集

* [V2X-Seq: A Large-Scale Sequential Dataset for Vehicle-Infrastructure Cooperative Perception and Forecasting](http://arxiv.org/abs/2305.05938v1)
:star:[code](https://github.com/AIR-THU/DAIR-V2X-Seq)

* [WEDGE: A multi-weather autonomous driving dataset built from generative vision-language models](http://arxiv.org/abs/2305.07528v1)
:star:[code](https://infernolia.github.io/WEDGE)用于极端天气条件下的物体检测和天气分类任务的合成数据集

* [CLOTH4D: A Dataset for Clothed Human Reconstruction](https://openaccess.thecvf.com/content/CVPR2023/papers/Zou_CLOTH4D_A_Dataset_for_Clothed_Human_Reconstruction_CVPR_2023_paper.pdf)
:sunflower:[dataset](http://www.github.com/AemikaChow/AiDLab-fAshIon-Data)
用于穿衣服人体重建的数据集

* [OmniCity: Omnipotent City Understanding With Multi-Level and Multi-View Images](https://arxiv.org/abs/2208.00928)
:sunflower:[dataset](https://city-super.github.io/omnicity)
从多层次和多视图图像中获取全能城市理解的新数据集。

* [RealImpact: A Dataset of Impact Sound Fields for Real Objects](https://openaccess.thecvf.com/content/CVPR2023/papers/Clarke_RealImpact_A_Dataset_of_Impact_Sound_Fields_for_Real_Objects_CVPR_2023_paper.pdf)
:star:[code](https://samuelpclarke.com/realimpact/)

* [BEDLAM: A Synthetic Dataset of Bodies Exhibiting Detailed Lifelike Animated Motion](https://openaccess.thecvf.com/content/CVPR2023/papers/Black_BEDLAM_A_Synthetic_Dataset_of_Bodies_Exhibiting_Detailed_Lifelike_Animated_CVPR_2023_paper.pdf)
:house:[project](https://bedlam.is.tue.mpg.de/)

* [GFIE:A Dataset and Baseline for Gaze-Following From 2D to 3D in Indoor Environments](https://openaccess.thecvf.com/content/CVPR2023/papers/Hu_GFIE_A_Dataset_and_Baseline_for_Gaze-Following_From_2D_to_CVPR_2023_paper.pdf)
:house:[project](https://sites.google.com/view/gfie)

* Benchmark(基准)

  * [A Soma Segmentation Benchmark in Full Adult Fly Brain](https://openaccess.thecvf.com/content/CVPR2023/papers/Liu_A_Soma_Segmentation_Benchmark_in_Full_Adult_Fly_Brain_CVPR_2023_paper.pdf)
:star:[code](https://github.com/liuxy1103/EMADS)

  * [A New Comprehensive Benchmark for Semi-Supervised Video Anomaly Detection and Anticipation](https://openaccess.thecvf.com/content/CVPR2023/papers/Cao_A_New_Comprehensive_Benchmark_for_Semi-Supervised_Video_Anomaly_Detection_and_CVPR_2023_paper.pdf)

  * [A Large-Scale Homography Benchmark](http://arxiv.org/abs/2302.09997)

  * [Toward RAW Object Detection: A New Benchmark and a New Model](https://openaccess.thecvf.com/content/CVPR2023/papers/Xu_Toward_RAW_Object_Detection_A_New_Benchmark_and_a_New_CVPR_2023_paper.pdf)

  * [MammalNet: A Large-Scale Video Benchmark for Mammal Recognition and Behavior Understanding](https://openaccess.thecvf.com/content/CVPR2023/papers/Chen_MammalNet_A_Large-Scale_Video_Benchmark_for_Mammal_Recognition_and_Behavior_CVPR_2023_paper.pdf)

  * [Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild](https://arxiv.org/abs/2207.10660)
:house:[project](https://omni3d.garrickbrazil.com/)

  * [Advancing Visual Grounding With Scene Knowledge: Benchmark and Method](https://openaccess.thecvf.com/content/CVPR2023/papers/Song_Advancing_Visual_Grounding_With_Scene_Knowledge_Benchmark_and_Method_CVPR_2023_paper.pdf)
:star:[code](https://github.com/zhjohnchan/SK-VG)

  * [The ObjectFolder Benchmark: Multisensory Learning With Neural and Real Objects](https://openaccess.thecvf.com/content/CVPR2023/papers/Gao_The_ObjectFolder_Benchmark_Multisensory_Learning_With_Neural_and_Real_Objects_CVPR_2023_paper.pdf)
:star:[code](https://objectfolder.stanford.edu/)

  * [Meta Omnium: A Benchmark for General-Purpose Learning-to-Learn](http://arxiv.org/abs/2305.07625v1)
:star:[code](https://edi-meta-learning.github.io/meta-omnium)

  * [A New Benchmark: On the Utility of Synthetic Data with Blender for Bare Supervised Learning and Downstream Domain Adaptation](https://arxiv.org/abs/2303.09165)
:star:[code](https://github.com/huitangtang/On_the_Utility_of_Synthetic_Data)

  * [GeoNet: Benchmarking Unsupervised Adaptation across Geographies](http://arxiv.org/abs/2303.15443v1)
:star:[code](https://tarun005.github.io/GeoNet)

  * [PosterLayout: A New Benchmark and Approach for Content-aware Visual-Textual Presentation Layout](http://arxiv.org/abs/2303.15937v1)
:star:[code](https://github.com/PKU-ICST-MIPL/PosterLayout-CVPR2023)

  * [Dense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline](http://arxiv.org/abs/2303.12930v1)

  * [ANetQA: A Large-scale Benchmark for Fine-grained Compositional Reasoning over Untrimmed Videos](https://arxiv.org/abs/2305.02519)
:house:[project](https://milvlg.github.io/anetqa/)

  * Image Similarity

    * [GeneCIS: A Benchmark for General Conditional Image Similarity](https://openaccess.thecvf.com/content/CVPR2023/papers/Vaze_GeneCIS_A_Benchmark_for_General_Conditional_Image_Similarity_CVPR_2023_paper.pdf)
:house:[project](sgvaze.github.io/genecis)

  * [ANetQA: A Large-scale Benchmark for Fine-grained Compositional Reasoning over Untrimmed Videos](http://arxiv.org/abs/2305.02519v1)
:star:[code](https://milvlg.github.io/anetqa/)

  * [Ultra-High Resolution Segmentation with Ultra-Rich Context: A Novel Benchmark](http://arxiv.org/abs/2305.10899v1)
:star:[code](https://github.com/jankyee/URUR)

  * [NewsNet: A Novel Benchmark for Hierarchical Temporal Segmentation](https://openaccess.thecvf.com/content/CVPR2023/papers/Wu_NewsNet_A_Novel_Dataset_for_Hierarchical_Temporal_Segmentation_CVPR_2023_paper.pdf)
:star:[code](https://github.com/NewsNet-Benchmark/NewsNet)

  * [Ultra-High Resolution Segmentation With Ultra-Rich Context: A Novel Benchmark](https://openaccess.thecvf.com/content/CVPR2023/papers/Ji_Ultra-High_Resolution_Segmentation_With_Ultra-Rich_Context_A_Novel_Benchmark_CVPR_2023_paper.pdf)
:star:[code](https://github.com/jankyee/URUR)

  * [PosterLayout: A New Benchmark and Approach for Content-Aware Visual-Textual Presentation Layout](https://arxiv.org/abs/2303.15937)
:star:[code](https://github.com/PKU-ICST-MIPL/PosterLayout-CVPR2023)

  * [Meta Omnium: A Benchmark for General-Purpose Learning-To-Learn](https://arxiv.org/abs/2305.07625)
:star:[code](https://edi-meta-learning.github.io/meta-omnium)

  * [RefTeacher: A Strong Baseline for Semi-Supervised Referring Expression Comprehension](https://openaccess.thecvf.com/content/CVPR2023/papers/Sun_RefTeacher_A_Strong_Baseline_for_Semi-Supervised_Referring_Expression_Comprehension_CVPR_2023_paper.pdf)
:house:[project](https://refteacher.github.io/)



## 53.Sign Language (手语)

* [Ham2Pose: Animating Sign Language Notation Into Pose Sequences](https://openaccess.thecvf.com/content/CVPR2023/papers/Arkushin_Ham2Pose_Animating_Sign_Language_Notation_Into_Pose_Sequences_CVPR_2023_paper.pdf)
:house:[project](https://rotem-shalev.github.io/ham-to-pose)

* 手语翻译

  * [Gloss Attention for Gloss-Free Sign Language Translation](https://openaccess.thecvf.com/content/CVPR2023/papers/Yin_Gloss_Attention_for_Gloss-Free_Sign_Language_Translation_CVPR_2023_paper.pdf)
:star:[code](https://github.com/YinAoXiong/GASLT)

* 手语识别

  * [Continuous Sign Language Recognition with Correlation Network](https://arxiv.org/abs/2303.03202)
:star:[code](https://github.com/hulianyuyy/CorrNet)

  * [Reconstructing Signing Avatars From Video Using Linguistic Priors](https://arxiv.org/abs/2304.10482)
:house:[project](http://sgnify.is.tue.mpg.de/)

  * [Distilling Cross-Temporal Contexts for Continuous Sign Language Recognition](https://openaccess.thecvf.com/content/CVPR2023/papers/Guo_Distilling_Cross-Temporal_Contexts_for_Continuous_Sign_Language_Recognition_CVPR_2023_paper.pdf)

  * [CVT-SLR: Contrastive Visual-Textual Transformation for Sign Language Recognition With Variational Alignment](https://openaccess.thecvf.com/content/CVPR2023/papers/Zheng_CVT-SLR_Contrastive_Visual-Textual_Transformation_for_Sign_Language_Recognition_With_Variational_CVPR_2023_paper.pdf)
:star:[code](https://github.com/binbinjiang/CVT-SLR)

  * [Natural Language-Assisted Sign Language Recognition](https://arxiv.org/abs/2303.12080)
:star:[code](https://github.com/FangyunWei/SLRT)

  * [Continuous Sign Language Recognition With Correlation Network](https://arxiv.org/abs/2303.03202)
:star:[code](https://github.com/hulianyuyy/CorrNet)

* 手语检索

  * [CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive Learning](http://arxiv.org/abs/2303.12793v1)
:star:[code](https://github.com/FangyunWei/SLRT)



## 52.Human Motion(人体运动)

* [Semi-Weakly Supervised Object Kinematic Motion Prediction](https://arxiv.org/abs/2303.17774)

* [The Wisdom of Crowds: Temporal Progressive Attention for Early Action Prediction](https://arxiv.org/abs/2204.13340)

* [MotionDiffuser: Controllable Multi-Agent Motion Prediction using Diffusion](https://openaccess.thecvf.com/content/CVPR2023/papers/Jiang_MotionDiffuser_Controllable_Multi-Agent_Motion_Prediction_Using_Diffusion_CVPR_2023_paper.pdf)

* 人体运动预测

  * [EqMotion: Equivariant Multi-agent Motion Prediction with Invariant Interaction Reasoning](https://arxiv.org/abs/2303.10876)
:star:[code](https://github.com/MediaBrain-SJTU/EqMotion)

  * [DeFeeNet: Consecutive 3D Human Motion Prediction with Deviation Feedback](http://arxiv.org/abs/2304.04496v1)

  * [Decompose More and Aggregate Better: Two Closer Looks at Frequency Representation Learning for Human Motion Prediction](https://openaccess.thecvf.com/content/CVPR2023/papers/Gao_Decompose_More_and_Aggregate_Better_Two_Closer_Looks_at_Frequency_CVPR_2023_paper.pdf)

* 人体运动合成

  * [Generating Human Motion From Textual Descriptions With Discrete Representations](https://arxiv.org/abs/2301.06052)
:house:[project](https://mael-zys.github.io/T2M-GPT/)

  * [UDE: A Unified Driving Engine for Human Motion Generation](https://arxiv.org/abs/2211.16016)
:star:[code](https://github.com/zixiangzhou916/UDE/)

  * [Mofusion: A Framework for Denoising-Diffusion-Based Motion Synthesis](https://arxiv.org/abs/2212.04495)
:house:[project](https://vcai.mpi-inf.mpg.de/projects/MoFusion)

  * [MoDi: Unconditional Motion Synthesis From Diverse Data](http://arxiv.org/abs/2206.08010)

* 3D HM

  * [Generating Holistic 3D Human Motion from Speech](https://arxiv.org/abs/2212.04420)
:house:[project](https://talkshow.is.tue.mpg.de/)



## 51.Computed Imaging(计算成像，如光学、几何、光场成像等)

* [Physics-Guided ISO-Dependent Sensor Noise Modeling for Extreme Low-Light Photography](https://openaccess.thecvf.com/content/CVPR2023/papers/Cao_Physics-Guided_ISO-Dependent_Sensor_Noise_Modeling_for_Extreme_Low-Light_Photography_CVPR_2023_paper.pdf)
:star:[code](https://github.com/happycaoyue/LLD)

* [TRACE: 5D Temporal Regression of Avatars With Dynamic Cameras in 3D Environments](https://openaccess.thecvf.com/content/CVPR2023/papers/Sun_TRACE_5D_Temporal_Regression_of_Avatars_With_Dynamic_Cameras_in_CVPR_2023_paper.pdf)
:star:[code](https://github.com/Arthur151/DynaCam)

* [High-Fidelity Event-Radiance Recovery via Transient Event Frequency](https://openaccess.thecvf.com/content/CVPR2023/papers/Han_High-Fidelity_Event-Radiance_Recovery_via_Transient_Event_Frequency_CVPR_2023_paper.pdf)
:star:[code](https://github.com/hjynwa/TEF)

* [Real-Time Neural Light Field on Mobile Devices](https://arxiv.org/abs/2212.08057)
:house:[project](https://snap-research.github.io/MobileR2L/)

* [Accidental Light Probes](https://arxiv.org/abs/2301.05211)
:house:[project](https://kovenyu.com/ALP/)

* [DyLiN: Making Light Field Networks Dynamic](http://arxiv.org/abs/2303.14243v1)
:star:[code](https://dylin2023.github.io)

* [Learning Rotation-Equivariant Features for Visual Correspondence](http://arxiv.org/abs/2303.15472v1)
:house:[project](http://cvlab.postech.ac.kr/research/RELF)

* [Role of Transients in Two-Bounce Non-Line-of-Sight Imaging](https://arxiv.org/abs/2304.01308)

* [Revisiting Rolling Shutter Bundle Adjustment: Toward Accurate and Fast Solution](https://arxiv.org/abs/2209.08503)

* 相机姿势估计

  * [SliceMatch: Geometry-Guided Aggregation for Cross-View Pose Estimation](https://arxiv.org/abs/2211.14651)

  * [SparsePose: Sparse-View Camera Pose Regression and Refinement](https://arxiv.org/abs/2211.16991)

  * [Pose Synchronization Under Multiple Pair-Wise Relative Poses](https://openaccess.thecvf.com/content/CVPR2023/papers/Sun_Pose_Synchronization_Under_Multiple_Pair-Wise_Relative_Poses_CVPR_2023_paper.pdf)

  * [Privacy-Preserving Representations Are Not Enough: Recovering Scene Content From Camera Poses](https://openaccess.thecvf.com/content/CVPR2023/papers/Chelani_Privacy-Preserving_Representations_Are_Not_Enough_Recovering_Scene_Content_From_Camera_CVPR_2023_paper.pdf)

* 快门校正

  * [EvShutter: Transforming Events for Unconstrained Rolling Shutter Correction](https://openaccess.thecvf.com/content/CVPR2023/papers/Erbach_EvShutter_Transforming_Events_for_Unconstrained_Rolling_Shutter_Correction_CVPR_2023_paper.pdf)
:star:[code](https://github.com/juliuserbach/EvShutter)

* 相机校准

  * [Perspective Fields for Single Image Camera Calibration](https://arxiv.org/abs/2212.03239)
:house:[project](https://jinlinyi.github.io/PerspectiveFields/)

* 几何估计

  * [Adaptive Annealing for Robust Geometric Estimation](https://openaccess.thecvf.com/content/CVPR2023/papers/Sidhartha_Adaptive_Annealing_for_Robust_Geometric_Estimation_CVPR_2023_paper.pdf)

* 相机定位

  *[NeuMap: Neural Coordinate Mapping by Auto-Transdecoder for Camera Localization](https://arxiv.org/abs/2211.11177)
:star:[code](https://github.com/Tangshitao/NeuMap)



## 50.Anomaly Detection(异常检测)

* [Revisiting Reverse Distillation for Anomaly Detection](https://openaccess.thecvf.com/content/CVPR2023/papers/Tien_Revisiting_Reverse_Distillation_for_Anomaly_Detection_CVPR_2023_paper.pdf)

* [SQUID: Deep Feature In-Painting for Unsupervised Anomaly Detection](http://arxiv.org/abs/2111.13495)

* [Prototypical Residual Networks for Anomaly Detection and Localization](https://arxiv.org/abs/2212.02031)

* [OpenMix: Exploring Outlier Samples for Misclassification Detection](https://arxiv.org/abs/2303.17093)
:star:[code](https://github.com/Impression2805/OpenMix)

* [Explicit Boundary Guided Semi-Push-Pull Contrastive Learning for Supervised Anomaly Detection](https://arxiv.org/abs/2207.01463)
:star:[code](https://github.com/xcyao00/BGAD)

* [Diversity-Measurable Anomaly Detection](https://arxiv.org/abs/2303.05047)

* [SimpleNet: A Simple Network for Image Anomaly Detection and Localization](http://arxiv.org/abs/2303.15140v1)
:star:[code](https://github.com/DonaldRR/SimpleNet)

* [DeSTSeg: Segmentation Guided Denoising Student-Teacher for Anomaly Detection](https://arxiv.org/abs/2211.11317)

* [WinCLIP: Zero-/Few-Shot Anomaly Classification and Segmentation](http://arxiv.org/abs/2303.14814v1)

* OOD

  * [Uncertainty-Aware Optimal Transport for Semantically Coherent Out-of-Distribution Detection](https://arxiv.org/abs/2303.10449)
:star:[code](https://github.com/LuFan31/ET-OOD)

  * [Mind the Label Shift of Augmentation-Based Graph OOD Generalization](http://arxiv.org/abs/2303.14859)

  * [Block Selection Method for Using Feature Norm in Out-of-Distribution Detection](https://arxiv.org/abs/2212.02295)
:star:[code](https://github.com/gist-ailab/block-selection-for-OOD-detection)

  * [Distribution Shift Inversion for Out-of-Distribution Prediction](https://openaccess.thecvf.com/content/CVPR2023/papers/Yu_Distribution_Shift_Inversion_for_Out-of-Distribution_Prediction_CVPR_2023_paper.pdf)
:star:[code](https://github.com/yu-rp/Distribution-Shift-Iverson)

  * [Are Data-Driven Explanations Robust Against Out-of-Distribution Data?](https://arxiv.org/abs/2303.16390)

  * [LINe: Out-of-Distribution Detection by Leveraging Important Neurons](http://arxiv.org/abs/2303.13995v1)

  * [Rethinking Out-of-Distribution (OOD) Detection: Masked Image Modeling Is All You Need](https://arxiv.org/abs/2302.02615)
:star:[code](https://github.com/JulietLJY/MOOD)

  * [Balanced Energy Regularization Loss for Out-of-Distribution Detection](https://openaccess.thecvf.com/content/CVPR2023/papers/Choi_Balanced_Energy_Regularization_Loss_for_Out-of-Distribution_Detection_CVPR_2023_paper.pdf)

  * [Decoupling MaxLogit for Out-of-Distribution Detection](https://openaccess.thecvf.com/content/CVPR2023/papers/Zhang_Decoupling_MaxLogit_for_Out-of-Distribution_Detection_CVPR_2023_paper.pdf)

  * [Detection of Out-of-Distribution Samples Using Binary Neuron Activation Patterns](https://arxiv.org/abs/2212.14268)

  * [GEN: Pushing the Limits of Softmax-Based Out-of-Distribution Detection](https://openaccess.thecvf.com/content/CVPR2023/papers/Liu_GEN_Pushing_the_Limits_of_Softmax-Based_Out-of-Distribution_Detection_CVPR_2023_paper.pdf)
:star:[code](https://github.com/XixiLiu95/GEN)



## 49.Image Geo-localization(图像地理位置识别)

* [Where We Are and What We're Looking At: Query Based Worldwide Image Geo-localization Using Hierarchies and Scenes](https://arxiv.org/abs/2303.04249)



## 48.NLP(自然语言处理)

* [Images Speak in Images: A Generalist Painter for In-Context Visual Learning](https://arxiv.org/abs/2212.02499)
:star:[code](https://github.com/baaivision/Painter)

* [CLIP-Sculptor: Zero-Shot Generation of High-Fidelity and Diverse Shapes From Natural Language](https://openaccess.thecvf.com/content/CVPR2023/papers/Sanghi_CLIP-Sculptor_Zero-Shot_Generation_of_High-Fidelity_and_Diverse_Shapes_From_Natural_CVPR_2023_paper.pdf)

* 反讽检测(检测文本（或图像，如漫画等其他模态）中是否存在讽刺)

  * [DIP: Dual Incongruity Perceiving Network for Sarcasm Detection](https://openaccess.thecvf.com/content/CVPR2023/papers/Wen_DIP_Dual_Incongruity_Perceiving_Network_for_Sarcasm_Detection_CVPR_2023_paper.pdf)
:star:[code](https://github.com/downdric/MSD)

* NLQ

  * [NaQ: Leveraging Narrations as Queries to Supervise Episodic Memory](https://arxiv.org/abs/2301.00746)
:star:[code](http://vision.cs.utexas.edu/projects/naq)

* Visual Grounding(视觉指代)

  * [Language Adaptive Weight Generation for Multi-Task Visual Grounding](https://openaccess.thecvf.com/content/CVPR2023/papers/Su_Language_Adaptive_Weight_Generation_for_Multi-Task_Visual_Grounding_CVPR_2023_paper.pdf)

* Referring Expression Comprehension(指代表达理解)

  * [RefCLIP: A Universal Teacher for Weakly Supervised Referring Expression Comprehension](https://openaccess.thecvf.com/content/CVPR2023/papers/Jin_RefCLIP_A_Universal_Teacher_for_Weakly_Supervised_Referring_Expression_Comprehension_CVPR_2023_paper.pdf)
:house:[project](https://refclip.github.io/)



## 47.Few/Zero-Shot Learning/Domain Generalization/Adaptation(小/零样本/域泛化/域适应)

* DG

  * [Towards Domain Generalization for Multi-view 3D Object Detection in Bird-Eye-View](https://arxiv.org/abs/2303.01686)

  * [Meta-Causal Learning for Single Domain Generalization](http://arxiv.org/abs/2304.03709)

  * [Bi-Level Meta-Learning for Few-Shot Domain Generalization](https://openaccess.thecvf.com/content/CVPR2023/papers/Qin_Bi-Level_Meta-Learning_for_Few-Shot_Domain_Generalization_CVPR_2023_paper.pdf)

  * [Promoting Semantic Connectivity: Dual Nearest Neighbors Contrastive Learning for Unsupervised Domain Generalization](https://openaccess.thecvf.com/content/CVPR2023/papers/Liu_Promoting_Semantic_Connectivity_Dual_Nearest_Neighbors_Contrastive_Learning_for_Unsupervised_CVPR_2023_paper.pdf)

  * [Federated Domain Generalization With Generalization Adjustment](https://openaccess.thecvf.com/content/CVPR2023/papers/Zhang_Federated_Domain_Generalization_With_Generalization_Adjustment_CVPR_2023_paper.pdf)
:star:[code](https://github.com/MediaBrain-SJTU/FedDG-GA)

  * [Decompose, Adjust, Compose: Effective Normalization by Playing With Frequency for Domain Generalization](https://arxiv.org/abs/2303.02328)

  * [NICO++: Towards Better Benchmarking for Domain Generalization](https://openaccess.thecvf.com/content/CVPR2023/papers/Zhang_NICO_Towards_Better_Benchmarking_for_Domain_Generalization_CVPR_2023_paper.pdf)
:star:[code](https://github.com/xxgege/NICO-plus)

  * [Improved Test-Time Adaptation for Domain Generalization](http://arxiv.org/abs/2304.04494v1)
:star:[code](https://github.com/liangchen527/ITTA)

  * [Modality-Agnostic Debiasing for Single Domain Generalization](https://arxiv.org/abs/2303.07123)

  * [Neuron Structure Modeling for Generalizable Remote Physiological Measurement](https://arxiv.org/abs/2303.05955)
:star:[code](https://github.com/LuPaoPao/NEST)

  * [Sharpness-Aware Gradient Matching for Domain Generalization](https://arxiv.org/abs/2303.10353)
:star:[code](https://github.com/Wang-pengfei/SAGM)

  * [Improving Generalization with Domain Convex Game](http://arxiv.org/abs/2303.13297v1)

  * [Generalist: Decoupling Natural and Robust Generalization](http://arxiv.org/abs/2303.13813v1)
:star:[code](https://github.com/PKU-ML/Generalist)

  * [ALOFT: A Lightweight MLP-like Architecture with Dynamic Low-frequency Transform for Domain Generalization](https://arxiv.org/abs/2303.11674)
:star:[code](https://github.com/lingeringlight/ALOFT/)

  * [Deep Frequency Filtering for Domain Generalization](https://arxiv.org/abs/2203.12198)

  * [Progressive Random Convolutions for Single Domain Generalization](http://arxiv.org/abs/2304.00424v1)

  * [Meta-causal Learning for Single Domain Generalization](http://arxiv.org/abs/2304.03709v1)

* DA

  * [Guiding Pseudo-labels with Uncertainty Estimation for Test-Time Adaptation](https://arxiv.org/abs/2303.03770)
:star:[code](https://github.com/MattiaLitrico/Guiding-Pseudo-labels-with-Uncertainty-Estimation-for-Test-Time-Adaptation)

  * [Class Relationship Embedded Learning for Source-Free Unsupervised Domain Adaptation](https://openaccess.thecvf.com/content/CVPR2023/papers/Zhang_Class_Relationship_Embedded_Learning_for_Source-Free_Unsupervised_Domain_Adaptation_CVPR_2023_paper.pdf)

  * [Semi-Supervised Domain Adaptation With Source Label Adaptation](http://arxiv.org/abs/2302.02335)

  * [SCoDA: Domain Adaptive Shape Completion for Real Scans](http://arxiv.org/abs/2304.10179)

  * [Divide and Adapt: Active Domain Adaptation via Customized Learning](https://openaccess.thecvf.com/content/CVPR2023/papers/Huang_Divide_and_Adapt_Active_Domain_Adaptation_via_Customized_Learning_CVPR_2023_paper.pdf)

  * [Source-Free Video Domain Adaptation With Spatial-Temporal-Historical Consistency Learning](https://openaccess.thecvf.com/content/CVPR2023/papers/Li_Source-Free_Video_Domain_Adaptation_With_Spatial-Temporal-Historical_Consistency_Learning_CVPR_2023_paper.pdf)

  * [DARE-GRAM: Unsupervised Domain Adaptation Regression by Aligning Inverse Gram Matrices](https://arxiv.org/abs/2303.13325)
:star:[code](https://github.com/ismailnejjar/DARE-GRAM)

  * [Dual-Bridging With Adversarial Noise Generation for Domain Adaptive rPPG Estimation](https://openaccess.thecvf.com/content/CVPR2023/papers/Du_Dual-Bridging_With_Adversarial_Noise_Generation_for_Domain_Adaptive_rPPG_Estimation_CVPR_2023_paper.pdf )

  * [MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation](https://arxiv.org/abs/2212.01322)
:star:[code](https://github.com/lhoyer/MIC)

  * [DATE: Domain Adaptive Product Seeker for E-commerce](http://arxiv.org/abs/2304.03669v1)
:star:[code](https://github.com/Taobao-live/Product-Seeking)

  * [Adjustment and Alignment for Unbiased Open Set Domain Adaptation](https://openaccess.thecvf.com/content/CVPR2023/papers/Li_Adjustment_and_Alignment_for_Unbiased_Open_Set_Domain_Adaptation_CVPR_2023_paper.pdf)
:star:[code](https://github.com/CityU-AIM-Group/Anna)

  * [Patch-Mix Transformer for Unsupervised Domain Adaptation: A Game Perspective](http://arxiv.org/abs/2303.13434v1)

  * [MHPL: Minimum Happy Points Learning for Active Source Free Domain Adaptation](https://openaccess.thecvf.com/content/CVPR2023/papers/Wang_MHPL_Minimum_Happy_Points_Learning_for_Active_Source_Free_Domain_CVPR_2023_paper.pdf)

  * [COT: Unsupervised Domain Adaptation with Clustering and Optimal Transport](https://openaccess.thecvf.com/content/CVPR2023/papers/Liu_COT_Unsupervised_Domain_Adaptation_With_Clustering_and_Optimal_Transport_CVPR_2023_paper.pdf)

  * [Upcycling Models under Domain and Category Shift](https://arxiv.org/abs/2303.07110)
:star:[code](https://github.com/ispc-lab/GLC)

  * [C-SFDA: A Curriculum Learning Aided Self-Training Framework for Efficient Source Free Domain Adaptation](http://arxiv.org/abs/2303.17132v1)

  * [TeSLA: Test-Time Self-Learning With Automatic Adversarial Augmentation](https://arxiv.org/abs/2303.09870)
:star:[code](https://github.com/devavratTomar/TeSLA)

  * [OSAN: A One-Stage Alignment Network to Unify Multimodal Alignment and Unsupervised Domain Adaptation](https://openaccess.thecvf.com/content/CVPR2023/papers/Liu_OSAN_A_One-Stage_Alignment_Network_To_Unify_Multimodal_Alignment_and_CVPR_2023_paper.pdf)

  * [MOT: Masked Optimal Transport for Partial Domain Adaptation](https://openaccess.thecvf.com/content/CVPR2023/papers/Luo_MOT_Masked_Optimal_Transport_for_Partial_Domain_Adaptation_CVPR_2023_paper.pdf)

  * [Feature Alignment and Uniformity for Test Time Adaptation](https://arxiv.org/abs/2303.10902)

* ZSL

  * [Bi-Directional Distribution Alignment for Transductive Zero-Shot Learning](https://arxiv.org/abs/2303.08698)
:star:[code](https://github.com/Zhicaiwww/Bi-VAEGAN)

  * [Progressive Semantic-Visual Mutual Adaption for Generalized Zero-Shot Learning](http://arxiv.org/abs/2303.15322v1)
:star:[code](https://github.com/ManLiuCoder/PSVMA)

  * [Learning Attention as Disentangler for Compositional Zero-shot Learning](http://arxiv.org/abs/2303.15111v1)
:star:[code](https://haoosz.github.io/ade-czsl/)

  * [Zero-shot Model Diagnosis](http://arxiv.org/abs/2303.15441v1)

  * [Learning Conditional Attributes for Compositional Zero-Shot Learning](https://arxiv.org/abs/2305.17940)
:star:[code](https://github.com/wqshmzh/CANet-CZSL)

  * [(ML)$^2$P-Encoder: On Exploration of Channel-Class Correlation for Multi-Label Zero-Shot Learning](https://openaccess.thecvf.com/content/CVPR2023/papers/Liu_ML2P-Encoder_On_Exploration_of_Channel-Class_Correlation_for_Multi-Label_Zero-Shot_Learning_CVPR_2023_paper.pdf)
:star:[code](github.com/simonzmliu/cvpr23_mlzsl)

  * [Decomposed Soft Prompt Guided Fusion Enhancing for Compositional Zero-Shot Learning](http://arxiv.org/abs/2211.10681)

* FSL

  * [Transductive Few-shot Learning with Prototype-based Label Propagation by Iterative Graph Refinement](http://arxiv.org/abs/2304.11598v1)

  * [Prompt, Generate, Then Cache: Cascade of Foundation Models Makes Strong Few-Shot Learners](https://arxiv.org/abs/2303.02151)
:star:[code](https://github.com/ZrrSkywalker/CaFo)

  * [Revisiting Prototypical Network for Cross Domain Few-Shot Learning](https://openaccess.thecvf.com/content/CVPR2023/papers/Zhou_Revisiting_Prototypical_Network_for_Cross_Domain_Few-Shot_Learning_CVPR_2023_paper.pdf)
:star:[code](https://github.com/NWPUZhoufei/LDP-Net)

  * [Multimodality Helps Unimodality: Cross-Modal Few-Shot Learning With Multimodal Models](https://arxiv.org/abs/2301.06267)
:house:[project](https://linzhiqiu.github.io/papers/cross_modal/)

  * [Open-Set Likelihood Maximization for Few-Shot Learning](https://arxiv.org/abs/2301.08390)

  * [StyleAdv: Meta Style Adversarial Training for Cross-Domain Few-Shot Learning](https://arxiv.org/abs/2302.09309)
:star:[code](https://github.com/lovelyqian/StyleAdv-CDFSL)



## 46.Scene Graph Generation(场景图生成)

* [Unbiased Scene Graph Generation in Videos](http://arxiv.org/abs/2304.00733)

* [Prototype-Based Embedding Network for Scene Graph Generation](http://arxiv.org/abs/2303.07096)

* [IS-GGT: Iterative Scene Graph Generation With Generative Transformers](https://openaccess.thecvf.com/content/CVPR2023/papers/Kundu_IS-GGT_Iterative_Scene_Graph_Generation_With_Generative_Transformers_CVPR_2023_paper.pdf)

* [Prototype-based Embedding Network for Scene Graph Generation](https://arxiv.org/abs/2303.07096)
:star:[code](https://github.com/VL-Group/PENET)

* [Devil's on the Edges: Selective Quad Attention for Scene Graph Generation](http://arxiv.org/abs/2304.03495v1)
:house:[project](https://cvlab.postech.ac.kr/research/SQUAT/)

* [Learning To Generate Language-Supervised and Open-Vocabulary Scene Graph Using Pre-Trained Visual-Semantic Space](https://openaccess.thecvf.com/content/CVPR2023/papers/Zhang_Learning_To_Generate_Language-Supervised_and_Open-Vocabulary_Scene_Graph_Using_Pre-Trained_CVPR_2023_paper.pdf)

* [Panoptic Video Scene Graph Generation](https://openaccess.thecvf.com/content/CVPR2023/papers/Yang_Panoptic_Video_Scene_Graph_Generation_CVPR_2023_paper.pdf)

* [Fast Contextual Scene Graph Generation With Unbiased Context Augmentation](https://openaccess.thecvf.com/content/CVPR2023/papers/Jin_Fast_Contextual_Scene_Graph_Generation_With_Unbiased_Context_Augmentation_CVPR_2023_paper.pdf)



## 45.Dense Prediction(密集预测)

* [Ensemble-Based Blackbox Attacks on Dense Prediction](https://arxiv.org/abs/2303.14304)
:star:[code](https://github.com/CSIPlab/EBAD)

* [DejaVu: Conditional Regenerative Learning to Enhance Dense Prediction](https://arxiv.org/abs/2303.01573)

* [Ensemble-based Blackbox Attacks on Dense Prediction](http://arxiv.org/abs/2303.14304v1)
:star:[code](https://github.com/CSIPlab/EBAD)

* [Probabilistic Prompt Learning for Dense Prediction](http://arxiv.org/abs/2304.00779v1)

* [1% VS 100%: Parameter-Efficient Low Rank Adapter for Dense Predictions](https://openaccess.thecvf.com/content/CVPR2023/papers/Yin_1_VS_100_Parameter-Efficient_Low_Rank_Adapter_for_Dense_Predictions_CVPR_2023_paper.pdf)

* [DPF: Learning Dense Prediction Fields With Weak Supervision](https://arxiv.org/abs/2303.16890)
:star:[code](https://github.com/cxx226/DPF)

* 密集检测

  * [One-to-Few Label Assignment for End-to-End Dense Detection](https://arxiv.org/abs/2303.11567)
:star:[code](https://github.com/strongwolf/o2f)

* 密集目标定位

  * [Learning Multi-Modal Class-Specific Tokens for Weakly Supervised Dense Object Localization](https://openaccess.thecvf.com/content/CVPR2023/papers/Xu_Learning_Multi-Modal_Class-Specific_Tokens_for_Weakly_Supervised_Dense_Object_Localization_CVPR_2023_paper.pdf)
:star:[code](https://github.com/xulianuwa/MMCST)



## 44.Federated Learning(联邦学习)

* [Confidence-Aware Personalized Federated Learning via Variational Expectation Maximization](https://openaccess.thecvf.com/content/CVPR2023/papers/Zhu_Confidence-Aware_Personalized_Federated_Learning_via_Variational_Expectation_Maximization_CVPR_2023_paper.pdf)

* [Federated Learning With Data-Agnostic Distribution Fusion](https://openaccess.thecvf.com/content/CVPR2023/papers/Duan_Federated_Learning_With_Data-Agnostic_Distribution_Fusion_CVPR_2023_paper.pdf)

* [How To Prevent the Poor Performance Clients for Personalized Federated Learning?](https://openaccess.thecvf.com/content/CVPR2023/papers/Qu_How_To_Prevent_the_Poor_Performance_Clients_for_Personalized_Federated_CVPR_2023_paper.pdf)

* [GradMA: A Gradient-Memory-Based Accelerated Federated Learning With Alleviated Catastrophic Forgetting](https://arxiv.org/abs/2302.14307)

* [Bias-Eliminating Augmentation Learning for Debiased Federated Learning](https://openaccess.thecvf.com/content/CVPR2023/papers/Xu_Bias-Eliminating_Augmentation_Learning_for_Debiased_Federated_Learning_CVPR_2023_paper.pdf)

* [Make Landscape Flatter in Differentially Private Federated Learning](https://arxiv.org/abs/2303.11242)

* [The Resource Problem of Using Linear Layer Leakage Attack in Federated Learning](http://arxiv.org/abs/2303.14868v1)

* [Rethinking Federated Learning With Domain Shift: A Prototype View](https://openaccess.thecvf.com/content/CVPR2023/papers/Huang_Rethinking_Federated_Learning_With_Domain_Shift_A_Prototype_View_CVPR_2023_paper.pdf)
:star:[code](https://github.com/WenkeHuang/RethinkFL)

* [On the Effectiveness of Partial Variance Reduction in Federated Learning With Heterogeneous Data](https://openaccess.thecvf.com/content/CVPR2023/papers/Li_On_the_Effectiveness_of_Partial_Variance_Reduction_in_Federated_Learning_CVPR_2023_paper.pdf)

* [Elastic Aggregation for Federated Optimization](https://openaccess.thecvf.com/content/CVPR2023/papers/Chen_Elastic_Aggregation_for_Federated_Optimization_CVPR_2023_paper.pdf)

* [FedDM: Iterative Distribution Matching for Communication-Efficient Federated Learning](https://arxiv.org/abs/2207.09653)

* [Adaptive Channel Sparsity for Federated Learning Under System Heterogeneity](https://openaccess.thecvf.com/content/CVPR2023/papers/Liao_Adaptive_Channel_Sparsity_for_Federated_Learning_Under_System_Heterogeneity_CVPR_2023_paper.pdf)

* [ScaleFL: Resource-Adaptive Federated Learning With Heterogeneous Clients](https://openaccess.thecvf.com/content/CVPR2023/papers/Ilhan_ScaleFL_Resource-Adaptive_Federated_Learning_With_Heterogeneous_Clients_CVPR_2023_paper.pdf)

* [Reliable and Interpretable Personalized Federated Learning](https://openaccess.thecvf.com/content/CVPR2023/papers/Qin_Reliable_and_Interpretable_Personalized_Federated_Learning_CVPR_2023_paper.pdf)



## 43.Multi-Task Learning(多任务学习)

* [Independent Component Alignment for Multi-Task Learning](https://openaccess.thecvf.com/content/CVPR2023/papers/Senushkin_Independent_Component_Alignment_for_Multi-Task_Learning_CVPR_2023_paper.pdf)

* [Dynamic Neural Network for Multi-Task Learning Searching Across Diverse Network Topologies](https://arxiv.org/abs/2303.06856)

* [AdaMTL: Adaptive Input-dependent Inference for Efficient Multi-Task Learning](http://arxiv.org/abs/2304.08594v1)
:star:[code](https://github.com/scale-lab/AdaMTL.git)

* [Mod-Squad: Designing Mixtures of Experts As Modular Multi-Task Learners](https://openaccess.thecvf.com/content/CVPR2023/papers/Chen_Mod-Squad_Designing_Mixtures_of_Experts_As_Modular_Multi-Task_Learners_CVPR_2023_paper.pdf)
:house:[project](https://vis-www.cs.umass.edu/mod-squad/)

* [Mitigating Task Interference in Multi-Task Learning via Explicit Task Routing With Non-Learnable Primitives](https://openaccess.thecvf.com/content/CVPR2023/papers/Ding_Mitigating_Task_Interference_in_Multi-Task_Learning_via_Explicit_Task_Routing_CVPR_2023_paper.pdf)
:star:[code](https://github.com/zhichao-lu/etr-nlp-mtl)

* [Hierarchical Prompt Learning for Multi-Task Learning](https://openaccess.thecvf.com/content/CVPR2023/papers/Liu_Hierarchical_Prompt_Learning_for_Multi-Task_Learning_CVPR_2023_paper.pdf)



## 42.Metric Learning(度量学习)

* [Advancing Deep Metric Learning Through Multiple Batch Norms And Multi-Targeted Adversarial Examples](https://arxiv.org/abs/2211.16253)

* [Deep Factorized Metric Learning](https://openaccess.thecvf.com/content/CVPR2023/papers/Wang_Deep_Factorized_Metric_Learning_CVPR_2023_paper.pdf)
:star:[code](https://github.com/wangck20/DFML)

* [Deep Semi-Supervised Metric Learning With Mixed Label Propagation](https://openaccess.thecvf.com/content/CVPR2023/papers/Zhuang_Deep_Semi-Supervised_Metric_Learning_With_Mixed_Label_Propagation_CVPR_2023_paper.pdf)

* [Cross-Image-Attention for Conditional Embeddings in Deep Metric Learning](https://openaccess.thecvf.com/content/CVPR2023/papers/Kotovenko_Cross-Image-Attention_for_Conditional_Embeddings_in_Deep_Metric_Learning_CVPR_2023_paper.pdf)



## 41.Incremental Learning(增量学习)

* [Decoupling Learning and Remembering: A Bilevel Memory Framework With Knowledge Projection for Task-Incremental Learning](https://openaccess.thecvf.com/content/CVPR2023/papers/Sun_Decoupling_Learning_and_Remembering_A_Bilevel_Memory_Framework_With_Knowledge_CVPR_2023_paper.pdf)
:star:[code](https://github.com/SunWenJu123/BMKP)

* [AttriCLIP: A Non-Incremental Learner for Incremental Knowledge Learning](https://arxiv.org/abs/2305.11488)
:star:[code](https://github.com/bhrqw/AttriCLIP)

* [GKEAL: Gaussian Kernel Embedded Analytic Learning for Few-Shot Class Incremental Task](https://openaccess.thecvf.com/content/CVPR2023/papers/Zhuang_GKEAL_Gaussian_Kernel_Embedded_Analytic_Learning_for_Few-Shot_Class_Incremental_CVPR_2023_paper.pdf)

* 类增量学习

  * [Dense Network Expansion for Class Incremental Learning](http://arxiv.org/abs/2303.12696v1)

  * [Class-Incremental Exemplar Compression for Class-Incremental Learning](http://arxiv.org/abs/2303.14042v1)
:star:[code](https://github.com/xfflzl/CIM-CIL)

  * [Rebalancing Batch Normalization for Exemplar-based Class-Incremental Learning](https://arxiv.org/abs/2201.12559)

  * [Learning with Fantasy: Semantic-Aware Virtual Contrastive Constraint for Few-Shot Class-Incremental Learning](http://arxiv.org/abs/2304.00426v1)
:star:[code](https://github.com/zysong0113/SAVC)

  * [On the Stability-Plasticity Dilemma of Class-Incremental Learning](http://arxiv.org/abs/2304.01663v1)

  * [Few-Shot Class-Incremental Learning via Class-Aware Bilateral Distillation](https://openaccess.thecvf.com/content/CVPR2023/papers/Zhao_Few-Shot_Class-Incremental_Learning_via_Class-Aware_Bilateral_Distillation_CVPR_2023_paper.pdf)
:star:[code](https://github.com/LinglanZhao/BiDistFSCIL)

  * [Multi-Centroid Task Descriptor for Dynamic Class Incremental Inference](https://openaccess.thecvf.com/content/CVPR2023/papers/Cai_Multi-Centroid_Task_Descriptor_for_Dynamic_Class_Incremental_Inference_CVPR_2023_paper.pdf)

  * [DKT: Diverse Knowledge Transfer Transformer for Class Incremental Learning](https://openaccess.thecvf.com/content/CVPR2023/papers/Gao_DKT_Diverse_Knowledge_Transfer_Transformer_for_Class_Incremental_Learning_CVPR_2023_paper.pdf)
:star:[code](https://github.com/MIV-XJTU/DKT)

  * [Learning With Fantasy: Semantic-Aware Virtual Contrastive Constraint for Few-Shot Class-Incremental Learning](http://arxiv.org/abs/2304.00426)

  * [CafeBoost: Causal Feature Boost To Eliminate Task-Induced Bias for Class Incremental Learning](https://openaccess.thecvf.com/content/CVPR2023/papers/Qiu_CafeBoost_Causal_Feature_Boost_To_Eliminate_Task-Induced_Bias_for_Class_CVPR_2023_paper.pdf)

  



## 40.Adversarial Learning(对抗学习)

* [Adversarial Robustness via Random Projection Filters](https://openaccess.thecvf.com/content/CVPR2023/papers/Dong_Adversarial_Robustness_via_Random_Projection_Filters_CVPR_2023_paper.pdf)

* [Seasoning Model Soups for Robustness to Adversarial and Natural Distribution Shifts](https://arxiv.org/abs/2302.10164)

* [Dynamic Generative Targeted Attacks With Pattern Injection](https://openaccess.thecvf.com/content/CVPR2023/papers/Feng_Dynamic_Generative_Targeted_Attacks_With_Pattern_Injection_CVPR_2023_paper.pdf)

* [FIANCEE: Faster Inference of Adversarial Networks via Conditional Early Exits](https://arxiv.org/abs/2304.10306)

* [Enhancing the Self-Universality for Transferable Targeted Attacks](https://arxiv.org/abs/2209.03716)
:star:[code](https://github.com/zhipeng-wei/Self-Universality)

* [Exploring the Relationship Between Architectural Design and Adversarially Robust Generalization](https://openaccess.thecvf.com/content/CVPR2023/papers/Liu_Exploring_the_Relationship_Between_Architectural_Design_and_Adversarially_Robust_Generalization_CVPR_2023_paper.pdf)
:house:[project](http://robust.art/)

* [Revisiting Residual Networks for Adversarial Robustness](https://arxiv.org/abs/2212.11005)
:star:[code](https://github.com/zhichao-lu/robust-residual-network)

* [Feature Separation and Recalibration for Adversarial Robustness](http://arxiv.org/abs/2303.13846v1)
:star:[code](https://github.com/wkim97/FSR)

* [CFA: Class-wise Calibrated Fair Adversarial Training](http://arxiv.org/abs/2303.14460v1)
:star:[code](https://github.com/PKU-ML/CFA)

* [Towards Compositional Adversarial Robustness: Generalizing Adversarial Training to Composite Semantic Perturbations](https://arxiv.org/abs/2202.04235)
:house:[project](https://hsiung.cc/CARBEN/)

* [Efficient Loss Function by Minimizing the Detrimental Effect of Floating-Point Errors on Gradient-Based Attacks](https://openaccess.thecvf.com/content/CVPR2023/papers/Yu_Efficient_Loss_Function_by_Minimizing_the_Detrimental_Effect_of_Floating-Point_CVPR_2023_paper.pdf)

* 黑盒

  * [BlackVIP: Black-Box Visual Prompting for Robust Transfer Learning](http://arxiv.org/abs/2303.14773v1)
:star:[code](https://github.com/changdaeoh/BlackVIP)

  * [Black-Box Sparse Adversarial Attack via Multi-Objective Optimisation](https://openaccess.thecvf.com/content/CVPR2023/papers/Williams_Black-Box_Sparse_Adversarial_Attack_via_Multi-Objective_Optimisation_CVPR_2023_paper.pdf)

  * [Reinforcement Learning-Based Black-Box Model Inversion Attacks](http://arxiv.org/abs/2304.04625v1)

  * [Minimizing Maximum Model Discrepancy for Transferable Black-Box Targeted Attacks](https://arxiv.org/abs/2212.09035)

* 对抗样本

  * [Demystifying Causal Features on Adversarial Examples and Causal Inoculation for Robust Network by Adversarial Instrumental Variable Regression](https://arxiv.org/abs/2303.01052)

  * [Introducing Competition To Boost the Transferability of Targeted Adversarial Examples Through Clean Feature Mixup](https://openaccess.thecvf.com/content/CVPR2023/papers/Byun_Introducing_Competition_To_Boost_the_Transferability_of_Targeted_Adversarial_Examples_CVPR_2023_paper.pdf)
:star:[code](https://github.com/dreamflake/CFM)

  * [Towards Transferable Targeted Adversarial Examples](https://openaccess.thecvf.com/content/CVPR2023/papers/Wang_Towards_Transferable_Targeted_Adversarial_Examples_CVPR_2023_paper.pdf)

  * [Improving the Transferability of Adversarial Samples by Path-Augmented Method](http://arxiv.org/abs/2303.15735v1)

  * [Increasing the Latency of LiDAR-Based Detection Using Adversarial Examples](https://openaccess.thecvf.com/content/CVPR2023/papers/Liu_SlowLiDAR_Increasing_the_Latency_of_LiDAR-Based_Detection_Using_Adversarial_Examples_CVPR_2023_paper.pdf)
:star:[code](https://github.com/WUSTL-CSPL/SlowLiDAR)

* 后门攻击

  * [Single Image Backdoor Inversion via Robust Smoothed Classifiers](https://arxiv.org/pdf/2303.00215.pdf)
:star:[code](https://arxiv.org/pdf/2303.00215.pdf)

  * [Detecting Backdoors During the Inference Stage Based on Corruption Robustness Consistency](http://arxiv.org/abs/2303.18191)

  * [You Are Catching My Attention: Are Vision Transformers Bad Learners Under Backdoor Attacks?](https://openaccess.thecvf.com/content/CVPR2023/papers/Yuan_You_Are_Catching_My_Attention_Are_Vision_Transformers_Bad_Learners_CVPR_2023_paper.pdf)

  * [MEDIC: Remove Model Backdoors via Importance Driven Cloning](https://openaccess.thecvf.com/content/CVPR2023/papers/Xu_MEDIC_Remove_Model_Backdoors_via_Importance_Driven_Cloning_CVPR_2023_paper.pdf)
:star:[code](https://github.com/qiulingxu/MEDIC)

  * [Backdoor Defense via Adaptively Splitting Poisoned Dataset](http://arxiv.org/abs/2303.12993v1)
:star:[code](https://github.com/KuofengGao/ASD)

  * [Detecting Backdoors in Pre-trained Encoders](http://arxiv.org/abs/2303.15180v1)
:star:[code](https://github.com/GiantSeaweed/DECREE)

  * [Color Backdoor: A Robust Poisoning Attack in Color Space](https://openaccess.thecvf.com/content/CVPR2023/papers/Jiang_Color_Backdoor_A_Robust_Poisoning_Attack_in_Color_Space_CVPR_2023_paper.pdf)

  * [Detecting Backdoors in Pre-Trained Encoders](https://arxiv.org/abs/2303.15180)
:star:[code](https://github.com/GiantSeaweed/DECREE)

* 对抗攻击

  * [Adversarial Attack with Raindrops](https://arxiv.org/pdf/2302.14267.pdf)

  * [Progressive Backdoor Erasing via Connecting Backdoor and Adversarial Attacks](http://arxiv.org/abs/2202.06312)

  * [Towards Benchmarking and Assessing Visual Naturalness of Physical World Adversarial Attacks](https://openaccess.thecvf.com/content/CVPR2023/papers/Li_Towards_Benchmarking_and_Assessing_Visual_Naturalness_of_Physical_World_Adversarial_CVPR_2023_paper.pdf)

  * [The Best Defense Is a Good Offense: Adversarial Augmentation Against Adversarial Attacks](https://openaccess.thecvf.com/content/CVPR2023/papers/Frosio_The_Best_Defense_Is_a_Good_Offense_Adversarial_Augmentation_Against_CVPR_2023_paper.pdf)

  * [Transferable Adversarial Attacks on Vision Transformers with Token Gradient Regularization](http://arxiv.org/abs/2303.15754v1)

  * [Robust Single Image Reflection Removal Against Adversarial Attacks](https://openaccess.thecvf.com/content/CVPR2023/papers/Song_Robust_Single_Image_Reflection_Removal_Against_Adversarial_Attacks_CVPR_2023_paper.pdf)

  * [Transferable Adversarial Attacks on Vision Transformers With Token Gradient Regularization](https://arxiv.org/abs/2303.15754)
:star:[code](https://github.com/jpzhang1810/TGR)

  * [StyLess: Boosting the Transferability of Adversarial Examples](http://arxiv.org/abs/2304.11579v1)

  * [Re-thinking Model Inversion Attacks Against Deep Neural Networks](http://arxiv.org/abs/2304.01669v1)
:star:[code](https://ngoc-nguyen-0.github.io/re-thinking_model_inversion_attacks/)

  * [Defending Against Patch-based Backdoor Attacks on Self-Supervised Learning](http://arxiv.org/abs/2304.01482v1)
:star:[code](https://github.com/UCDvision/PatchSearch)

  * [Jedi: Entropy-based Localization and Removal of Adversarial Patches](http://arxiv.org/abs/2304.10029v1)

* 后门防御

  * [Backdoor Defense via Deconfounded Representation Learning](https://arxiv.org/abs/2303.06818)
:star:[code](https://github.com/zaixizhang/CBD)

* 对抗训练

  * [The Enemy of My Enemy Is My Friend: Exploring Inverse Adversaries for Improving Adversarial Training](https://arxiv.org/abs/2211.00525)

  * [Randomized Adversarial Training via Taylor Expansion](https://arxiv.org/abs/2303.10653)
:star:[code](https://github.com/Alexkael/Randomized-Adversarial-Training)

  * [AGAIN: Adversarial Training With Attribution Span Enlargement and Hybrid Feature Fusion](https://openaccess.thecvf.com/content/CVPR2023/papers/Yin_AGAIN_Adversarial_Training_With_Attribution_Span_Enlargement_and_Hybrid_Feature_CVPR_2023_paper.pdf)



## 39.Continual Learning(持续学习)

* [Dealing With Cross-Task Class Discrimination in Online Continual Learning](https://openaccess.thecvf.com/content/CVPR2023/papers/Guo_Dealing_With_Cross-Task_Class_Discrimination_in_Online_Continual_Learning_CVPR_2023_paper.pdf)

* [Heterogeneous Continual Learning](https://openaccess.thecvf.com/content/CVPR2023/papers/Madaan_Heterogeneous_Continual_Learning_CVPR_2023_paper.pdf)

* [Batch Model Consolidation: A Multi-Task Model Consolidation Framework](https://openaccess.thecvf.com/content/CVPR2023/papers/Fostiropoulos_Batch_Model_Consolidation_A_Multi-Task_Model_Consolidation_Framework_CVPR_2023_paper.pdf)

* [CODA-Prompt: COntinual Decomposed Attention-Based Prompting for Rehearsal-Free Continual Learning](https://openaccess.thecvf.com/content/CVPR2023/papers/Smith_CODA-Prompt_COntinual_Decomposed_Attention-Based_Prompting_for_Rehearsal-Free_Continual_Learning_CVPR_2023_paper.pdf)
:star:[code](https://github.com/GT-RIPL/CODA-Prompt)

* [Achieving a Better Stability-Plasticity Trade-off via Auxiliary Networks in Continual Learning](https://arxiv.org/abs/2303.09483)
:star:[code](https://github.com/kim-sanghwan/ANCL)

* [Computationally Budgeted Continual Learning: What Does Matter?](https://arxiv.org/abs/2303.11165)
:star:[code](https://github.com/drimpossible/BudgetCL)

* [Achieving a Better Stability-Plasticity Trade-Off via Auxiliary Networks in Continual Learning](https://arxiv.org/abs/2303.09483)

* [Preserving Linear Separability in Continual Learning by Backward Feature Projection](http://arxiv.org/abs/2303.14595v1)

* [Regularizing Second-Order Influences for Continual Lea
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/52cv/cvpr-2023-papers

Awesome Lists containing this project

README