{"id":19320149,"url":"https://github.com/52cv/wacv-2024-papers","last_synced_at":"2026-02-28T07:31:45.031Z","repository":{"id":206128297,"uuid":"682464792","full_name":"52CV/WACV-2024-Papers","owner":"52CV","description":null,"archived":false,"fork":false,"pushed_at":"2024-01-16T08:14:54.000Z","size":441,"stargazers_count":101,"open_issues_count":0,"forks_count":7,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-02-24T05:14:35.497Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/52CV.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-08-24T08:20:51.000Z","updated_at":"2025-02-03T16:10:21.000Z","dependencies_parsed_at":"2023-11-13T04:25:04.877Z","dependency_job_id":"9f9dc5cb-9ac8-4e7b-ac7d-0419914da821","html_url":"https://github.com/52CV/WACV-2024-Papers","commit_stats":null,"previous_names":["52cv/wacv-2024-papers"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/52CV/WACV-2024-Papers","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/52CV%2FWACV-2024-Papers","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/52CV%2FWACV-2024-Papers/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/52CV%2FWACV-2024-Papers/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/52CV%2FWACV-2024-Papers/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/52CV","download_url":"https://codeload.github.com/52CV/WACV-2024-Papers/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/52CV%2FWACV-2024-Papers/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29927568,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-27T19:37:42.220Z","status":"online","status_checked_at":"2026-02-28T02:00:07.010Z","response_time":90,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-10T01:27:18.301Z","updated_at":"2026-02-28T07:31:45.013Z","avatar_url":"https://github.com/52CV.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# WACV-2024-Papers\n![Alt text](96748913c73db498eb8249e43c245b8.jpg)\n## 会议时间：2024年1月3-7日\n## 会议网址：https://wacv2024.thecvf.com/\n## ❣❣❣ WACV 2024 论文分类整理已完成\n## 📢📢📢获奖论文\n\n#### 🏆最佳论文奖(Algorithms)\n[Conditional Velocity Score Estimation for Image Restoration](https://openaccess.thecvf.com/content/WACV2024/papers/Shi_Conditional_Velocity_Score_Estimation_for_Image_Restoration_WACV_2024_paper.pdf)\n\n#### 🏆最佳论文奖(Applications)\n[WildlifeDatasets: An Open-Source Toolkit for Animal Re-Identification](https://openaccess.thecvf.com/content/WACV2024/papers/Cermak_WildlifeDatasets_An_Open-Source_Toolkit_for_Animal_Re-Identification_WACV_2024_paper.pdf)\n\n#### 🏆最佳学生论文\n[Wino Vidi Vici: Conquering Numerical Instability of 8-Bit Winograd Convolution for Accurate Inference Acceleration on Edge](https://openaccess.thecvf.com/content/WACV2024/papers/Mori_Wino_Vidi_Vici_Conquering_Numerical_Instability_of_8-Bit_Winograd_Convolution_WACV_2024_paper.pdf)\n\n#### 🏆最佳论文荣誉提名\n[ParticleNeRF: A Particle-Based Encoding for Online Neural Radiance Fields](https://openaccess.thecvf.com/content/WACV2024/papers/Abou-Chakra_ParticleNeRF_A_Particle-Based_Encoding_for_Online_Neural_Radiance_Fields_WACV_2024_paper.pdf)\n\n## 查看2024年综述文献点这里↘️[2024-CV-Surveys](https://github.com/52CV/CV-Surveys)\n\n## 2024 年论文分类汇总戳这里\n↘️[WACV-2024-Papers](https://github.com/52CV/WACV-2024-Papers)\n\n## 2023 年论文分类汇总戳这里\n↘️[CVPR-2023-Papers](https://github.com/52CV/CVPR-2023-Papers)\n↘️[WACV-2023-Papers](https://github.com/52CV/WACV-2023-Papers)\n↘️[ICCV-2023-Papers](https://github.com/52CV/ICCV-2023-Papers)\n↘️[2023-CV-Surveys](https://github.com/52CV/CV-Surveys/blob/main/2023-CV-Surveys.md)\n\n## [2022 年论文分类汇总戳这里](#000)\n## [2021 年论文分类汇总戳这里](#00)\n## [2020 年论文分类汇总戳这里](#0)\n\n\n## 目录\n\n|:cat:|:dog:|:tiger:|:wolf:|\n|------|------|------|------|\n|[1.其它(Other)](#1)|[2.SR(超分辨率)](#2)|[3.Image/Video Retrieval(图像/视频检索)](#3)|[4.Image/Video Caption(图像/视频字幕)](#4)|\n|[5.Image/Video Composition(图像/视频压缩)](#5)|[6.Medical Image(医学图像处理)](#6)|[7.3D(三维重建\\三维视觉)](#7)|[8.Face(人脸技术)](#8)|\n|[9.Image Segmentation(图像分割)](#9)|[10.Object Detector(目标检测)](#10)|[11.Object Tracking(目标跟踪)](#11)|[12.UAV/RS/Satellite Image(无人机/遥感/卫星图像)](#12)|\n|[13.Reid(人员重识别/步态识别/行人检测)](#13)|[14.OCR(文本检测识别)](#14)|[15.Video](#15)|[16.Action Detection(动作检测)](#16)|\n|[17.HPE(人体姿态估计)](#17)|[18.Animal](#18)|[19.Object Pose Estimation(物体姿态估计)](#19)|[20.GAN/生成](#20)|\n|[21.SLAM/AR/VR/Robotics(增强/虚拟现实/机器人)](#21)|[22.VAQ(视觉问答)](#22)|[23.VL(视觉语言)](#23)|[24.LLM(大语言模型)](#24)|\n|[25.Multimodal(多模态)](#25)|[26.Human Motion Prediction(人体运动预测)](#26)|[27.HOI(人物交互)](#27)|[28.Point-Cloud(点云)](#28)|\n|[29.SGG(场景图生成)](#29)|[30.GNN/GCN](#30)|[31.Automated Driving(自动驾驶)](#31)|[32.Scene Flow Estimation(场景流估计)](#32)|\n|[33.Optical Flow Estimation(光流估计)](#33)|[34.NAS](#34)|[35.MC/KD/Pruning(模型压缩/知识蒸馏/剪枝)](#35)|[36.NLP](#36)|\n|[37.ML(机器学习)](#37)|[38.Visual Representation Learning](#38)|[39.Few/Zero-Shot Learning/DG/A(小/零样本/域泛化/域适应)](#39)|[40.Self/Semi-supervised learning](#40)|\n|[41.Image Progress(低层图像处理、质量评价)](#41)|[42.Image Classification(图像分类)](#42)|[43.Image Fusion(图像融合)](#43)|[44.visual industrial inspection(工业检测)](#44)|\n|[45.Visual Tampering Detection(视觉篡改检测)](#45)|[46.Dense Prediction(密集预测)](#46)|[47.Edge Detection(边缘检测)](#47)|[48.Image/Video Editing](#48)|\n|[49.Vision Transformers](#49)|[50.Dataset(数据集)](#50)|[51.sound(语音)](#51)|[52.Gaze Estimation(凝视估计)](#52)|[53.Crack Segmentation](#53)|\n|[53.Crack Segmentation](#53)|[54.Style Transfer(风格迁移)](#54)|[55.Biometrics(生物特征识别)](#55)|[56.Event Cameras(事件相机)](#56)|\n|[57.Neural Radiance Fields(NeRF)](#57)|[58.Novel View Synthesis(新视角合成)](#58)|[59.Rendering](#59)|[60.Graphic Layout(图形布局)](#60)|\n|[61.Computed Imaging(计算成像，如光学、几何、光场成像等)](#61)|\n\n\u003ca name=\"61\"/\u003e\n\n## 61.Computed Imaging(计算成像，如光学、几何、光场成像等)\n* [Motion Matters: Neural Motion Transfer for Better Camera Physiological Measurement](http://arxiv.org/abs/2303.12059)\n* [On the Quantification of Image Reconstruction Uncertainty without Training Data](http://arxiv.org/abs/2311.09639v1)\n* [Deep Optics for Optomechanical Control Policy Design](https://openaccess.thecvf.com/content/WACV2024/papers/Fletcher_Deep_Optics_for_Optomechanical_Control_Policy_Design_WACV_2024_paper.pdf)\n* [From Chaos to Calibration: A Geometric Mutual Information Approach To Target-Free Camera LiDAR Extrinsic Calibration](https://openaccess.thecvf.com/content/WACV2024/papers/Borer_From_Chaos_to_Calibration_A_Geometric_Mutual_Information_Approach_To_WACV_2024_paper.pdf)\n* [Joint 3D Shape and Motion Estimation From Rolling Shutter Light-Field Images](http://arxiv.org/abs/2311.01292)\n* [CGAPoseNet+GCAN: A Geometric Clifford Algebra Network for Geometry-Aware Camera Pose Regression](https://openaccess.thecvf.com/content/WACV2024/papers/Pepe_CGAPoseNetGCAN_A_Geometric_Clifford_Algebra_Network_for_Geometry-Aware_Camera_Pose_WACV_2024_paper.pdf)\n* 相机校准\n  * [MSCC: Multi-Scale Transformers for Camera Calibration](https://openaccess.thecvf.com/content/WACV2024/papers/Song_MSCC_Multi-Scale_Transformers_for_Camera_Calibration_WACV_2024_paper.pdf)\n\n\u003ca name=\"60\"/\u003e\n\n## 60.Graphic Layout(图形布局)\n* [Unsupervised Graphic Layout Grouping with Transformers](https://openaccess.thecvf.com/content/WACV2024/papers/Zhu_Unsupervised_Graphic_Layout_Grouping_With_Transformers_WACV_2024_paper.pdf)\n\n\u003ca name=\"59\"/\u003e\n\n## 59.Rendering\n* [LensNeRF: Rethinking Volume Rendering Based on Thin-Lens Camera Model](https://openaccess.thecvf.com/content/WACV2024/papers/Kim_LensNeRF_Rethinking_Volume_Rendering_Based_on_Thin-Lens_Camera_Model_WACV_2024_paper.pdf)\n* [Specular Object Reconstruction Behind Frosted Glass by Differentiable Rendering](https://openaccess.thecvf.com/content/WACV2024/papers/Iwaguchi_Specular_Object_Reconstruction_Behind_Frosted_Glass_by_Differentiable_Rendering_WACV_2024_paper.pdf)\n\n\u003ca name=\"58\"/\u003e\n\n## 58.Novel View Synthesis(新视角合成)\n* [Ray Deformation Networks for Novel View Synthesis of Refractive Objects](https://openaccess.thecvf.com/content/WACV2024/papers/Deng_Ray_Deformation_Networks_for_Novel_View_Synthesis_of_Refractive_Objects_WACV_2024_paper.pdf)\n* [Stereo Conversion With Disparity-Aware Warping, Compositing and Inpainting](https://openaccess.thecvf.com/content/WACV2024/papers/Mehl_Stereo_Conversion_With_Disparity-Aware_Warping_Compositing_and_Inpainting_WACV_2024_paper.pdf)\n\n\u003ca name=\"57\"/\u003e\n\n## 57.Neural Radiance Fields(NeRF)\n* [EvDNeRF: Reconstructing Event Data With Dynamic Neural Radiance Fields](http://arxiv.org/abs/2310.02437)\n* [Hyb-NeRF: A Multiresolution Hybrid Encoding for Neural Radiance Fields](https://arxiv.org/abs/2311.12490)\n* [Fast Sun-aligned Outdoor Scene Relighting based on TensoRF](http://arxiv.org/abs/2311.03965v1)\n* [ParticleNeRF: A Particle-Based Encoding for Online Neural Radiance Fields](https://openaccess.thecvf.com/content/WACV2024/papers/Abou-Chakra_ParticleNeRF_A_Particle-Based_Encoding_for_Online_Neural_Radiance_Fields_WACV_2024_paper.pdf)\n* [MoRF: Mobile Realistic Fullbody Avatars From a Monocular Video](https://arxiv.org/abs/2303.10275)\n* [ZIGNeRF: Zero-Shot 3D Scene Representation With Invertible Generative Neural Radiance Fields](http://arxiv.org/abs/2306.02741)\n* [Point-DynRF: Point-Based Dynamic Radiance Fields From a Monocular Video](https://openaccess.thecvf.com/content/WACV2024/papers/Park_Point-DynRF_Point-Based_Dynamic_Radiance_Fields_From_a_Monocular_Video_WACV_2024_paper.pdf)\n* [A Generic and Flexible Regularization Framework for NeRFs](https://openaccess.thecvf.com/content/WACV2024/papers/Ehret_A_Generic_and_Flexible_Regularization_Framework_for_NeRFs_WACV_2024_paper.pdf)\n\n\u003ca name=\"56\"/\u003e\n\n## 56.Event Cameras(事件相机)\n* [Masked Event Modeling: Self-Supervised Pretraining for Event Cameras](http://arxiv.org/abs/2212.10368)\n\n\u003ca name=\"55\"/\u003e\n\n## 55.Biometrics(生物特征识别)\n* [Deep Visual-Genetic Biometrics for Taxonomic Classification of Rare Species](http://arxiv.org/abs/2305.06695)\n* [Fingervein Verification using Convolutional Multi-Head Attention Network](http://arxiv.org/abs/2310.16808v1)\n* [FarSight: A Physics-Driven Whole-Body Biometric System at Large Distance and Altitude](http://arxiv.org/abs/2306.17206)\n* [Vikriti-ID: A Novel Approach for Real Looking Fingerprint Data-Set Generation](https://openaccess.thecvf.com/content/WACV2024/papers/Shukla_Vikriti-ID_A_Novel_Approach_for_Real_Looking_Fingerprint_Data-Set_Generation_WACV_2024_paper.pdf)\n* 指纹生成\n  * [FPGAN-Control: A Controllable Fingerprint Generator for Training With Synthetic Data](https://openaccess.thecvf.com/content/WACV2024/papers/Shoshan_FPGAN-Control_A_Controllable_Fingerprint_Generator_for_Training_With_Synthetic_Data_WACV_2024_paper.pdf)\n\n\u003ca name=\"54\"/\u003e\n\n## 54.Style Transfer(风格迁移)\n* [Optical Flow Domain Adaptation via Target Style Transfer](https://openaccess.thecvf.com/content/WACV2024/papers/Yoon_Optical_Flow_Domain_Adaptation_via_Target_Style_Transfer_WACV_2024_paper.pdf)\n* [Multimodality-guided Image Style Transfer using Cross-modal GAN Inversion](http://arxiv.org/abs/2312.01671v1)\u003cbr\u003e:star:[code](https://hywang66.github.io/mmist/)\n* [FastCLIPstyler: Optimisation-Free Text-Based Image Style Transfer Using Style Representations](http://arxiv.org/abs/2210.03461)\n* [SpectralCLIP: Preventing Artifacts in Text-Guided Style Transfer From a Spectral Perspective](http://arxiv.org/abs/2303.09270)\n* [Neural Style Protection: Counteracting Unauthorized Neural Style Transfer](https://openaccess.thecvf.com/content/WACV2024/papers/Li_Neural_Style_Protection_Counteracting_Unauthorized_Neural_Style_Transfer_WACV_2024_paper.pdf)\n* [LipAT: Beyond Style Transfer for Controllable Neural Simulation of Lipstick Using Cosmetic Attributes](https://openaccess.thecvf.com/content/WACV2024/papers/Silva_LipAT_Beyond_Style_Transfer_for_Controllable_Neural_Simulation_of_Lipstick_WACV_2024_paper.pdf)\n\n\u003ca name=\"53\"/\u003e\n\n## 53.Crack Segmentation\n* [Designing a Hybrid Neural System To Learn Real-World Crack Segmentation From Fractal-Based Simulation](http://arxiv.org/abs/2309.09637)\n\n\u003ca name=\"52\"/\u003e\n\n## 52.Gaze Estimation(凝视估计)\n* [Rotation-Constrained Cross-View Feature Fusion for Multi-View Appearance-Based Gaze Estimation](http://arxiv.org/abs/2305.12704)\n* 目光跟踪\n  * [Multi-Modal Gaze Following in Conversational Scenarios](http://arxiv.org/abs/2311.05669)\n\n\u003ca name=\"51\"/\u003e\n\n## 51.sound(语音)\n* 唇语同步\n  * [Diff2Lip: Audio Conditioned Diffusion Models for Lip-Synchronization](http://arxiv.org/abs/2308.09716)\n* 声源定位\n  * [Can CLIP Help Sound Source Localization?](http://arxiv.org/abs/2311.04066v1)\n* 音频分离  \n  * [LAVSS: Location-Guided Audio-Visual Spatial Audio Separation](https://arxiv.org/abs/2310.20446)\n  * [Visually Guided Audio Source Separation With Meta Consistency Learning](https://openaccess.thecvf.com/content/WACV2024/papers/Islam_Visually_Guided_Audio_Source_Separation_With_Meta_Consistency_Learning_WACV_2024_paper.pdf)\n* 3D 声源检测\n  * [Sound3DVDet: 3D Sound Source Detection Using Multiview Microphone Array and RGB Images](https://openaccess.thecvf.com/content/WACV2024/papers/He_Sound3DVDet_3D_Sound_Source_Detection_Using_Multiview_Microphone_Array_and_WACV_2024_paper.pdf)\n* 音视频分割\n  * [Annotation-Free Audio-Visual Segmentation](http://arxiv.org/abs/2305.11019)\n* 语音视频合成\n  * [DR2: Disentangled Recurrent Representation Learning for Data-Efficient Speech Video Synthesis](https://openaccess.thecvf.com/content/WACV2024/papers/Zhang_DR2_Disentangled_Recurrent_Representation_Learning_for_Data-Efficient_Speech_Video_Synthesis_WACV_2024_paper.pdf)\n* 身体节拍制作互动鼓声\n  * [Let the Beat Follow You - Creating Interactive Drum Sounds From Body Rhythm](https://openaccess.thecvf.com/content/WACV2024/papers/Liu_Let_the_Beat_Follow_You_-_Creating_Interactive_Drum_Sounds_WACV_2024_paper.pdf)\n\n\u003ca name=\"50\"/\u003e\n\n## 50.Dataset(数据集)\n* [HaGRID -- HAnd Gesture Recognition Image Dataset](https://openaccess.thecvf.com/content/WACV2024/papers/Kapitanov_HaGRID_--_HAnd_Gesture_Recognition_Image_Dataset_WACV_2024_paper.pdf)\n* [Beyond RGB: A Real World Dataset for Multispectral Imaging in Mobile Devices](https://openaccess.thecvf.com/content/WACV2024/papers/Glatt_Beyond_RGB_A_Real_World_Dataset_for_Multispectral_Imaging_in_WACV_2024_paper.pdf)\n* [IKEA Ego 3D Dataset: Understanding Furniture Assembly Actions From Ego-View 3D Point Clouds](https://openaccess.thecvf.com/content/WACV2024/papers/Ben-Shabat_IKEA_Ego_3D_Dataset_Understanding_Furniture_Assembly_Actions_From_Ego-View_WACV_2024_paper.pdf)\n* [PsyMo: A Dataset for Estimating Self-Reported Psychological Traits From Gait](http://arxiv.org/abs/2308.10631)\n* [The Growing Strawberries Dataset: Tracking Multiple Objects With Biological Development Over an Extended Period](https://openaccess.thecvf.com/content/WACV2024/papers/Wen_The_Growing_Strawberries_Dataset_Tracking_Multiple_Objects_With_Biological_Development_WACV_2024_paper.pdf)\n* [UOW-Vessel: A Benchmark Dataset of High-Resolution Optical Satellite Images for Vessel Detection and Segmentation](https://openaccess.thecvf.com/content/WACV2024/papers/Bui_UOW-Vessel_A_Benchmark_Dataset_of_High-Resolution_Optical_Satellite_Images_for_WACV_2024_paper.pdf)\n* [NITEC: Versatile Hand-Annotated Eye Contact Dataset for Ego-Vision Interaction](http://arxiv.org/abs/2311.04505v1)\u003cbr\u003e:star:[code](https://github.com/thohemp/nitec)\n* [FishTrack23: An Ensemble Underwater Dataset for Multi-Object Tracking](https://openaccess.thecvf.com/content/WACV2024/papers/Dawkins_FishTrack23_An_Ensemble_Underwater_Dataset_for_Multi-Object_Tracking_WACV_2024_paper.pdf)\n* [Rank2Tell: A Multimodal Driving Dataset for Joint Importance Ranking and Reasoning](http://arxiv.org/abs/2309.06597)\n* [NOMAD: A Natural, Occluded, Multi-Scale Aerial Dataset, for Emergency Response Scenarios](http://arxiv.org/abs/2309.09518)\n* [Tackling Data Bias in MUSIC-AVQA: Crafting a Balanced Dataset for Unbiased Question-Answering](https://openaccess.thecvf.com/content/WACV2024/papers/Liu_Tackling_Data_Bias_in_MUSIC-AVQA_Crafting_a_Balanced_Dataset_for_WACV_2024_paper.pdf)\n* [SphereCraft: A Dataset for Spherical Keypoint Detection, Matching and Camera Pose Estimation](https://openaccess.thecvf.com/content/WACV2024/papers/Gava_SphereCraft_A_Dataset_for_Spherical_Keypoint_Detection_Matching_and_Camera_WACV_2024_paper.pdf)\n* [Ego2HandsPose: A Dataset for Egocentric Two-Hand 3D Global Pose Estimation](http://arxiv.org/abs/2206.04927)\n* [MarsLS-Net: Martian Landslides Segmentation Network and Benchmark Dataset](https://openaccess.thecvf.com/content/WACV2024/papers/Paheding_MarsLS-Net_Martian_Landslides_Segmentation_Network_and_Benchmark_Dataset_WACV_2024_paper.pdf)\n* [Beyond Document Page Classification: Design, Datasets, and Challenges](http://arxiv.org/abs/2308.12896)\n* [MuSHRoom: Multi-Sensor Hybrid Room Dataset for Joint 3D Reconstruction and Novel View Synthesis](http://arxiv.org/abs/2311.02778)\n* [SyntheWorld: A Large-Scale Synthetic Dataset for Land Cover Mapping and Building Change Detection](https://arxiv.org/abs/2309.01907)\u003cbr\u003e:sunflower:[dataset](https://github.com/JTRNEO/SyntheWorld)\n* [IndustReal: A Dataset for Procedure Step Recognition Handling Execution Errors in Egocentric Videos in an Industrial-Like Setting](http://arxiv.org/abs/2310.17323v1)\u003cbr\u003e:star:[code](https://github.com/TimSchoonbeek/IndustReal)\n* [SICKLE: A Multi-Sensor Satellite Imagery Dataset Annotated with Multiple Key Cropping Parameters](https://arxiv.org/abs/2312.00069)\n* [SeaTurtleID2022: A Long-Span Dataset for Reliable Sea Turtle Re-Identification](https://openaccess.thecvf.com/content/WACV2024/papers/Adam_SeaTurtleID2022_A_Long-Span_Dataset_for_Reliable_Sea_Turtle_Re-Identification_WACV_2024_paper.pdf)\n* [Amodal Intra-Class Instance Segmentation: Synthetic Datasets and Benchmark](http://arxiv.org/abs/2303.06596)\n* [Towards Accurate Disease Segmentation in Plant Images: A Comprehensive Dataset Creation and Network Evaluation](https://openaccess.thecvf.com/content/WACV2024/papers/Prashanth_Towards_Accurate_Disease_Segmentation_in_Plant_Images_A_Comprehensive_Dataset_WACV_2024_paper.pdf)\n* [AssemblyNet: A Point Cloud Dataset and Benchmark for Predicting Part Directions in an Exploded Layout](https://openaccess.thecvf.com/content/WACV2024/papers/Gaarsdal_AssemblyNet_A_Point_Cloud_Dataset_and_Benchmark_for_Predicting_Part_WACV_2024_paper.pdf)\n* [MAdVerse: A Hierarchical Dataset of Multi-Lingual Ads From Diverse Sources and Categories](https://openaccess.thecvf.com/content/WACV2024/papers/Sagar_MAdVerse_A_Hierarchical_Dataset_of_Multi-Lingual_Ads_From_Diverse_Sources_WACV_2024_paper.pdf)\n* [InfraParis: A Multi-Modal and Multi-Task Autonomous Driving Dataset](http://arxiv.org/abs/2309.15751)\n* [ZRG: A Dataset for Multimodal 3D Residential Rooftop Understanding](http://arxiv.org/abs/2304.13219)\n* 基准\n  * [ISAR: A Benchmark for Single- and Few-Shot Object Instance Segmentation and Re-Identification](http://arxiv.org/abs/2311.02734v1)\n  * [ConeQuest: A Benchmark for Cone Segmentation on Mars](http://arxiv.org/abs/2311.08657v1)\u003cbr\u003e:star:[code](https://github.com/kerner-lab/ConeQuest)\n  * [dacl10k: Benchmark for Semantic Bridge Damage Segmentation](https://openaccess.thecvf.com/content/WACV2024/papers/Flotzinger_dacl10k_Benchmark_for_Semantic_Bridge_Damage_Segmentation_WACV_2024_paper.pdf)\n  * [IDD-AW: A Benchmark for Safe and Robust Segmentation of Drive Scenes in Unstructured Traffic and Adverse Weather](https://arxiv.org/abs/2311.14459)\n  * [A Multimodal Benchmark and Improved Architecture for Zero Shot Learning](https://openaccess.thecvf.com/content/WACV2024/papers/Doshi_A_Multimodal_Benchmark_and_Improved_Architecture_for_Zero_Shot_Learning_WACV_2024_paper.pdf)\n  * [RobustCLEVR: A Benchmark and Framework for Evaluating Robustness in Object-Centric Learning](http://arxiv.org/abs/2308.14899)\n\n\u003ca name=\"49\"/\u003e\n\n## 49.Vision Transformers\n* [Grafting Vision Transformers](http://arxiv.org/abs/2210.15943)\n* [Efficient MAE Towards Large-Scale Vision Transformers](https://openaccess.thecvf.com/content/WACV2024/papers/Han_Efficient_MAE_Towards_Large-Scale_Vision_Transformers_WACV_2024_paper.pdf)\n* [SimA: Simple Softmax-Free Attention for Vision Transformers](http://arxiv.org/abs/2206.08898)\n* [Open-NeRF: Towards Open Vocabulary NeRF Decomposition](http://arxiv.org/abs/2310.16383v1)\n* [Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked Autoencoders](https://arxiv.org/abs/2310.20704)\n* [Triplet Attention Transformer for Spatiotemporal Predictive Learning](http://arxiv.org/abs/2310.18698)\n* [Query-Guided Attention in Vision Transformers for Localizing Objects Using a Single Sketch](http://arxiv.org/abs/2303.08784)\n* [GTP-ViT: Efficient Vision Transformers via Graph-based Token Propagation](http://arxiv.org/abs/2311.03035v1)\u003cbr\u003e:star:[code](https://github.com/Ackesnal/GTP-ViT)\n* [Exploring Adversarial Robustness of Vision Transformers in the Spectral Perspective](http://arxiv.org/abs/2208.09602)\n* [SBCFormer: Lightweight Network Capable of Full-size ImageNet Classification at 1 FPS on Single Board Computers](http://arxiv.org/abs/2311.03747v1)\u003cbr\u003e:star:[code](https://github.com/xyongLu/SBCFormer)\n* [Robust Eye Blink Detection Using Dual Embedding Video Vision Transformer](https://openaccess.thecvf.com/content/WACV2024/papers/Hong_Robust_Eye_Blink_Detection_Using_Dual_Embedding_Video_Vision_Transformer_WACV_2024_paper.pdf)\n* [Semantic Labels-Aware Transformer Model for Searching Over a Large Collection of Lecture-Slides](https://openaccess.thecvf.com/content/WACV2024/papers/Jobin_Semantic_Labels-Aware_Transformer_Model_for_Searching_Over_a_Large_Collection_WACV_2024_paper.pdf)\n\n\u003ca name=\"48\"/\u003e\n\n## 48.Image/Video Editing\n* [Unified Concept Editing in Diffusion Models](http://arxiv.org/abs/2308.14761)\n* [Iterative Multi-Granular Image Editing Using Diffusion Models](http://arxiv.org/abs/2309.00613)\n* [Discovering and Mitigating Biases in CLIP-Based Image Editing](https://openaccess.thecvf.com/content/WACV2024/papers/Tanjim_Discovering_and_Mitigating_Biases_in_CLIP-Based_Image_Editing_WACV_2024_paper.pdf)\n* [Revisiting Latent Space of GAN Inversion for Robust Real Image Editing](https://openaccess.thecvf.com/content/WACV2024/papers/Katsumata_Revisiting_Latent_Space_of_GAN_Inversion_for_Robust_Real_Image_WACV_2024_paper.pdf)\n* [ProxEdit: Improving Tuning-Free Real Image Editing With Proximal Guidance](https://openaccess.thecvf.com/content/WACV2024/papers/Han_ProxEdit_Improving_Tuning-Free_Real_Image_Editing_With_Proximal_Guidance_WACV_2024_paper.pdf)\n* 图像拼接\n  * [Learning Residual Elastic Warps for Image Stitching Under Dirichlet Boundary Condition](http://arxiv.org/abs/2309.01406)\n  * [Implicit Neural Image Stitching With Enhanced and Blended Feature Reconstruction](http://arxiv.org/abs/2309.01409)\n* 视频编辑\n  * [Real Time GAZED: Online Shot Selection and Editing of Virtual Cameras From Wide-Angle Monocular Video Recordings](http://arxiv.org/abs/2311.15581)\n* 文本-图像编辑\n  * [Text-to-Image Editing by Image Information Removal](http://arxiv.org/abs/2305.17489)\n* 3D 场景编辑\n  * [NeRFEditor: Differentiable Style Decomposition for 3D Scene Editing](https://openaccess.thecvf.com/content/WACV2024/papers/Sun_NeRFEditor_Differentiable_Style_Decomposition_for_3D_Scene_Editing_WACV_2024_paper.pdf)\n\n\u003ca name=\"47\"/\u003e\n\n## 47.Edge Detection(边缘检测)\n* [Self-Supervised Edge Detection Reconstruction for Topology-Informed 3D Axon Segmentation and Centerline Detection](https://openaccess.thecvf.com/content/WACV2024/papers/Xu_Self-Supervised_Edge_Detection_Reconstruction_for_Topology-Informed_3D_Axon_Segmentation_and_WACV_2024_paper.pdf)\n\n\u003ca name=\"46\"/\u003e\n\n## 46.Dense Prediction(密集预测)\n* [PolyMaX: General Dense Prediction with Mask Transformer](http://arxiv.org/abs/2311.05770v1)\n* [Convolutional Masked Image Modeling for Dense Prediction Tasks on Pathology Images](https://openaccess.thecvf.com/content/WACV2024/papers/Yang_Convolutional_Masked_Image_Modeling_for_Dense_Prediction_Tasks_on_Pathology_WACV_2024_paper.pdf)\n\n\u003ca name=\"45\"/\u003e\n\n## 45.Visual Tampering Detection(视觉篡改检测)\n* 包裹防伪检测\n  * [TAMPAR: Visual Tampering Detection for Parcel Logistics in Postal Supply Chains](http://arxiv.org/abs/2311.03124v1)\u003cbr\u003e:star:[code](https://a-nau.github.io/tampar)\n* 视频伪造检测\n  * [VideoFACT: Detecting Video Forgeries Using Attention, Scene Context, and Forensic Traces](https://arxiv.org/abs/2211.15775)\n* Deepfakes \n  * [D4: Detection of Adversarial Diffusion Deepfakes Using Disjoint Ensembles](http://arxiv.org/abs/2202.05687)\n  * [Weakly-supervised deepfake localization in diffusion-generated images](http://arxiv.org/abs/2311.04584v1)\n  * [How Do Deepfakes Move? Motion Magnification for Deepfake Source Detection](https://openaccess.thecvf.com/content/WACV2024/papers/Demir_How_Do_Deepfakes_Move_Motion_Magnification_for_Deepfake_Source_Detection_WACV_2024_paper.pdf)\n  * [Improving Fairness in Deepfake Detection](http://arxiv.org/abs/2306.16635)\n\n\u003ca name=\"44\"/\u003e\n\n## 44.visual industrial inspection(工业检测)\n* [ReConPatch: Contrastive Patch Representation Learning for Industrial Anomaly Detection](http://arxiv.org/abs/2305.16713)\n* [High-Fidelity Zero-Shot Texture Anomaly Localization Using Feature Correspondence Analysis](http://arxiv.org/abs/2304.06433)\n* 图像异常检测\n  * [Attention Modules Improve Image-Level Anomaly Detection for Industrial Inspection: A DifferNet Case Study](https://openaccess.thecvf.com/content/WACV2024/papers/Vieira_e_Silva_Attention_Modules_Improve_Image-Level_Anomaly_Detection_for_Industrial_Inspection_A_WACV_2024_paper.pdf)\n  * [Contextual Affinity Distillation for Image Anomaly Detection](http://arxiv.org/abs/2307.03101)\n* 表面异常检测\n  * [Cheating Depth: Enhancing 3D Surface Anomaly Detection via Depth Simulation](http://arxiv.org/abs/2311.01117v1)\n* 图像异常定位\n  * [Learning Transferable Representations for Image Anomaly Localization Using Dense Pretraining](https://openaccess.thecvf.com/content/WACV2024/papers/He_Learning_Transferable_Representations_for_Image_Anomaly_Localization_Using_Dense_Pretraining_WACV_2024_paper.pdf)\n* 视觉异常检测\n  * [EfficientAD: Accurate Visual Anomaly Detection at Millisecond-Level Latencies](https://openaccess.thecvf.com/content/WACV2024/papers/Batzner_EfficientAD_Accurate_Visual_Anomaly_Detection_at_Millisecond-Level_Latencies_WACV_2024_paper.pdf)\n* 零样本异常检测\n  * [PromptAD: Zero-Shot Anomaly Detection Using Text Prompts](https://openaccess.thecvf.com/content/WACV2024/papers/Li_PromptAD_Zero-Shot_Anomaly_Detection_Using_Text_Prompts_WACV_2024_paper.pdf)\n* 轨迹异常检测\n  * [Holistic Representation Learning for Multitask Trajectory Anomaly Detection](http://arxiv.org/abs/2311.01851)\n* 人类行为理解\n  * [ENIGMA-51: Towards a Fine-Grained Understanding of Human Behavior in Industrial Scenarios](https://openaccess.thecvf.com/content/WACV2024/papers/Ragusa_ENIGMA-51_Towards_a_Fine-Grained_Understanding_of_Human_Behavior_in_Industrial_WACV_2024_paper.pdf)\n* OOD\n  * [HyperMix: Out-of-Distribution Detection and Classification in Few-Shot Settings](https://openaccess.thecvf.com/content/WACV2024/papers/Mehta_HyperMix_Out-of-Distribution_Detection_and_Classification_in_Few-Shot_Settings_WACV_2024_paper.pdf)\n  * [Out-of-Distribution Detection With Logical Reasoning](https://openaccess.thecvf.com/content/WACV2024/papers/Kirchheim_Out-of-Distribution_Detection_With_Logical_Reasoning_WACV_2024_paper.pdf)\n  * [ATS: Adaptive Temperature Scaling for Enhancing Out-of-Distribution Detection Methods](https://openaccess.thecvf.com/content/WACV2024/papers/Krumpl_ATS_Adaptive_Temperature_Scaling_for_Enhancing_Out-of-Distribution_Detection_Methods_WACV_2024_paper.pdf)\n\n\u003ca name=\"43\"/\u003e\n\n## 43.Image Fusion(图像融合)\n* [Bridging the Gap between Multi-focus and Multi-modal: A Focused Integration Framework for Multi-modal Image Fusion](http://arxiv.org/abs/2311.01886v1)\u003cbr\u003e:star:[code](https://github.com/ixilai/MFIF-MMIF)\n\n\u003ca name=\"42\"/\u003e\n\n## 42.Image Classification(图像分类)\n* [Semantic Generative Augmentations for Few-Shot Counting](https://arxiv.org/abs/2311.16122)\n* [Learning Quality Labels for Robust Image Classification](https://openaccess.thecvf.com/content/WACV2024/papers/Wang_Learning_Quality_Labels_for_Robust_Image_Classification_WACV_2024_paper.pdf)\n* [Visual Narratives: Large-Scale Hierarchical Classification of Art-Historical Images](https://openaccess.thecvf.com/content/WACV2024/papers/Springstein_Visual_Narratives_Large-Scale_Hierarchical_Classification_of_Art-Historical_Images_WACV_2024_paper.pdf)\n* [Benchmark Generation Framework With Customizable Distortions for Image Classifier Robustness](http://arxiv.org/abs/2310.18626)\n* [Deep Subdomain Alignment for Cross-Domain Image Classification](https://openaccess.thecvf.com/content/WACV2024/papers/Zhao_Deep_Subdomain_Alignment_for_Cross-Domain_Image_Classification_WACV_2024_paper.pdf)\n* [Online Class-Incremental Learning for Real-World Food Image Classification](https://openaccess.thecvf.com/content/WACV2024/papers/Raghavan_Online_Class-Incremental_Learning_for_Real-World_Food_Image_Classification_WACV_2024_paper.pdf)\n* [An Empirical Investigation Into Benchmarking Model Multiplicity for Trustworthy Machine Learning: A Case Study on Image Classification](http://arxiv.org/abs/2311.14859)\n* [Letting 3D Guide the Way: 3D Guided 2D Few-Shot Image Classification](https://openaccess.thecvf.com/content/WACV2024/papers/Chen_Letting_3D_Guide_the_Way_3D_Guided_2D_Few-Shot_Image_WACV_2024_paper.pdf)\n* 长尾视觉识别\n  * [Semantic Transfer From Head to Tail: Enlarging Tail Margin for Long-Tailed Visual Recognition](https://openaccess.thecvf.com/content/WACV2024/papers/Zhang_Semantic_Transfer_From_Head_to_Tail_Enlarging_Tail_Margin_for_WACV_2024_paper.pdf)\n* 多标签图像分类\n  * [Discriminator-Free Unsupervised Domain Adaptation for Multi-Label Image Classification](http://arxiv.org/abs/2301.10611)\n  * [Active Batch Sampling for Multi-Label Classification With Binary User Feedback](https://openaccess.thecvf.com/content/WACV2024/papers/Goswami_Active_Batch_Sampling_for_Multi-Label_Classification_With_Binary_User_Feedback_WACV_2024_paper.pdf)\n* 小样本分类\n  * [Domain Aligned CLIP for Few-shot Classification](http://arxiv.org/abs/2311.09191v1)\n  * [HELA-VFA: A Hellinger Distance-Attention-Based Feature Aggregation Network for Few-Shot Classification](https://openaccess.thecvf.com/content/WACV2024/papers/Lee_HELA-VFA_A_Hellinger_Distance-Attention-Based_Feature_Aggregation_Network_for_Few-Shot_Classification_WACV_2024_paper.pdf)\n* 多视图分类\n  * [Multi-View Classification Using Hybrid Fusion and Mutual Distillation](https://openaccess.thecvf.com/content/WACV2024/papers/Black_Multi-View_Classification_Using_Hybrid_Fusion_and_Mutual_Distillation_WACV_2024_paper.pdf)\n* 海草分类\n  * [Image Labels Are All You Need for Coarse Seagrass Segmentation](http://arxiv.org/abs/2303.00973)\n* 细粒度\n  * [Elusive Images: Beyond Coarse Analysis for Fine-Grained Recognition](https://openaccess.thecvf.com/content/WACV2024/papers/Anderson_Elusive_Images_Beyond_Coarse_Analysis_for_Fine-Grained_Recognition_WACV_2024_paper.pdf)\n* 鸟类物种分类\n  * [BirdSAT: Cross-View Contrastive Masked Autoencoders for Bird Species Classification and Mapping](http://arxiv.org/abs/2310.19168)\n\n\u003ca name=\"41\"/\u003e\n\n## 41.Image Progress(低层图像处理、质量评价)\n* 图像恢复\n  * [Conditional Velocity Score Estimation for Image Restoration](https://openaccess.thecvf.com/content/WACV2024/papers/Shi_Conditional_Velocity_Score_Estimation_for_Image_Restoration_WACV_2024_paper.pdf)\n  * [UGPNet: Universal Generative Prior for Image Restoration](https://openaccess.thecvf.com/content/WACV2024/papers/Lee_UGPNet_Universal_Generative_Prior_for_Image_Restoration_WACV_2024_paper.pdf)\n  * [PAIR: Perception Aided Image Restoration for Natural Driving Conditions](https://openaccess.thecvf.com/content/WACV2024/papers/Shyam_PAIR_Perception_Aided_Image_Restoration_for_Natural_Driving_Conditions_WACV_2024_paper.pdf)\n  * [LatentDR: Improving Model Generalization Through Sample-Aware Latent Degradation and Restoration](http://arxiv.org/abs/2308.14596)\n  * [Efficient Layout-Guided Image Inpainting for Mobile Use](https://openaccess.thecvf.com/content/WACV2024/papers/Li_Efficient_Layout-Guided_Image_Inpainting_for_Mobile_Use_WACV_2024_paper.pdf)\n* 图像修复\n  * [GraphFill: Deep Image Inpainting Using Graphs](https://openaccess.thecvf.com/content/WACV2024/papers/Verma_GraphFill_Deep_Image_Inpainting_Using_Graphs_WACV_2024_paper.pdf)\n  * [LatentPaint: Image Inpainting in Latent Space With Diffusion Models](https://openaccess.thecvf.com/content/WACV2024/papers/Corneanu_LatentPaint_Image_Inpainting_in_Latent_Space_With_Diffusion_Models_WACV_2024_paper.pdf)\n* 图像矫正\n  * [4K-Resolution Photo Exposure Correction at 125 FPS with ~8K Parameters](http://arxiv.org/abs/2311.08759v1)\u003cbr\u003e:star:[code](https://github.com/Zhou-Yijie/MSLTNet)\n* 图像增强\n  * 水下图像增强\n    * [PhISH-Net: Physics Inspired System for High Resolution Underwater Image Enhancement](https://openaccess.thecvf.com/content/WACV2024/papers/Chandrasekar_PhISH-Net_Physics_Inspired_System_for_High_Resolution_Underwater_Image_Enhancement_WACV_2024_paper.pdf)\n    * [Spectroformer: Multi-Domain Query Cascaded Transformer Network for Underwater Image Enhancement](https://openaccess.thecvf.com/content/WACV2024/papers/Khan_Spectroformer_Multi-Domain_Query_Cascaded_Transformer_Network_for_Underwater_Image_Enhancement_WACV_2024_paper.pdf)\n* 图像去噪\n  * [Self-Supervised Denoising Transformer With Gaussian Process](https://openaccess.thecvf.com/content/WACV2024/papers/Yasarla_Self-Supervised_Denoising_Transformer_With_Gaussian_Process_WACV_2024_paper.pdf)\n  * [Spiking Denoising Diffusion Probabilistic Models](http://arxiv.org/abs/2306.17046)\n  * [Image Denoising and the Generative Accumulation of Photons](http://arxiv.org/abs/2307.06607)\n  * [Fixed Pattern Noise Removal for Multi-View Single-Sensor Infrared Camera](https://openaccess.thecvf.com/content/WACV2024/papers/Barral_Fixed_Pattern_Noise_Removal_for_Multi-View_Single-Sensor_Infrared_Camera_WACV_2024_paper.pdf)\n  * [LIVENet: A Novel Network for Real-World Low-Light Image Denoising and Enhancement](https://openaccess.thecvf.com/content/WACV2024/papers/Makwana_LIVENet_A_Novel_Network_for_Real-World_Low-Light_Image_Denoising_and_WACV_2024_paper.pdf)\n* 图像去雾\n  * [C2AIR: Consolidated Compact Aerial Image Haze Removal](https://openaccess.thecvf.com/content/WACV2024/papers/Kulkarni_C2AIR_Consolidated_Compact_Aerial_Image_Haze_Removal_WACV_2024_paper.pdf)\n* 图像去闪光\n  * [Revolutionize the Oceanic Drone RGB Imagery With Pioneering Sun Glint Detection and Removal Techniques](https://openaccess.thecvf.com/content/WACV2024/papers/Qin_Revolutionize_the_Oceanic_Drone_RGB_Imagery_With_Pioneering_Sun_Glint_WACV_2024_paper.pdf)\n* 图像去反射\n  * [Fully-Automatic Reflection Removal for 360-Degree Images](https://openaccess.thecvf.com/content/WACV2024/papers/Park_Fully-Automatic_Reflection_Removal_for_360-Degree_Images_WACV_2024_paper.pdf)\n* 图像去模糊\n  * [Sharp-NeRF: Grid-Based Fast Deblurring Neural Radiance Fields Using Sharpness Prior](https://openaccess.thecvf.com/content/WACV2024/papers/Lee_Sharp-NeRF_Grid-Based_Fast_Deblurring_Neural_Radiance_Fields_Using_Sharpness_Prior_WACV_2024_paper.pdf)\n  * [Deep Plug-and-Play Nighttime Non-Blind Deblurring With Saturated Pixel Handling Schemes](https://openaccess.thecvf.com/content/WACV2024/papers/Shu_Deep_Plug-and-Play_Nighttime_Non-Blind_Deblurring_With_Saturated_Pixel_Handling_Schemes_WACV_2024_paper.pdf)\n  * [Deblur-NSFF: Neural Scene Flow Fields for Blurry Dynamic Scenes](https://openaccess.thecvf.com/content/WACV2024/papers/Luthra_Deblur-NSFF_Neural_Scene_Flow_Fields_for_Blurry_Dynamic_Scenes_WACV_2024_paper.pdf)\n  * [Single-Image Deblurring, Trajectory and Shape Recovery of Fast Moving Objects With Denoising Diffusion Probabilistic Models](https://openaccess.thecvf.com/content/WACV2024/papers/Spetlik_Single-Image_Deblurring_Trajectory_and_Shape_Recovery_of_Fast_Moving_Objects_WACV_2024_paper.pdf)\n* 图像去阴影\n  * [Latent Feature-Guided Diffusion Models for Shadow Removal](http://arxiv.org/abs/2312.02156)\n* 图像质量评估\n  * [ARNIQA: Learning Distortion Manifold for Image Quality Assessment](http://arxiv.org/abs/2310.14918)\n  * [Learning Generalizable Perceptual Representations for Data-Efficient No-Reference Image Quality Assessment](https://arxiv.org/abs/2312.04838)\n  * [Opinion Unaware Image Quality Assessment via Adversarial Convolutional Variational Autoencoder](https://openaccess.thecvf.com/content/WACV2024/papers/Shukla_Opinion_Unaware_Image_Quality_Assessment_via_Adversarial_Convolutional_Variational_Autoencoder_WACV_2024_paper.pdf)\n* 图像颜色编辑\n  * [Content-Aware Image Color Editing With Auxiliary Color Restoration Tasks](https://openaccess.thecvf.com/content/WACV2024/papers/Ren_Content-Aware_Image_Color_Editing_With_Auxiliary_Color_Restoration_Tasks_WACV_2024_paper.pdf)\n  * [Real-Time User-Guided Adaptive Colorization With Vision Transformer](https://openaccess.thecvf.com/content/WACV2024/papers/Lee_Real-Time_User-Guided_Adaptive_Colorization_With_Vision_Transformer_WACV_2024_paper.pdf)\n  * 再着色\n    * [Latent-Guided Exemplar-Based Image Re-Colorization](https://openaccess.thecvf.com/content/WACV2024/papers/Yang_Latent-Guided_Exemplar-Based_Image_Re-Colorization_WACV_2024_paper.pdf)\n\n\u003ca name=\"40\"/\u003e\n\n## 40.Self/Semi-supervised learning\n* 无监督学习\n  * [United We Stand, Divided We Fall: UnityGraph for Unsupervised Procedure Learning from Videos](http://arxiv.org/abs/2311.03550v1)\n  * [FELGA: Unsupervised Fragment Embedding for Fine-Grained Cross-Modal Association](https://openaccess.thecvf.com/content/WACV2024/papers/Zhuo_FELGA_Unsupervised_Fragment_Embedding_for_Fine-Grained_Cross-Modal_Association_WACV_2024_paper.pdf)\n* 半监督学习\n  * [SequenceMatch: Revisiting the design of weak-strong augmentations for Semi-supervised learning](https://arxiv.org/abs/2310.15787)\u003cbr\u003e:star:[code](https://github.com/beandkay/SequenceMatch)\n  * [Debiasing, calibrating, and improving Semi-supervised Learning performance via simple Ensemble Projector](https://arxiv.org/abs/2310.15764)\u003cbr\u003e:star:[code](https://github.com/beandkay/EPASS)\n  * [Universal Semi-Supervised Model Adaptation via Collaborative Consistency Training](http://arxiv.org/abs/2307.03449)\n  * [Improving Open-Set Semi-Supervised Learning With Self-Supervision](http://arxiv.org/abs/2301.10127)\n  * [Appearance-Based Curriculum for Semi-Supervised Learning With Multi-Angle Unlabeled Data](https://openaccess.thecvf.com/content/WACV2024/papers/Tanaka_Appearance-Based_Curriculum_for_Semi-Supervised_Learning_With_Multi-Angle_Unlabeled_Data_WACV_2024_paper.pdf)\n* 自监督学习\n  * [Self-Supervised Learning of Semantic Correspondence Using Web Videos](https://openaccess.thecvf.com/content/WACV2024/papers/Kwon_Self-Supervised_Learning_of_Semantic_Correspondence_Using_Web_Videos_WACV_2024_paper.pdf)\n  * [CycleCL: Self-supervised Learning for Periodic Videos](http://arxiv.org/abs/2311.03402v1)\n  * [Self-Supervised Representation Learning With Cross-Context Learning Between Global and Hypercolumn Features](http://arxiv.org/abs/2308.13392)\n  * [Self-Supervised Learning for Visual Relationship Detection through Masked Bounding Box Reconstruction](http://arxiv.org/abs/2311.04834v1)\u003cbr\u003e:star:[code](https://github.com/deeplab-ai/SelfSupervisedVRD)\n  * [Self-Supervised Learning for Place Representation Generalization Across Appearance Changes](https://openaccess.thecvf.com/content/WACV2024/papers/Musallam_Self-Supervised_Learning_for_Place_Representation_Generalization_Across_Appearance_Changes_WACV_2024_paper.pdf)\n  * [Masking Improves Contrastive Self-Supervised Learning for ConvNets, and Saliency Tells You Where](http://arxiv.org/abs/2309.12757)\n  * [MGM-AE: Self-Supervised Learning on 3D Shape Using Mesh Graph Masked Autoencoders](https://openaccess.thecvf.com/content/WACV2024/papers/Yang_MGM-AE_Self-Supervised_Learning_on_3D_Shape_Using_Mesh_Graph_Masked_WACV_2024_paper.pdf)\n\n\u003ca name=\"39\"/\u003e\n\n## 39.Few/Zero-Shot Learning/Domain Generalization/Adaptation(小/零样本/域泛化/域适应)\n* 零样本学习\n  * [GIPCOL: Graph-Injected Soft Prompting for Compositional Zero-Shot Learning](http://arxiv.org/abs/2311.05729v1)\n  * [Meta-Learned Attribute Self-Interaction Network for Continual and Generalized Zero-Shot Learning](http://arxiv.org/abs/2312.01167v1)\n  * [CAILA: Concept-Aware Intra-Layer Adapters for Compositional Zero-Shot Learning](http://arxiv.org/abs/2305.16681)\n* 小样本学习\n  * [Adaptive Manifold for Imbalanced Transductive Few-Shot Learning](http://arxiv.org/abs/2304.14281)\n  * [Hyperbolic vs Euclidean Embeddings in Few-Shot Learning: Two Sides of the Same Coin](https://arxiv.org/abs/2309.10013)\n* DG\n  * [Learning Class and Domain Augmentations for Single-Source Open-Domain Generalization](http://arxiv.org/abs/2311.02599v1)\n  * [On the Fly Neural Style Smoothing for Risk-Averse Domain Generalization](http://arxiv.org/abs/2307.08551)\n  * [Domain Generalization With Correlated Style Uncertainty](http://arxiv.org/abs/2212.09950)\n  * [Randomized Adversarial Style Perturbations for Domain Generalization](http://arxiv.org/abs/2304.01959)\n  * [Domain Generalisation via Risk Distribution Matching](http://arxiv.org/abs/2310.18598)\n  * [Domain Generalization by Rejecting Extreme Augmentations](https://openaccess.thecvf.com/content/WACV2024/papers/Aminbeidokhti_Domain_Generalization_by_Rejecting_Extreme_Augmentations_WACV_2024_paper.pdf)\n  * [Single Domain Generalization via Normalised Cross-Correlation Based Convolutions](http://arxiv.org/abs/2307.05901)\n  * [STYLIP: Multi-Scale Style-Conditioned Prompt Learning for CLIP-Based Domain Generalization](http://arxiv.org/abs/2302.09251)\n* DA\n  * [Gradual Source Domain Expansion for Unsupervised Domain Adaptation](http://arxiv.org/abs/2311.09599v1)\n  * [Continual Test-Time Domain Adaptation via Dynamic Sample Selection](http://arxiv.org/abs/2301.10611)\n  * [Bridging Generalization Gaps in High Content Imaging Through Online Self-Supervised Domain Adaptation](https://arxiv.org/abs/2311.12623)\u003cbr\u003e:star:[code](https://github.com/cfredinh/coda)\n  * [GLAD: Global-Local View Alignment and Background Debiasing for Unsupervised Video Domain Adaptation with Large Domain Gap](https://arxiv.org/abs/2311.12467)\u003cbr\u003e:star:[code](https://github.com/KHU-VLL/GLAD)\n  * [Aligning Non-Causal Factors for Transformer-Based Source-Free Domain Adaptation](https://arxiv.org/abs/2311.16294)\u003cbr\u003e:house:[project](https://val.cds.iisc.ac.in/C-SFTrans/)\n  * [Robust Unsupervised Domain Adaptation Through Negative-View Regularization](https://openaccess.thecvf.com/content/WACV2024/papers/Jang_Robust_Unsupervised_Domain_Adaptation_Through_Negative-View_Regularization_WACV_2024_paper.pdf)\n  * [ReCLIP: Refine Contrastive Language Image Pre-Training With Source Free Domain Adaptation](http://arxiv.org/abs/2308.03793)\n  * [Stochastic Binary Network for Universal Domain Adaptation](https://openaccess.thecvf.com/content/WACV2024/papers/Jain_Stochastic_Binary_Network_for_Universal_Domain_Adaptation_WACV_2024_paper.pdf)\n  * [D3GU: Multi-Target Active Domain Adaptation via Enhancing Domain Alignment](https://openaccess.thecvf.com/content/WACV2024/papers/Zhang_D3GU_Multi-Target_Active_Domain_Adaptation_via_Enhancing_Domain_Alignment_WACV_2024_paper.pdf)\n  * [Feed-Forward Latent Domain Adaptation](https://openaccess.thecvf.com/content/WACV2024/papers/Bohdal_Feed-Forward_Latent_Domain_Adaptation_WACV_2024_paper.pdf)\n\n\u003ca name=\"38\"/\u003e\n\n## 38.Visual Representation Learning\n* [Group-Wise Contrastive Bottleneck for Weakly-Supervised Visual Representation Learning](https://openaccess.thecvf.com/content/WACV2024/papers/Yap_Group-Wise_Contrastive_Bottleneck_for_Weakly-Supervised_Visual_Representation_Learning_WACV_2024_paper.pdf)\n\n\u003ca name=\"37\"/\u003e\n\n## 37.Machine Learning(机器学习)\n* 元学习\n  * [SigmML: Metric Meta-Learning for Writer Independent Offline Signature Verification in the Space of SPD Matrices](https://openaccess.thecvf.com/content/WACV2024/papers/Giazitzis_SigmML_Metric_Meta-Learning_for_Writer_Independent_Offline_Signature_Verification_in_WACV_2024_paper.pdf)\n* 持续学习/增量学习\n  * [MoP-CLIP: A Mixture of Prompt-Tuned CLIP Models for Domain Incremental Learning](https://openaccess.thecvf.com/content/WACV2024/papers/Nicolas_MoP-CLIP_A_Mixture_of_Prompt-Tuned_CLIP_Models_for_Domain_Incremental_WACV_2024_paper.pdf)\n  * [Efficient Expansion and Gradient Based Task Inference for Replay Free Incremental Learning](http://arxiv.org/abs/2312.01188v1)\n  * 类增量\n    * [Expanding Hyperspherical Space for Few-Shot Class-Incremental Learning](https://openaccess.thecvf.com/content/WACV2024/papers/Deng_Expanding_Hyperspherical_Space_for_Few-Shot_Class-Incremental_Learning_WACV_2024_paper.pdf)\n    * [Overcoming Catastrophic Forgetting for Multi-Label Class-Incremental Learning](https://openaccess.thecvf.com/content/WACV2024/papers/Song_Overcoming_Catastrophic_Forgetting_for_Multi-Label_Class-Incremental_Learning_WACV_2024_paper.pdf)\n    * [An Analysis of Initial Training Strategies for Exemplar-Free Class-Incremental Learning](http://arxiv.org/abs/2308.11677)\n    * [Wakening Past Concepts without Past Data: Class-Incremental Learning from Online Placebos](http://arxiv.org/abs/2310.16115v1)\u003cbr\u003e:star:[code](https://github.com/yaoyao-liu/online-placebos)\n    * [Robust Feature Learning and Global Variance-Driven Classifier Alignment for Long-Tail Class Incremental Learning](http://arxiv.org/abs/2311.01227v1)\n    * [TCP: Triplet Contrastive-Relationship Preserving for Class-Incremental Learning](https://openaccess.thecvf.com/content/WACV2024/papers/Li_TCP_Triplet_Contrastive-Relationship_Preserving_for_Class-Incremental_Learning_WACV_2024_paper.pdf)\n    * [MICS: Midpoint Interpolation To Learn Compact and Separated Representations for Few-Shot Class-Incremental Learning](https://openaccess.thecvf.com/content/WACV2024/papers/Kim_MICS_Midpoint_Interpolation_To_Learn_Compact_and_Separated_Representations_for_WACV_2024_paper.pdf)\n  * CL\n    * [Plasticity-Optimized Complementary Networks for Unsupervised Continual Learning](https://arxiv.org/abs/2309.06086)\n    * [Kaizen: Practical Self-Supervised Continual Learning With Continual Fine-Tuning](https://openaccess.thecvf.com/content/WACV2024/papers/Tang_Kaizen_Practical_Self-Supervised_Continual_Learning_With_Continual_Fine-Tuning_WACV_2024_paper.pdf)\n    * [Evolve: Enhancing Unsupervised Continual Learning With Multiple Experts](https://openaccess.thecvf.com/content/WACV2024/papers/Yu_Evolve_Enhancing_Unsupervised_Continual_Learning_With_Multiple_Experts_WACV_2024_paper.pdf)\n    * [Steering Prototypes With Prompt-Tuning for Rehearsal-Free Continual Learning](http://arxiv.org/abs/2303.09447)\n* 度量学习/Metric Learning\n  * [ProcSim: Proxy-based Confidence for Robust Similarity Learning](http://arxiv.org/abs/2311.00668v1)\n  * [Deep Metric Learning With Chance Constraints](https://openaccess.thecvf.com/content/WACV2024/papers/Gurbuz_Deep_Metric_Learning_With_Chance_Constraints_WACV_2024_paper.pdf)\n  * [Understanding Hyperbolic Metric Learning Through Hard Negative Sampling](https://openaccess.thecvf.com/content/WACV2024/papers/Yue_Understanding_Hyperbolic_Metric_Learning_Through_Hard_Negative_Sampling_WACV_2024_paper.pdf)\n* 对抗学习\n  * [Army of Thieves: Enhancing Black-Box Model Extraction via Ensemble based sample selection](http://arxiv.org/abs/2311.04588v1)\n  * 对抗攻击\n    * [Hard-Label Based Small Query Black-Box Adversarial Attack](https://openaccess.thecvf.com/content/WACV2024/papers/Park_Hard-Label_Based_Small_Query_Black-Box_Adversarial_Attack_WACV_2024_paper.pdf) \n  * 后门\n    * [A Closer Look at Robustness of Vision Transformers to Backdoor Attacks](https://openaccess.thecvf.com/content/WACV2024/papers/Subramanya_A_Closer_Look_at_Robustness_of_Vision_Transformers_to_Backdoor_WACV_2024_paper.pdf)\n* 主动学习\n  * [Training Ensembles With Inliers and Outliers for Semi-Supervised Active Learning](https://openaccess.thecvf.com/content/WACV2024/papers/Stojnic_Training_Ensembles_With_Inliers_and_Outliers_for_Semi-Supervised_Active_Learning_WACV_2024_paper.pdf)\n  * [Active Learning With Task Consistency and Diversity in Multi-Task Networks](https://openaccess.thecvf.com/content/WACV2024/papers/Hekimoglu_Active_Learning_With_Task_Consistency_and_Diversity_in_Multi-Task_Networks_WACV_2024_paper.pdf)\n  * [Critical Gap Between Generalization Error and Empirical Error in Active Learning](https://openaccess.thecvf.com/content/WACV2024/papers/Kanebako_Critical_Gap_Between_Generalization_Error_and_Empirical_Error_in_Active_WACV_2024_paper.pdf)\n* 联邦学习\n  * [Gradient Coreset for Federated Learning](https://openaccess.thecvf.com/content/WACV2024/papers/Sivasubramanian_Gradient_Coreset_for_Federated_Learning_WACV_2024_paper.pdf)\n  * [Late to the Party? On-Demand Unlabeled Personalized Federated Learning](https://openaccess.thecvf.com/content/WACV2024/papers/Amosy_Late_to_the_Party_On-Demand_Unlabeled_Personalized_Federated_Learning_WACV_2024_paper.pdf)\n  * [MetaVers: Meta-Learned Versatile Representations for Personalized Federated Learning](https://openaccess.thecvf.com/content/WACV2024/papers/Lim_MetaVers_Meta-Learned_Versatile_Representations_for_Personalized_Federated_Learning_WACV_2024_paper.pdf)\n  * [Maximum Knowledge Orthogonality Reconstruction With Gradients in Federated Learning](http://arxiv.org/abs/2310.19222)\n  * [Minimizing Layerwise Activation Norm Improves Generalization in Federated Learning](https://openaccess.thecvf.com/content/WACV2024/papers/Yashwanth_Minimizing_Layerwise_Activation_Norm_Improves_Generalization_in_Federated_Learning_WACV_2024_paper.pdf)\n  * [TransFed: A Way To Epitomize Focal Modulation Using Transformer-Based Federated Learning](https://openaccess.thecvf.com/content/WACV2024/papers/Ashraf_TransFed_A_Way_To_Epitomize_Focal_Modulation_Using_Transformer-Based_Federated_WACV_2024_paper.pdf)\n  * [Mixing Gradients in Neural Networks as a Strategy To Enhance Privacy in Federated Learning](https://openaccess.thecvf.com/content/WACV2024/papers/Eloul_Mixing_Gradients_in_Neural_Networks_as_a_Strategy_To_Enhance_WACV_2024_paper.pdf)\n* 对比学习\n  * [Activity-Based Early Autism Diagnosis Using a Multi-Dataset Supervised Contrastive Learning Approach](https://openaccess.thecvf.com/content/WACV2024/papers/Rani_Activity-Based_Early_Autism_Diagnosis_Using_a_Multi-Dataset_Supervised_Contrastive_Learning_WACV_2024_paper.pdf)\n  * [Distortion-Disentangled Contrastive Learning](http://arxiv.org/abs/2303.05066)\n  * [OOD Aware Supervised Contrastive Learning](http://arxiv.org/abs/2310.01942)\n* 强化学习\n  * [CryoRL: Reinforcement Learning Enables Efficient Cryo-EM Data Collection](http://arxiv.org/abs/2204.07543)\n* 迁移学习\n  * [DR10K: Transfer Learning Using Weak Labels for Grading Diabetic Retinopathy on DR10K Dataset](https://openaccess.thecvf.com/content/WACV2024/papers/ElHabebe_DR10K_Transfer_Learning_Using_Weak_Labels_for_Grading_Diabetic_Retinopathy_WACV_2024_paper.pdf)\n* 多任务学习\n  * [BigSmall: Efficient Multi-Task Learning for Disparate Spatial and Temporal Physiological Measurements](http://arxiv.org/abs/2303.11573)\n\n\u003ca name=\"36\"/\u003e\n\n## 36.NLP\n* [Few-Shot Event Classification in Images Using Knowledge Graphs for Prompting](https://openaccess.thecvf.com/content/WACV2024/papers/Tahmasebzadeh_Few-Shot_Event_Classification_in_Images_Using_Knowledge_Graphs_for_Prompting_WACV_2024_paper.pdf)\n\n\u003ca name=\"35\"/\u003e\n\n## 35.Model Compression/Knowledge Distillation/Pruning(模型压缩/知识蒸馏/剪枝)\n* [Wino Vidi Vici: Conquering Numerical Instability of 8-Bit Winograd Convolution for Accurate Inference Acceleration on Edge](https://openaccess.thecvf.com/content/WACV2024/papers/Mori_Wino_Vidi_Vici_Conquering_Numerical_Instability_of_8-Bit_Winograd_Convolution_WACV_2024_paper.pdf)\n* 量化\n  * [Reducing the Side-Effects of Oscillations in Training of Quantized YOLO Networks](http://arxiv.org/abs/2311.05109v1)\n  * [Improved Techniques for Quantizing Deep Networks With Adaptive Bit-Widths](http://arxiv.org/abs/2103.01435)\n  * [Evidential Uncertainty Quantification: A Variance-Based Perspective](https://arxiv.org/abs/2311.11367)\n  * [Edge Inference With Fully Differentiable Quantized Mixed Precision Neural Networks](http://arxiv.org/abs/2206.07741)\n* 剪枝\n  * [Token Fusion: Bridging the Gap between Token Pruning and Token Merging](http://arxiv.org/abs/2312.01026v1)\n  * [Torque Based Structured Pruning for Deep Neural Network](https://openaccess.thecvf.com/content/WACV2024/papers/Gupta_Torque_Based_Structured_Pruning_for_Deep_Neural_Network_WACV_2024_paper.pdf)\n  * [Pruning From Scratch via Shared Pruning Module and Nuclear Norm-Based Regularization](https://openaccess.thecvf.com/content/WACV2024/papers/Lee_Pruning_From_Scratch_via_Shared_Pruning_Module_and_Nuclear_Norm-Based_WACV_2024_paper.pdf)\n  * [Towards Better Structured Pruning Saliency by Reorganizing Convolution](https://openaccess.thecvf.com/content/WACV2024/papers/Sun_Towards_Better_Structured_Pruning_Saliency_by_Reorganizing_Convolution_WACV_2024_paper.pdf)\n  * [PATROL: Privacy-Oriented Pruning for Collaborative Inference Against Model Inversion Attacks](http://arxiv.org/abs/2307.10981)\n* KD\n  * [Frequency Attention for Knowledge Distillation](https://openaccess.thecvf.com/content/WACV2024/papers/Pham_Frequency_Attention_for_Knowledge_Distillation_WACV_2024_paper.pdf)\n  * [Adapt Your Teacher: Improving Knowledge Distillation for Exemplar-Free Continual Learning](https://openaccess.thecvf.com/content/WACV2024/papers/Szatkowski_Adapt_Your_Teacher_Improving_Knowledge_Distillation_for_Exemplar-Free_Continual_Learning_WACV_2024_paper.pdf)\n  * [Towards Domain-Aware Knowledge Distillation for Continual Model Generalization](https://openaccess.thecvf.com/content/WACV2024/papers/Reddy_Towards_Domain-Aware_Knowledge_Distillation_for_Continual_Model_Generalization_WACV_2024_paper.pdf)\n  * [Reverse Knowledge Distillation: Training a Large Model Using a Small One for Retinal Image Matching on Limited Data](http://arxiv.org/abs/2307.10698)\n\n\u003ca name=\"34\"/\u003e\n\n## 34.NAS\n* [FLORA: Fine-grained Low-Rank Architecture Search for Vision Transformer](http://arxiv.org/abs/2311.03912v1)\u003cbr\u003e:star:[code](https://github.com/shadowpa0327/FLORA)\n* [Hardware Aware Evolutionary Neural Architecture Search Using Representation Similarity Metric](http://arxiv.org/abs/2311.03923)\n\n\u003ca name=\"33\"/\u003e\n\n## 33.Optical Flow Estimation(光流估计)\n* [Detection Defenses: An Empty Promise against Adversarial Patch Attacks on Optical Flow](http://arxiv.org/abs/2310.17403v1)\u003cbr\u003e:star:[code](https://github.com/cv-stuttgart/DetectionDefenses)\n* [CCMR: High Resolution Optical Flow Estimation via Coarse-to-Fine Context-Guided Motion Reasoning](http://arxiv.org/abs/2311.02661v1)\u003cbr\u003e:star:[code](https://github.com/cv-stuttgart)\n\n\u003ca name=\"32\"/\u003e\n\n## 32.Scene Flow Estimation(场景流估计)\n* [OptFlow: Fast Optimization-Based Scene Flow Estimation Without Supervision](https://openaccess.thecvf.com/content/WACV2024/papers/Ahuja_OptFlow_Fast_Optimization-Based_Scene_Flow_Estimation_Without_Supervision_WACV_2024_paper.pdf)\n\n\u003ca name=\"31\"/\u003e\n\n## 31.Automated Driving(自动驾驶)\n* 车道线检测\n  * [CLRerNet: Improving Confidence of Lane Detection With LaneIoU](http://arxiv.org/abs/2305.08366)\n* 自动驾驶\n  * [Re-Evaluating LiDAR Scene Flow for Autonomous Driving](https://arxiv.org/abs/2304.02150)\n  * [NVAutoNet: Fast and Accurate 360deg 3D Visual Perception for Self Driving](https://openaccess.thecvf.com/content/WACV2024/papers/Pham_NVAutoNet_Fast_and_Accurate_360deg_3D_Visual_Perception_for_Self_WACV_2024_paper.pdf)\n  * [Driving Through the Concept Gridlock: Unraveling Explainability Bottlenecks in Automated Driving](http://arxiv.org/abs/2310.16639)\n  * [StreamMapNet: Streaming Mapping Network for Vectorized Online HD Map Construction](https://arxiv.org/abs/2308.12570)\u003cbr\u003e:star:[code](https://github.com/yuantianyuan01/StreamMapNet)\n* 驾驶员损伤评估\n  * [Estimating Blood Alcohol Level Through Facial Features for Driver Impairment Assessment](https://openaccess.thecvf.com/content/WACV2024/papers/Keshtkaran_Estimating_Blood_Alcohol_Level_Through_Facial_Features_for_Driver_Impairment_WACV_2024_paper.pdf)\n* 交通标志检测\n  * [Natural Light Can Also Be Dangerous: Traffic Sign Misinterpretation Under Adversarial Natural Light Attacks](https://openaccess.thecvf.com/content/WACV2024/papers/Hsiao_Natural_Light_Can_Also_Be_Dangerous_Traffic_Sign_Misinterpretation_Under_WACV_2024_paper.pdf)\n* 障碍物检测\n  * [Have We Ever Encountered This Before? Retrieving Out-of-Distribution Road Obstacles From Driving Scenes](http://arxiv.org/abs/2309.04302)\n* 驾驶员动作意图识别\n  * [Evaluation of Video Masked Autoencoders' Performance and Uncertainty Estimations for Driver Action and Intention Recognition](https://openaccess.thecvf.com/content/WACV2024/papers/Vellenga_Evaluation_of_Video_Masked_Autoencoders_Performance_and_Uncertainty_Estimations_for_WACV_2024_paper.pdf)\n\n\u003ca name=\"30\"/\u003e\n\n## 30.GNN/GCN\n* GNN \n  * [Automated Camera Calibration via Homography Estimation With GNNs](http://arxiv.org/abs/2311.02598)\n  * [RIMeshGNN: A Rotation-Invariant Graph Neural Network for Mesh Classification](https://openaccess.thecvf.com/content/WACV2024/papers/Shakibajahromi_RIMeshGNN_A_Rotation-Invariant_Graph_Neural_Network_for_Mesh_Classification_WACV_2024_paper.pdf)\n* 图网络\n  * [Improving Graph Networks Through Selection-Based Convolution](https://openaccess.thecvf.com/content/WACV2024/papers/Hart_Improving_Graph_Networks_Through_Selection-Based_Convolution_WACV_2024_paper.pdf)\n\n\u003ca name=\"29\"/\u003e\n\n## 29.Scene Graph Generation(场景图生成)\n* [Self-Supervised Relation Alignment for Scene Graph Generation](http://arxiv.org/abs/2302.01403)\n* [Refine and Redistribute: Multi-Domain Fusion and Dynamic Label Assignment for Unbiased Scene Graph Generation](https://openaccess.thecvf.com/content/WACV2024/papers/Zang_Refine_and_Redistribute_Multi-Domain_Fusion_and_Dynamic_Label_Assignment_for_WACV_2024_paper.pdf)\n\n\u003ca name=\"28\"/\u003e\n\n## 28.Point-Cloud(点云)\n* [MAELi: Masked Autoencoder for Large-Scale LiDAR Point Clouds](http://arxiv.org/abs/2212.07207)\n* [Cross-Domain Few-Shot Incremental Learning for Point-Cloud Recognition](https://openaccess.thecvf.com/content/WACV2024/papers/Tan_Cross-Domain_Few-Shot_Incremental_Learning_for_Point-Cloud_Recognition_WACV_2024_paper.pdf)\n* [Sparse Convolutional Networks for Surface Reconstruction From Noisy Point Clouds](https://openaccess.thecvf.com/content/WACV2024/papers/Wang_Sparse_Convolutional_Networks_for_Surface_Reconstruction_From_Noisy_Point_Clouds_WACV_2024_paper.pdf)\n* [LidarCLIP or: How I Learned To Talk to Point Clouds](http://arxiv.org/abs/2212.06858)\n* [FinderNet: A Data Augmentation Free Canonicalization Aided Loop Detection and Closure Technique for Point Clouds in 6-DOF Separation](https://openaccess.thecvf.com/content/WACV2024/papers/Harithas_FinderNet_A_Data_Augmentation_Free_Canonicalization_Aided_Loop_Detection_and_WACV_2024_paper.pdf)\n* [Indoor Visual Localization Using Point and Line Correspondences in Dense Colored Point Cloud](https://openaccess.thecvf.com/content/WACV2024/papers/Matsumoto_Indoor_Visual_Localization_Using_Point_and_Line_Correspondences_in_Dense_WACV_2024_paper.pdf)\n* [SSP: Semi-Signed Prioritized Neural Fitting for Surface Reconstruction From Unoriented Point Clouds](https://openaccess.thecvf.com/content/WACV2024/papers/Zhu_SSP_Semi-Signed_Prioritized_Neural_Fitting_for_Surface_Reconstruction_From_Unoriented_WACV_2024_paper.pdf)\n* 3D 点云\n  * [Synergizing Contrastive Learning and Optimal Transport for 3D Point Cloud Domain Adaptation](http://arxiv.org/abs/2308.14126)\n* 点云配准\n  * [MagneticPillars: Efficient Point Cloud Registration Through Hierarchized Birds-Eye-View Cell Correspondence Refinement](https://openaccess.thecvf.com/content/WACV2024/papers/Fischer_MagneticPillars_Efficient_Point_Cloud_Registration_Through_Hierarchized_Birds-Eye-View_Cell_Correspondence_WACV_2024_paper.pdf)\n  * [HDMNet: A Hierarchical Matching Network With Double Attention for Large-Scale Outdoor LiDAR Point Cloud Registration](http://arxiv.org/abs/2310.18874)\n* 点云补全\n  * [WalkFormer: Point Cloud Completion via Guided Walks](https://openaccess.thecvf.com/content/WACV2024/papers/Zhang_WalkFormer_Point_Cloud_Completion_via_Guided_Walks_WACV_2024_paper.pdf)\n* 点云分割\n  * [When 3D Bounding-Box Meets SAM: Point Cloud Instance Segmentation With Weak-and-Noisy Supervision](http://arxiv.org/abs/2309.00828)\n  * [PointCT: Point Central Transformer Network for Weakly-Supervised Point Cloud Semantic Segmentation](https://openaccess.thecvf.com/content/WACV2024/papers/Tran_PointCT_Point_Central_Transformer_Network_for_Weakly-Supervised_Point_Cloud_Semantic_WACV_2024_paper.pdf)\n* 点云分类\n  * [SimpliMix: A Simplified Manifold Mixup for Few-Shot Point Cloud Classification](https://openaccess.thecvf.com/content/WACV2024/papers/Yang_SimpliMix_A_Simplified_Manifold_Mixup_for_Few-Shot_Point_Cloud_Classification_WACV_2024_paper.pdf)\n\n\u003ca name=\"27\"/\u003e\n\n## 27.Human-Object Interactions(人物交互)\n* [Exploiting CLIP for Zero-Shot HOI Detection Requires Knowledge Distillation at Multiple Levels](http://arxiv.org/abs/2309.05069)\n* [Task-Oriented Human-Object Interactions Generation With Implicit Neural Representations](http://arxiv.org/abs/2303.13129)\n* [Beyond Active Learning: Leveraging the Full Potential of Human Interaction via Auto-Labeling, Human Correction, and Human Verification](http://arxiv.org/abs/2306.01277)\n* [Bipartite Graph Diffusion Model for Human Interaction Generation](http://arxiv.org/abs/2301.10134)\n\n\u003ca name=\"26\"/\u003e\n\n## 26.Human Motion Prediction(人体运动预测)\n* [Incorporating Physics Principles for Precise Human Motion Prediction](https://openaccess.thecvf.com/content/WACV2024/papers/Zhang_Incorporating_Physics_Principles_for_Precise_Human_Motion_Prediction_WACV_2024_paper.pdf)\n* [Context-Based Interpretable Spatio-Temporal Graph Convolutional Network for Human Motion Forecasting](https://openaccess.thecvf.com/content/WACV2024/papers/Medina_Context-Based_Interpretable_Spatio-Temporal_Graph_Convolutional_Network_for_Human_Motion_Forecasting_WACV_2024_paper.pdf)\n* 人体运动合成\n  * [MotionGPT: Human Motion Synthesis With Improved Diversity and Realism via GPT-3 Prompting](https://openaccess.thecvf.com/content/WACV2024/papers/Ribeiro-Gomes_MotionGPT_Human_Motion_Synthesis_With_Improved_Diversity_and_Realism_via_WACV_2024_paper.pdf)\n\n\u003ca name=\"25\"/\u003e\n\n## 25.Multimodal(多模态)\n* [Dynamic Multimodal Information Bottleneck for Multimodality Classification](http://arxiv.org/abs/2311.01066v1)\u003cbr\u003e:star:[code](https://github.com/BII-wushuang/DMIB)\n* [CoD: Coherent Detection of Entities From Images With Multiple Modalities](https://openaccess.thecvf.com/content/WACV2024/papers/Verma_CoD_Coherent_Detection_of_Entities_From_Images_With_Multiple_Modalities_WACV_2024_paper.pdf)\n* [Multimodal Deep Learning for Remote Stress Estimation Using CCT-LSTM](https://openaccess.thecvf.com/content/WACV2024/papers/Ziaratnia_Multimodal_Deep_Learning_for_Remote_Stress_Estimation_Using_CCT-LSTM_WACV_2024_paper.pdf)\n* [Enhancing Multimodal Compositional Reasoning of Visual Language Models with Generative Negative Mining](https://arxiv.org/abs/2311.03964)\u003cbr\u003e:star:[code](https://github.com/ugorsahin/Generative-Negative-Mining)\n* [OmniVec: Learning robust representations with cross modal sharing](http://arxiv.org/abs/2311.05709v1)\n* [Complementary-Contradictory Feature Regularization Against Multimodal Overfitting](https://openaccess.thecvf.com/content/WACV2024/papers/Tejero-de-Pablos_Complementary-Contradictory_Feature_Regularization_Against_Multimodal_Overfitting_WACV_2024_paper.pdf)\u003cbr\u003e:star:[code](https://github.com/CyberAgentAILab/CM-VQVAE)\n* [Learning Intra-Class Multimodal Distributions With Orthonormal Matrices](https://openaccess.thecvf.com/content/WACV2024/papers/Goto_Learning_Intra-Class_Multimodal_Distributions_With_Orthonormal_Matrices_WACV_2024_paper.pdf)\n* [EASUM: Enhancing Affective State Understanding Through Joint Sentiment and Emotion Modeling for Multimodal Tasks](https://openaccess.thecvf.com/content/WACV2024/papers/Hwang_EASUM_Enhancing_Affective_State_Understanding_Through_Joint_Sentiment_and_Emotion_WACV_2024_paper.pdf)\n* CLIP\n  * [C-CLIP: Contrastive Image-Text Encoders To Close the Descriptive-Commentative Gap](https://openaccess.thecvf.com/content/WACV2024/papers/Theisen_C-CLIP_Contrastive_Image-Text_Encoders_To_Close_the_Descriptive-Commentative_Gap_WACV_2024_paper.pdf)\n  * [DiffCLIP: Leveraging Stable Diffusion for Language Grounded 3D Classification](http://arxiv.org/abs/2305.15957)\n  * [ClipSitu: Effectively Leveraging CLIP for Conditional Predictions in Situation Recognition](http://arxiv.org/abs/2307.00586)\n\n\n\u003ca name=\"24\"/\u003e\n\n## 24.Lage Language Models(大语言模型)\n* [Zero-Shot Building Attribute Extraction From Large-Scale Vision and Language Models](https://openaccess.thecvf.com/content/WACV2024/papers/Pan_Zero-Shot_Building_Attribute_Extraction_From_Large-Scale_Vision_and_Language_Models_WACV_2024_paper.pdf)\n\n\u003ca name=\"23\"/\u003e\n\n## 23.Vision-Language(视觉语言)\n* [Multitask Vision-Language Prompt Tuning](http://arxiv.org/abs/2211.11720)\n* [Improving Fairness Using Vision-Language Driven Image Augmentation](https://openaccess.thecvf.com/content/WACV2024/papers/DInca_Improving_Fairness_Using_Vision-Language_Driven_Image_Augmentation_WACV_2024_paper.pdf)\n* [Empowering Unsupervised Domain Adaptation With Large-Scale Pre-Trained Vision-Language Models](https://openaccess.thecvf.com/content/WACV2024/papers/Lai_Empowering_Unsupervised_Domain_Adaptation_With_Large-Scale_Pre-Trained_Vision-Language_Models_WACV_2024_paper.pdf)\n* [Can Vision-Language Models Be a Good Guesser? Exploring VLMs for Times and Location Reasoning](http://arxiv.org/abs/2307.06166)\n* [Towards Addressing the Misalignment of Object Proposal Evaluation for Vision-Language Tasks via Semantic Grounding](https://arxiv.org/abs/2309.00215)\n* [Improving Vision-and-Language Reasoning via Spatial Relations Modeling](http://arxiv.org/abs/2311.05298)\n* [MIVC: Multiple Instance Visual Component for Visual-Language Models](https://openaccess.thecvf.com/content/WACV2024/papers/Wu_MIVC_Multiple_Instance_Visual_Component_for_Visual-Language_Models_WACV_2024_paper.pdf)\n\n\n\u003ca name=\"22\"/\u003e\n\n## 22.Visual Answer Questions(视觉问答)\n* [RankDVQA: Deep VQA Based on Ranking-Inspired Hybrid Training](http://arxiv.org/abs/2202.08595)\n* [POP-VQA - Privacy Preserving, On-Device, Personalized Visual Question Answering](https://openaccess.thecvf.com/content/WACV2024/papers/Sahu_POP-VQA_-_Privacy_Preserving_On-Device_Personalized_Visual_Question_Answering_WACV_2024_paper.pdf)\n* [Benchmarking Out-of-Distribution Detection in Visual Question Answering](https://openaccess.thecvf.com/content/WACV2024/papers/Shi_Benchmarking_Out-of-Distribution_Detection_in_Visual_Question_Answering_WACV_2024_paper.pdf)\n* [Can You Even Tell Left From Right? Presenting a New Challenge for VQA](https://openaccess.thecvf.com/content/WACV2024/papers/Venkataraman_Can_You_Even_Tell_Left_From_Right_Presenting_a_New_WACV_2024_paper.pdf)\n* 视觉对话\n  * [VD-GR: Boosting Visual Dialog with Cascaded Spatial-Temporal Multi-Modal GRaphs](http://arxiv.org/abs/2310.16590v1)\n* AVQA\n  * [CAD - Contextual Multi-Modal Alignment for Dynamic AVQA](https://openaccess.thecvf.com/content/WACV2024/papers/Nadeem_CAD_-_Contextual_Multi-Modal_Alignment_for_Dynamic_AVQA_WACV_2024_paper.pdf)\n  * [Tackling Data Bias in MUSIC-AVQA: Crafting a Balanced Dataset for Unbiased Question-Answering](https://openaccess.thecvf.com/content/WACV2024/papers/Liu_Tackling_Data_Bias_in_MUSIC-AVQA_Crafting_a_Balanced_Dataset_for_WACV_2024_paper.pdf)\n* ArtVQA\n  * [ArtQuest: Countering Hidden Language Biases in ArtVQA](https://openaccess.thecvf.com/content/WACV2024/papers/Bleidt_ArtQuest_Countering_Hidden_Language_Biases_in_ArtVQA_WACV_2024_paper.pdf)\n\n\u003ca name=\"21\"/\u003e\n\n## 21.SLAM/Augmented Reality/Virtual Reality/Robotics(增强/虚拟现实/机器人)\n* 虚拟试穿\n  * [A Generative Multi-Resolution Pyramid and Normal-Conditioning 3D Cloth Draping](http://arxiv.org/abs/2311.02700v1)\n  * [Controlling Virtual Try-On Pipeline Through Rendering Policies](https://openaccess.thecvf.com/content/WACV2024/papers/Li_Controlling_Virtual_Try-On_Pipeline_Through_Rendering_Policies_WACV_2024_paper.pdf)\n  * [GC-VTON: Predicting Globally Consistent and Occlusion Aware Local Flows with Neighborhood Integrity Preservation for Virtual Try-on](http://arxiv.org/abs/2311.04932v1)\n* 虚拟化身\n  * [CVTHead: One-shot Controllable Head Avatar with Vertex-feature Transformer](http://arxiv.org/abs/2311.06443v1)\n* 机器人\n  * [Shape From Shading for Robotic Manipulation](http://arxiv.org/abs/2304.11824)\n  * [Optimizing Long-Term Robot Tracking With Multi-Platform Sensor Fusion](https://openaccess.thecvf.com/content/WACV2024/papers/Albanese_Optimizing_Long-Term_Robot_Tracking_With_Multi-Platform_Sensor_Fusion_WACV_2024_paper.pdf)\n  * 机器人定位\n    * [Cross-Attention Between Satellite and Ground Views for Enhanced Fine-Grained Robot Geo-Localization](https://openaccess.thecvf.com/content/WACV2024/papers/Yuan_Cross-Attention_Between_Satellite_and_Ground_Views_for_Enhanced_Fine-Grained_Robot_WACV_2024_paper.pdf)\n* 导航\n  * [MOPA: Modular Object Navigation With PointGoal Agents](http://arxiv.org/abs/2304.03696)\n* 视觉定位\n  * [FocusTune: Tuning Visual Localization Through Focus-Guided Sampling](http://arxiv.org/abs/2311.02872)\n* 轨迹预测\n  * [Second-Order Graph ODEs for Multi-Agent Trajectory Forecasting](https://openaccess.thecvf.com/content/WACV2024/papers/Wen_Second-Order_Graph_ODEs_for_Multi-Agent_Trajectory_Forecasting_WACV_2024_paper.pdf)\n\n\u003ca name=\"20\"/\u003e\n\n## 20.GAN/生成\n* [FacadeNet: Conditional Facade Synthesis via Selective Editing](http://arxiv.org/abs/2311.01240)\n* [Synthesizing Anyone, Anywhere, in Any Pose](https://openaccess.thecvf.com/content/WACV2024/papers/Hukkelas_Synthesizing_Anyone_Anywhere_in_Any_Pose_WACV_2024_paper.pdf)\n* GAN\n  * [Consistent Multimodal Generation via a Unified GAN Framework](http://arxiv.org/abs/2307.01425)\n  * [StyleGenes: Discrete and Efficient Latent Distributions for GANs](http://arxiv.org/abs/2305.00599)\n  * [Improving the Fairness of the Min-Max Game in GANs Training](https://openaccess.thecvf.com/content/WACV2024/papers/Zhang_Improving_the_Fairness_of_the_Min-Max_Game_in_GANs_Training_WACV_2024_paper.pdf)\n  * [StyleGAN-Fusion: Diffusion Guided Domain Adaptation of Image Generators](https://openaccess.thecvf.com/content/WACV2024/papers/Song_StyleGAN-Fusion_Diffusion_Guided_Domain_Adaptation_of_Image_Generators_WACV_2024_paper.pdf)\n  * [PlantPlotGAN: A Physics-Informed Generative Adversarial Network for Plant Disease Prediction](https://arxiv.org/abs/2310.18268)\n  * [P2D: Plug and Play Discriminator for Accelerating GAN Frameworks](https://openaccess.thecvf.com/content/WACV2024/papers/Chong_P2D_Plug_and_Play_Discriminator_for_Accelerating_GAN_Frameworks_WACV_2024_paper.pdf)\n  * [Soft Curriculum for Learning Conditional GANs With Noisy-Labeled and Uncurated Unlabeled Data](http://arxiv.org/abs/2307.08319)\n  * [What Decreases Editing Capability? Domain-Specific Hybrid Refinement for Improved GAN Inversion](http://arxiv.org/abs/2301.12141)\n  * [Improving the Leaking of Augmentations in Data-Efficient GANs via Adaptive Negative Data Augmentation](https://openaccess.thecvf.com/content/WACV2024/papers/Zhang_Improving_the_Leaking_of_Augmentations_in_Data-Efficient_GANs_via_Adaptive_WACV_2024_paper.pdf)\n  * [PETIT-GAN: Physically Enhanced Thermal Image-Translating Generative Adversarial Network](https://openaccess.thecvf.com/content/WACV2024/papers/Berman_PETIT-GAN_Physically_Enhanced_Thermal_Image-Translating_Generative_Adversarial_Network_WACV_2024_paper.pdf)\n* 图像生成\n  * [Improving the Effectiveness of Deep Generative Data](http://arxiv.org/abs/2311.03959v1)\n  * [Generated Distributions Are All You Need for Membership Inference Attacks Against Generative Models](http://arxiv.org/abs/2310.19410)\n  * [Nested Diffusion Processes for Anytime Image Generation](http://arxiv.org/abs/2305.19066)\n* 图像合成\n  * [Painterly Image Harmonization via Adversarial Residual Learning](http://arxiv.org/abs/2311.08646v1)\n  * [Controllable Image Synthesis of Industrial Data Using Stable Diffusion](https://openaccess.thecvf.com/content/WACV2024/papers/Valvano_Controllable_Image_Synthesis_of_Industrial_Data_Using_Stable_Diffusion_WACV_2024_paper.pdf)\n  * [Label Augmentation As Inter-Class Data Augmentation for Conditional Image Synthesis With Imbalanced Data](https://openaccess.thecvf.com/content/WACV2024/papers/Katsumata_Label_Augmentation_As_Inter-Class_Data_Augmentation_for_Conditional_Image_Synthesis_WACV_2024_paper.pdf)\n* 文本-图像\n  * [CLIPAG: Towards Generator-Free Text-to-Image Generation](http://arxiv.org/abs/2306.16805)\n  * [Customizing 360-Degree Panoramas Through Text-to-Image Diffusion Models](http://arxiv.org/abs/2310.18840)\n  * [Text-to-Image Models for Counterfactual Explanations: A Black-Box Approach](http://arxiv.org/abs/2309.07944)\n  * [TIAM - A Metric for Evaluating Alignment in Text-to-Image Generation](https://openaccess.thecvf.com/content/WACV2024/papers/Grimal_TIAM_-_A_Metric_for_Evaluating_Alignment_in_Text-to-Image_Generation_WACV_2024_paper.pdf)\n  * [Localization and Manipulation of Immoral Visual Cues for Safe Text-to-Image Generation](https://openaccess.thecvf.com/content/WACV2024/papers/Park_Localization_and_Manipulation_of_Immoral_Visual_Cues_for_Safe_Text-to-Image_WACV_2024_paper.pdf)\n  * [Unsupervised Co-Generation of Foreground-Background Segmentation From Text-to-Image Synthesis](https://openaccess.thecvf.com/content/WACV2024/papers/Ahmed_Unsupervised_Co-Generation_of_Foreground-Background_Segmentation_From_Text-to-Image_Synthesis_WACV_2024_paper.pdf)\n* 图像-文本\n  * [SciOL and MuLMS-Img: Introducing a Large-Scale Multimodal Scientific Dataset and Models for Image-Text Tasks in the Scientific Domain](https://openaccess.thecvf.com/content/WACV2024/papers/Tarsi_SciOL_and_MuLMS-Img_Introducing_a_Large-Scale_Multimodal_Scientific_Dataset_and_WACV_2024_paper.pdf)\n* 视频合成\n  * [RADIO: Reference-Agnostic Dubbing Video Synthesis](http://arxiv.org/abs/2309.01950)\n  * [One Style Is All You Need To Generate a Video](http://arxiv.org/abs/2310.17835)\n* 扩散模型\n  * [Fast Diffusion EM: A Diffusion Model for Blind Inverse Problems With Application to Deconvolution](https://openaccess.thecvf.com/content/WACV2024/papers/Laroche_Fast_Diffusion_EM_A_Diffusion_Model_for_Blind_Inverse_Problems_WACV_2024_paper.pdf)\n  * [Expanding Expressiveness of Diffusion Models with Limited Data via Self-Distillation based Fine-Tuning](http://arxiv.org/abs/2311.01018v1)\n  * [Preserving Image Properties Through Initializations in Diffusion Models](https://openaccess.thecvf.com/content/WACV2024/papers/Zhang_Preserving_Image_Properties_Through_Initializations_in_Diffusion_Models_WACV_2024_paper.pdf)\n  * [Exploiting the Signal-Leak Bias in Diffusion Models](http://arxiv.org/abs/2309.15842)\n  * [Hierarchical Diffusion Autoencoders and Disentangled Image Manipulation](http://arxiv.org/abs/2304.11829)\n  * [Common Diffusion Noise Schedules and Sample Steps Are Flawed](http://arxiv.org/abs/2305.08891)\n  * [Training-Free Content Injection Using H-Space in Diffusion Models](https://openaccess.thecvf.com/content/WACV2024/papers/Jeong_Training-Free_Content_Injection_Using_H-Space_in_Diffusion_Models_WACV_2024_paper.pdf)\n  * [PoseDiff: Pose-Conditioned Multimodal Diffusion Model for Unbounded Scene Synthesis From Sparse Inputs](https://openaccess.thecvf.com/content/WACV2024/papers/Lee_PoseDiff_Pose-Conditioned_Multimodal_Diffusion_Model_for_Unbounded_Scene_Synthesis_From_WACV_2024_paper.pdf)\n  * [Diffusion Models Meet Image Counter-Forensics](https://openaccess.thecvf.com/content/WACV2024/papers/Tailanian_Diffusion_Models_Meet_Image_Counter-Forensics_WACV_2024_paper.pdf)\n  * [PathLDM: Text Conditioned Latent Diffusion Model for Histopathology](http://arxiv.org/abs/2309.00748)\n  * [Synthesizing Coherent Story With Auto-Regressive Latent Diffusion Models](http://arxiv.org/abs/2211.10950)\n  * [Towards More Realistic Membership Inference Attacks on Large Diffusion Models](https://openaccess.thecvf.com/content/WACV2024/papers/Dubinski_Towards_More_Realistic_Membership_Inference_Attacks_on_Large_Diffusion_Models_WACV_2024_paper.pdf)\n  * [Dual Domain Diffusion Guidance for 3D CBCT Metal Artifact Reduction](https://openaccess.thecvf.com/content/WACV2024/papers/Choi_Dual_Domain_Diffusion_Guidance_for_3D_CBCT_Metal_Artifact_Reduction_WACV_2024_paper.pdf)\n* 图像翻译\n  * [SemST: Semantically Consistent Multi-Scale Image Translation via Structure-Texture Alignment](http://arxiv.org/abs/2310.04995)\n* 图像-图像翻译\n  * [GRIT: GAN Residuals for Paired Image-to-Image Translation](https://openaccess.thecvf.com/content/WACV2024/papers/Suri_GRIT_GAN_Residuals_for_Paired_Image-to-Image_Translation_WACV_2024_paper.pdf)\n* 文本-3D\n  * [HD-Fusion: Detailed Text-to-3D Generation Leveraging Multiple Noise Estimation](https://openaccess.thecvf.com/content/WACV2024/papers/Wu_HD-Fusion_Detailed_Text-to-3D_Generation_Leveraging_Multiple_Noise_Estimation_WACV_2024_paper.pdf)\n* 文本-视频\n  * [Human Motion Aware Text-to-Video Generation With Explicit Camera Control](https://openaccess.thecvf.com/content/WACV2024/papers/Kim_Human_Motion_Aware_Text-to-Video_Generation_With_Explicit_Camera_Control_WACV_2024_paper.pdf)\n* 合成图像检测\n  * [Deep Image Fingerprint: Towards Low Budget Synthetic Image Detection and Model Lineage Analysis](http://arxiv.org/abs/2303.10762)\n\n\u003ca name=\"19\"/\u003e\n\n## 19.Object Pose Estimation(物体姿态估计)\n* 6D\n  * [Real-time 6-DoF Pose Estimation by an Event-based Camera using Active LED Markers](http://arxiv.org/abs/2310.16618v1)\n  * [Effects of Markers in Training Datasets on the Accuracy of 6D Pose Estimation](https://openaccess.thecvf.com/content/WACV2024/papers/Rosskamp_Effects_of_Markers_in_Training_Datasets_on_the_Accuracy_of_WACV_2024_paper.pdf)\n  * [Learning Better Keypoints for Multi-Object 6DoF Pose Estimation](http://arxiv.org/abs/2308.07827)\n* 物体计数\n  * [Training-Free Object Counting With Prompts](http://arxiv.org/abs/2307.00038)\n* 目标重识别\n  * [Object Re-Identification From Point Clouds](https://openaccess.thecvf.com/content/WACV2024/papers/Therien_Object_Re-Identification_From_Point_Clouds_WACV_2024_paper.pdf)\n\n\u003ca name=\"18\"/\u003e\n\n## 18.Animal\n* 犬类姿态分析\n  * [RGBT-Dog: A Parametric Model and Pose Prior for Canine Body Analysis Data Creation](https://openaccess.thecvf.com/content/WACV2024/papers/Deane_RGBT-Dog_A_Parametric_Model_and_Pose_Prior_for_Canine_Body_WACV_2024_paper.pdf)\n* 动物重识别\n  * [WildlifeDatasets: An Open-Source Toolkit for Animal Re-Identification](https://openaccess.thecvf.com/content/WACV2024/papers/Cermak_WildlifeDatasets_An_Open-Source_Toolkit_for_Animal_Re-Identification_WACV_2024_paper.pdf)\n\n\n\n\u003ca name=\"17\"/\u003e\n\n## 17.Human Pose Estimation(人体姿态估计)\n* [Re-VoxelDet: Rethinking Neck and Head Architectures for High-Performance Voxel-Based 3D Detection](https://openaccess.thecvf.com/content/WACV2024/papers/Lee_Re-VoxelDet_Rethinking_Neck_and_Head_Architectures_for_High-Performance_Voxel-Based_3D_WACV_2024_paper.pdf)\n* [DiffBody: Diffusion-Based Pose and Shape Editing of Human Images](https://openaccess.thecvf.com/content/WACV2024/papers/Okuyama_DiffBody_Diffusion-Based_Pose_and_Shape_Editing_of_Human_Images_WACV_2024_paper.pdf)\n* [Denoising and Selecting Pseudo-Heatmaps for Semi-Supervised Human Pose Estimation](http://arxiv.org/abs/2310.00099)\n* [Rethinking Visibility in Human Pose Estimation: Occluded Pose Reasoning via Transformers](https://openaccess.thecvf.com/content/WACV2024/papers/Sun_Rethinking_Visibility_in_Human_Pose_Estimation_Occluded_Pose_Reasoning_via_WACV_2024_paper.pdf)\n* [Active Transfer Learning for Efficient Video-Specific Human Pose Estimation](http://arxiv.org/abs/2311.05041v1)\u003cbr\u003e:star:[code](https://github.com/ImIntheMiddle/VATL4Pose-WACV2024)\n* [LInKs \"Lifting Independent Keypoints\" - Partial Pose Lifting for Occlusion Handling With Improved Accuracy in 2D-3D Human Pose Estimation](https://openaccess.thecvf.com/content/WACV2024/papers/Hardy_LInKs_Lifting_Independent_Keypoints_-_Partial_Pose_Lifting_for_Occlusion_WACV_2024_paper.pdf)\n* 3D HPE\n  * [3D Human Pose Estimation With Two-Step Mixed-Training Strategy](https://openaccess.thecvf.com/content/WACV2024/papers/Wang_3D_Human_Pose_Estimation_With_Two-Step_Mixed-Training_Strategy_WACV_2024_paper.pdf)\n  * [Unsupervised 3D Pose Estimation With Non-Rigid Structure-From-Motion Modeling](http://arxiv.org/abs/2308.10705)\n  * [Back to Optimization: Diffusion-Based Zero-Shot 3D Human Pose Estimation](http://arxiv.org/abs/2307.03833)\n  * [MotionAGFormer: Enhancing 3D Human Pose Estimation With a Transformer-GCNFormer Network](http://arxiv.org/abs/2310.16288)\n  * [UNSPAT: Uncertainty-Guided SpatioTemporal Transformer for 3D Human Pose and Shape Estimation on Videos](https://openaccess.thecvf.com/content/WACV2024/papers/Lee_UNSPAT_Uncertainty-Guided_SpatioTemporal_Transformer_for_3D_Human_Pose_and_Shape_WACV_2024_paper.pdf)\n  * [A Geometry Loss Combination for 3D Human Pose Estimation](https://openaccess.thecvf.com/content/WACV2024/papers/Matsune_A_Geometry_Loss_Combination_for_3D_Human_Pose_Estimation_WACV_2024_paper.pdf)\n  * [Robust Category-Level 3D Pose Estimation From Diffusion-Enhanced Synthetic Data](https://openaccess.thecvf.com/content/WACV2024/papers/Yang_Robust_Category-Level_3D_Pose_Estimation_From_Diffusion-Enhanced_Synthetic_Data_WACV_2024_paper.pdf)\n* 多身体网格检测\n  * [Physical-Space Multi-Body Mesh Detection Achieved by Local Alignment and Global Dense Learning](https://openaccess.thecvf.com/content/WACV2024/papers/Dong_Physical-Space_Multi-Body_Mesh_Detection_Achieved_by_Local_Alignment_and_Global_WACV_2024_paper.pdf)\n* 人定位与姿态分类\n  * [Learning-Based Spotlight Position Optimization for Non-Line-of-Sight Human Localization and Posture Classification](https://openaccess.thecvf.com/content/WACV2024/papers/Chandran_Learning-Based_Spotlight_Position_Optimization_for_Non-Line-of-Sight_Human_Localization_and_Posture_WACV_2024_paper.pdf)\n* 三维人体网格恢复\n  * [Progressive Hypothesis Transformer for 3D Human Mesh Recovery](https://openaccess.thecvf.com/content/WACV2024/papers/Liao_Progressive_Hypothesis_Transformer_for_3D_Human_Mesh_Recovery_WACV_2024_paper.pdf)\n* 人体姿态与网格重建\n  * [MPT: Mesh Pre-Training With Transformers for Human Pose and Mesh Reconstruction](http://arxiv.org/abs/2211.13357)\n* 着装人体重建\n  * [PIDiffu: Pixel-Aligned Diffusion Model for High-Fidelity Clothed Human Reconstruction](https://openaccess.thecvf.com/content/WACV2024/papers/Lee_PIDiffu_Pixel-Aligned_Diffusion_Model_for_High-Fidelity_Clothed_Human_Reconstruction_WACV_2024_paper.pdf)\n* 手部\n  * 手部重建\n    * [Intrinsic Hand Avatar: Illumination-Aware Hand Appearance and Shape Reconstruction From Monocular RGB Video](https://openaccess.thecvf.com/content/WACV2024/papers/Kalshetti_Intrinsic_Hand_Avatar_Illumination-Aware_Hand_Appearance_and_Shape_Reconstruction_From_WACV_2024_paper.pdf)\n  * 手语翻译\n    * [Fingerspelling PoseNet: Enhancing Fingerspelling Translation with Pose-Based Transformer Models](https://arxiv.org/abs/2311.12128)\u003cbr\u003e:star:[code](https://github.com/pooyafayyaz/Fingerspelling-PoseNet)\n  * 手语制作\n    * [Sign Language Production With Latent Motion Transformer](https://openaccess.thecvf.com/content/WACV2024/papers/Xie_Sign_Language_Production_With_Latent_Motion_Transformer_WACV_2024_paper.pdf)\n  * 手部姿态估计\n    * [HMP: Hand Motion Priors for Pose and Shape Estimation From Video](https://openaccess.thecvf.com/content/WACV2024/papers/Duran_HMP_Hand_Motion_Priors_for_Pose_and_Shape_Estimation_From_WACV_2024_paper.pdf)\n    * [Handformer2T: A Lightweight Regression-Based Model for Interacting Hands Pose Estimation From a Single RGB Image](https://openaccess.thecvf.com/content/WACV2024/papers/Zhang_Handformer2T_A_Lightweight_Regression-Based_Model_for_Interacting_Hands_Pose_Estimation_WACV_2024_paper.pdf)\n  * 手势检测\n    * [Co-Speech Gesture Detection Through Multi-Phase Sequence Labeling](http://arxiv.org/abs/2308.10680)\n  * 抄写员手识别\n    * [The Paleographer's Eye ex machina: Using Computer Vision To Assist Humanists in Scribal Hand Identification](https://openaccess.thecvf.com/content/WACV2024/papers/Grieggs_The_Paleographers_Eye_ex_machina_Using_Computer_Vision_To_Assist_WACV_2024_paper.pdf)\n  * 交互式分割\n    * [Interactive Segmentation for Diverse Gesture Types Without Context](http://arxiv.org/abs/2307.10518)  \n    * [Continuous Adaptation for Interactive Segmentation Using Teacher-Student Architecture](https://openaccess.thecvf.com/content/WACV2024/papers/Atanyan_Continuous_Adaptation_for_Interactive_Segmentation_Using_Teacher-Student_Architecture_WACV_2024_paper.pdf)\n* 人体轮廓提取\n  * [POISE: Pose Guided Human Silhouette Extraction Under Occlusions](http://arxiv.org/abs/2311.05077)\n* 动作捕捉\n  * [A Sequential Learning-Based Approach for Monocular Human Performance Capture](https://openaccess.thecvf.com/content/WACV2024/papers/Chen_A_Sequential_Learning-Based_Approach_for_Monocular_Human_Performance_Capture_WACV_2024_paper.pdf)\n* 人体动画\n  * [AvatarOne: Monocular 3D Human Animation](https://openaccess.thecvf.com/content/WACV2024/papers/Karthikeyan_AvatarOne_Monocular_3D_Human_Animation_WACV_2024_paper.pdf)\n  * [StyleAvatar: Stylizing Animatable Head Avatars](https://openaccess.thecvf.com/content/WACV2024/papers/Perez_StyleAvatar_Stylizing_Animatable_Head_Avatars_WACV_2024_paper.pdf)\n\n\u003ca name=\"16\"/\u003e\n\n## 16.Action Detection(动作检测)\n* [Context in Human Action Through Motion Complementarity](https://openaccess.thecvf.com/content/WACV2024/papers/Dessalene_Context_in_Human_Action_Through_Motion_Complementarity_WACV_2024_paper.pdf)\n* 小样本动作检测\n  * [Semantic-aware Video Representation for Few-shot Action Recognition](http://arxiv.org/abs/2311.06218v1)\n* 细粒度动作识别\n  * [PGVT: Pose-Guided Video Transformer for Fine-Grained Action Recognition](https://openaccess.thecvf.com/content/WACV2024/papers/Zhang_PGVT_Pose-Guided_Video_Transformer_for_Fine-Grained_Action_Recognition_WACV_2024_paper.pdf)\n* 时序动作分割\n  * [OTAS: Unsupervised Boundary Detection for Object-Centric Temporal Action Segmentation](https://arxiv.org/abs/2309.06276)\n* 时序动作检测\n  * [A*: Atrous Spatial Temporal Action Recognition for Real Time Applications](https://openaccess.thecvf.com/content/WACV2024/papers/Kim_A_Atrous_Spatial_Temporal_Action_Recognition_for_Real_Time_Applications_WACV_2024_paper.pdf)\n  * [ZEETAD: Adapting Pretrained Vision-Language Model for Zero-Shot End-to-End Temporal Action Detection](http://arxiv.org/abs/2311.00729)\n* 动作检测\n  * [Embodied Human Activity Recognition](https://openaccess.thecvf.com/content/WACV2024/papers/Hu_Embodied_Human_Activity_Recognition_WACV_2024_paper.pdf)\n  * [JOADAA: Joint Online Action Detection and Action Anticipation](http://arxiv.org/abs/2309.06130)\n  * [A Hybrid Graph Network for Complex Activity Detection in Video](http://arxiv.org/abs/2310.17493v1)\n  * [Differentially Private Video Activity Recognition](http://arxiv.org/abs/2306.15742)\n  * [Embedding Task Structure for Action Detection](https://openaccess.thecvf.com/content/WACV2024/papers/Peven_Embedding_Task_Structure_for_Action_Detection_WACV_2024_paper.pdf)\n  * [Egocentric Action Recognition by Capturing Hand-Object Contact and Object State](https://openaccess.thecvf.com/content/WACV2024/papers/Shiota_Egocentric_Action_Recognition_by_Capturing_Hand-Object_Contact_and_Object_State_WACV_2024_paper.pdf)\n  * [Exploring the Impact of Rendering Method and Motion Quality on Model Performance When Using Multi-View Synthetic Data for Action Recognition](https://openaccess.thecvf.com/content/WACV2024/papers/Panev_Exploring_the_Impact_of_Rendering_Method_and_Motion_Quality_on_WACV_2024_paper.pdf)\n  * [Learnable Cube-Based Video Encryption for Privacy-Preserving Action Recognition](https://openaccess.thecvf.com/content/WACV2024/papers/Ishikawa_Learnable_Cube-Based_Video_Encryption_for_Privacy-Preserving_Action_Recognition_WACV_2024_paper.pdf)\n* 动作预测\n  * [Object-centric Video Representation for Long-term Action Anticipation](http://arxiv.org/abs/2311.00180v1)\u003cbr\u003e:star:[code](https://github.com/brown-palm/ObjectPrompt)\n  * [Interaction Region Visual Transformer for Egocentric Action Anticipation](https://openaccess.thecvf.com/content/WACV2024/papers/Roy_Interaction_Region_Visual_Transformer_for_Egocentric_Action_Anticipation_WACV_2024_paper.pdf)\n* 动作分割\n  * [Permutation-Aware Activity Segmentation via Unsupervised Frame-To-Segment Alignment](https://openaccess.thecvf.com/content/WACV2024/papers/Tran_Permutation-Aware_Activity_Segmentation_via_Unsupervised_Frame-To-Segment_Alignment_WACV_2024_paper.pdf)\n  * [Mining and Unifying Heterogeneous Contrastive Relations for Weakly-Supervised Actor-Action Segmentation](https://openaccess.thecvf.com/content/WACV2024/papers/Duan_Mining_and_Unifying_Heterogeneous_Contrastive_Relations_for_Weakly-Supervised_Actor-Action_Segmentation_WACV_2024_paper.pdf)\n  * 时序动作分割\n    * [Random Walks for Temporal Action Segmentation With Timestamp Supervision](https://openaccess.thecvf.com/content/WACV2024/papers/Hirsch_Random_Walks_for_Temporal_Action_Segmentation_With_Timestamp_Supervision_WACV_2024_paper.pdf)\n* 动作分类\n  * [Spatio-Temporal Filter Analysis Improves 3D-CNN for Action Classification](https://openaccess.thecvf.com/content/WACV2024/papers/Kobayashi_Spatio-Temporal_Filter_Analysis_Improves_3D-CNN_for_Action_Classification_WACV_2024_paper.pdf)\n* 动作合成\n  * [Few-Shot Generative Model for Skeleton-Based Human Action Synthesis Using Cross-Domain Adversarial Learning](https://openaccess.thecvf.com/content/WACV2024/papers/Fukushi_Few-Shot_Generative_Model_for_Skeleton-Based_Human_Action_Synthesis_Using_Cross-Domain_WACV_2024_paper.pdf)\n* 动作质量评估\n  * [PECoP: Parameter Efficient Continual Pretraining for Action Quality Assessment](http://arxiv.org/abs/2311.07603v1)\u003cbr\u003e:star:[code](https://github.com/Plrbear/PECoP)\n* 重复动作计数\n  * [Repetitive Action Counting with Motion Feature Learning](https://openaccess.thecvf.com/content/WACV2024/papers/Li_Repetitive_Action_Counting_With_Motion_Feature_Learning_WACV_2024_paper.pdf)\n\n\u003ca name=\"15\"/\u003e\n\n## 15.Video\n* [Detecting Content Segments From Online Sports Streaming Events: Challenges and Solutions](https://openaccess.thecvf.com/content/WACV2024/papers/Liu_Detecting_Content_Segments_From_Online_Sports_Streaming_Events_Challenges_and_WACV_2024_paper.pdf)\n* 视频理解\n  * [M33D: Learning 3D Priors Using Multi-Modal Masked Autoencoders for 2D Image and Video Understanding](https://openaccess.thecvf.com/content/WACV2024/papers/Jamal_M33D_Learning_3D_Priors_Using_Multi-Modal_Masked_Autoencoders_for_2D_WACV_2024_paper.pdf)\n  * [PromptonomyViT: Multi-Task Prompt Learning Improves Video Transformers Using Synthetic Scene Data](https://openaccess.thecvf.com/content/WACV2024/papers/Herzig_PromptonomyViT_Multi-Task_Prompt_Learning_Improves_Video_Transformers_Using_Synthetic_Scene_WACV_2024_paper.pdf)\u003cbr\u003e:house:[project](https://ofir1080.github.io/PromptonomyViT)\n* 视频分割\n  * [Correlation-aware active learning for surgery video segmentation](http://arxiv.org/abs/2311.08811v1)\n* 视频识别\n  * [Automated Sperm Assessment Framework and Neural Network Specialized for Sperm Video Recognition](http://arxiv.org/abs/2311.05927v1)\n* 视频稳定\n  * [Leveraging Synthetic Data To Learn Video Stabilization Under Adverse Conditions](http://arxiv.org/abs/2208.12763)\n* 视频重建\n  * [Unsupervised Event-Based Video Reconstruction](https://openaccess.thecvf.com/content/WACV2024/papers/Fox_Unsupervised_Event-Based_Video_Reconstruction_WACV_2024_paper.pdf)\n* 视频监控\n  * [Lightweight Delivery Detection on Doorbell Cameras](http://arxiv.org/abs/2305.07812)\n* 视频分析\n  * [Weakly-Supervised Representation Learning for Video Alignment and Analysis](http://arxiv.org/abs/2302.04064)\n* 视频和谐化\n  * [TSA2: Temporal Segment Adaptation and Aggregation for Video Harmonization](https://openaccess.thecvf.com/content/WACV2024/papers/Xiao_TSA2_Temporal_Segment_Adaptation_and_Aggregation_for_Video_Harmonization_WACV_2024_paper.pdf)\n* 录像带修复\n  * [Reference-Based Restoration of Digitized Analog Videotapes](http://arxiv.org/abs/2310.14926)\n  * [Restoring Degraded Old Films With Recursive Recurrent Transformer Networks](https://openaccess.thecvf.com/content/WACV2024/papers/Lin_Restoring_Degraded_Old_Films_With_Recursive_Recurrent_Transformer_Networks_WACV_2024_paper.pdf)\n* 视频时刻检索\n  * [Zero-Shot Video Moment Retrieval from Frozen Vision-Language Models](https://arxiv.org/abs/2309.00661)\n  * [Semantic Fusion Augmentation and Semantic Boundary Detection: A Novel Approach to Multi-Target Video Moment Retrieval](https://openaccess.thecvf.com/content/WACV2024/papers/Huang_Semantic_Fusion_Augmentation_and_Semantic_Boundary_Detection_A_Novel_Approach_WACV_2024_paper.pdf)\n* 视频目标定位\n  * [Sketch-Based Video Object Localization](http://arxiv.org/abs/2304.00450)\n* 电影类型分类\n  * [Movie Genre Classification by Language Augmentation and Shot Sampling](http://arxiv.org/abs/2203.13281)\n* 视频质量增强\n  * [Leveraging Bitstream Metadata for Fast, Accurate, Generalized Compressed Video Quality Enhancement](http://arxiv.org/abs/2202.00011)\n* VAD\n  * [A Coarse-to-Fine Pseudo-Labeling (C2FPL) Framework for Unsupervised Video Anomaly Detection](http://arxiv.org/abs/2310.17650v1)\n  * [Real-Time Weakly Supervised Video Anomaly Detection](https://openaccess.thecvf.com/content/WACV2024/papers/Karim_Real-Time_Weakly_Supervised_Video_Anomaly_Detection_WACV_2024_paper.pdf)\n  * [OE-CTST: Outlier-Embedded Cross Temporal Scale Transformer for Weakly-Supervised Video Anomaly Detection](https://openaccess.thecvf.com/content/WACV2024/papers/Majhi_OE-CTST_Outlier-Embedded_Cross_Temporal_Scale_Transformer_for_Weakly-Supervised_Video_Anomaly_WACV_2024_paper.pdf)\n\n\u003ca name=\"14\"/\u003e\n\n## 14.OCR(文本检测识别)\n* [DTrOCR: Decoder-only Transformer for Optical Character Recognition](https://arxiv.org/abs/2308.15996)\n* [On Manipulating Scene Text in the Wild with Diffusion Models](http://arxiv.org/abs/2311.00734v1)\n* [DECDM: Document Enhancement using Cycle-Consistent Diffusion Models](http://arxiv.org/abs/2311.09625v1)\n* 文本检测\n  * [Sequential Transformer for End-to-End Video Text Detection](https://openaccess.thecvf.com/content/WACV2024/papers/Zhang_Sequential_Transformer_for_End-to-End_Video_Text_Detection_WACV_2024_paper.pdf)\n  * [Textron: Weakly Supervised Multilingual Text Detection Through Data Programming](https://openaccess.thecvf.com/content/WACV2024/papers/Kudale_Textron_Weakly_Supervised_Multilingual_Text_Detection_Through_Data_Programming_WACV_2024_paper.pdf)\n  * [Diffusion in the Dark: A Diffusion Model for Low-Light Text Recognition](http://arxiv.org/abs/2303.04291)\n* Text Spotting\n  * [Harnessing the Power of Multi-Lingual Datasets for Pre-training: Towards Enhancing Text Spotting Performance](https://arxiv.org/abs/2310.00917)\n  * [Hierarchical Text Spotter for Joint Text Spotting and Layout Analysis](https://arxiv.org/abs/2310.17674)\n* Scene-Text Spotting\n  * [STEP - Towards Structured Scene-Text Spotting](https://openaccess.thecvf.com/content/WACV2024/papers/Garcia-Bordils_STEP_-_Towards_Structured_Scene-Text_Spotting_WACV_2024_paper.pdf)\n* Document Dewarping(文档矫正)\n  * [DocReal: Robust Document Dewarping of Real-Life Images via Attention-Enhanced Control Point Prediction](https://openaccess.thecvf.com/content/WACV2024/papers/Yu_DocReal_Robust_Document_Dewarping_of_Real-Life_Images_via_Attention-Enhanced_Control_WACV_2024_paper.pdf)\n* 场景文本理解\n  * [Textual Alchemy: CoFormer for Scene Text Understanding](https://openaccess.thecvf.com/content/WACV2024/papers/Deshmukh_Textual_Alchemy_CoFormer_for_Scene_Text_Understanding_WACV_2024_paper.pdf)\n* 文档布局分割\n  * [A One-Shot Learning Approach To Document Layout Segmentation of Ancient Arabic Manuscripts](https://openaccess.thecvf.com/content/WACV2024/papers/De_Nardin_A_One-Shot_Learning_Approach_To_Document_Layout_Segmentation_of_Ancient_WACV_2024_paper.pdf)\n* 字体生成\n  * [Towards Diverse and Consistent Typography Generation](http://arxiv.org/abs/2309.02099)\n* 信息提取\n  * [Graph Neural Networks for End-to-End Information Extraction From Handwritten Documents](https://openaccess.thecvf.com/content/WACV2024/papers/Khanfir_Graph_Neural_Networks_for_End-to-End_Information_Extraction_From_Handwritten_Documents_WACV_2024_paper.pdf)\n\n\u003ca name=\"13\"/\u003e\n\n## 13.Reid(人员重识别/步态识别/行人检测)\n* Reid\n  * [Privacy-Enhancing Person Re-Identification Framework - A Dual-Stage Approach](https://openaccess.thecvf.com/content/WACV2024/papers/Kansal_Privacy-Enhancing_Person_Re-Identification_Framework_-_A_Dual-Stage_Approach_WACV_2024_paper.pdf)\n  * [HashReID: Dynamic Network with Binary Codes for Efficient Person Re-identification](https://arxiv.org/abs/2308.11900)\n  * [Mitigate Domain Shift by Primary-Auxiliary Objectives Association for Generalizing Person ReID](https://arxiv.org/abs/2310.15913)\n  * [Source-Guided Similarity Preservation for Online Person Re-Identification](https://openaccess.thecvf.com/content/WACV2024/papers/Rami_Source-Guided_Similarity_Preservation_for_Online_Person_Re-Identification_WACV_2024_paper.pdf)\n  * [Contrastive Viewpoint-Aware Shape Learning for Long-Term Person Re-Identification](https://openaccess.thecvf.com/content/WACV2024/papers/Nguyen_Contrastive_Viewpoint-Aware_Shape_Learning_for_Long-Term_Person_Re-Identification_WACV_2024_paper.pdf)\n  * 可见光红外Reid\n    * [Enhancing Diverse Intra-Identity Representation for Visible-Infrared Person Re-Identification](https://openaccess.thecvf.com/content/WACV2024/papers/Kim_Enhancing_Diverse_Intra-Identity_Representation_for_Visible-Infrared_Person_Re-Identification_WACV_2024_paper.pdf)\n* 行人识别\n  * [ShARc: Shape and Appearance Recognition for Person Identification In-the-wild](https://arxiv.org/abs/2310.15946)\n* 行人搜索\n  * [DDAM-PS: Diligent Domain Adaptive Mixer for Person Search](https://arxiv.org/abs/2310.20706)\u003cbr\u003e:star:[code](https://github.com/mustansarfiaz/DDAM-PS)\n* 行人检测\n  * [HalluciDet: Hallucinating RGB Modality for Person Detection Through Privileged Information](https://openaccess.thecvf.com/content/WACV2024/papers/Medeiros_HalluciDet_Hallucinating_RGB_Modality_for_Person_Detection_Through_Privileged_Information_WACV_2024_paper.pdf)\n  * [Beyond Fusion: Modality Hallucination-Based Multispectral Fusion for Pedestrian Detection](https://openaccess.thecvf.com/content/WACV2024/papers/Xie_Beyond_Fusion_Modality_Hallucination-Based_Multispectral_Fusion_for_Pedestrian_Detection_WACV_2024_paper.pdf)\n  * [Booster-SHOT: Boosting Stacked Homography Transformations for Multiview Pedestrian Detection With Attention](https://openaccess.thecvf.com/content/WACV2024/papers/Hwang_Booster-SHOT_Boosting_Stacked_Homography_Transformations_for_Multiview_Pedestrian_Detection_With_WACV_2024_paper.pdf)\n  * [Enhancing Multi-View Pedestrian Detection Through Generalized 3D Feature Pulling](https://openaccess.thecvf.com/content/WACV2024/papers/Aung_Enhancing_Multi-View_Pedestrian_Detection_Through_Generalized_3D_Feature_Pulling_WACV_2024_paper.pdf)\n  * [Favoring One Among Equals - Not a Good Idea: Many-to-One Matching for Robust Transformer Based Pedestrian Detection](https://openaccess.thecvf.com/content/WACV2024/papers/Shastry_Favoring_One_Among_Equals_-_Not_a_Good_Idea_Many-to-One_WACV_2024_paper.pdf)\n* 人群计数\n  * 弱监督人群计数\n    * [Glance To Count: Learning To Rank With Anchors for Weakly-Supervised Crowd Counting](http://arxiv.org/abs/2205.14659)\n  * 基于红外的人群计数\n    * [Evaluating Supervision Levels Trade-Offs for Infrared-Based People Counting](https://arxiv.org/abs/2311.11974)\u003cbr\u003e:star:[code](https://github.com/tortueTortue/IRPeopleCounting)\n* 步态识别\n  * [You Can Run but not Hide: Improving Gait Recognition with Intrinsic Occlusion Type Awareness](http://arxiv.org/abs/2312.02290v1)\n  * [Watch Where You Head: A View-Biased Domai","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F52cv%2Fwacv-2024-papers","html_url":"https://awesome.ecosyste.ms/projects/github.com%2F52cv%2Fwacv-2024-papers","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F52cv%2Fwacv-2024-papers/lists"}