https://github.com/ashishpatel26/cvpr2024

CVPR 2024 Research Paper with Code
https://github.com/ashishpatel26/cvpr2024
computervision cvpr cvpr2024
Last synced: 5 months ago
JSON representation
CVPR 2024 Research Paper with Code
Host: GitHub
URL: https://github.com/ashishpatel26/cvpr2024
Owner: ashishpatel26
Created: 2024-06-26T06:06:29.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2024-06-28T08:10:00.000Z (about 2 years ago)
Last Synced: 2025-07-04T09:48:38.212Z (12 months ago)
Topics: computervision, cvpr, cvpr2024
Homepage:
Size: 545 KB
Stars: 48
Watchers: 3
Forks: 9
Open Issues: 1
Metadata Files:
- Readme: README.md
Awesome Lists containing this project

README

          # CVPR 2024 

![](https://camo.githubusercontent.com/e98004b4a9a1fdbad3c3fe1be700c0f0546286942108c54fa7f009eb786df0d0/68747470733a2f2f6869726f6b617473756b6174616f6b6131362e6769746875622e696f2f435650522d323032342d4c494d49542f696d672f435650525f4c6f676f53656174746c655f323032345f5072696d6172792e6a7067)

### Research Paper with Code

![](mindmap/fd94ed3530b7015a458d81055f24fc026f33f7d1c45cde6cdc34f1a689509916.png)

---

## Table of Contents

- [3DGS (Gaussian Splatting)](#3dgs-gaussian-splatting)

- [Avatars](#avatars)

- [Backbone](#backbone)

- [CLIP](#clip)

- [Embodied AI](#embodied-ai)

- [OCR](#ocr)

- [NeRF](#nerf)

- [DETR](#detr)

- [ReID](#reid)

- [Long-Tail](#long-tail)

- [Vision Transformer](#vision-transformer)

- [Vision-Language](#vision-language)

- [Self-supervised Learning](#self-supervised-learning)

- [Data Augmentation](#data-augmentation)

- [Object Detection](#object-detection)

- [Anomaly Detection](#anomaly-detection)

- [Visual Tracking](#visual-tracking)

- [Semantic Segmentation](#semantic-segmentation)

- [Instance Segmentation](#instance-segmentation)

- [Panoptic Segmentation](#panoptic-segmentation)

- [Medical Image](#medical-image)

- [Medical Image Segmentation](#medical-image-segmentation)

- [Video Object Segmentation](#video-object-segmentation)

- [Video Instance Segmentation](#video-instance-segmentation)

- [Referring Image Segmentation](#referring-image-segmentation)

- [Image Matting](#image-matting)

- [Image Editing](#image-editing)

- [Low-level Vision](#low-level-vision)

- [Super-Resolution](#super-resolution)

- [Denoising](#denoising)

- [Deblur](#deblur)

- [Autonomous Driving](#autonomous-driving)

- [3D Point Cloud](#3d-point-cloud)

- [3D Object Detection](#3d-object-detection)

- [3D Semantic Segmentation](#3d-semantic-segmentation)

- [3D Object Tracking](#3d-object-tracking)

- [3D Semantic Scene Completion](#3d-semantic-scene-completion)

- [3D Registration](#3d-registration)

- [3D Human Pose Estimation](#3d-human-pose-estimation)

- [3D Human Mesh Estimation](#3d-human-mesh-estimation)

- [Image Generation](#image-generation)

- [Video Generation](#video-generation)

- [Video Understanding](#video-understanding)

- [Knowledge Distillation](#knowledge-distillation)

- [Stereo Matching](#stereo-matching)

- [Scene Graph Generation](#scene-graph-generation)

- [Video Quality Assessment](#video-quality-assessment)

- [Datasets](#datasets)

- [Others](#others)

### Domain-wise Table

#### 3DGS (Gaussian Splatting)

| Index | Paper Title                                                  | Paper Link                                | Code                                                        | Official Repo                                                |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ----------------------------------------------------------- | ------------------------------------------------------------ |

| 1     | Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering | [Paper](https://arxiv.org/abs/2312.00109) | [Code](https://github.com/city-super/Scaffold-GS)           | [Homepage](https://city-super.github.io/scaffold-gs/)        |

| 2     | GPS-Gaussian: Generalizable Pixel-wise 3D Gaussian Splatting for Real-time Human Novel View Synthesis | [Paper](https://arxiv.org/abs/2312.02155) | [Code](https://github.com/ShunyuanZheng/GPS-Gaussian)       | [Homepage](https://shunyuanzheng.github.io/GPS-Gaussian)     |

| 3     | GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians | [Paper](https://arxiv.org/abs/2312.02134) | [Code](https://github.com/huliangxiao/GaussianAvatar)       | N/A                                                          |

| 4     | GaussianEditor: Swift and Controllable 3D Editing with Gaussian Splatting | [Paper](https://arxiv.org/abs/2311.14521) | [Code](https://github.com/buaacyw/GaussianEditor)           | N/A                                                          |

| 5     | Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene Reconstruction | [Paper](https://arxiv.org/abs/2309.13101) | [Code](https://github.com/ingra14m/Deformable-3D-Gaussians) | [Homepage](https://ingra14m.github.io/Deformable-Gaussians/) |

#### Avatars

| Index | Paper Title                                                  | Paper Link                                | Code                                                  | Official Repo                                 |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ----------------------------------------------------- | --------------------------------------------- |

| 6     | GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians | [Paper](https://arxiv.org/abs/2312.02134) | [Code](https://github.com/huliangxiao/GaussianAvatar) | N/A                                           |

| 7     | Real-Time Simulated Avatar from Head-Mounted Sensors         | [Paper](https://arxiv.org/abs/2403.06862) | N/A                                                   | [Homepage](https://www.zhengyiluo.com/SimXR/) |

#### Backbone

| Index | Paper Title                                                  | Paper Link                                | Code                                                | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | --------------------------------------------------- | ------------- |

| 8     | RepViT: Revisiting Mobile CNN From ViT Perspective           | [Paper](https://arxiv.org/abs/2307.09283) | [Code](https://github.com/THU-MIG/RepViT)           | N/A           |

| 9     | TransNeXt: Robust Foveal Visual Perception for Vision Transformers | [Paper](https://arxiv.org/abs/2311.17132) | [Code](https://github.com/DaiShiResearch/TransNeXt) | N/A           |

#### CLIP

| Index | Paper Title                                               | Paper Link                                | Code                                                         | Official Repo |

| ----- | --------------------------------------------------------- | ----------------------------------------- | ------------------------------------------------------------ | ------------- |

| 10    | Alpha-CLIP: A CLIP Model Focusing on Wherever You Want    | [Paper](https://arxiv.org/abs/2312.03818) | [Code](https://github.com/SunzeY/AlphaCLIP)                  | N/A           |

| 11    | FairCLIP: Harnessing Fairness in Vision-Language Learning | [Paper](https://arxiv.org/abs/2403.19949) | [Code](https://github.com/Harvard-Ophthalmology-AI-Lab/FairCLIP) | N/A           |

#### Embodied AI

| Index | Paper Title                                                  | Paper Link                                | Code                                                 | Official Repo                                        |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ---------------------------------------------------- | ---------------------------------------------------- |

| 12    | EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI | [Paper](https://arxiv.org/abs/2312.16170) | [Code](https://github.com/OpenRobotLab/EmbodiedScan) | [Homepage](https://tai-wang.github.io/embodiedscan/) |

| 13    | MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception | [Paper](https://arxiv.org/abs/2312.07472) | [Code](https://github.com/IranQin/MP5)               | [Homepage](https://iranqin.github.io/MP5.github.io/) |

#### OCR

| Index | Paper Title                                                  | Paper Link                                | Code                                                         | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ------------------------------------------------------------ | ------------- |

| 14    | An Empirical Study of Scaling Law for OCR                    | [Paper](https://arxiv.org/abs/2401.00028) | [Code](https://github.com/large-ocr-model/large-ocr-model.github.io) | N/A           |

| 15    | ODM: A Text-Image Further Alignment Pre-training Approach for Scene Text Detection and Spotting | [Paper](https://arxiv.org/abs/2403.00303) | [Code](https://github.com/PriNing/ODM)                       | N/A           |

#### NeRF

| Index | Paper Title                                                  | Paper Link                                | Code                                        | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ------------------------------------------- | ------------- |

| 16    | PIE-NeRF: Physics-based Interactive Elastodynamics with NeRF | [Paper](https://arxiv.org/abs/2311.13099) | [Code](https://github.com/FYTalon/pienerf/) | N/A           |

#### DETR

| Index | Paper Title                                                  | Paper Link                                | Code                                             | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ------------------------------------------------ | ------------- |

| 17    | DETRs Beat YOLOs on Real-time Object Detection               | [Paper](https://arxiv.org/abs/2304.08069) | [Code](https://github.com/lyuwenyu/RT-DETR)      | N/A           |

| 18    | Salience DETR: Enhancing Detection Transformer with Hierarchical Salience Filtering Refinement | [Paper](https://arxiv.org/abs/2403.16131) | [Code](https://github.com/xiuqhou/Salience-DETR) | N/A           |

#### ReID

| Index | Paper Title                                                  | Paper Link                                | Code                                        | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ------------------------------------------- | ------------- |

| 19    | Magic Tokens: Select Diverse Tokens for Multi-modal Object Re-Identification | [Paper](https://arxiv.org/abs/2403.10254) | [Code](https://github.com/924973292/EDITOR) | N/A           |

| 20    | Noisy-Correspondence Learning for Text-to-Image Person Re-identification | [Paper](https://arxiv.org/abs/2308.09911) | [Code](https://github.com/QinYang79/RDE)    | N/A           |

#### Long-Tail

| Index | Paper Title                                                  | Paper Link                                | Code                                                         | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ------------------------------------------------------------ | ------------- |

| 1     | Delving into the Trajectory Long-tail Distribution for Multi-object Tracking | [Paper](https://arxiv.org/abs/2403.04700) | [Code](https://github.com/chen-si-jia/Trajectory-Long-tail-Distribution-for-MOT) | N/A           |

#### Vision Transformer

| Index | Paper Title                                                  | Paper Link                                | Code                                                | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | --------------------------------------------------- | ------------- |

| 2     | TransNeXt: Robust Foveal Visual Perception for Vision Transformers | [Paper](https://arxiv.org/abs/2311.17132) | [Code](https://github.com/DaiShiResearch/TransNeXt) | N/A           |

| 3     | RepViT: Revisiting Mobile CNN From ViT Perspective           | [Paper](https://arxiv.org/abs/2307.09283) | [Code](https://github.com/THU-MIG/RepViT)           | N/A           |

#### Vision-Language

| Index | Paper Title                                                  | Paper Link                                | Code                                                         | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ------------------------------------------------------------ | ------------- |

| 4     | PromptKD: Unsupervised Prompt Distillation for Vision-Language Models | [Paper](https://arxiv.org/abs/2403.02781) | [Code](https://github.com/zhengli97/PromptKD)                | N/A           |

| 5     | FairCLIP: Harnessing Fairness in Vision-Language Learning    | [Paper](https://arxiv.org/abs/2403.19949) | [Code](https://github.com/Harvard-Ophthalmology-AI-Lab/FairCLIP) | N/A           |

#### Self-supervised Learning

| Index | Paper Title | Paper Link | Code | Official Repo |

| ----- | ----------- | ---------- | ---- | ------------- |

| 6     | N/A         | N/A        | N/A  | N/A           |

#### Data Augmentation

| Index | Paper Title | Paper Link | Code | Official Repo |

| ----- | ----------- | ---------- | ---- | ------------- |

| 7     | N/A         | N/A        | N/A  | N/A           |

#### Object Detection

| Index | Paper Title                                                  | Paper Link                                | Code                                                         | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ------------------------------------------------------------ | ------------- |

| 8     | DETRs Beat YOLOs on Real-time Object Detection               | [Paper](https://arxiv.org/abs/2304.08069) | [Code](https://github.com/lyuwenyu/RT-DETR)                  | N/A           |

| 9     | Boosting Object Detection with Zero-Shot Day-Night Domain Adaptation | [Paper](https://arxiv.org/abs/2312.01220) | [Code](https://github.com/ZPDu/Boosting-Object-Detection-with-Zero-Shot-Day-Night-Domain-Adaptation) | N/A           |

| 10    | YOLO-World: Real-Time Open-Vocabulary Object Detection       | [Paper](https://arxiv.org/abs/2401.17270) | [Code](https://github.com/AILab-CVC/YOLO-World)              | N/A           |

| 11    | Salience DETR: Enhancing Detection Transformer with Hierarchical Salience Filtering Refinement | [Paper](https://arxiv.org/abs/2403.16131) | [Code](https://github.com/xiuqhou/Salience-DETR)             | N/A           |

#### Anomaly Detection

| Index | Paper Title                                                  | Paper Link                                | Code                                    | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | --------------------------------------- | ------------- |

| 12    | Anomaly Heterogeneity Learning for Open-set Supervised Anomaly Detection | [Paper](https://arxiv.org/abs/2310.12790) | [Code](https://github.com/mala-lab/AHL) | N/A           |

#### Visual Tracking

| Index | Paper Title | Paper Link | Code | Official Repo |

| ----- | ----------- | ---------- | ---- | ------------- |

| 13    | N/A         | N/A        | N/A  | N/A           |

#### Semantic Segmentation

| Index | Paper Title                                                  | Paper Link                                | Code                                   | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | -------------------------------------- | ------------- |

| 14    | Stronger, Fewer, & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation | [Paper](https://arxiv.org/abs/2312.04265) | [Code](https://github.com/w1oves/Rein) | N/A           |

| 15    | SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation | [Paper](https://arxiv.org/abs/2311.15537) | [Code](https://github.com/xb534/SED)   | N/A           |

#### Instance Segmentation

| Index | Paper Title | Paper Link | Code | Official Repo |

| ----- | ----------- | ---------- | ---- | ------------- |

| 16    | N/A         | N/A        | N/A  | N/A           |

#### Panoptic Segmentation

| Index | Paper Title | Paper Link | Code | Official Repo |

| ----- | ----------- | ---------- | ---- | ------------- |

| 17    | N/A         | N/A        | N/A  | N/A           |

#### Medical Image

| Index | Paper Title                                                  | Paper Link                                | Code                                          | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | --------------------------------------------- | ------------- |

| 18    | Feature Re-Embedding: Towards Foundation Model-Level Performance in Computational Pathology | [Paper](https://arxiv.org/abs/2402.17228) | [Code](https://github.com/DearCaat/RRT-MIL)   | N/A           |

| 19    | VoCo: A Simple-yet-Effective Volume Contrastive Learning Framework for 3D Medical Image Analysis | [Paper](https://arxiv.org/abs/2402.17300) | [Code](https://github.com/Luffy03/VoCo)       | N/A           |

| 20    | ChAda-ViT : Channel Adaptive Attention for Joint Representation Learning of Heterogeneous Microscopy Images | [Paper](https://arxiv.org/abs/2311.15264) | [Code](https://github.com/nicoboou/chada_vit) | N/A           |

#### Medical Image Segmentation

| Index | Paper Title | Paper Link | Code | Official Repo |

| ----- | ----------- | ---------- | ---- | ------------- |

| 21    | N/A         | N/A        | N/A  | N/A           |

#### Video Object Segmentation

| Index | Paper Title | Paper Link | Code | Official Repo |

| ----- | ----------- | ---------- | ---- | ------------- |

| 22    | N/A         | N/A        | N/A  | N/A           |

#### Video Instance Segmentation

| Index | Paper Title | Paper Link | Code | Official Repo |

| ----- | ----------- | ---------- | ---- | ------------- |

| 23    | N/A         | N/A        | N/A  | N/A           |

#### Referring Image Segmentation

| Index | Paper Title | Paper Link | Code | Official Repo |

| ----- | ----------- | ---------- | ---- | ------------- |

| 24    | N/A         | N/A        | N/A  | N/A           |

#### Image Matting

| Index | Paper Title | Paper Link | Code | Official Repo |

| ----- | ----------- | ---------- | ---- | ------------- |

| 25    | N/A         | N/A        | N/A  | N/A           |

#### Image Editing

| Index | Paper Title                                       | Paper Link                                | Code                                                  | Official Repo                                            |

| ----- | ------------------------------------------------- | ----------------------------------------- | ----------------------------------------------------- | -------------------------------------------------------- |

| 26    | Edit One for All: Interactive Batch Image Editing | [Paper](https://arxiv.org/abs/2401.10219) | [Code](https://github.com/thaoshibe/edit-one-for-all) | [Homepage](https://thaoshibe.github.io/edit-one-for-all) |

#### Low-level Vision

| Index | Paper Title                                                  | Paper Link                                | Code                                     | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ---------------------------------------- | ------------- |

| 27    | Residual Denoising Diffusion Models                          | [Paper](https://arxiv.org/abs/2308.13712) | [Code](https://github.com/nachifur/RDDM) | N/A           |

| 28    | Boosting Image Restoration via Priors from Pre-trained Models | [Paper](https://arxiv.org/abs/2403.06793) | N/A                                      | N/A           |

#### Super-Resolution)

| Index | Paper Title                                                  | Paper Link                                | Code                                                 | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ---------------------------------------------------- | ------------- |

| 29    | SeD: Semantic-Aware Discriminator for Image Super-Resolution | [Paper](https://arxiv.org/abs/2402.19387) | [Code](https://github.com/lbc12345/SeD)              | N/A           |

| 30    | APISR: Anime Production Inspired Real-World Anime Super-Resolution | [Paper](https://arxiv.org/abs/2403.01598) | [Code](https://github.com/Kiter### Domain-wise Table |               |

#### Denoising

| Index | Paper Title                         | Paper Link                                | Code                                     | Official Repo |

| ----- | ----------------------------------- | ----------------------------------------- | ---------------------------------------- | ------------- |

| 31    | Residual Denoising Diffusion Models | [Paper](https://arxiv.org/abs/2308.13712) | [Code](https://github.com/nachifur/RDDM) | N/A           |

#### Deblur

| Index | Paper Title | Paper Link | Code | Official Repo |

| ----- | ----------- | ---------- | ---- | ------------- |

| 32    | N/A         | N/A        | N/A  | N/A           |

#### Autonomous Driving

| Index | Paper Title                                                  | Paper Link                                | Code                                            | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ----------------------------------------------- | ------------- |

| 33    | UniPAD: A Universal Pre-training Paradigm for Autonomous Driving | [Paper](https://arxiv.org/abs/2310.08370) | [Code](https://github.com/Nightmare-n/UniPAD)   | N/A           |

| 34    | Cam4DOcc: Benchmark for Camera-Only 4D Occupancy Forecasting in Autonomous Driving Applications | [Paper](https://arxiv.org/abs/2311.17663) | [Code](https://github.com/haomo-ai/Cam4DOcc)    | N/A           |

| 35    | Memory-based Adapters for Online 3D Scene Perception         | [Paper](https://arxiv.org/abs/2403.06974) | [Code](https://github.com/xuxw98/Online3D)      | N/A           |

| 36    | Symphonize 3D Semantic Scene Completion with Contextual Instance Queries | [Paper](https://arxiv.org/abs/2306.15670) | [Code](https://github.com/hustvl/Symphonies)    | N/A           |

| 37    | A Real-world Large-scale Dataset for Roadside Cooperative Perception | [Paper](https://arxiv.org/abs/2403.10145) | [Code](https://github.com/AIR-THU/DAIR-RCooper) | N/A           |

| 38    | Adaptive Fusion of Single-View and Multi-View Depth for Autonomous Driving | [Paper](https://arxiv.org/abs/2403.07535) | [Code](https://github.com/Junda24/AFNet)        | N/A           |

#### 3D Point Cloud

| Index | Paper Title | Paper Link | Code | Official Repo |

| ----- | ----------- | ---------- | ---- | ------------- |

| 40    | N/A         | N/A        | N/A  | N/A           |

#### 3D Object Detection

| Index | Paper Title                                                  | Paper Link                                | Code                                         | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | -------------------------------------------- | ------------- |

| 41    | PTT: Point-Trajectory Transformer for Efficient Temporal 3D Object Detection | [Paper](https://arxiv.org/abs/2312.08371) | [Code](https://github.com/kuanchihhuang/PTT) | N/A           |

| 42    | UniMODE: Unified Monocular 3D Object Detection               | [Paper](https://arxiv.org/abs/2402.18573) | N/A                                          | N/A           |

#### 3D Semantic Segmentation

| Index | Paper Title | Paper Link | Code | Official Repo |

| ----- | ----------- | ---------- | ---- | ------------- |

| 43    | N/A         | N/A        | N/A  | N/A           |

#### 3D Object Tracking

| Index | Paper Title | Paper Link | Code | Official Repo |

| ----- | ----------- | ---------- | ---- | ------------- |

| 44    | N/A         | N/A        | N/A  | N/A           |

#### 3D Semantic Scene Completion

| Index | Paper Title                                                  | Paper Link                                | Code                                         | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | -------------------------------------------- | ------------- |

| 45    | Symphonize 3D Semantic Scene Completion with Contextual Instance Queries | [Paper](https://arxiv.org/abs/2306.15670) | [Code](https://github.com/hustvl/Symphonies) | N/A           |

#### 3D Registration

| Index | Paper Title | Paper Link | Code | Official Repo |

| ----- | ----------- | ---------- | ---- | ------------- |

| 46    | N/A         | N/A        | N/A  | N/A           |

#### 3D Human Pose Estimation

| Index | Paper Title                                                  | Paper Link                                | Code                                          | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | --------------------------------------------- | ------------- |

| 47    | Hourglass Tokenizer for Efficient Transformer-Based 3D Human Pose Estimation | [Paper](https://arxiv.org/abs/2311.12028) | [Code](https://github.com/NationalGAILab/HoT) | N/A           |

#### 3D Human Mesh Estimation

| Index | Paper Title | Paper Link | Code | Official Repo |

| ----- | ----------- | ---------- | ---- | ------------- |

| 48    | N/A         | N/A        | N/A  | N/A           |

#### Medical Image

| Index | Paper Title                                                  | Paper Link                                | Code                                          | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | --------------------------------------------- | ------------- |

| 49    | Feature Re-Embedding: Towards Foundation Model-Level Performance in Computational Pathology | [Paper](https://arxiv.org/abs/2402.17228) | [Code](https://github.com/DearCaat/RRT-MIL)   | N/A           |

| 50    | VoCo: A Simple-yet-Effective Volume Contrastive Learning Framework for 3D Medical Image Analysis | [Paper](https://arxiv.org/abs/2402.17300) | [Code](https://github.com/Luffy03/VoCo)       | N/A           |

| 51    | ChAda-ViT : Channel Adaptive Attention for Joint Representation Learning of Heterogeneous Microscopy Images | [Paper](https://arxiv.org/abs/2311.15264) | [Code](https://github.com/nicoboou/chada_vit) | N/A           |

#### Image Generation

| Index | Paper Title                                                  | Paper Link                                | Code                                                     | Official Repo                                                |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | -------------------------------------------------------- | ------------------------------------------------------------ |

| 52    | InstanceDiffusion: Instance-level Control for Image Generation | [Paper](https://arxiv.org/abs/2402.03290) | [Code](https://github.com/frank-xwang/InstanceDiffusion) | [Homepage](https://people.eecs.berkeley.edu/~xdwang/projects/InstDiff/) |

| 53    | ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations | [Paper](https://arxiv.org/abs/2312.04655) | [Code](https://github.com/eclipse-t2i/eclipse-inference) | [Homepage](https://eclipse-t2i.vercel.app/)                  |

| 54    | Instruct-Imagen: Image Generation with Multi-modal Instruction | [Paper](https://arxiv.org/abs/2401.01952) | N/A                                                      | N/A                                                          |

| 55    | UniGS: Unified Representation for Image Generation and Segmentation | [Paper](https://arxiv.org/abs/2312.01985) | N/A                                                      | N/A                                                          |

| 56    | Multi-Instance Generation Controller for Text-to-Image Synthesis | [Paper](https://arxiv.org/abs/2402.05408) | [Code](https://github.com/limuloo/migc)                  | N/A                                                          |

| 57    | SVGDreamer: Text Guided SVG Generation with Diffusion Model  | [Paper](https://arxiv.org/abs/2312.16476) | [Code](https://ximinng.github.io/SVGDreamer-project/)    | N/A                                                          |

| 58    | InteractDiffusion: Interaction-Control for Text-to-Image Diffusion Model | [Paper](https://arxiv.org/abs/2312.05849) | [Code](https://github.com/jiuntian/interactdiffusion)    | N/A                                                          |

| 59    | Ranni: Taming Text-to-Image Diffusion for Accurate Prompt Following | [Paper](https://arxiv.org/abs/2311.17002) | [Code](https://github.com/ali-vilab/Ranni)               | N/A                                                          |

#### Video Generation

| Index | Paper Title                                                  | Paper Link                                | Code                                                         | Official Repo                                                |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ |

| 60    | Vlogger: Make Your Dream A Vlog                              | [Paper](https://arxiv.org/abs/2401.09414) | [Code](https://github.com/Vchitect/Vlogger)                  | N/A                                                          |

| 61    | VBench: Comprehensive Benchmark Suite for Video Generative Models | [Paper](https://arxiv.org/abs/2311.17982) | [Code](https://github.com/Vchitect/VBench)                   | [Homepage](https://vchitect.github.io/VBench-project/)       |

| 62    | VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models | [Paper](https://arxiv.org/abs/2312.00845) | [Code](https://github.com/HyeonHo99/Video-Motion-Customization) | [Homepage](https://github.com/HyeonHo99/Video-Motion-Customization) |

#### Vision Transformer

| Index | Paper Title                                                  | Paper Link                                | Code                                                | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | --------------------------------------------------- | ------------- |

| 63    | TransNeXt: Robust Foveal Visual Perception for Vision Transformers | [Paper](https://arxiv.org/abs/2311.17132) | [Code](https://github.com/DaiShiResearch/TransNeXt) | N/A           |

| 64    | RepViT: Revisiting Mobile CNN From ViT Perspective           | [Paper](https://arxiv.org/abs/2307.09283) | [Code](https://github.com/THU-MIG/RepViT)           | N/A           |

| 65    | A General and Efficient Training for Transformer via Token Expansion | [Paper](https://arxiv.org/abs/2404.00672) | [Code](https://github.com/Osilly/TokenExpansion)    | N/A           |

#### Vision-Language

| Index | Paper Title                                                  | Paper Link                                | Code                                                         | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ------------------------------------------------------------ | ------------- |

| 66    | PromptKD: Unsupervised Prompt Distillation for Vision-Language Models | [Paper](https://arxiv.org/abs/2403.02781) | [Code](https://github.com/zhengli97/PromptKD)                | N/A           |

| 67    | FairCLIP: Harnessing Fairness in Vision-Language Learning    | [Paper](https://arxiv.org/abs/2403.19949) | [Code](https://github.com/Harvard-Ophthalmology-AI-Lab/FairCLIP) | N/A           |

#### Object Detection

| Index | Paper Title                                                  | Paper Link                                | Code                                                         | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ------------------------------------------------------------ | ------------- |

| 68    | DETRs Beat YOLOs on Real-time Object Detection               | [Paper](https://arxiv.org/abs/2304.08069) | [Code](https://github.com/lyuwenyu/RT-DETR)                  | N/A           |

| 69    | Boosting Object Detection with Zero-Shot Day-Night Domain Adaptation | [Paper](https://arxiv.org/abs/2312.01220) | [Code](https://github.com/ZPDu/Boosting-Object-Detection-with-Zero-Shot-Day-Night-Domain-Adaptation) | N/A           |

| 70    | YOLO-World: Real-Time Open-Vocabulary Object Detection       | [Paper](https://arxiv.org/abs/2401.17270) | [Code](https://github.com/AILab-CVC/YOLO-World)              | N/A           |

| 71    | Salience DETR: Enhancing Detection Transformer with Hierarchical Salience Filtering Refinement | [Paper](https://arxiv.org/abs/2403.16131) | [Code](https://github.com/xiuqhou/Salience-DETR)             | N/A           |

#### Anomaly Detection

| Index | Paper Title                                                  | Paper Link                                | Code                                    | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | --------------------------------------- | ------------- |

| 72    | Anomaly Heterogeneity Learning for Open-set Supervised Anomaly Detection | [Paper](https://arxiv.org/abs/2310.12790) | [Code](https://github.com/mala-lab/AHL) | N/A           |

#### Object Tracking

| Index | Paper Title                                                  | Paper Link                                | Code                                                         | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ------------------------------------------------------------ | ------------- |

| 73    | Delving into the Trajectory Long-tail Distribution for Multi-object Tracking | [Paper](https://arxiv.org/abs/2403.04700) | [Code](https://github.com/chen-si-jia/Trajectory-Long-tail-Distribution-for-MOT) | N/A           |

#### Semantic Segmentation

| Index | Paper Title                                                  | Paper Link                                | Code                                   | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | -------------------------------------- | ------------- |

| 74    | Stronger, Fewer, & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation | [Paper](https://arxiv.org/abs/2312.04265) | [Code](https://github.com/w1oves/Rein) | N/A           |

| 75    | SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation | [Paper](https://arxiv.org/abs/2311.15537) | [Code](https://github.com/xb534/SED)   | N/A           |

#### Medical Image

| Index | Paper Title                                                  | Paper Link                                | Code                                          | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | --------------------------------------------- | ------------- |

| 76    | Feature Re-Embedding: Towards Foundation Model-Level Performance in Computational Pathology | [Paper](https://arxiv.org/abs/2402.17228) | [Code](https://github.com/DearCaat/RRT-MIL)   | N/A           |

| 77    | VoCo: A Simple-yet-Effective Volume Contrastive Learning Framework for 3D Medical Image Analysis | [Paper](https://arxiv.org/abs/2402.17300) | [Code](https://github.com/Luffy03/VoCo)       | N/A           |

| 78    | ChAda-ViT : Channel Adaptive Attention for Joint Representation Learning of Heterogeneous Microscopy Images | [Paper](https://arxiv.org/abs/2311.15264) | [Code](https://github.com/nicoboou/chada_vit) | N/A           |

#### Medical Image Segmentation

| Index | Paper Title | Paper Link | Code | Official Repo |

| ----- | ----------- | ---------- | ---- | ------------- |

| 76    | N/A         | N/A        | N/A  | N/A           |

#### Autonomous Driving

| Index | Paper Title                                                  | Paper Link                                    | Code                                            | Official Repo |

| ----- | ------------------------------------------------------------ | --------------------------------------------- | ----------------------------------------------- | ------------- |

| 77    | UniPAD: A Universal Pre-training Paradigm for Autonomous Driving | [Paper](https://arxiv.org/abs/2310.08370)     | [Code](https://github.com/Nightmare-n/UniPAD)   | N/A           |

| 78    | Cam4DOcc: Benchmark for Camera-Only 4D Occupancy Forecasting in Autonomous Driving Applications | [Paper](https://arxiv.org/abs/2311.17663)     | [Code](https://github.com/haomo-ai/Cam4DOcc)    | N/A           |

| 79    | Memory-based Adapters for Online 3D Scene Perception         | [Paper](https://arxiv.org/abs/2403.06974)     | [Code](https://github.com/xuxw98/Online3D)      | N/A           |

| 80    | Symphonize 3D Semantic Scene Completion with Contextual Instance Queries | [Paper](https://arxiv.org/abs/2306.15670)     | [Code](https://github.com/hustvl/Symphonies)    | N/A           |

| 81    | A Real-world Large-scale Dataset for Roadside Cooperative Perception | [Paper](https://arxiv.org/abs/2403.10145)     | [Code](https://github.com/AIR-THU/DAIR-RCooper) | N/A           |

| 82    | Adaptive Fusion of Single-View and Multi-View Depth for Autonomous Driving | [Paper](https://arxiv.org/abs/2403.07535)     | [Code](https://github.com/Junda24/AFNet)        | N/A           |

| 83    | Traffic Scene Parsing through the TSP6K Dataset              | [Paper](https://arxiv.org/pdf/2303.02835.pdf) | [Code](https://github.com/PengtaoJiang/TSP6K)   | N/A           |

#### 3D Point Cloud

| Index | Paper Title | Paper Link | Code | Official Repo |

| ----- | ----------- | ---------- | ---- | ------------- |

| 84    | N/A         | N/A        | N/A  | N/A           |

#### 3D Object Detection

| Index | Paper Title                                                  | Paper Link                                | Code                                         | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | -------------------------------------------- | ------------- |

| 85    | PTT: Point-Trajectory Transformer for Efficient Temporal 3D Object Detection | [Paper](https://arxiv.org/abs/2312.08371) | [Code](https://github.com/kuanchihhuang/PTT) | N/A           |

| 86    | UniMODE: Unified Monocular 3D Object Detection               | [Paper](https://arxiv.org/abs/2402.18573) | N/A                                          | N/A           |

#### 3D Semantic Segmentation

| Index | Paper Title | Paper Link | Code | Official Repo |

| ----- | ----------- | ---------- | ---- | ------------- |

| 87    | N/A         | N/A        | N/A  | N/A           |

#### Image Editing

| Index | Paper Title                                       | Paper Link                                | Code                                                  | Official Repo                                            |

| ----- | ------------------------------------------------- | ----------------------------------------- | ----------------------------------------------------- | -------------------------------------------------------- |

| 88    | Edit One for All: Interactive Batch Image Editing | [Paper](https://arxiv.org/abs/2401.10219) | [Code](https://github.com/thaoshibe/edit-one-for-all) | [Homepage](https://thaoshibe.github.io/edit-one-for-all) |

#### Video Editing

| Index | Paper Title                                                  | Paper Link                                | Code | Official Repo                         |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ---- | ------------------------------------- |

| 89    | MaskINT: Video Editing via Interpolative Non-autoregressive Masked Transformers | [Paper](https://arxiv.org/abs/2312.12468) | N/A  | [Homepage](https://maskint.github.io) |

#### Low-level Vision

| Index | Paper Title                                                  | Paper Link                                | Code                                     | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ---------------------------------------- | ------------- |

| 90    | Residual Denoising Diffusion Models                          | [Paper](https://arxiv.org/abs/2308.13712) | [Code](https://github.com/nachifur/RDDM) | N/A           |

| 91    | Boosting Image Restoration via Priors from Pre-trained Models | [Paper](https://arxiv.org/abs/2403.06793) | N/A                                      | N/A           |

#### Super-Resolution

| Index | Paper Title                                                  | Paper Link                                | Code                                         | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | -------------------------------------------- | ------------- |

| 92    | SeD: Semantic-Aware Discriminator for Image Super-Resolution | [Paper](https://arxiv.org/abs/2402.19387) | [Code](https://github.com/lbc12345/SeD)      | N/A           |

| 93    | APISR: Anime Production Inspired Real-World Anime Super-Resolution | [Paper](https://arxiv.org/abs/2403.01598) | [Code](https://github.com/Kiteretsu77/APISR) | N/A           |

#### Denoising

| Index | Paper Title | Paper Link | Code | Official Repo |

| ----- | ----------- | ---------- | ---- | ------------- |

| 94    | N/A         | N/A        | N/A  | N/A           |

#### 3D Human Pose Estimation

| Index | Paper Title                                                  | Paper Link                                | Code                                          | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | --------------------------------------------- | ------------- |

| 95    | Hourglass Tokenizer for Efficient Transformer-Based 3D Human Pose Estimation | [Paper](https://arxiv.org/abs/2311.12028) | [Code](https://github.com/NationalGAILab/HoT) | N/A           |

#### Image Generation

| Index | Paper Title                                                  | Paper Link                                | Code                                                     | Official Repo                                                |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | -------------------------------------------------------- | ------------------------------------------------------------ |

| 96    | InstanceDiffusion: Instance-level Control for Image Generation | [Paper](https://arxiv.org/abs/2402.03290) | [Code](https://github.com/frank-xwang/InstanceDiffusion) | [Homepage](https://people.eecs.berkeley.edu/~xdwang/projects/InstDiff/) |

| 97    | ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations | [Paper](https://arxiv.org/abs/2312.04655) | [Code](https://github.com/eclipse-t2i/eclipse-inference) | [Homepage](https://eclipse-t2i.vercel.app/)                  |

| 98    | Instruct-Imagen: Image Generation with Multi-modal Instruction | [Paper](https://arxiv.org/abs/2401.01952) | N/A                                                      | N/A                                                          |

| 99    | Residual Denoising Diffusion Models                          | [Paper](https://arxiv.org/abs/2308.13712) | [Code](https://github.com/nachifur/RDDM)                 | N/A                                                          |

| 100   | UniGS: Unified Representation for Image Generation and Segmentation | [Paper](https://arxiv.org/abs/2312.01985) | N/A                                                      | N/A                                                          |

| 101   | Multi-Instance Generation Controller for Text-to-Image Synthesis | [Paper](https://arxiv.org/abs/2402.05408) | [Code](https://github.com/limuloo/migc)                  | N/A                                                          |

| 102   | SVGDreamer: Text Guided SVG Generation with Diffusion Model  | [Paper](https://arxiv.org/abs/2312.16476) | [Code](https://ximinng.github.io/SVGDreamer-project/)    | N/A                                                          |

| 103   | InteractDiffusion: Interaction-Control for Text-to-Image Diffusion Model | [Paper](https://arxiv.org/abs/2312.05849) | [Code](https://github.com/jiuntian/interactdiffusion)    | N/A                                                          |

| 104   | Ranni: Taming Text-to-Image Diffusion for Accurate Prompt Following | [Paper](https://arxiv.org/abs/2311.17002) | [Code](https://github.com/ali-vilab/Ranni)               | N/A                                                          |

#### Video Generation

| Index | Paper Title                                                  | Paper Link                                | Code                                                         | Official Repo                                             |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ------------------------------------------------------------ | --------------------------------------------------------- |

| 105   | Vlogger: Make Your Dream A Vlog                              | [Paper](https://arxiv.org/abs/2401.09414) | [Code](https://github.com/Vchitect/Vlogger)                  | N/A                                                       |

| 106   | VBench: Comprehensive Benchmark Suite for Video Generative Models | [Paper](https://arxiv.org/abs/2311.17982) | [Code](https://github.com/Vchitect/VBench)                   | [Homepage](https://vchitect.github.io/VBench-project/)    |

| 107   | VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models | [Paper](https://arxiv.org/abs/2312.00845) | [Code](https://github.com/HyeonHo99/Video-Motion-Customization) | [Homepage](https://video-motion-customization.github.io/) |

#### 3D Generation

| Index | Paper Title                                                  | Paper Link                                | Code                                                      | Official Repo                                           |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | --------------------------------------------------------- | ------------------------------------------------------- |

| 108   | CityDreamer: Compositional Generative Model of Unbounded 3D Cities | [Paper](https://arxiv.org/abs/2309.00610) | [Code](https://github.com/hzxie/city-dreamer)             | [Homepage](https://haozhexie.com/project/city-dreamer/) |

| 109   | LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching | [Paper](https://arxiv.org/abs/2311.11284) | [Code](https://github.com/EnVision-Research/LucidDreamer) | N/A                                                     |

#### Video Understanding

| Index | Paper Title                                                  | Paper Link                                | Code                                                         | Official Repo |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ------------------------------------------------------------ | ------------- |

| 110   | MVBench: A Comprehensive Multi-modal Video Understanding Benchmark | [Paper](https://arxiv.org/abs/2311.17005) | [Code](https://github.com/OpenGVLab/Ask-Anything/tree/main/video_chat2) | N/A           |

#### Knowledge Distillation

| Index | Paper Title                                          | Paper Link                                | Code                                                         | Official Repo |

| ----- | ---------------------------------------------------- | ----------------------------------------- | ------------------------------------------------------------ | ------------- |

| 111   | Logit Standardization in Knowledge Distillation      | [Paper](https://arxiv.org/abs/2403.01427) | [Code](https://github.com/sunshangquan/logit-standardization-KD) | N/A           |

| 112   | Efficient Dataset Distillation via Minimax Diffusion | [Paper](https://arxiv.org/abs/2311.15529) | [Code](https://github.com/vimar-gu/MinimaxDiffusion)         | N/A           |

#### Stereo Matching

| Index | Paper Title                                    | Paper Link                                | Code                                       | Official Repo |

| ----- | ---------------------------------------------- | ----------------------------------------- | ------------------------------------------ | ------------- |

| 113   | Neural Markov Random Field for Stereo Matching | [Paper](https://arxiv.org/abs/2403.11193) | [Code](https://github.com/aeolusguan/NMRF) | N/A           |

#### Scene Graph Generation

| Index | Paper Title                                                  | Paper Link                                | Code                                           | Official Repo                                      |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ---------------------------------------------- | -------------------------------------------------- |

| 114   | HiKER-SGG: Hierarchical Knowledge Enhanced Robust Scene Graph Generation | [Paper](https://arxiv.org/abs/2403.12033) | [Code](https://github.com/zhangce01/HiKER-SGG) | [Homepage](https://zhangce01.github.io/HiKER-SGG/) |

#### Video Quality Assessment

| Index | Paper Title                                                  | Paper Link                                | Code                                                         | Official Repo                                         |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ------------------------------------------------------------ | ----------------------------------------------------- |

| 115   | KVQ: Kaleidoscope Video Quality Assessment for Short-form Videos | [Paper](https://arxiv.org/abs/2402.07220) | [Code](https://github.com/lixinustc/KVQ-Challenge-CVPR-NTIRE2024) | [Homepage](https://lixinustc.github.io/projects/KVQ/) |

#### Datasets

| Index | Paper Title                                                  | Paper Link                                    | Code                                            | Official Repo |

| ----- | ------------------------------------------------------------ | --------------------------------------------- | ----------------------------------------------- | ------------- |

| 116   | A Real-world Large-scale Dataset for Roadside Cooperative Perception | [Paper](https://arxiv.org/abs/2403.10145)     | [Code](https://github.com/AIR-THU/DAIR-RCooper) | N/A           |

| 117   | Traffic Scene Parsing through the TSP6K Dataset              | [Paper](https://arxiv.org/pdf/2303.02835.pdf) | [Code](https://github.com/PengtaoJiang/TSP6K)   | N/A           |

#### Others

| Index | Paper Title                                                  | Paper Link                                | Code                                                         | Official Repo                                                |

| ----- | ------------------------------------------------------------ | ----------------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ |

| 118   | Object Recognition as Next Token Prediction                  | [Paper](https://arxiv.org/abs/2312.02142) | [Code](https://github.com/kaiyuyue/nxtp)                     | N/A                                                          |

| 119   | ParameterNet: Parameters Are All You Need for Large-scale Visual Pretraining of Mobile Networks | [Paper](https://arxiv.org/abs/2306.14525) | [Code](https://parameternet.github.io/)                      | N/A                                                          |

| 120   | Seamless Human Motion Composition with Blended Positional Encodings | [Paper](https://arxiv.org/abs/2402.15509) | [Code](https://github.com/BarqueroGerman/FlowMDM)            | N/A                                                          |

| 121   | LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding, Reasoning, and Planning | [Paper](https://arxiv.org/abs/2311.18651) | [Code](https://github.com/Open3DA/LL3DA)                     | [Homepage](https://ll3da.github.io/)                         |

| 122   | CLOVA: A Closed-LOop Visual Assistant with Tool Usage and Update | [Paper](https://arxiv.org/abs/2312.10908) | N/A                                                          | [Homepage](https://clova-tool.github.io/)                    |

| 123   | MoMask: Generative Masked Modeling of 3D Human Motions       | [Paper](https://arxiv.org/abs/2312.00063) | [Code](https://github.com/EricGuo5513/momask-codes)          | N/A                                                          |

| 124   | Amodal Ground Truth and Completion in the Wild               | [Paper](https://arxiv.org/abs/2312.17247) | [Code](https://github.com/Championchess/Amodal-Completion-in-the-Wild) | [Homepage](https://www.robots.ox.ac.uk/~vgg/research/amodal/) |

| 125   | Improved Visual Grounding through Self-Consistent Explanations | [Paper](https://arxiv.org/abs/2312.04554) | [Code](https://github.com/uvavision/SelfEQ)                  | N/A                                                          |

| 126   | ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object | [Paper](https://arxiv.org/abs/2403.18775) | [Code](https://github.com/chenshuang-zhang/imagenet_d)       | [Homepage](https://chenshuang-zhang.github.io/imagenet_d/)   |

| 127   | Learning from Synthetic Human Group Activities               | [Paper](https://arxiv.org/abs/2306.16772) | [Code](https://github.com/cjerry1243/M3Act)                  | [Homepage](https://cjerry1243.github.io/M3Act/)              |

| 128   | A Cross-Subject Brain Decoding Framework                     | [Paper](https://arxiv.org/abs/2404.07850) | [Code](https://github.com/littlepure2333/MindBridge)         | [Homepage](https://littlepure2333.github.io/MindBridge/)     |

| 129   | Multi-Task Dense Prediction via Mixture of Low-Rank Experts  | [Paper](https://arxiv.org/abs/2403.17749) | [Code](https://github.com/YuqiYang213/MLoRE)                 | N/A                                                          |

| 130   | Contrastive Mean-Shift Learning for Generalized Category Discovery | [Paper](https://arxiv.org/abs/2404.09451) | [Code](https://github.com/sua-choi/CMS)                      | [Homepage](https://postech-cvlab.github.io/cms/)             |

#### Thank you for Reading
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ashishpatel26/cvpr2024

Awesome Lists containing this project

README