An open API service indexing awesome lists of open source software.

Computer vision

Computer vision is an interdisciplinary field that deals with how computers can be made to gain high-level understanding of digital images and videos.

https://github.com/andyzeng/3dmatch-toolbox

3DMatch - a 3D ConvNet-based local geometric descriptor for aligning 3D meshes and point clouds.

3d 3d-deep-learning 3dmatch artificial-intelligence computer-vision deep-learning geometry-processing point-cloud rgbd vision

Last synced: 16 May 2025

https://github.com/gigwegbe/tinyml-papers-and-projects

This is a list of interesting papers and projects about TinyML.

computer-vision embedded-systems machine-learning neural-architecture-search tinyml wake-word

Last synced: 11 May 2025

https://github.com/hustvl/YOLOS

[NeurIPS 2021] You Only Look at One Sequence

computer-vision object-detection transformer vision-transformer

Last synced: 20 Apr 2025

https://github.com/hustvl/yolos

[NeurIPS 2021] You Only Look at One Sequence

computer-vision object-detection transformer vision-transformer

Last synced: 13 Apr 2025

https://github.com/airctic/icevision

An Agnostic Computer Vision Framework - Pluggable to any Training Library: Fastai, Pytorch-Lightning with more to come

ai annotation-parsers coco-dataset coco-parser computer-vision deep-learning effecientdet fastai faster-rcnn mask-rcnn object-detection pycocotools python pytorch pytorch-lightning tutorials voc-dataset voc-parser

Last synced: 14 May 2025

https://github.com/FORTH-ModelBasedTracker/MocapNET

We present MocapNET, a real-time method that estimates the 3D human pose directly in the popular Bio Vision Hierarchy (BVH) format, given estimations of the 2D body joints originating from monocular color images. Our contributions include: (a) A novel and compact 2D pose NSRM representation. (b) A human body orientation classifier and an ensemble of orientation-tuned neural networks that regress the 3D human pose by also allowing for the decomposition of the body to an upper and lower kinematic hierarchy. This permits the recovery of the human pose even in the case of significant occlusions. (c) An efficient Inverse Kinematics solver that refines the neural-network-based solution providing 3D human pose estimations that are consistent with the limb sizes of a target person (if known). All the above yield a 33% accuracy improvement on the Human 3.6 Million (H3.6M) dataset compared to the baseline method (MocapNET) while maintaining real-time performance

2d-to-3d 3d-animation 3d-pose-estimation bvh bvh-format computer-vision demo ensemble gesture-recognition mocap neural-network pose-estimation real-time rgb-images tensorflow webcam

Last synced: 05 Apr 2025

https://github.com/zhmiao/OpenLongTailRecognition-OLTR

Pytorch implementation for "Large-Scale Long-Tailed Recognition in an Open World" (CVPR 2019 ORAL)

computer-vision cvpr2019 deep-learning long-tail oltr open-long-tail-recognition open-set pytorch-implementation

Last synced: 26 Mar 2025

https://github.com/zhmiao/openlongtailrecognition-oltr

Pytorch implementation for "Large-Scale Long-Tailed Recognition in an Open World" (CVPR 2019 ORAL)

computer-vision cvpr2019 deep-learning long-tail oltr open-long-tail-recognition open-set pytorch-implementation

Last synced: 13 Apr 2025

https://github.com/wasidennis/AdaptSegNet

Learning to Adapt Structured Output Space for Semantic Segmentation, CVPR 2018 (spotlight)

adversarial-learning computer-vision deep-learning domain-adaptation generative-adversarial-network pytorch semantic-segmentation

Last synced: 13 Jul 2025

https://github.com/rust-cv/cv

Rust CV mono-repo. Contains pure-Rust dependencies which attempt to encapsulate the capability of OpenCV, OpenMVG, and vSLAM frameworks in a cohesive set of APIs.

algorithms computer-vision crates rust-cv

Last synced: 19 Apr 2025

https://github.com/hkchengrex/cutie

[CVPR 2024 Highlight] Putting the Object Back Into Video Object Segmentation

computer-vision cvpr2024 deep-learning pytorch segmentation video-editing video-object-segmentation video-segmentation

Last synced: 15 May 2025

https://github.com/Dovyski/cvui

A (very) simple UI lib built on top of OpenCV drawing primitives

computer-vision cpp gui imgui opencv opencv-drawing-primitives python ui

Last synced: 15 Mar 2025

https://github.com/dovyski/cvui

A (very) simple UI lib built on top of OpenCV drawing primitives

computer-vision cpp gui imgui opencv opencv-drawing-primitives python ui

Last synced: 08 Apr 2025

https://github.com/VladimirYugay/Gaussian-SLAM

Gaussian-SLAM: Photo-realistic Dense SLAM with Gaussian Splatting

3d-reconstruction computer-vision gaussian-splatting robotics slam

Last synced: 20 Mar 2025

https://github.com/haosulab/ManiSkill

SAPIEN Manipulation Skill Framework, a GPU parallelized robotics simulator and benchmark

3d-computer-vision computer-vision embodied-ai reinforcement-learning robot-learning robot-manipulation robotics robotics-simulation simulation-environment

Last synced: 05 May 2025

https://github.com/nettlep/magic

Scanner for decks of cards with bar codes printed on card edges

computer-vision magic swift

Last synced: 11 Mar 2026

https://github.com/Machine-Learning-Tokyo/papers-with-annotations

Research papers with annotations, illustrations and explanations

computer-vision deep-learning machine-learning

Last synced: 23 Apr 2025

https://github.com/machine-learning-tokyo/papers-with-annotations

Research papers with annotations, illustrations and explanations

computer-vision deep-learning machine-learning

Last synced: 22 Feb 2026

https://github.com/torchgeo/terratorch

A Python toolkit for fine-tuning Geospatial Foundation Models (GFMs).

ai4good ai4science computer-vision deep-learning earth-observation foundation-models geospatial solar-physics weather-models

Last synced: 11 Jun 2026

https://github.com/SimonVandenhende/Multi-Task-Learning-PyTorch

PyTorch implementation of multi-task learning architectures, incl. MTI-Net (ECCV2020).

computer-vision eccv2020 multi-task-learning nyud pascal pytorch scene-understanding segmentation

Last synced: 06 May 2025

https://github.com/simonvandenhende/multi-task-learning-pytorch

PyTorch implementation of multi-task learning architectures, incl. MTI-Net (ECCV2020).

computer-vision eccv2020 multi-task-learning nyud pascal pytorch scene-understanding segmentation

Last synced: 04 Apr 2025

https://github.com/patrikhuber/4dface

Real-time 3D face tracking and reconstruction from 2D video

computer-vision face-models modern-cpp video-processing

Last synced: 04 Apr 2025

https://github.com/megvii-research/IJCAI2023-CoNR

IJCAI2023 - Collaborative Neural Rendering using Anime Character Sheets

anime computer-graphics computer-vision deep-learning pytorch

Last synced: 14 Apr 2025

https://github.com/baegwangbin/dsine

[CVPR 2024 Oral] Rethinking Inductive Biases for Surface Normal Estimation

3d-from-images 3d-reconstruction computer-vision cvpr2024 deep-learning surface-normal surface-normals surface-normals-estimation

Last synced: 13 Apr 2025

https://github.com/ziqi-jin/finetune-anything

Fine-tune SAM (Segment Anything Model) for computer vision tasks such as semantic segmentation, matting, detection ... in specific scenarios

computer-vision deep-learning fine-tune segment-anything

Last synced: 06 May 2025

https://github.com/vandit15/Class-balanced-loss-pytorch

Pytorch implementation of the paper "Class-Balanced Loss Based on Effective Number of Samples"

computer-vision cvpr2019 deep-learning loss-functions pytorch

Last synced: 08 May 2025

https://github.com/noahcao/OC_SORT

[CVPR2023] The official repo for OC-SORT: Observation-Centric SORT on video Multi-Object Tracking. OC-SORT is simple, online and robust to occlusion/non-linear motion.

computer-vision deep-learning object-detection object-tracking tracking

Last synced: 20 Mar 2025

https://github.com/vacancy/PreciseRoIPooling

Precise RoI Pooling with coordinate gradient support, proposed in the paper "Acquisition of Localization Confidence for Accurate Object Detection" (https://arxiv.org/abs/1807.11590).

computer-vision object-detection

Last synced: 04 May 2025

https://github.com/fregu856/deeplabv3

PyTorch implementation of DeepLabV3, trained on the Cityscapes dataset.

autonomous-driving computer-vision deep-learning machine-learning pytorch semantic-segmentation

Last synced: 19 Jul 2025

https://github.com/microsoft/CameraTraps

PyTorch Wildlife: a Collaborative Deep Learning Framework for Conservation.

camera-traps computer-vision conservation machine-learning megadetector pytorch pytorch-wildlife wildlife

Last synced: 20 Jul 2025

https://github.com/Jumpat/SegmentAnythingin3D

Segment Anything in 3D with NeRFs (NeurIPS 2023)

3d 3d-segmentation computer-vision deep-learning nerf segment-anything segmentation

Last synced: 20 Mar 2025

https://github.com/lagadic/visp

Open Source Visual Servoing Platform

c-plus-plus computer-vision visp visual-servoing

Last synced: 16 May 2025

https://github.com/onetaken/awesome_deep_learning_interpretability

ๆทฑๅบฆๅญฆไน ่ฟ‘ๅนดๆฅๅ…ณไบŽ็ฅž็ป็ฝ‘็ปœๆจกๅž‹่งฃ้‡Šๆ€ง็š„็›ธๅ…ณ้ซ˜ๅผ•็”จ/้กถไผš่ฎบๆ–‡(้™„ๅธฆไปฃ็ )

awesome awesome-list chainer computer-vision cvpr deep-learning eccv iccv iclr icml interpretability keras matlab neural-network neurips nlp papers pytorch tensorflow torch

Last synced: 30 Dec 2025

https://github.com/dog-qiuqiu/FastestDet

:zap: A newly designed ultra lightweight anchor free target detection algorithm๏ผŒ weight only 250K parameters๏ผŒ reduces the time consumption by 10% compared with yolo-fastest, and the post-processing is simpler

computer-vision deep-learning object-detection

Last synced: 14 Mar 2025

https://github.com/dog-qiuqiu/fastestdet

:zap: A newly designed ultra lightweight anchor free target detection algorithm๏ผŒ weight only 250K parameters๏ผŒ reduces the time consumption by 10% compared with yolo-fastest, and the post-processing is simpler

computer-vision deep-learning object-detection

Last synced: 04 Apr 2025

https://github.com/hustvl/gaussiandreamer

[CVPR 2024] GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models

aigc computer-vision cvpr2024 diffusion-models dreamfusion gaussian-splatting nerf radiance-field smpl text-to-3d

Last synced: 15 May 2025

https://github.com/johnolafenwa/DeepStack

The World's Leading Cross Platform AI Engine for Edge Devices

ai-engine computer-vision deepstack face-detection face-recognition object-detection scene-recognition

Last synced: 06 Apr 2025

https://github.com/adamdad/kat

[ICLR2025] Kolmogorov-Arnold Transformer

computer-vision kan kolmogorov-arnold-networks transformer

Last synced: 15 May 2025

https://github.com/nianticlabs/acezero

[ECCV 2024 - Oral] ACE0 is a learning-based structure-from-motion approach that estimates camera parameters of sets of images by learning a multi-view consistent, implicit scene representation.

3d-reconstruction camera-relocalization computer-vision eccv eccv2024 machine-learning pose-estimation sfm structure-from-motion visual-relocalization

Last synced: 15 May 2025

https://github.com/johnolafenwa/deepstack

The World's Leading Cross Platform AI Engine for Edge Devices

ai-engine computer-vision deepstack face-detection face-recognition object-detection scene-recognition

Last synced: 12 Apr 2025

https://github.com/amusi/AI-Job-Recommend

ๅ›ฝๅ†…ๅ…ฌๅธไบบๅทฅๆ™บ่ƒฝๆ–นๅ‘๏ผˆๅซๆœบๅ™จๅญฆไน ใ€ๆทฑๅบฆๅญฆไน ใ€่ฎก็ฎ—ๆœบ่ง†่ง‰ๅ’Œ่‡ช็„ถ่ฏญ่จ€ๅค„็†๏ผ‰ๅฒ—ไฝ็š„ๆ‹›่˜ไฟกๆฏ๏ผˆๅซๅ…จ่Œใ€ๅฎžไน ๅ’Œๆ กๆ‹›๏ผ‰

artificial-intelligence computer-vision deep-learning job-search jobs machine-learning natural-language-processing

Last synced: 03 Aug 2025

https://github.com/AntonioTepsich/Convolutional-KANs

This project extends the idea of the innovative architecture of Kolmogorov-Arnold Networks (KAN) to the Convolutional Layers, changing the classic linear transformation of the convolution to learnable non linear activations in each pixel.

cnn computer-vision deep-learning

Last synced: 01 Aug 2025

https://github.com/xtreme1-io/xtreme1

Xtreme1 is an all-in-one data labeling and annotation platform for multimodal data training and supports 3D LiDAR point cloud, image, and LLM.

3d-annotation annotation annotation-tool computer-vision image-annotation image-classification image-labelling-tool labeling-tool multimodal point-cloud rlhf

Last synced: 20 Mar 2025

https://github.com/skalskip/top-cvpr-2025-papers

About This repository is a curated collection of the most exciting and influential CVPR 2025 papers. ๐Ÿ”ฅ [Paper + Code + Demo]

computer-vision cvpr cvpr2025 image-segmentation multimodal object-detection paper transformers vision-and-language vision-language-model

Last synced: 08 Aug 2025

https://github.com/quic/sense

Enhance your application with the ability to see and interact with humans using any RGB camera.

activity-recognition calorie-estimation computer-vision deep-learning fitness-app gesture-recognition neural-networks pytorch video

Last synced: 13 Jul 2025

https://github.com/bgshih/aster

Recognizing cropped text in natural images.

computer-vision ocr recognition scene-text

Last synced: 13 Apr 2025

https://github.com/onepanelio/onepanel

The open source, end-to-end computer vision platform. Label, build, train, tune, deploy and automate in a unified platform that runs on any cloud and on-premises.

ai aiops annotation computer-vision deeplearning etl hyperparameter-tuning inference jupyterlab labeling machinelearning mlops pipelines pytorch tensorboard tensorflow training workflows

Last synced: 05 May 2026

https://github.com/ahangchen/torch_base

Quickly bring up your PyTorch project(a skeleton)

computer-vision deep-learning machine-learning pytorch

Last synced: 04 Apr 2025

https://github.com/Guanghan/lighttrack

LightTrack: A Generic Framework for Online Top-Down Human Pose Tracking

artificial-intelligence computer-vision pose-estimation pose-tracking visual-object-tracking

Last synced: 19 Jul 2025

https://github.com/rnchg/APT

AI Productivity Tool - Free and open source, improve user productivity, and protect privacy and data security. Including but not limited to: built-in local exclusive ChatGPT, DeepSeek, Phi, Qwen and other models, one-click batch intelligent processing of pictures, videos, audio, etc.

ai ai-framework aigc audio-processing chatgpt computer-vision deep-learning deepseek generative-ai image-processing inference llm machine-learning machinelearning neural-network onnx onnxruntime video-processing

Last synced: 14 Aug 2025

https://github.com/adamspannbauer/python_video_stab

A Python package to stabilize videos using OpenCV

computer-vision opencv python video video-stabilization

Last synced: 08 Apr 2025

https://github.com/hkchengrex/Cutie

[CVPR 2024 Highlight] Putting the Object Back Into Video Object Segmentation

computer-vision cvpr2024 deep-learning pytorch segmentation video-editing video-object-segmentation video-segmentation

Last synced: 17 Apr 2025

https://github.com/mhamilton723/stego

Unsupervised Semantic Segmentation by Distilling Feature Correspondences

computer-vision deep-learning iclr2022 pytorch semantic-segmentation unsupervised-learning

Last synced: 04 Apr 2025

https://github.com/mhamilton723/STEGO

Unsupervised Semantic Segmentation by Distilling Feature Correspondences

computer-vision deep-learning iclr2022 pytorch semantic-segmentation unsupervised-learning

Last synced: 08 May 2025

https://github.com/MarekKowalski/FaceSwap

3D face swapping implemented in Python

3d-models computer-vision face-alignment face-swap optimization

Last synced: 12 May 2025

https://github.com/cvondrick/videogan

Generating Videos with Scene Dynamics. NIPS 2016.

computer-vision deep-learning generative-adversarial-network video

Last synced: 27 Jan 2026

https://github.com/huggingface/computer-vision-course

This repo is the homebase of a community driven course on Computer Vision with Neural Networks. Feel free to join us on the Hugging Face discord: hf.co/join/discord

computer-vision convolutional-neural-networks generative-ai neural-networks transformers

Last synced: 14 Oct 2025

https://github.com/stevenygd/PointFlow

PointFlow : 3D Point Cloud Generation with Continuous Normalizing Flows

3d-point-clouds computer-vision continuous-normalizing-flows machine-learning pytorch shapes

Last synced: 15 Jul 2025

https://github.com/gjy3035/c-3-framework

An open-source PyTorch code for crowd counting

computer-vision crowd-analysis crowd-counting deep-learning

Last synced: 12 Apr 2025

https://github.com/PeterWang512/GANSketching

Sketch Your Own GAN: Customizing a GAN model with hand-drawn sketches.

computer-graphics computer-vision deep-learning gans hci

Last synced: 04 Apr 2025

https://github.com/skalskip/top-cvpr-2024-papers

This repository is a curated collection of the most exciting and influential CVPR 2024 papers. ๐Ÿ”ฅ [Paper + Code + Demo]

computer-vision cvpr cvpr2024 image-segmentation object-detection paper transformers vision-and-language

Last synced: 04 Apr 2025

https://github.com/gjy3035/C-3-Framework

An open-source PyTorch code for crowd counting

computer-vision crowd-analysis crowd-counting deep-learning

Last synced: 11 May 2025

https://github.com/Rubikplayer/flame-fitting

Example code for the FLAME 3D head model. The code demonstrates how to sample 3D heads from the model, fit the model to 3D keypoints and 3D scans.

3d-face-alignment 3d-model 3d-reconstruction chumpy computer-graphics computer-vision face face-alignment face-model flame flame-fitting flame-model morphable-model smpl-x

Last synced: 19 Apr 2025

https://github.com/mit-spark/vggt-slam

VGGT-SLAM: Dense RGB SLAM Optimized on the SL(4) Manifold

computer-vision slam vggt vggt-slam

Last synced: 20 Feb 2026

https://github.com/niessner/VoxelHashing

[Siggraph Asia 2013] Large-Scale, Real-Time 3D Reconstruction

3d-reconstruction computer-vision kinect

Last synced: 21 Nov 2025

https://github.com/semperai/amica

Amica is an open source interface for interactive communication with 3D characters with voice synthesis and speech recognition.

ai assistant-chat-bots computer-vision llm speech-recognition tts

Last synced: 24 Mar 2025

https://github.com/ika-rwth-aachen/Cam2BEV

TensorFlow Implementation for Computing a Semantically Segmented Bird's Eye View (BEV) Image Given the Images of Multiple Vehicle-Mounted Cameras.

autonomous-vehicles birds-eye-view computer-vision deep-learning ipm machine-learning segmentation sim2real simulation

Last synced: 20 Mar 2025

https://github.com/wenhaochai/MovieChat

[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding

computer-vision dataset large-language-models llama long-video-understanding multimodal-large-language-models

Last synced: 03 May 2026

https://github.com/swz30/mirnet

[ECCV 2020] Learning Enriched Features for Real Image Restoration and Enhancement. SOTA results for image denoising, super-resolution, and image enhancement.

attention-mechanism computer-vision eccv2020 image-denoising image-enhancement image-restoration low-level-vision multi-resolution-streams pytorch super-resolution

Last synced: 04 Apr 2025

https://github.com/swz30/MIRNet

[ECCV 2020] Learning Enriched Features for Real Image Restoration and Enhancement. SOTA results for image denoising, super-resolution, and image enhancement.

attention-mechanism computer-vision eccv2020 image-denoising image-enhancement image-restoration low-level-vision multi-resolution-streams pytorch super-resolution

Last synced: 02 Apr 2025

https://github.com/HKUST-Aerial-Robotics/DenseSurfelMapping

This is the open-source version of ICRA 2019 submission "Real-time Scalable Dense Surfel Mapping"

computer-vision drones quadrotor robotics

Last synced: 20 Apr 2025

https://github.com/visipedia/inat_comp

iNaturalist competition details

competition computer-vision dataset inaturalist

Last synced: 05 Apr 2025

https://github.com/zhangxiaosong18/FreeAnchor

FreeAnchor: Learning to Match Anchors for Visual Object Detection (NeurIPS 2019)

computer-vision freeanchor neurips-2019 object-detection one-stage pytorch

Last synced: 14 Mar 2025

Computer vision Awesome Lists
Awesome-pytorch-list 707 awesome-multimodal-ml 480 awesome-self-supervised-learning 463 Awesome-Transformer-Attention 2,160 awesome_3DReconstruction_list 169 awesome-industrial-anomaly-detection 1,102 awesome-hand-pose-estimation 497 awesome-image-classification 220 Awesome-Crowd-Counting 468 awesome-human-pose-estimation 89 awesome-autonomous-vehicles 296 Awesome-World-Model 725 Awesome-Federated-Learning 558 Awesome-FL 4,069 awesome-low-light-image-enhancement 219 Awesome-pytorch-list-CNVersion 692 Awesome-Interaction-aware-Trajectory-Prediction 564 Awesome-Implicit-NeRF-Robotics 191 iOS_ML 39 awesome-tensorflow-lite 110 CV-pretrained-model 103 awesome-attention-mechanism-in-cv 195 Awesome-Image-Colorization 150 awesome-grounding 157 openstl 43 Awesome-Open-Vocabulary 162 awesome-ai-awesomeness 236 awesome-capsule-networks 67 awesome-autonomous-vehicle 181 awesome-6d-object 600 awesome-multi-task-learning 233 awesome-robotics-3d 111 awesome-photogrammetry 90 awesome-open-data-centric-ai 56 awesome-ai-data-guided-projects 56 Awesome-3D-Object-Detection 169 Awesome-Skeleton-based-Action-Recognition 104 awesome-optical-flow 110 awesome-holistic-3d 129 awesome-data-annotation 93 Awesome-Parameter-Efficient-Transfer-Learning 124 awesome-panoptic-segmentation 45 awesome-computer-vision-models 189 awesome-robotics-datasets 79 awesome-state-of-depth-completion 71 awesome-nerf-editing 537 arctic 34 awesome-image-alignment-and-stitching 108 Awesome-Distributed-Deep-Learning 44 Awesome-Monocular-3D-detection 97