An open API service indexing awesome lists of open source software.

Computer vision

Computer vision is an interdisciplinary field that deals with how computers can be made to gain high-level understanding of digital images and videos.

https://github.com/hkchengrex/STCN

[NeurIPS 2021] Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation

computer-vision deep-learning neurips-2021 pytorch segmentation video-object-segmentation video-segmentation

Last synced: 13 Feb 2026

https://github.com/andreibarsan/dynslam

Master's Thesis on Simultaneous Localization and Mapping in dynamic environments. Separately reconstructs both the static environment and the dynamic objects from it, such as cars.

autonomous-vehicles computer-vision deep-learning dense eth-zurich master-thesis slam

Last synced: 04 Apr 2025

https://github.com/deepfates/memery

Search over large image datasets with natural language and computer vision!

computer-vision image-search local-search natural-language python

Last synced: 07 Oct 2025

https://github.com/iitzco/faced

๐Ÿš€ ๐Ÿ˜ Near Real Time CPU Face detection using deep learning

computer-vision convolutional-neural-networks deep-learning face-detection fully-convolutional-networks python python-library tensorflow

Last synced: 20 Oct 2025

https://github.com/hkchengrex/stcn

[NeurIPS 2021] Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation

computer-vision deep-learning neurips-2021 pytorch segmentation video-object-segmentation video-segmentation

Last synced: 05 Apr 2025

https://github.com/aimagelab/dress-code

Dress Code: High-Resolution Multi-Category Virtual Try-On. ECCV 2022

artificial-intelligence computer-vision deep-learning dress-code eccv2022 virtual-try-on

Last synced: 15 May 2025

https://github.com/fatescript/centernet-better

An easy to understand and better performance version of CenterNet

computer-vision deep-learning object-detection

Last synced: 05 Apr 2025

https://github.com/FateScript/CenterNet-better

An easy to understand and better performance version of CenterNet

computer-vision deep-learning object-detection

Last synced: 19 Jul 2025

https://github.com/ternaus/ternausnetv2

TernausNetV2: Fully Convolutional Network for Instance Segmentation

computer-vision deep-learning image-segmentation python pytorch satellite-imagery

Last synced: 05 Apr 2025

https://github.com/ternaus/TernausNetV2

TernausNetV2: Fully Convolutional Network for Instance Segmentation

computer-vision deep-learning image-segmentation python pytorch satellite-imagery

Last synced: 13 May 2025

https://github.com/nianticlabs/mickey

[CVPR 2024 - Oral] Matching 2D Images in 3D: Metric Relative Pose from Metric Correspondences

computer-vision cvpr2024 dsac metric-correspondences mickey pose-estimation ransac ransac-algorithm

Last synced: 15 May 2025

https://github.com/raoyongming/DynamicViT

[NeurIPS 2021] [T-PAMI] DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification

computer-vision deep-learning image-classification vision-transformers

Last synced: 08 May 2025

https://github.com/hassony2/useful-computer-vision-phd-resources

Lists of resources useful for my PhD in computer vision

computer-vision list phd resources

Last synced: 05 May 2025

https://github.com/vlm-run/vlmrun-hub

A hub for various industry-specific schemas to be used with VLMs.

ai computer-vision etl genai json multimodal pydantic pydantic-models vlm vlm-ocr

Last synced: 02 Apr 2026

https://github.com/LingDong-/skeleton-tracing

A new algorithm for retrieving topological skeleton as a set of polylines from binary images

algorithm computational-geometry computer-vision polylines skeletonization

Last synced: 10 Apr 2025

https://github.com/lingdong-/skeleton-tracing

A new algorithm for retrieving topological skeleton as a set of polylines from binary images

algorithm computational-geometry computer-vision polylines skeletonization

Last synced: 26 Oct 2025

https://github.com/swz30/CycleISP

[CVPR 2020--Oral] CycleISP: Real Image Restoration via Improved Data Synthesis

camera-imaging-pipeline computer-vision cvpr2020 cycleisp data-synthesis image-denoising image-restoration low-level-vision pytorch raw2rgb rgb2raw

Last synced: 02 Apr 2025

https://github.com/swz30/cycleisp

[CVPR 2020--Oral] CycleISP: Real Image Restoration via Improved Data Synthesis

camera-imaging-pipeline computer-vision cvpr2020 cycleisp data-synthesis image-denoising image-restoration low-level-vision pytorch raw2rgb rgb2raw

Last synced: 09 Apr 2025

https://github.com/philferriere/tfoptflow

Optical Flow Prediction with TensorFlow. Implements "PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume," by Deqing Sun et al. (CVPR 2018)

computer-vision cvpr2018 deep-learning flying-chairs kitti-dataset motion-estimation mpi-sintel optical-flow pwc-net tensorflow

Last synced: 05 Apr 2025

https://github.com/dmitryryumin/aaai-2024-papers

AAAI 2024 Papers: Explore a comprehensive collection of innovative research papers presented at one of the premier artificial intelligence conferences. Seamlessly integrate code implementations for better understanding. โญ experience the forefront of progress in artificial intelligence with this repository!

aaai aaai2024 application-domains artificial-intelligence cognitive-systems computer-vision computer-vison deep-learning expert-systems human-computer-interaction knowledge-representation machine-learning multi-agent-systems neural-networks reinforcement-learning sentiment-analysis

Last synced: 15 May 2025

https://github.com/mv-lab/InstructIR

[ECCV 2024] InstructIR: High-Quality Image Restoration Following Human Instructions https://huggingface.co/spaces/marcosv/InstructIR

computer-vision deblurring deep-learning dehazing denoising image-enhancement image-restoration inverse-problems language-model low-light-image-enhancement multi-task multimodal neural-network photography prompt pytorch super-resolution

Last synced: 02 Apr 2025

https://github.com/microsoft/CvT

This is an official implementation of CvT: Introducing Convolutions to Vision Transformers.

classification computer-vision cvt deep-learning imagenet

Last synced: 07 May 2025

https://github.com/NVlabs/pacnet

Pixel-Adaptive Convolutional Neural Networks (CVPR '19)

computer-vision deep-learning machine-learning

Last synced: 19 Jul 2025

https://github.com/rese1f/MovieChat

[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding

computer-vision dataset large-language-models llama long-video-understanding multimodal-large-language-models

Last synced: 20 Apr 2025

https://wenhaochai.com/MovieChat/

[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding

computer-vision dataset large-language-models llama long-video-understanding multimodal-large-language-models

Last synced: 05 May 2025

https://zju3dv.github.io/manhattan_sdf/

Code for "Neural 3D Scene Reconstruction with the Manhattan-world Assumption" CVPR 2022 Oral

3d-reconstruction 3d-vision computer-vision cvpr2022

Last synced: 11 May 2025

https://github.com/Justin-Tan/generative-compression

TensorFlow Implementation of Generative Adversarial Networks for Extreme Learned Image Compression

computer-vision gan generative-adversarial-network image-compression tensorflow

Last synced: 19 Jul 2025

https://github.com/DagnyT/hardnet

Hardnet descriptor model - "Working hard to know your neighbor's margins: Local descriptor learning loss"

computer-vision convolutional-neural-networks deep-learning image-matching image-retrieval metric-learning nips-2017 pytorch

Last synced: 03 Apr 2025

https://github.com/jsn5/dancenet

DanceNet -๐Ÿ’ƒ๐Ÿ’ƒDance generator using Autoencoder, LSTM and Mixture Density Network. (Keras)

autoencoder computer-vision generative-model keras lstm mixture-density-networks

Last synced: 19 Jul 2025

https://github.com/rgeirhos/stylized-imagenet

Code to create Stylized-ImageNet, a stylized version of standard ImageNet (ICLR 2019 Oral)

computer-vision deep-learning human-vision shape-bias style-transfer texture-bias

Last synced: 05 Apr 2025

https://github.com/DIYer22/boxx

Tool-box for efficient build and debug in Python. Especially for Scientific Computing and Computer Vision.

awesome-python computer-vision debug debugging deep-learning hack pytorch pytorch-debug scientific-computing toolbox

Last synced: 19 Jul 2025

https://github.com/diyer22/boxx

Tool-box for efficient build and debug in Python. Especially for Scientific Computing and Computer Vision.

awesome-python computer-vision debug debugging deep-learning hack pytorch pytorch-debug scientific-computing toolbox

Last synced: 04 Sep 2025

https://github.com/xiaoyufenfei/LEDNet

LEDNet: A Lightweight Encoder-Decoder Network for Real-time Semantic Segmentation

cityscape-dataset computer-vision lednet pytorch real-time semantic-segmentation

Last synced: 07 May 2025

https://github.com/pauldanielml/mujoco_rl_ur5

A MuJoCo/Gym environment for robot control using Reinforcement Learning. The task of agents in this environment is pixel-wise prediction of grasp success chances.

computer-vision gym-environment mujoco pick-and-place reinforcement-learning robotics

Last synced: 05 Apr 2025

https://github.com/willard-yuan/pcv-book-code

:book:Python่ฎก็ฎ—ๆœบ่ง†่ง‰ไธญ่ฏ‘ๆœฌๅฎžไพ‹ไปฃ็ 

computer-vision python

Last synced: 05 Apr 2025

https://github.com/hfslyc/AdvSemiSeg

Adversarial Learning for Semi-supervised Semantic Segmentation, BMVC 2018

adversarial-learning computer-vision deep-learning pytorch semantic-segmentation semi-supervised-learning

Last synced: 20 Mar 2025

https://github.com/wellflat/imageprocessing-labs

computer vision, image processing and machine learning on the web browser or node.

computer-vision image-processing javascript machine-learning

Last synced: 05 Apr 2025

https://github.com/simpler-env/SimplerEnv

Evaluating and reproducing real-world robot manipulation policies (e.g., RT-1, RT-1-X, Octo) in simulation under common setups (e.g., Google Robot, WidowX+Bridge) (CoRL 2024)

computer-vision embodied-ai real2sim reinforcement-learning robot-learning robot-manipulation robotics robotics-benchmark robotics-simulation

Last synced: 05 Mar 2025

https://github.com/khanhnamle1994/computer-vision

Programming Assignments and Lectures for Stanford's CS 231: Convolutional Neural Networks for Visual Recognition

computer-vision convolutional-neural-networks deep-learning image-classification imagenet visual-recognition

Last synced: 05 Apr 2025

https://github.com/skalskip/sports

Cool experiments at the intersection of Computer Vision and Sports โšฝ๐Ÿƒ

computer-vision deep-learning deep-neural-networks gpt-4 gpt-4-vision object-detection prompt-engineering pytorch sports-analytics tutorial yolov5 yolov7

Last synced: 05 Apr 2025

https://github.com/limuloo/MIGC

[CVPR 2024 Highlight] "MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis" (Official Implementation)

aigc computer-vision cvpr cvpr2024 stable-diffusion text-to-image

Last synced: 28 Mar 2025

https://github.com/megvii-basedetection/defcn

End-to-End Object Detection with Fully Convolutional Network

computer-vision object-detection

Last synced: 06 Apr 2025

https://github.com/jo-m/trainbot

Watches a piece of train track, detects trains, and stitches together images of them.

bot computer-vision go golang stitching trains

Last synced: 16 Jan 2026

https://github.com/wvangansbeke/sparse-depth-completion

Predict dense depth maps from sparse and noisy LiDAR frames guided by RGB images. (Ranked 1st place on KITTI) [MVA 2019]

computer-vision deep-learning depth-completion depth-prediction kitti lidar noisy-data pytorch sensor-fusion

Last synced: 05 Apr 2025

https://github.com/kuanghuei/scan

PyTorch source code for "Stacked Cross Attention for Image-Text Matching" (ECCV 2018)

computer-vision cross-modal deep-learning image-captioning neural-network pytorch visual-semantic

Last synced: 24 Jun 2025

https://github.com/Megvii-BaseDetection/DeFCN

End-to-End Object Detection with Fully Convolutional Network

computer-vision object-detection

Last synced: 14 Mar 2025

https://github.com/diyer22/bpycv

Computer vision utils for Blender (generate instance annoatation, depth and 6D pose by one line code)

6dof-pose blender blender-cv blender-python computer-vision data-synthesis dataset-generation deep-learning depth instance-segmentation synthetic-datasets ycb

Last synced: 04 Sep 2025

https://github.com/lxe/llavavision

A simple "Be My Eyes" web app with a llama.cpp/llava backend

ai artificial-intelligence computer-vision llama llamacpp llm local-llm machine-learning multimodal webapp

Last synced: 05 Apr 2025

https://github.com/abhshkdz/neural-vqa

:grey_question: Visual Question Answering in Torch

computer-vision deep-learning natural-language-processing torch

Last synced: 06 Apr 2025

https://github.com/liuziwei7/fashion-detection

Fashion Detection in the Wild (Deep Clothes Detector)

clothes-detection clothes-detector computer-vision deep-learning vision-for-fashion

Last synced: 13 May 2025

https://github.com/maplelost/lazyeat

Lazyeat ๅƒ้ฅญๆ—ถ็œ‹ๅ‰ง/ๅˆท็ฝ‘้กตไธๆƒณๆฒพๆฒนๆ‰‹๏ผŸ ๅฏน็€ๆ‘„ๅƒๅคดๆฏ”ๅˆ’ๆ‰‹ๅŠฟๅฐฑ่ƒฝๆš‚ๅœ่ง†้ข‘/ๅ…จๅฑ/ๅˆ‡ๆข่ง†้ข‘๏ผLazyeat is a touch-free controller for use while eating! Don't want greasy hands while watching shows or browsing the web during meals? You can pause videos/full screen/switch videos just by gesturing to the camera!

accessibility application computer-vision gesture-detection gesture-recognition hands-free mediapipe mediapipe-hands multitasking productivity-tool python tauri tauri-app vue3 webcam-hacks windows

Last synced: 08 Apr 2025

https://github.com/zju3dv/manhattan_sdf

Code for "Neural 3D Scene Reconstruction with the Manhattan-world Assumption" CVPR 2022 Oral

3d-reconstruction 3d-vision computer-vision cvpr2022

Last synced: 05 Apr 2025

https://github.com/Alpha-VL/ConvMAE

ConvMAE: Masked Convolution Meets Masked Autoencoders

backbone computer-vision mae masked-image-modeling object-detection semantic-segmentation

Last synced: 08 May 2025

https://github.com/SkalskiP/sports

Cool experiments at the intersection of Computer Vision and Sports โšฝ๐Ÿƒ

computer-vision deep-learning deep-neural-networks gpt-4 gpt-4-vision object-detection prompt-engineering pytorch sports-analytics tutorial yolov5 yolov7

Last synced: 06 Apr 2025

https://github.com/open-edge-platform/datumaro

Dataset Management Framework, a Python library and a CLI tool to build, analyze and manage Computer Vision datasets.

coco computer-vision dataset datasets deep-learning format-converter imagenet neural-networks openvino-toolkit pascal-voc yolo

Last synced: 06 May 2025

https://github.com/mtli/photosketch

Code for Photo-Sketching: Inferring Contour Drawings from Images :dog:

ai computer-vision graphics sketch

Last synced: 05 Apr 2025

https://github.com/thp/psmoveapi

Cross-platform library for 6DoF tracking of the PS Move Motion Controller. Sensor fusion, computer vision, ambient display (LED orb).

6dof computer-vision controller hid sensors tracking

Last synced: 07 Apr 2025

https://github.com/kohjingyu/fromage

๐Ÿง€ Code and models for the ICML 2023 paper "Grounding Language Models to Images for Multimodal Inputs and Outputs".

computer-vision large-language-models machine-learning natural-language-processing

Last synced: 02 Feb 2026

https://github.com/jeffersonqin/yuzumarker.fontdetection

โœจ ้ฆ–ไธชCJK๏ผˆไธญๆ—ฅ้Ÿฉ๏ผ‰ๅญ—ไฝ“่ฏ†ๅˆซไปฅๅŠๆ ทๅผๆๅ–ๆจกๅž‹ YuzuMarker็š„ๅญ—ไฝ“่ฏ†ๅˆซๆจกๅž‹ไธŽๅฎž็Žฐ / First-ever CJK (Chinese Japanese Korean) Font Recognition and Style Extractor, side project of YuzuMarker

chinese cjk-characters cjk-font cnn computer-vision cv font font-recognition fonts japanese korean pytorch pytorch-cnn pytorch-lightning recognition

Last synced: 16 May 2025

https://github.com/mtli/PhotoSketch

Code for Photo-Sketching: Inferring Contour Drawings from Images :dog:

ai computer-vision graphics sketch

Last synced: 13 Apr 2026

https://github.com/amusi/ai-job-resume

AI ็ฎ—ๆณ•ๅฒ—็ฎ€ๅކๆจกๆฟ

artificial-intelligence computer-vision deep-learning natural-language-processing resume

Last synced: 26 Dec 2025

https://github.com/TRI-ML/dd3d

Official PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park*, Rares Ambrus*, Vitor Guizilini, Jie Li, and Adrien Gaidon.

computer-vision deep-learning pytorch

Last synced: 20 Mar 2025

https://github.com/tri-ml/dd3d

Official PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park*, Rares Ambrus*, Vitor Guizilini, Jie Li, and Adrien Gaidon.

computer-vision deep-learning pytorch

Last synced: 05 Apr 2025

https://github.com/hkchengrex/mivos

[CVPR 2021] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion. Semi-supervised VOS as well!

computer-vision cvpr2021 deep-learning interactive-segmentation pytorch segmentation video-object-segmentation video-segmentation

Last synced: 08 Apr 2025

https://github.com/amazon-science/siam-mot

SiamMOT: Siamese Multi-Object Tracking

computer-vision multi-object-tracking video-analysis

Last synced: 05 Apr 2025

https://github.com/synthesiaresearch/humanrf

Official code for "HumanRF: High-Fidelity Neural Radiance Fields for Humans in Motion"

3d-reconstruction computer-graphics computer-vision machine-learning nerf siggraph2023

Last synced: 07 Apr 2025

https://github.com/luyanger1799/Amazing-Semantic-Segmentation

Amazing Semantic Segmentation on Tensorflow && Keras (include FCN, UNet, SegNet, PSPNet, PAN, RefineNet, DeepLabV3, DeepLabV3+, DenseASPP, BiSegNet)

computer-vision deep-learning keras-tensorflow semantic-segmentation tensorflow

Last synced: 01 Apr 2025

https://github.com/google-research/maxvit

[ECCV 2022] Official repository for "MaxViT: Multi-Axis Vision Transformer". SOTA foundation models for classification, detection, segmentation, image quality, and generative modeling...

architecture classification cnn computer-vision image image-processing mlp object-detection resnet segmentation transformer transformer-architecture vision-transformer

Last synced: 11 May 2025

https://github.com/cvondrick/soundnet

SoundNet: Learning Sound Representations from Unlabeled Video. NIPS 2016

computer-vision deep-learning sound

Last synced: 27 Jan 2026

https://github.com/jiawangbian/sc_depth_pl

SC-Depth (V1, V2, and V3) for Unsupervised Monocular Depth Estimation Webpage:https://jiawangbian.github.io/sc_depth_pl/

computer-vision deep-learning depth-estimation pose-estimation self-supervised-learning

Last synced: 05 Apr 2025

https://github.com/pykale/pykale

Knowledge-Aware machine LEarning (KALE): accessible machine learning from multiple sources for interdisciplinary research, part of the ๐Ÿ”ฅPyTorch ecosystem. โญ Star to support our work!

computer-vision data-science deep-learning domain-adaptation graph-analysis knowledge-aware-learning machine-learning medical-image-analysis meta-learning multimodal multimodal-learning python pytorch transfer-learning

Last synced: 10 May 2026

https://github.com/anmspro/traffic-signal-violation-detection-system

A Computer Vision based Traffic Signal Violation Detection System from video footage using YOLOv3 & Tkinter. (GUI Included)

computer-vision hacktoberfest object-detection opencv python tensorflow tkinter traffic-signal traffic-violation-detection yolov3

Last synced: 04 Apr 2025

https://github.com/opendrivelab/persformer_3dlane

[ECCV 2022 Oral] Perspective Transformer on 3D Lane Detection

3d-lane-detection autonomous-driving computer-vision deep-learning lane-detection

Last synced: 05 Apr 2025

Computer vision Awesome Lists
Awesome-pytorch-list 707 awesome-multimodal-ml 480 awesome-self-supervised-learning 463 Awesome-Transformer-Attention 2,160 awesome_3DReconstruction_list 169 awesome-industrial-anomaly-detection 1,102 awesome-hand-pose-estimation 497 awesome-image-classification 220 Awesome-Crowd-Counting 468 awesome-human-pose-estimation 89 awesome-autonomous-vehicles 296 Awesome-World-Model 725 Awesome-Federated-Learning 558 Awesome-FL 4,069 awesome-low-light-image-enhancement 219 Awesome-pytorch-list-CNVersion 692 Awesome-Interaction-aware-Trajectory-Prediction 564 Awesome-Implicit-NeRF-Robotics 191 iOS_ML 39 awesome-tensorflow-lite 110 CV-pretrained-model 103 awesome-attention-mechanism-in-cv 195 Awesome-Image-Colorization 150 awesome-grounding 157 openstl 43 Awesome-Open-Vocabulary 162 awesome-ai-awesomeness 236 awesome-capsule-networks 67 awesome-autonomous-vehicle 181 awesome-6d-object 600 awesome-multi-task-learning 233 awesome-robotics-3d 111 awesome-photogrammetry 90 awesome-open-data-centric-ai 56 awesome-ai-data-guided-projects 56 Awesome-3D-Object-Detection 169 Awesome-Skeleton-based-Action-Recognition 104 awesome-optical-flow 110 awesome-holistic-3d 129 awesome-data-annotation 93 Awesome-Parameter-Efficient-Transfer-Learning 124 awesome-panoptic-segmentation 45 awesome-computer-vision-models 189 awesome-robotics-datasets 79 awesome-state-of-depth-completion 71 awesome-nerf-editing 537 arctic 34 awesome-image-alignment-and-stitching 108 Awesome-Distributed-Deep-Learning 44 Awesome-Monocular-3D-detection 97