An open API service indexing awesome lists of open source software.

Computer vision

Computer vision is an interdisciplinary field that deals with how computers can be made to gain high-level understanding of digital images and videos.

https://github.com/roboflow/roboflow-100-benchmark

Code for replicating Roboflow 100 benchmark results and programmatically downloading benchmark datasets

computer-vision dataset deep-learning machine-learning object-detection pytorch

Last synced: 16 May 2025

https://github.com/dailenson/One-DM

Official Code for ECCV 2024 paper โ€” One-Shot Diffusion Mimicker for Handwritten Text Generation

computer-vision deep-learning diffusion-models handwriting-imitator handwritten-text-generation image-generation latent-diffusion pytorch-implementation

Last synced: 08 Sep 2025

https://github.com/erilyth/DeepLearning-Challenges

Codes for weekly challenges on Deep Learning by Siraj

computer-vision deep-learning machine-learning neural-networks

Last synced: 19 Jul 2025

https://github.com/roboflow/webcamGPT

webcamGPT - chat with video stream ๐Ÿ’ฌ + ๐Ÿ“ธ

chatgpt computer-vision gpt-4

Last synced: 07 Apr 2025

https://ali-design.github.io/gan_steerability/

On the "steerability" of generative adversarial networks

computer-vision deep-learning gan generative-adversarial-network

Last synced: 28 Apr 2025

https://github.com/roboflow/webcamgpt

webcamGPT - chat with video stream ๐Ÿ’ฌ + ๐Ÿ“ธ

chatgpt computer-vision gpt-4

Last synced: 05 Apr 2025

https://github.com/nikolazubic/2dimageto3dmodel

We evaluate our method on different datasets (including ShapeNet, CUB-200-2011, and Pascal3D+) and achieve state-of-the-art results, outperforming all the other supervised and unsupervised methods and 3D representations, all in terms of performance, accuracy, and training time.

3d-computer-graphics 3d-reconstruction computer-graphics computer-vision cub-dataset deep-learning gan kaolin loss-functions mesh neural-networks pascal3d point-cloud pose-prediction pytorch rendering shapenet shapenet-dataset single-view-reconstruction voxel

Last synced: 26 Jan 2026

https://github.com/staghado/vit.cpp

Inference Vision Transformer (ViT) in plain C/C++ with ggml

ai c computer-vision cpp cpu edge-computing ggml image-classification llamacpp vision-transformer whisper-cpp

Last synced: 03 Oct 2025

https://github.com/qdata/c-tran

General Multi-label Image Classification with Transformers

computer-vision multi-label-classification transformers

Last synced: 12 Apr 2025

https://github.com/QData/C-Tran

General Multi-label Image Classification with Transformers

computer-vision multi-label-classification transformers

Last synced: 05 Apr 2025

https://github.com/davidstutz/superpixels-revisited

Library containing 7 state-of-the-art superpixel algorithms with a total of 9 implementations used for evaluation purposes in [1] utilizing an extended version of the Berkeley Segmentation Benchmark.

computer-vision image-processing opencv superpixel-algorithms superpixels

Last synced: 13 Apr 2025

https://github.com/zae-bayern/elpv-dataset

A dataset of functional and defective solar cells extracted from EL images of solar modules

computer-vision machine-learning photovoltaic solar-cells solar-energy

Last synced: 07 May 2025

https://github.com/angeladai/ScanComplete

[CVPR'18] ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans

3d-reconstruction autoregressive-neural-networks computer-graphics computer-vision deep-learning

Last synced: 13 Apr 2025

https://github.com/junweiliang/multiverse

Dataset, code and model for the CVPR'20 paper "The Garden of Forking Paths: Towards Multi-Future Trajectory Prediction". And for the ECCV'20 SimAug paper.

3d-simulation computer-vision trajectory-prediction trajectory-prediction-benchmark video-understanding

Last synced: 09 Apr 2025

https://github.com/tomas789/tonav

Implementation of Multi-State Constraint Kalman Filter (MSCKF) for Vision-aided Inertial Navigation. This is my master's thesis.

computer-vision ekf localization msckf ros tonav

Last synced: 15 Jun 2025

https://github.com/lil-lab/nlvr

Cornell NLVR and NLVR2 are natural language grounding datasets. Each example shows a visual input and a sentence describing it, and is annotated with the truth-value of the sentence.

computer-vision corpus machine-learning natural-language-processing

Last synced: 02 May 2025

https://github.com/zc-alexfan/hold

[CVPR 2024โœจHighlight] Official repository for HOLD, the first method that jointly reconstructs articulated hands and objects from monocular videos without assuming a pre-scanned object template and 3D hand-object training data.

3d-reconstruction ai artificial-intelligence augmented-reality computer-vision hand-object-interaction hand-object-reconstruction hand-tracking mano mixed-reality neural-networks pose-estimation pytorch virtual-reality

Last synced: 06 May 2025

https://github.com/keytoyze/visionts

Code for our paper "VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters".

computer-vision deep-learning time-series

Last synced: 20 Apr 2026

https://github.com/brentyi/jaxlie

Rigid transforms + Lie groups for JAX

computer-vision geometry jax lie-groups robotics

Last synced: 15 May 2025

https://github.com/guiggh/hand_pose_action

Dataset and code for the paper "First-Person Hand Action Benchmark with RGB-D Videos and 3D Hand Pose Annotations", CVPR 2018.

action-recognition benchmark computer-vision dataset hand-pose-estimation

Last synced: 24 Dec 2025

https://github.com/fcakyon/craft-text-detector

Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector

actions anaconda computer-vision craft deep-learning document hacktoberfest linux macos neural-network ocr pypi python pytorch text text-detection vision windows workflow

Last synced: 02 Apr 2025

https://github.com/twitter-research/image-crop-analysis

Code for reproducing our analysis in the paper titled: Image Cropping on Twitter: Fairness Metrics, their Limitations, and the Importance of Representation, Design, and Agency

bias computer-vision fairness fairness-ml image-processing machine-learning research

Last synced: 27 Mar 2025

https://github.com/chunbolang/BAM

Official PyTorch Implementation of Learning What Not to Segment: A New Perspective on Few-Shot Segmentation (CVPR'22 Oral & TPAMI'23).

computer-vision few-shot-segmentation

Last synced: 08 May 2025

https://github.com/megvii-research/RevCol

Official Code of Paper "Reversible Column Networks" "RevColv2"

cnn computer-vision iclr2023 mae pytorch transformer vit

Last synced: 20 Mar 2025

https://github.com/voxel51/voxelgpt

AI assistant that can query visual datasets, search the FiftyOne docs, and answer general computer vision questions

artificial-intelligence chatgpt computer-vision data-science deep-learning fiftyone langchain llm machine-learning openai python

Last synced: 26 Jun 2025

https://github.com/louisfb01/best_ai_papers_2023

A curated list of the latest breakthroughs in AI (in 2023) by release date with a clear video explanation, link to a more in-depth article, and code.

ai artificial-intelligence computer-vision machine-learning ml nlp paper papers python research state-of-the-art

Last synced: 10 Apr 2025

https://github.com/mtli/HTML4Vision

A simple HTML visualization tool for computer vision research :hammer_and_wrench:

computer-vision html visualization

Last synced: 08 May 2025

https://github.com/mtli/html4vision

A simple HTML visualization tool for computer vision research :hammer_and_wrench:

computer-vision html visualization

Last synced: 15 May 2025

https://github.com/LZBUAV/K210

Kendryte K210ไบบๅทฅๆ™บ่ƒฝ่Šฏ็‰‡ๅบ”็”จ็จ‹ๅบ้›†ๅˆ๏ผŒๅŒ…ๆ‹ฌไบบ่„ธๆฃ€ๆต‹ใ€้ขœ่‰ฒๆฃ€ๆต‹ใ€็›ฎๆ ‡ๆฃ€ๆต‹ๅ’Œๅˆ†็ฑปใ€ไบŒ็ปด็ ๅ’ŒApriltagไปฃ็ ๆฃ€ๆต‹ไปฅๅŠๅ’ŒArduPilot้ฃžๆŽง่ฝฏไปถ็š„้€šไฟกใ€‚่ฟ™ไบ›ๅบ”็”จ็จ‹ๅบๅทฒ้ƒจ็ฝฒๅˆฐๆ— ไบบๆœบ็ปˆ็ซฏใ€‚This repository is a collection of applications for the Kendryte K210 AI chip which include face detection, color detection, object detection and classification, QR code and Apriltag code detection ,and communication with the ArduPilot flight software. Finally, we can deploy these applications to the UAV terminals and make drones more intelligent.

computer-vision face-detection k210 machine-vision yolov2

Last synced: 20 Mar 2025

https://github.com/AaltoVision/ADVIO

An Authentic Dataset for Visual-Inertial Odometry

benchmarking computer-vision navigation visual-inertial-odometry

Last synced: 01 Apr 2025

https://github.com/milaan9/python_computer_vision_from_scratch

This repository explores the variety of techniques commonly used to analyze and interpret images. It also describes challenging real-world applications where vision is being successfully used, both for specialized applications such as medical imaging, and for fun, consumer-level tasks such as image editing and stitching, which students can apply to their own personal photos and videos.

canny-edge-detection computer-vision dtw-algorithm eigenfaces feature-extraction hough-lines image-analysis image-manipulation image-processing image-recognition ipython-notebook machine-learning python4datascience python4everybody sobel-filter sobel-operator tutor-milaan9

Last synced: 06 Apr 2025

https://github.com/bukalapak/pybrisque

A python implementation of BRISQUE Image Quality Assessment

brisque computer-vision image-processing image-quality-assessment python

Last synced: 05 Jul 2025

https://github.com/idea-research/deepdataspace

The Go-To Choice for CV Data Visualization, Annotation, and Model Analysis.

collaborative-annotation computer-vision dataset-visualization intelligent-annotation labeling-tool model-analysis

Last synced: 05 Apr 2025

https://github.com/aangelopoulos/conformal_classification

Wrapper for a PyTorch classifier which allows it to output prediction sets. The sets are theoretically guaranteed to contain the true class with high probability (via conformal prediction).

artificial-intelligence classification classifier computer-vision conformal conformal-prediction deep-neural-networks distribution-free imagenet machine-learning neural-networks nonparametric nonparametric-statistics prediction-sets pytorch statistics uncertainty uncertainty-quantification vision

Last synced: 04 Apr 2025

https://github.com/ternaus/cloths_segmentation

Code for binary segmentation of cloths

computer-vision deep-learning image-segmentation

Last synced: 06 Apr 2025

https://github.com/mindspore-lab/mindcv

A toolbox of vision models and algorithms based on MindSpore

computer-vision cv deep-learning image-classification imagenet mindspore models resnet transformer

Last synced: 06 Jan 2026

https://github.com/giakoumoglou/pyfeats

[GitHub 2021] Open source software for image feature extraction.

computer-vision fdta feature-extraction fos fps fpsglcm glds glrlm glszm hos lbp lte morphological-analysis ngtdm pyfeats python sfm

Last synced: 07 Apr 2025

https://github.com/vitae-transformer/vitae-transformer-matting

A comprehensive list [AIM@IJCAI'21, P3M@MM'21, GFM@IJCV'22, RIM@CVPR'23, P3MNet@IJCV'23] of our research works related to image matting, including papers, codes, datasets, demos, and citations. Note: The repo for [IJCV'23] "Rethinking Portrait Matting with Privacy Preserving" has been moved to: https://github.com/ViTAE-Transformer/P3M-Net

computer-vision deep-learning image-matting privacy-preserving survey vision-transformer

Last synced: 05 Mar 2026

https://github.com/nianticlabs/wavelet-monodepth

[CVPR 2021] Monocular depth estimation using wavelets for efficiency

computer-vision cvpr2021 depth-estimation kitti-dataset nyu-depth-v2 wavelets

Last synced: 04 Sep 2025

https://github.com/kwotsin/transfer_learning_tutorial

A guide to transfer learning with inception-resnet-v2.

computer-vision tensorflow tensorflow-tutorials transfer-learning

Last synced: 08 May 2025

https://github.com/astra-vision/MaterialPalette

[CVPR 2024] Official repository of "Material Palette: Extraction of Materials from a Single Real-world Image"

albedo computer-vision cvpr cvpr2024 generative-ai material normal roughness stable-diffusion

Last synced: 27 Mar 2025

https://astra-vision.github.io/MaterialPalette/

[CVPR 2024] Official repository of "Material Palette: Extraction of Materials from a Single Real-world Image"

albedo computer-vision cvpr cvpr2024 generative-ai material normal roughness stable-diffusion

Last synced: 27 Mar 2025

https://github.com/wkentaro/morefusion

MoreFusion: Multi-object Reasoning for 6D Pose Estimation from Volumetric Fusion, CVPR 2020

artificial-intelligence computer-vision deep-learning machine-learning pose-estimation robotics ros

Last synced: 07 May 2025

https://github.com/compphoto/intrinsic

Repo for the papers "Intrinsic Image Decomposition via Ordinal Shading" (TOG 2023) and "Colorful Diffuse Intrinsic Image Decomposition in the Wild" (TOG 2024)

computer-graphics computer-vision image-processing intrinsic-decomposition inverse-rendering machine-learning multi-illumination

Last synced: 04 Apr 2025

https://github.com/daleroberts/bv

Quickly view satellite imagery, hyperspectral imagery, and machine learning image outputs directly in your iTerm2 terminal.

command-line computer-vision image iterm2 machine-learning python satellite satellite-imagery

Last synced: 22 Aug 2025

https://github.com/jkulhanek/tetra-nerf

Official implementation for Tetra-NeRF paper - NeRF represented as triangulation of input point cloud.

3d 3d-reconstruction computer-vision nerf nerfstudio neural-networks optix pytorch raytracing

Last synced: 26 Dec 2025

https://github.com/merveenoyan/siglip

Projects based on SigLIP (Zhai et. al, 2023) and Hugging Face transformers integration ๐Ÿค—

computer-vision machine-learning multimodal-learning siglip

Last synced: 04 Apr 2025

https://github.com/ClipsAI/clipsai

Clips AI is an open-source Python library that automatically converts long videos into clips.

computer-vision nlp video-processing

Last synced: 07 Apr 2025

https://github.com/realsenseai/hand_tracking_samples

:wave: :ok_hand: research codebase for depth-based hand pose estimation using dynamics based tracking and CNNs

cnn computer-vision hand-pose-estimation hand-tracking machine-learning physics-engine

Last synced: 10 Mar 2026

https://github.com/Villavu/Simba

Simba is a program used to repeat certain (complicated) tasks. Typically these tasks involve using the mouse and keyboard. Simba is programmable, which means you can design your own logic and steps that Simba will follow, based upon certain input such as colors on the screen.

automation computer-vision fpc lape lazarus macro pascal simba villavu

Last synced: 04 Apr 2025

https://github.com/nghorbani/moshpp

Motion and Shape Capture from Sparse Markers

computer-vision marker-based mocap solving vicon

Last synced: 22 Apr 2025

https://github.com/baegwangbin/surface_normal_uncertainty

[ICCV 2021 Oral] Estimating and Exploiting the Aleatoric Uncertainty in Surface Normal Estimation

3d-reconstruction computer-vision deep-learning iccv2021 surface-normal surface-normals surface-normals-estimation uncertainty uncertainty-estimation

Last synced: 07 Apr 2025

https://github.com/nianticlabs/footprints

[CVPR 2020] Estimation of the visible and hidden traversable space from a single color image

computer-vision deep-learning depth-estimation monodepth pytorch

Last synced: 29 Jun 2025

https://github.com/haofanwang/natural-language-joint-query-search

Search photos on Unsplash based on OpenAI's CLIP model, support search with joint image+text queries and attention visualization.

attention clip computer-vision image-retrieval image-search multi-modal-search unsplash visualizations

Last synced: 20 Aug 2025

https://github.com/baegwangbin/magnet

[CVPR 2022 Oral] Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

3d-reconstruction computer-vision cvpr2022 deep-learning depth-estimation multi-view-stereo multiview-geometry multiview-stereo uncertainty uncertainty-estimation

Last synced: 12 May 2025

https://github.com/arefmalek/airdraw

A vision-based drawing application

computer-vision mediapipe opencv python python3

Last synced: 29 Mar 2025

https://github.com/wkentaro/fcn

Chainer Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

chainer computer-vision convolutional-networks deep-learning fcn fcn8s segmentation semantic-segmentation

Last synced: 05 Apr 2025

https://github.com/baegwangbin/MaGNet

[CVPR 2022 Oral] Multi-View Depth Estimation by Fusing Single-View Depth Probability with Multi-View Geometry

3d-reconstruction computer-vision cvpr2022 deep-learning depth-estimation multi-view-stereo multiview-geometry multiview-stereo uncertainty uncertainty-estimation

Last synced: 03 Apr 2025

https://github.com/li-plus/dsnet

DSNet: A Flexible Detect-to-Summarize Network for Video Summarization

computer-vision detection machine-learning pytorch video-summarization

Last synced: 08 May 2025

Computer vision Awesome Lists
Awesome-pytorch-list 707 awesome-multimodal-ml 480 awesome-self-supervised-learning 463 Awesome-Transformer-Attention 2,160 awesome_3DReconstruction_list 169 awesome-industrial-anomaly-detection 1,102 awesome-hand-pose-estimation 497 awesome-image-classification 220 Awesome-Crowd-Counting 468 awesome-human-pose-estimation 89 awesome-autonomous-vehicles 296 Awesome-World-Model 725 Awesome-Federated-Learning 558 Awesome-FL 4,069 awesome-low-light-image-enhancement 219 Awesome-pytorch-list-CNVersion 692 Awesome-Interaction-aware-Trajectory-Prediction 564 Awesome-Implicit-NeRF-Robotics 191 iOS_ML 39 awesome-tensorflow-lite 110 CV-pretrained-model 103 awesome-attention-mechanism-in-cv 195 Awesome-Image-Colorization 150 awesome-grounding 157 openstl 43 Awesome-Open-Vocabulary 162 awesome-ai-awesomeness 236 awesome-capsule-networks 67 awesome-autonomous-vehicle 181 awesome-6d-object 600 awesome-multi-task-learning 233 awesome-robotics-3d 111 awesome-photogrammetry 90 awesome-open-data-centric-ai 56 awesome-ai-data-guided-projects 56 Awesome-3D-Object-Detection 169 Awesome-Skeleton-based-Action-Recognition 104 awesome-optical-flow 110 awesome-holistic-3d 129 awesome-data-annotation 93 Awesome-Parameter-Efficient-Transfer-Learning 124 awesome-panoptic-segmentation 45 awesome-computer-vision-models 189 awesome-robotics-datasets 79 awesome-state-of-depth-completion 71 awesome-nerf-editing 537 arctic 34 awesome-image-alignment-and-stitching 108 Awesome-Distributed-Deep-Learning 44 Awesome-Monocular-3D-detection 97