An open API service indexing awesome lists of open source software.

Computer vision

Computer vision is an interdisciplinary field that deals with how computers can be made to gain high-level understanding of digital images and videos.

https://github.com/lufficc/ssd

High quality, fast, modular reference implementation of SSD in PyTorch

computer-vision deep-learning object-detection pytorch ssd

Last synced: 15 May 2025

https://github.com/lufficc/SSD

High quality, fast, modular reference implementation of SSD in PyTorch

computer-vision deep-learning object-detection pytorch ssd

Last synced: 20 Mar 2025

https://github.com/MaaXYZ/MaaFramework

基于图像识别的自动化黑盒测试框架 | An automation black-box testing framework based on image recognition

black-box-testing computer-vision

Last synced: 08 Nov 2025

https://github.com/yatenglg/isat_with_segment_anything

Labeling tool with SAM(segment anything model),supports SAM, SAM2, sam-hq, MobileSAM EdgeSAM etc.交互式半自动图像标注工具

annotation-tool computer-vision labeling labeling-tool sam sam2 segment-anything segment-anything-2 video-segmentation

Last synced: 24 Dec 2025

https://github.com/lucidrains/lambda-networks

Implementation of LambdaNetworks, a new approach to image recognition that reaches SOTA with less compute

artificial-intelligence attention attention-mechanism computer-vision deep-learning

Last synced: 15 May 2025

https://github.com/symforce-org/symforce

Fast symbolic computation, code generation, and nonlinear optimization for robotics

autonomous-vehicles code-generation computer-vision cpp motion-planning optimization python robotics slam structure-from-motion symbolic-computation

Last synced: 06 Sep 2025

https://github.com/spla-tam/SplaTAM

SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAM (CVPR 2024)

computer-vision cvpr2024 gaussian-splatting robotics slam

Last synced: 20 Mar 2025

https://github.com/robertknight/ocrs

Rust library and CLI tool for OCR (extracting text from images)

computer-vision machine-learning ocr

Last synced: 13 May 2025

https://github.com/pixeltable/pixeltable

Pixeltable — Data Infrastructure providing a declarative, incremental approach for multimodal AI workloads.

ai artificial-intelligence chatbot computer-vision data-science database feature-engineering feature-store genai llm machine-learning ml mlops multimodal vector-database

Last synced: 01 May 2026

https://github.com/mathworks/matlab-simulink-challenge-project-hub

This MATLAB and Simulink Challenge Project Hub contains a list of research and design project ideas. These projects will help you gain practical experience and insight into technology trends and industry directions.

ai autonomous capstone capstone-project computer-vision deep-learning drones energy final-project final-year-project master-thesis matlab project-ideas robotics senior-design senior-project simulink student-project students thesis

Last synced: 11 Apr 2025

https://github.com/chainer/chainercv

ChainerCV: a Library for Deep Learning in Computer Vision

chainer chainercv computer-vision cupy deep-learning neural-network python

Last synced: 28 Sep 2025

https://github.com/hkchengrex/mmaudio

[CVPR 2025] MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

audio audio-synthesis computer-vision deep-learning text-to-audio video-to-audio

Last synced: 08 May 2025

https://github.com/toandaominh1997/efficientdet.pytorch

Implementation EfficientDet: Scalable and Efficient Object Detection in PyTorch

coco computer-vision demo detection efficientdet-d0 efficientnet focalloss multibox nms object-detection pascal-voc pytorch

Last synced: 14 Aug 2025

https://github.com/toandaominh1997/EfficientDet.Pytorch

Implementation EfficientDet: Scalable and Efficient Object Detection in PyTorch

coco computer-vision demo detection efficientdet-d0 efficientnet focalloss multibox nms object-detection pascal-voc pytorch

Last synced: 19 Jul 2025

https://github.com/una-dinosauria/3d-pose-baseline

A simple baseline for 3d human pose estimation in tensorflow. Presented at ICCV 17.

3d-vision baseline computer-vision iccv-17 iccv-2017 tensorflow

Last synced: 16 May 2025

https://github.com/minivision-ai/silent-face-anti-spoofing

静默活体检测(Silent-Face-Anti-Spoofing)

android-app computer-vision deep-learning face-anti-spoofing sdk

Last synced: 08 Apr 2025

https://github.com/rese1f/stablevideo

[ICCV 2023] StableVideo: Text-driven Consistency-aware Diffusion Video Editing

aigc computer-vision controlnet diffusion-model video-editing

Last synced: 16 May 2025

https://github.com/RQLuo/MixTeX-Latex-OCR

MixTeX multimodal LaTeX, ZhEn, and, Table OCR. It performs efficient CPU-based inference in a local offline on Windows.

computer-vision deep-learning latex machine-learning ocr onnx python

Last synced: 30 Aug 2025

https://github.com/CSAILVision/LabelMeAnnotationTool

Source code for the LabelMe annotation tool.

annotation computer-vision

Last synced: 27 Apr 2025

https://github.com/csailvision/labelmeannotationtool

Source code for the LabelMe annotation tool.

annotation computer-vision

Last synced: 15 May 2025

https://github.com/ai-forever/ghost

A new one shot face swap approach for image and video domains

computer-vision deep-face-swap deep-learning deepfake face-swap faceswap ghost ghost-faceswap ghost-swap pytorch

Last synced: 14 May 2025

https://github.com/torchgan/torchgan

Research Framework for easy and efficient training of GANs based on Pytorch

computer-vision deep-learning gans generative-adversarial-networks generative-model machine-learning neural-networks python python3 pytorch

Last synced: 15 May 2025

https://github.com/qingyonghu/randla-net

🔥RandLA-Net in Tensorflow (CVPR 2020, Oral & IEEE TPAMI 2021)

3d-vision computer-vision s3dis semantic-segmentation semantic3d semantickitti

Last synced: 16 May 2025

https://github.com/tannerhelland/PhotoDemon

A free portable photo editor focused on pro-grade features, high performance, and maximum usability.

computer-vision image-editor image-filters image-processing paint photo-editor vb6 win32

Last synced: 03 Apr 2025

https://github.com/skalskip/ilearndeeplearning.py

This repository contains small projects related to Neural Networks and Deep Learning in general. Subjects are closely linekd with articles I publish on Medium. I encourage you both to read as well as to check how the code works in the action.

computer-vision deep-learning deep-learning-tutorial neural-network numpy visualizations

Last synced: 08 Apr 2025

https://github.com/SkalskiP/ILearnDeepLearning.py

This repository contains small projects related to Neural Networks and Deep Learning in general. Subjects are closely linekd with articles I publish on Medium. I encourage you both to read as well as to check how the code works in the action.

computer-vision deep-learning deep-learning-tutorial neural-network numpy visualizations

Last synced: 01 May 2025

https://github.com/QingyongHu/RandLA-Net

🔥RandLA-Net in Tensorflow (CVPR 2020, Oral & IEEE TPAMI 2021)

3d-vision computer-vision s3dis semantic-segmentation semantic3d semantickitti

Last synced: 20 Mar 2025

https://github.com/nianticlabs/simplerecon

[ECCV 2022] SimpleRecon: 3D Reconstruction Without 3D Convolutions

computer-vision cost-volume depth depth-estimation eccv2022 multi-view-stereo mvs pytorch scannet visualization

Last synced: 16 May 2025

https://github.com/dwctod/cvpr2024-papers-with-code-demo

收集 CVPR 最新的成果,包括论文、代码和demo视频等,欢迎大家推荐!Collect the latest CVPR (Conference on Computer Vision and Pattern Recognition) results, including papers, code, and demo videos, etc., and welcome recommendations from everyone!

computer-vision cvpr cvpr2021 cvpr2022 cvpr2023 cvpr2024 llm multimodal-deep-learning object-detection segment-anything segmentation

Last synced: 26 Jan 2026

https://github.com/DWCTOD/CVPR2024-Papers-with-Code-Demo

收集 CVPR 最新的成果,包括论文、代码和demo视频等,欢迎大家推荐!Collect the latest CVPR (Conference on Computer Vision and Pattern Recognition) results, including papers, code, and demo videos, etc., and welcome recommendations from everyone!

computer-vision cvpr cvpr2021 cvpr2022 cvpr2023 cvpr2024 llm multimodal-deep-learning object-detection segment-anything segmentation

Last synced: 29 Mar 2025

https://github.com/cdpierse/transformers-interpret

Model explainability that works seamlessly with 🤗 transformers. Explain your transformers model in just 2 lines of code.

captum computer-vision deep-learning explainable-ai interpretability machine-learning model-explainability natural-language-processing neural-network nlp transformers transformers-model

Last synced: 15 May 2025

https://github.com/muskie82/MonoGS

[CVPR'24 Highlight & Best Demo Award] Gaussian Splatting SLAM

computer-vision cvpr2024 gaussian-splatting robotics slam

Last synced: 20 Mar 2025

https://github.com/om-ai-lab/omdet

Real-time and accurate open-vocabulary end-to-end object detection

coco computer-vision lvis object-detection open-vocabulary real-time vision-and-language zero-shot zero-shot-object-detection

Last synced: 13 Apr 2025

https://github.com/digantamisra98/mish

Official Repository for "Mish: A Self Regularized Non-Monotonic Neural Activation Function" [BMVC 2020]

activation-functions bmvc bmvc20 computer-vision deep-learning image-classification mathematics neural-networks object-detection

Last synced: 17 Oct 2025

https://github.com/piddnad/DDColor

[ICCV 2023] DDColor: Towards Photo-Realistic Image Colorization via Dual Decoders

computer-vision image-colorization pytorch

Last synced: 24 Aug 2025

https://github.com/piddnad/ddcolor

[ICCV 2023] DDColor: Towards Photo-Realistic Image Colorization via Dual Decoders

computer-vision image-colorization pytorch

Last synced: 14 May 2025

https://github.com/mathworks/MATLAB-Simulink-Challenge-Project-Hub

This MATLAB and Simulink Challenge Project Hub contains a list of research and design project ideas. These projects will help you gain practical experience and insight into technology trends and industry directions.

ai autonomous capstone capstone-project computer-vision deep-learning drones energy final-project final-year-project master-thesis matlab project-ideas robotics senior-design senior-project simulink student-project students thesis

Last synced: 09 May 2025

https://github.com/digantamisra98/Mish

Official Repository for "Mish: A Self Regularized Non-Monotonic Neural Activation Function" [BMVC 2020]

activation-functions bmvc bmvc20 computer-vision deep-learning image-classification mathematics neural-networks object-detection

Last synced: 20 Mar 2025

https://github.com/swz30/mprnet

[CVPR 2021] Multi-Stage Progressive Image Restoration. SOTA results for Image deblurring, deraining, and denoising.

computer-vision cvpr-2021 cvpr2021 cvpr21 image-deblurring image-denoising image-deraining image-restoration low-level-vision multistage-network progressive-restoration pytorch

Last synced: 16 May 2025

https://github.com/poloclub/diffusiondb

A large-scale text-to-image prompt gallery dataset based on Stable Diffusion

ai-art computer-vision image-generation prompt-engineering stable-diffusion

Last synced: 16 May 2025

https://github.com/yatengLG/ISAT_with_segment_anything

Labeling tool with SAM(segment anything model),supports SAM, SAM2, sam-hq, MobileSAM EdgeSAM etc.交互式半自动图像标注工具

annotation-tool computer-vision labeling labeling-tool sam sam2 segment-anything segment-anything-2 video-segmentation

Last synced: 09 Mar 2025

https://poloclub.github.io/diffusiondb/

A large-scale text-to-image prompt gallery dataset based on Stable Diffusion

ai-art computer-vision image-generation prompt-engineering stable-diffusion

Last synced: 07 May 2025

https://github.com/open-mmlab/mmengine

OpenMMLab Foundational Library for Training Deep Learning Models

ai computer-vision deep-learning machine-learning python pytorch

Last synced: 13 May 2025

https://github.com/amaiya/ktrain

ktrain is a Python library that makes deep learning and AI more accessible and easier to apply

computer-vision deep-learning graph-neural-networks keras machine-learning nlp python tabular-data tensorflow

Last synced: 29 Apr 2025

https://github.com/lingdong-/qiji-font

齊伋體 - typeface from Ming Dynasty woodblock printed books

chinese classical-chinese computer-vision font typeface typography woodcutting

Last synced: 12 Apr 2025

https://github.com/swz30/MPRNet

[CVPR 2021] Multi-Stage Progressive Image Restoration. SOTA results for Image deblurring, deraining, and denoising.

computer-vision cvpr-2021 cvpr2021 cvpr21 image-deblurring image-denoising image-deraining image-restoration low-level-vision multistage-network progressive-restoration pytorch

Last synced: 07 Apr 2025

https://github.com/LingDong-/qiji-font

齊伋體 - typeface from Ming Dynasty woodblock printed books

chinese classical-chinese computer-vision font typeface typography woodcutting

Last synced: 07 May 2025

https://github.com/open-compass/VLMEvalKit

Open-source evaluation toolkit of large vision-language models (LVLMs), support ~100 VLMs, 40+ benchmarks

chatgpt claude clip computer-vision evaluation gemini gpt gpt-4v gpt4 large-language-models llava llm multi-modal openai openai-api pytorch qwen vit vqa

Last synced: 20 Jul 2025

https://github.com/imagej/imagej2

Open scientific N-dimensional image processing :microscope: :sparkler:

computer-vision image-processing

Last synced: 14 May 2025

https://github.com/captain1986/captainblackboard

船长关于机器学习、计算机视觉和工程技术的总结和分享

computer-vision convolutional-neural-networks deep-learning optimization-algorithms summary

Last synced: 16 May 2025

https://github.com/pprp/simplecvreproduction

Replication of simple CV Projects including attention, classification, detection, keypoint detection, etc.

attention classification computer-vision cv demo face-detection landmark object-detection paper-reproduction pytorch

Last synced: 16 May 2025

https://github.com/br-idl/paddlevit

:robot: PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

classification computer-vision cv deep-learning detection encoder-decoder gan mlp object-detection paddlepaddle segmentation semantic-segmentation transformer vit

Last synced: 14 Apr 2025

https://github.com/huoyijie/AdvancedEAST

AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST, and the significant improvement was also made, which make long text predictions more accurate.https://github.com/huoyijie/raspberrypi-car

advancedeast advancedeast-network-arch algorithm bellow computer-vision deep-learning east icpr keras machine-learning python scene tensorflow text-detect text-predictions tian-chi tianchi

Last synced: 02 Apr 2025

https://github.com/huoyijie/advancedeast

AdvancedEAST is an algorithm used for Scene image text detect, which is primarily based on EAST, and the significant improvement was also made, which make long text predictions more accurate.https://github.com/huoyijie/raspberrypi-car

advancedeast advancedeast-network-arch algorithm bellow computer-vision deep-learning east icpr keras machine-learning python scene tensorflow text-detect text-predictions tian-chi tianchi

Last synced: 16 May 2025

https://github.com/nachifur/mulimgviewer

MulimgViewer is a multi-image viewer that can open multiple images in one interface, which is convenient for image comparison and image stitching.

computer-vision deep-learning image-comparison image-stitching image-viewer multiple-image-comparison multiple-images multiple-imageview opencas parallel picture-viewer python3 ubuntu viewer windows10

Last synced: 14 May 2025

https://github.com/rqluo/mixtex-latex-ocr

MixTeX multimodal LaTeX, ZhEn, and, Table OCR. It performs efficient CPU-based inference in a local offline on Windows.

computer-vision deep-learning latex machine-learning ocr onnx python

Last synced: 14 May 2025

https://github.com/xvjiarui/gcnet

GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond

computer-vision deep-learning instance-segmentation object-detection

Last synced: 16 May 2025

https://github.com/xvjiarui/GCNet

GCNet: Non-local Networks Meet Squeeze-Excitation Networks and Beyond

computer-vision deep-learning instance-segmentation object-detection

Last synced: 19 Jul 2025

https://github.com/utiasstars/pykitti

Python tools for working with KITTI data.

computer-vision kitti-dataset python robotics

Last synced: 15 May 2025

https://github.com/openpifpaf/openpifpaf

Official implementation of "OpenPifPaf: Composite Fields for Semantic Keypoint Detection and Spatio-Temporal Association" in PyTorch.

composite-fields computer-vision deep-learning human-pose-estimation keypoint-estimation pose-estimation

Last synced: 14 May 2025

https://github.com/utiasSTARS/pykitti

Python tools for working with KITTI data.

computer-vision kitti-dataset python robotics

Last synced: 20 Mar 2025

Computer vision Awesome Lists
Awesome-pytorch-list 707 awesome-multimodal-ml 480 awesome-self-supervised-learning 463 Awesome-Transformer-Attention 2,160 awesome_3DReconstruction_list 169 awesome-industrial-anomaly-detection 1,102 awesome-hand-pose-estimation 497 awesome-image-classification 220 Awesome-Crowd-Counting 468 awesome-human-pose-estimation 89 awesome-autonomous-vehicles 296 Awesome-World-Model 725 Awesome-Federated-Learning 558 Awesome-FL 4,069 awesome-low-light-image-enhancement 219 Awesome-pytorch-list-CNVersion 692 Awesome-Interaction-aware-Trajectory-Prediction 564 Awesome-Implicit-NeRF-Robotics 191 iOS_ML 39 awesome-tensorflow-lite 110 CV-pretrained-model 103 awesome-attention-mechanism-in-cv 195 Awesome-Image-Colorization 150 awesome-grounding 157 openstl 43 Awesome-Open-Vocabulary 162 awesome-ai-awesomeness 236 awesome-capsule-networks 67 awesome-autonomous-vehicle 181 awesome-6d-object 600 awesome-multi-task-learning 233 awesome-robotics-3d 111 awesome-photogrammetry 90 awesome-open-data-centric-ai 56 awesome-ai-data-guided-projects 56 Awesome-3D-Object-Detection 169 Awesome-Skeleton-based-Action-Recognition 104 awesome-optical-flow 110 awesome-holistic-3d 129 awesome-data-annotation 93 Awesome-Parameter-Efficient-Transfer-Learning 124 awesome-panoptic-segmentation 45 awesome-computer-vision-models 189 awesome-robotics-datasets 79 awesome-state-of-depth-completion 71 awesome-nerf-editing 537 arctic 34 awesome-image-alignment-and-stitching 108 Awesome-Distributed-Deep-Learning 44 Awesome-Monocular-3D-detection 97