Projects in Awesome Lists tagged with vision
A curated list of projects in awesome lists tagged with vision .
https://github.com/bvlc/caffe
Caffe: a fast open framework for deep learning.
deep-learning machine-learning vision
Last synced: 13 May 2025
https://github.com/BVLC/caffe
Caffe: a fast open framework for deep learning.
deep-learning machine-learning vision
Last synced: 14 Mar 2025
https://github.com/xtls/xray-core
Xray, Penetrates Everything. Also the best v2ray-core. Where the magic happens. An open platform for various uses.
anticensorship dns network proxy reality shadowsocks socks5 tls trojan tunnel utls vision vless vmess vpn wireguard xhttp xray xtls xudp
Last synced: 09 Sep 2025
https://github.com/danny-avila/librechat
Enhanced ChatGPT Clone: Features Agents, DeepSeek, Anthropic, AWS, OpenAI, Assistants API, Azure, Groq, o1, GPT-4o, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, Code Interpreter, langchain, DALL-E-3, OpenAPI Actions, Functions, Secure Multi-User Auth, Presets, open-source for self-hosting. Active project.
ai anthropic artifacts assistant-api aws azure chatgpt chatgpt-clone claude clone dall-e-3 deepseek gemini google librechat o1 openai plugins vision webui
Last synced: 09 Sep 2025
https://github.com/bytedance/UI-TARS-desktop
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
agent agent-tars browser-use computer-use gui-agent gui-operator mcp mcp-server multimodal tars ui-tars vision vlm
Last synced: 06 Oct 2025
https://github.com/danny-avila/LibreChat
Enhanced ChatGPT Clone: Features Anthropic, AWS, OpenAI, Assistants API, Azure, Groq, o1, GPT-4o, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, langchain, DALL-E-3, ChatGPT Plugins, OpenAI Functions, Secure Multi-User System, Presets, completely open-source for self-hosting. Actively in public development.
ai anthropic artifacts assistant-api aws azure chatgpt chatgpt-clone claude clone dall-e-3 gemini google librechat o1 openai plugins search vision webui
Last synced: 20 Mar 2025
https://github.com/mediar-ai/screenpipe
AI app store powered by 24/7 desktop history. open source | 100% local | dev friendly | 24/7 screen, mic recording
agents agi ai computer-vision llm machine-learning ml multimodal vision
Last synced: 13 May 2025
https://github.com/skyvern-ai/skyvern
Automate browser-based workflows with LLMs and Computer Vision
api automation browser browser-automation computer gpt llm playwright python rpa vision workflow
Last synced: 12 May 2025
https://github.com/bytedance/ui-tars-desktop
A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.
agent browser-use computer-use electron gui-agents mcp mcp-server vision vite vlm
Last synced: 09 Sep 2025
https://github.com/Skyvern-AI/skyvern
Automate browser-based workflows with LLMs and Computer Vision
api automation browser browser-automation computer gpt llm playwright python rpa vision workflow
Last synced: 07 Apr 2025
https://github.com/Skyvern-AI/Skyvern
Automate browser-based workflows with LLMs and Computer Vision
api automation browser browser-automation computer gpt llm playwright python rpa vision workflow
Last synced: 09 Mar 2025
https://github.com/mrousavy/react-native-vision-camera
📸 A powerful, high-performance React Native Camera library.
ai android barcode camera instagram ios javascript jsi library native qr qrcode react react-native react-native-camera scanner snapchat typescript vision worklet
Last synced: 09 Sep 2025
https://github.com/dooy/chatgpt-web-midjourney-proxy
One UI is all done with chatgpt web, midjourney, gpts,suno,luma,runway,viggle,flux,ideogram,realtime,pika,udio; Simultaneous support Web / PWA / Linux / Win / MacOS platform
chatgpt-ui claude-3 flux gpts gpts-ui gptstore ideogram kling luma midjourney midjourney-ui pika realtime runway suno udio viggle vision whisper-ui
Last synced: 23 Apr 2025
https://github.com/TEN-framework/ten-framework
The world’s first real-time, distributed, cloud-edge collaborative multimodal AI Agent Framework that simultaneously supports C/C++/Go/Python/JS/TS
agents ai audio-video cloud-edge-computing cpp cross-platform go golang javascript llm low-latency multimodal package-management python realtime rust typescript vision voice-assistant
Last synced: 03 May 2025
https://github.com/ten-framework/ten-agent
TEN Agent is a conversational voice AI agent powered by TEN, integrating Deepseek, Gemini, OpenAI, RTC, and hardware like ESP32. It enables realtime AI capabilities like seeing, hearing, and speaking, and is fully compatible with platforms like Dify and Coze.
agent ai asr cpp gemini golang gpt-4 gpt-4o llm low-latency multimodal nextjs14 openai python rag real-time realtime tts vision voice-assistant
Last synced: 31 Mar 2025
https://github.com/TEN-framework/TEN-Agent
TEN Agent is a conversational voice AI agent powered by TEN, integrating Deepseek, Gemini, OpenAI, RTC, and hardware like ESP32. It enables realtime AI capabilities like seeing, hearing, and speaking, and is fully compatible with platforms like Dify and Coze.
agent ai asr cpp gemini golang gpt-4 gpt-4o llm low-latency multimodal nextjs14 openai python rag real-time realtime tts vision voice-assistant
Last synced: 08 Mar 2025
https://github.com/autorope/donkeycar
Open source hardware and software platform to build a small scale self driving car.
cv2 donkeycar jetson-nano keras python raspberry-pi self-driving-car tensorflow vision
Last synced: 29 Apr 2025
https://github.com/sightmachine/SimpleCV
The Open Source Framework for Machine Vision
computer-vision cv image-processing python vision visionprocessing
Last synced: 18 Mar 2025
https://github.com/nextlevel/nextlevel
⬆️ Media Capture in Swift
ar arkit augmented-reality avfoundation camera capture coreimage custom instagram ios media mixed-reality nextlevel photography snapchat swift tiktok video vision
Last synced: 14 May 2025
https://github.com/NextLevel/NextLevel
⬆️ Media Capture in Swift
ar arkit augmented-reality avfoundation camera capture coreimage custom instagram ios media mixed-reality nextlevel photography snapchat swift tiktok video vision
Last synced: 06 Aug 2025
https://github.com/andyzeng/tsdf-fusion-python
Python code to fuse multiple RGB-D images into a TSDF voxel volume.
3d 3d-deep-learning 3d-reconstruction artificial-intelligence cuda depth-camera kinect-fusion rgbd tsdf vision volumetric-data
Last synced: 16 May 2025
https://github.com/roatienza/deep-learning-experiments
Videos, notes and experiments to understand deep learning
artificial-intelligence deep-learning deep-learning-tutorial nlp pytorch speech vision
Last synced: 14 May 2025
https://github.com/roatienza/Deep-Learning-Experiments
Videos, notes and experiments to understand deep learning
artificial-intelligence deep-learning deep-learning-tutorial nlp pytorch speech vision
Last synced: 27 Mar 2025
https://github.com/jenly1314/mlkit
🌝 MLKit是一个强大易用的工具包。通过ML Kit您可以很轻松的实现文字识别、条码识别、图像标记、人脸检测、对象检测等功能。
android barcode barcode-scanning camerax face-detection image-labeling machine-learning machine-learning-library mlkit object-detection object-recognition ocr pose-detection qrcode recognition segmentation-selfie text-recognition vision
Last synced: 14 May 2025
https://github.com/kevingong2013/chineseidcardocr
[Deprecated] 🇨🇳中国二代身份证光学识别
cnn coreml deep-learning ios11 machine-learning swift vision xcode
Last synced: 13 Apr 2025
https://github.com/KevinGong2013/ChineseIDCardOCR
[Deprecated] 🇨🇳中国二代身份证光学识别
cnn coreml deep-learning ios11 machine-learning swift vision xcode
Last synced: 23 Apr 2025
https://github.com/lucidrains/mlp-mixer-pytorch
An All-MLP solution for Vision, from Google AI
Last synced: 15 May 2025
https://github.com/andyzeng/visual-pushing-grasping
Train robotic agents to learn to plan pushing and grasping actions for manipulation with deep reinforcement learning.
3d artificial-intelligence computer-vision deep-learning deep-reinforcement-learning grasping manipulation pushing robotics vision
Last synced: 16 May 2025
https://github.com/deepdrive/deepdrive
Deepdrive is a simulator that allows anyone with a PC to push the state-of-the-art in self-driving
competition control deep-learning deep-reinforcement-learning gym python reinforcement-learning self-driving-car sensorimotor simulation tensorflow transfer-learning unreal-engine vision
Last synced: 01 Apr 2025
https://github.com/anupamchugh/iowncode
A curated collection of iOS, ML, AR resources sprinkled with some UI additions
alamofire arkit computer-vision coreml coremltools ios keras ml-kit natural-language-processing nlp realitykit swift swiftui vision vision-framework
Last synced: 22 Jul 2025
https://github.com/andyzeng/3dmatch-toolbox
3DMatch - a 3D ConvNet-based local geometric descriptor for aligning 3D meshes and point clouds.
3d 3d-deep-learning 3dmatch artificial-intelligence computer-vision deep-learning geometry-processing point-cloud rgbd vision
Last synced: 16 May 2025
https://github.com/jasmcaus/caer
High-performance Vision library in Python. Scale your research, not boilerplate.
ai artificial-intelligence augmentation caer computer-vision cuda data-science deep-learning gpu image-classification image-processing image-segmentation machine-learning neural-network opencv python segmentation type-checking video-processing vision
Last synced: 15 May 2025
https://github.com/valentinfrlch/ha-llmvision
Let Home Assistant see!
ai hacs-integration home-assistant image-analysis llm vision
Last synced: 15 May 2025
https://github.com/andyzeng/tsdf-fusion
Fuse multiple depth frames into a TSDF voxel volume.
3d 3d-deep-learning 3d-reconstruction artificial-intelligence cuda depth-camera kinect-fusion rgbd tsdf vision volumetric-data
Last synced: 05 Apr 2025
https://github.com/evilgix/evil
Optical Character Recognition in Swift for iOS&macOS. 银行卡、身份证、门牌号光学识别
cnn-model keras machine-learning ocr swift4 vision
Last synced: 05 Apr 2025
https://github.com/evilgix/Evil
Optical Character Recognition in Swift for iOS&macOS. 银行卡、身份证、门牌号光学识别
cnn-model keras machine-learning ocr swift4 vision
Last synced: 15 May 2025
https://github.com/lucidrains/bottleneck-transformer-pytorch
Implementation of Bottleneck Transformer in Pytorch
artificial-intelligence attention-mechanism deep-learning image-classification transformers vision
Last synced: 16 May 2025
https://github.com/mostafasadeghi97/design2code
Convert any web design screenshot to clean HTML/CSS code
ai code-generation coding-assistant design-to-code gpt4 openai vision
Last synced: 03 Apr 2025
https://github.com/ovidijusparsiunas/myvision
Computer vision based ML training data generation tool :rocket:
ai annotation annotation-tool coco computer-vision image image-annotation label labeling-tool labelling machine-learning ml model object-detection tagging tensorflow training-data vgg vision yolo
Last synced: 15 May 2025
https://github.com/google-research/ravens
Train robotic agents to learn pick and place with deep learning for vision-based manipulation in PyBullet. Transporter Nets, CoRL 2020.
artificial-intelligence computer-vision deep-learning imitation-learning manipulation openai-gym pick-and-place pybullet rearrangement reinforcement-learning robotics tensorflow transporter-nets vision
Last synced: 16 May 2025
https://github.com/cary-sas/v2ray_bin
梅林380 固件的魔改科学上网插件
armv5 asuswrt-merlin grpc hysteria2 koolshare naiveproxy reality shadowsocks shadowsocks-2022 ss ssr trojan trojan-go v2ray vision vison vless vmess xray xtls
Last synced: 27 Sep 2025
https://github.com/OvidijusParsiunas/myvision
Computer vision based ML training data generation tool :rocket:
ai annotation annotation-tool coco computer-vision image image-annotation label labeling-tool labelling machine-learning ml model object-detection tagging tensorflow training-data vgg vision yolo
Last synced: 20 Mar 2025
https://github.com/robotlocomotion/pytorch-dense-correspondence
Code for "Dense Object Nets: Learning Dense Visual Object Descriptors By and For Robotic Manipulation"
3d artificial-intelligence computer-vision deep-learning manipulation pytorch robotics self-supervised-learning vision
Last synced: 04 Apr 2025
https://github.com/RobotLocomotion/pytorch-dense-correspondence
Code for "Dense Object Nets: Learning Dense Visual Object Descriptors By and For Robotic Manipulation"
3d artificial-intelligence computer-vision deep-learning manipulation pytorch robotics self-supervised-learning vision
Last synced: 07 May 2025
https://github.com/davidbau/rewriting
Rewriting a Deep Generative Model, ECCV 2020 (oral). Interactive tool to directly edit the rules of a GAN to synthesize scenes with objects added, removed, or altered. Change StyleGANv2 to make extravagant eyebrows, or horses wearing hats.
deep-learning gans graphics hci machine-learning research vision
Last synced: 04 Apr 2025
https://github.com/rowanz/neural-motifs
Code for Neural Motifs: Scene Graph Parsing with Global Context (CVPR 2018)
pytorch scene-graph vision visual-genome
Last synced: 02 Apr 2025
https://github.com/hrnet/hrformer
[ NeurIPS2021] This is an official implementation of our paper "HRFormer: High-Resolution Transformer for Dense Prediction".
classification hrnet pose-estimation segmentation transformer vision
Last synced: 05 Apr 2025
https://github.com/HRNet/HRFormer
[ NeurIPS2021] This is an official implementation of our paper "HRFormer: High-Resolution Transformer for Dense Prediction".
classification hrnet pose-estimation segmentation transformer vision
Last synced: 12 May 2025
https://github.com/KimDarren/FaceCropper
:scissors: Crop faces, inside of your image, with iOS 11 Vision api.
face face-detection face-recognition ios ios11 swift vision vision-api
Last synced: 02 Aug 2025
https://github.com/ictnlp/llava-mini
LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.
efficient gpt4o gpt4v large-language-models large-multimodal-models llama llava multimodal multimodal-large-language-models video vision vision-language-model visual-instruction-tuning
Last synced: 16 May 2025
https://github.com/myndex/sapc-apca
APCA (Accessible Perceptual Contrast Algorithm) is a new method for predicting contrast for use in emerging web standards (WCAG 3) for determining readability contrast. APCA is derived form the SAPC (S-LUV Advanced Predictive Color) which is an accessibility-oriented color appearance model designed for self-illuminated displays.
accessibility apca cieluv color color-contrast color-contrast-checker color-models color-theory colorimetry contrast contrast-calculator css luminance readability srgb vision wcag wcag-contrast web
Last synced: 16 May 2025
https://github.com/rowanz/r2c
Recognition to Cognition Networks (code for the model in "From Recognition to Cognition: Visual Commonsense Reasoning", CVPR 2019)
commonsense reasoning vcr vision visual visual-commonsense-reasoning
Last synced: 13 Apr 2025
https://github.com/kscottz/pythonfromspace
Python Examples for Remote Sensing
computer-vision gis image-processing imaging jupyter-notebook python satellite vision
Last synced: 05 Apr 2025
https://github.com/okalachev/arucogen
Online ArUco markers generator
apriltags aruco aruco-markers fiducial-markers generator vision website
Last synced: 05 Apr 2025
https://github.com/kscottz/PythonFromSpace
Python Examples for Remote Sensing
computer-vision gis image-processing imaging jupyter-notebook python satellite vision
Last synced: 08 May 2025
https://github.com/tomrunia/opticalflow_visualization
Python optical flow visualization following Baker et al. (ICCV 2007) as used by the MPI-Sintel challenge
iccv motion opencv optical-flow python vision visualization
Last synced: 09 Apr 2025
https://github.com/zihangjiang/tokenlabeling
Pytorch implementation of "All Tokens Matter: Token Labeling for Training Better Vision Transformers"
imagenet lv-vit pytorch segmentation transformer vision
Last synced: 04 Apr 2025
https://github.com/zihangJiang/TokenLabeling
Pytorch implementation of "All Tokens Matter: Token Labeling for Training Better Vision Transformers"
imagenet lv-vit pytorch segmentation transformer vision
Last synced: 05 May 2025
https://github.com/Myndex/SAPC-APCA
APCA (Accessible Perceptual Contrast Algorithm) is a new method for predicting contrast for use in emerging web standards (WCAG 3) for determining readability contrast. APCA is derived form the SAPC (S-LUV Advanced Predictive Color) which is an accessibility-oriented color appearance model designed for self-illuminated displays.
accessibility apca cieluv color color-contrast color-contrast-checker color-models color-theory colorimetry contrast contrast-calculator css luminance readability srgb vision wcag wcag-contrast web
Last synced: 07 May 2025
https://github.com/rentainhe/visualization
a collection of visualization function
attention attention-map attention-mechanism data-visualization deep-learning transformer vision vision-mlp vision-transformer visualization
Last synced: 04 Apr 2025
https://github.com/AprilRobotics/apriltag_ros
A ROS wrapper of the AprilTag 3 visual fiducial detector
apriltags fiducial-markers ros vision wrapper
Last synced: 05 May 2025
https://github.com/aprilrobotics/apriltag_ros
A ROS wrapper of the AprilTag 3 visual fiducial detector
apriltags fiducial-markers ros vision wrapper
Last synced: 12 Apr 2025
https://github.com/WPIRoboticsProjects/GRIP
Program for rapidly developing computer vision applications
camera computer-vision first-frc first-robotics-competition firstrobotics opencv robotics vision wpi
Last synced: 11 May 2025
https://github.com/wpiroboticsprojects/grip
Program for rapidly developing computer vision applications
camera computer-vision first-frc first-robotics-competition firstrobotics opencv robotics vision wpi
Last synced: 05 Apr 2025
https://github.com/photonvision/photonvision
PhotonVision is the free, fast, and easy-to-use computer vision solution for the FIRST Robotics Competition.
computer-vision frc java opencv vision vision-processing wpilib
Last synced: 16 Dec 2025
https://github.com/aras62/vision-based-prediction
Deep Learning for Vision-based Prediction
actions dataset deep-learning metrics motion papers prediction trajectory video vision
Last synced: 16 May 2025
https://github.com/bbc-esq/vectordb-plugin
Plugin that lets you ask questions about your documents including audio and video files.
bark database-management embedding-models embedding-vectors embeddings gtts koboldai koboldcpp python rag retrieval-augmented-generation retrieval-chatbot tiledb vector-data-management vector-database vector-search vision whisper whispers2t whisperspeech
Last synced: 16 May 2025
https://github.com/cocoa-ai/FacesVisionDemo
👀 iOS11 demo application for age and gender classification of facial images.
coreml coreml-models emotion-recognition facial-recognition gender-classification ios machine-learning swift swift4 vision
Last synced: 11 May 2025
https://github.com/cocoa-ai/facesvisiondemo
👀 iOS11 demo application for age and gender classification of facial images.
coreml coreml-models emotion-recognition facial-recognition gender-classification ios machine-learning swift swift4 vision
Last synced: 04 Aug 2025
https://github.com/andyzeng/arc-robot-vision
MIT-Princeton Vision Toolbox for Robotic Pick-and-Place at the Amazon Robotics Challenge 2017 - Robotic Grasping and One-shot Recognition of Novel Objects with Deep Learning.
3d amazon-robotics-challenge artificial-intelligence computer-vision deep-learning grasping manipulation mit-princeton rgbd vision
Last synced: 09 Apr 2025
https://github.com/andyzeng/apc-vision-toolbox
MIT-Princeton Vision Toolbox for the Amazon Picking Challenge 2016 - RGB-D ConvNet-based object segmentation and 6D object pose estimation.
3d amazon-picking-challenge artificial-intelligence computer-vision deep-learning marvin mit-princeton rgbd ros segmentation vision
Last synced: 09 Apr 2025
https://github.com/Feghal/ImageDetect
✂️ Detect and crop faces, barcodes and texts in image with iOS 11 Vision api.
barcode detector face face-detection face-recognition ios ios11 recognition swift vision vision-api
Last synced: 06 Aug 2025
https://github.com/ZhangGongjie/SAM-DETR
[CVPR'2022] SAM-DETR & SAM-DETR++: Official PyTorch Implementation
computer-vision cvpr cvpr2022 deep-learning detection detr machine-learning object-detection pytorch transformer vision vision-transformer
Last synced: 20 Mar 2025
https://github.com/cheind/dest
:panda_face: One Millisecond Deformable Shape Tracking Library (DEST)
face-alignment face-detector machine-learning vision
Last synced: 19 Jun 2025
https://github.com/BBC-Esq/VectorDB-Plugin
Plugin that lets you ask questions about your documents including audio and video files.
bark database-management embedding-models embedding-vectors embeddings gtts koboldai koboldcpp python rag retrieval-augmented-generation retrieval-chatbot tiledb vector-data-management vector-database vector-search vision whisper whispers2t whisperspeech
Last synced: 25 Oct 2025
https://github.com/harishdeivanayagam/rowfill
Open-source unstructured data (PDFs, Images, Audiofiles) processing platform built for knowledge workers
document document-extraction document-parsing image-ocr langgraph llama llm nextjs ocr ocr-javascript ollama openai pdf pdfs unstructured unstructured-data vision vision-api
Last synced: 13 Apr 2025
https://github.com/Olney1/ChatGPT-OpenAI-Smart-Speaker
This AI Smart Speaker uses speech recognition, TTS (text-to-speech), and STT (speech-to-text) to enable voice and vision-driven conversations, with additional web search capabilities via OpenAI and Langchain agents.
agents ai artificial-intelligence chatgpt gpt-4 langchain langsmith openai smarthome smartspeaker speech-recognition speech-to-text tavily text-to-speech vision vision-and-language webscraping
Last synced: 07 Apr 2025
https://github.com/FuxiaoLiu/LRV-Instruction?tab=readme-ov-file
[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning
chatgpt evaluation evaluation-metrics foundation-models gpt gpt-4 hallucination iclr iclr2024 llama llava multimodal object-detection prompt-engineering vicuna vision vision-and-language vqa
Last synced: 29 Mar 2025
https://github.com/brakmic/OpenCV
:camera: Computer-Vision Demos
computer-vision ocr ocr-recognition opencv scanimage scanned-documents scanning vision
Last synced: 04 Apr 2025
https://github.com/brakmic/opencv
:camera: Computer-Vision Demos
computer-vision ocr ocr-recognition opencv scanimage scanned-documents scanning vision
Last synced: 12 Sep 2025
https://github.com/olney1/chatgpt-openai-smart-speaker
This AI Smart Speaker uses speech recognition, TTS (text-to-speech), and STT (speech-to-text) to enable voice and vision-driven conversations, with additional web search capabilities via OpenAI and Langchain agents.
agents ai artificial-intelligence chatgpt gpt-4 langchain langsmith openai smarthome smartspeaker speech-recognition speech-to-text tavily text-to-speech vision vision-and-language webscraping
Last synced: 03 Oct 2025
https://github.com/DroidsOnRoids/VisionFaceDetection
An example of use a Vision framework for face landmarks detection in iOS 11
ios11 landmark-detection landmarks vision vision-framework xcode9
Last synced: 22 Feb 2025
https://github.com/gabeur/mmt
Multi-Modal Transformer for Video Retrieval
fusion language multimodal nlp video vision
Last synced: 12 May 2025
https://github.com/fcakyon/craft-text-detector
Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector
actions anaconda computer-vision craft deep-learning document hacktoberfest linux macos neural-network ocr pypi python pytorch text text-detection vision windows workflow
Last synced: 02 Apr 2025
https://github.com/rishikksh20/FNet-pytorch
Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms
feedforward-neural-network fnet fourier-transform image-classification language-model text text-classification transformer vision
Last synced: 08 May 2025
https://github.com/georgegach/flowiz
Converts Optical Flow files to images and optionally compiles them to a video. Flow viewer GUI is also available. Check out mockup right from Github Pages:
converter flo flow image middlebury optical python video vision visualisation visualization
Last synced: 04 Apr 2025
https://github.com/aangelopoulos/conformal_classification
Wrapper for a PyTorch classifier which allows it to output prediction sets. The sets are theoretically guaranteed to contain the true class with high probability (via conformal prediction).
artificial-intelligence classification classifier computer-vision conformal conformal-prediction deep-neural-networks distribution-free imagenet machine-learning neural-networks nonparametric nonparametric-statistics prediction-sets pytorch statistics uncertainty uncertainty-quantification vision
Last synced: 04 Apr 2025
https://github.com/alireza-akhavan/class.vision
Computer vision and Deep learning
computer-vision course-materials image-processing opencv opencv-python pix2pix python tensorflow vision
Last synced: 24 Aug 2025
https://github.com/zsajjad/react-native-text-detector
Text Detector from image for react native using firebase MLKit on android and Tesseract on iOS
core-ml firebase-mlkit react-native tesseract-ios tesseract-ocr text-detection vision
Last synced: 13 May 2025