Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with vision

A curated list of projects in awesome lists tagged with vision .

https://github.com/bvlc/caffe

Caffe: a fast open framework for deep learning.

deep-learning machine-learning vision

Last synced: 16 Dec 2024

https://github.com/BVLC/caffe

Caffe: a fast open framework for deep learning.

deep-learning machine-learning vision

Last synced: 25 Oct 2024

https://github.com/xtls/xray-core

Xray, Penetrates Everything. Also the best v2ray-core, with XTLS support. Fully compatible configuration.

anticensorship dns network proxy reality shadowsocks socks5 tls trojan tunnel utls vision vless vmess vpn wireguard xhttp xray xtls xudp

Last synced: 16 Dec 2024

https://github.com/danny-avila/librechat

Enhanced ChatGPT Clone: Features Anthropic, AWS, OpenAI, Assistants API, Azure, Groq, o1, GPT-4o, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, langchain, DALL-E-3, ChatGPT Plugins, OpenAI Functions, Secure Multi-User System, Presets, completely open-source for self-hosting. Actively in public development.

ai anthropic artifacts assistant-api aws azure chatgpt chatgpt-clone claude clone dall-e-3 gemini google librechat o1 openai plugins search vision webui

Last synced: 16 Dec 2024

https://github.com/PaddlePaddle/PaddleHub

Awesome pre-trained models toolkit based on PaddlePaddle. (400+ models including Image, Text, Audio, Video and Cross-Modal with Easy Inference & Serving)【安全加固,暂停交互,请耐心等待】

awesome deep-learning model nlp text2image vision

Last synced: 29 Oct 2024

https://github.com/paddlepaddle/paddlehub

Awesome pre-trained models toolkit based on PaddlePaddle. (400+ models including Image, Text, Audio, Video and Cross-Modal with Easy Inference & Serving)【安全加固,暂停交互,请耐心等待】

awesome deep-learning model nlp text2image vision

Last synced: 29 Sep 2024

https://github.com/danny-avila/LibreChat

Enhanced ChatGPT Clone: Features OpenAI, Assistants API, Azure, Groq, GPT-4 Vision, Mistral, Bing, Anthropic, OpenRouter, Vertex AI, Gemini, AI model switching, message search, langchain, DALL-E-3, ChatGPT Plugins, OpenAI Functions, Secure Multi-User System, Presets, completely open-source for self-hosting. More features in development

ai anthropic assistant-api azure bing chatgpt chatgpt-clone claude clone dall-e-3 gemini google gpt-4-vision langchain librechat openai plugins search vision webui

Last synced: 27 Oct 2024

https://github.com/mediar-ai/screenpipe

Rust API to get all user desktop data (local, cross platform, 24/7, screen, voice, keyboard, mouse, camera recording). sandboxed js plugin system. keyboard and mouse control

agents agi ai computer-vision llm machine-learning ml multimodal vision

Last synced: 17 Dec 2024

https://github.com/Skyvern-AI/skyvern

Automate browser-based workflows with LLMs and Computer Vision

api automation browser browser-automation computer gpt llm playwright python rpa vision workflow

Last synced: 06 Nov 2024

https://github.com/skyvern-ai/skyvern

Automate browser-based workflows with LLMs and Computer Vision

api automation browser browser-automation computer gpt llm playwright python rpa vision workflow

Last synced: 17 Dec 2024

https://github.com/Skyvern-AI/Skyvern

Automate browser-based workflows with LLMs and Computer Vision

api automation browser browser-automation computer gpt llm playwright python rpa vision workflow

Last synced: 22 Oct 2024

https://github.com/dooy/chatgpt-web-midjourney-proxy

One UI is all done with chatgpt web, midjourney, gpts,suno,luma,runway,viggle,flux,ideogram,realtime,pika,udio; Simultaneous support Web / PWA / Linux / Win / MacOS platform

chatgpt-ui claude-3 flux gpts gpts-ui gptstore ideogram kling luma midjourney midjourney-ui pika realtime runway suno udio viggle vision whisper-ui

Last synced: 17 Dec 2024

https://github.com/artemnovichkov/iOS-11-by-Examples

👨🏻‍💻 Examples of new iOS 11 APIs

arkit core-nfc coreml ios11 swift vision xcode9

Last synced: 18 Nov 2024

https://github.com/artemnovichkov/ios-11-by-examples

👨🏻‍💻 Examples of new iOS 11 APIs

arkit core-nfc coreml ios11 swift vision xcode9

Last synced: 20 Dec 2024

https://github.com/autorope/donkeycar

Open source hardware and software platform to build a small scale self driving car.

cv2 donkeycar jetson-nano keras python raspberry-pi self-driving-car tensorflow vision

Last synced: 16 Dec 2024

https://github.com/sightmachine/SimpleCV

The Open Source Framework for Machine Vision

computer-vision cv image-processing python vision visionprocessing

Last synced: 27 Oct 2024

https://github.com/googlecloudplatform/java-docs-samples

Java and Kotlin Code samples used on cloud.google.com

appengine auth automl cdn java kotlin samples translate video vision

Last synced: 17 Dec 2024

https://github.com/ten-framework/ten-agent

TEN Agent is the world’s first real-time multimodal agent integrated with the OpenAI Realtime API, RTC, and features weather checks, web search, vision, and RAG capabilities.

agent ai asr cpp gemini golang gpt-4 gpt-4o llm low-latency multimodal nextjs14 openai python rag real-time realtime tts vision voice-assistant

Last synced: 01 Nov 2024

https://github.com/roatienza/deep-learning-experiments

Videos, notes and experiments to understand deep learning

artificial-intelligence deep-learning deep-learning-tutorial nlp pytorch speech vision

Last synced: 19 Dec 2024

https://github.com/roatienza/Deep-Learning-Experiments

Videos, notes and experiments to understand deep learning

artificial-intelligence deep-learning deep-learning-tutorial nlp pytorch speech vision

Last synced: 30 Oct 2024

https://github.com/kevingong2013/chineseidcardocr

[Deprecated] 🇨🇳中国二代身份证光学识别

cnn coreml deep-learning ios11 machine-learning swift vision xcode

Last synced: 18 Dec 2024

https://github.com/KevinGong2013/ChineseIDCardOCR

[Deprecated] 🇨🇳中国二代身份证光学识别

cnn coreml deep-learning ios11 machine-learning swift vision xcode

Last synced: 10 Nov 2024

https://github.com/TEN-framework/TEN-Agent

TEN Agent is the world’s first real-time multimodal agent integrated with the OpenAI Realtime API, RTC, and features weather checks, web search, vision, and RAG capabilities.

agent ai asr cpp gemini golang gpt-4 gpt-4o llm low-latency multimodal nextjs14 openai python rag real-time realtime tts vision voice-assistant

Last synced: 22 Oct 2024

https://github.com/lucidrains/mlp-mixer-pytorch

An All-MLP solution for Vision, from Google AI

deep-learning vision

Last synced: 19 Dec 2024

https://github.com/aheze/openfind

An app to find text in real life.

app camera find hacktoberfest ios ocr photos realm swift swiftui uikit vision

Last synced: 18 Dec 2024

https://github.com/aheze/OpenFind

An app to find text in real life.

app camera find hacktoberfest ios ocr photos realm swift swiftui uikit vision

Last synced: 30 Oct 2024

https://github.com/jenly1314/mlkit

🌝 MLKit是一个强大易用的工具包。通过ML Kit您可以很轻松的实现文字识别、条码识别、图像标记、人脸检测、对象检测等功能。

android barcode barcode-scanning camerax face-detection image-labeling machine-learning machine-learning-library mlkit object-detection object-recognition ocr pose-detection qrcode recognition segmentation-selfie text-recognition vision

Last synced: 20 Dec 2024

https://github.com/andyzeng/visual-pushing-grasping

Train robotic agents to learn to plan pushing and grasping actions for manipulation with deep reinforcement learning.

3d artificial-intelligence computer-vision deep-learning deep-reinforcement-learning grasping manipulation pushing robotics vision

Last synced: 18 Dec 2024

https://github.com/aravisproject/aravis

A vision library for genicam based cameras

c camera genicam gige glib gobject gobject-introspection gstreamer gtk3 meson usb3 video vision

Last synced: 19 Dec 2024

https://github.com/AravisProject/aravis

A vision library for genicam based cameras

c camera genicam gige glib gobject gobject-introspection gstreamer gtk3 meson usb3 video vision

Last synced: 03 Nov 2024

https://github.com/anupamchugh/iowncode

A curated collection of iOS, ML, AR resources sprinkled with some UI additions

alamofire arkit computer-vision coreml coremltools ios keras ml-kit natural-language-processing nlp realitykit swift swiftui vision vision-framework

Last synced: 29 Nov 2024

https://github.com/andyzeng/3dmatch-toolbox

3DMatch - a 3D ConvNet-based local geometric descriptor for aligning 3D meshes and point clouds.

3d 3d-deep-learning 3dmatch artificial-intelligence computer-vision deep-learning geometry-processing point-cloud rgbd vision

Last synced: 17 Dec 2024

https://github.com/Celebrandil/CudaSift

A CUDA implementation of SIFT for NVidia GPUs (1.2 ms on a GTX 1060)

cuda gpu nvidia sift vision

Last synced: 13 Nov 2024

https://github.com/evilgix/Evil

Optical Character Recognition in Swift for iOS&macOS. 银行卡、身份证、门牌号光学识别

cnn-model keras machine-learning ocr swift4 vision

Last synced: 19 Nov 2024

https://github.com/evilgix/evil

Optical Character Recognition in Swift for iOS&macOS. 银行卡、身份证、门牌号光学识别

cnn-model keras machine-learning ocr swift4 vision

Last synced: 22 Dec 2024

https://github.com/google-research/ravens

Train robotic agents to learn pick and place with deep learning for vision-based manipulation in PyBullet. Transporter Nets, CoRL 2020.

artificial-intelligence computer-vision deep-learning imitation-learning manipulation openai-gym pick-and-place pybullet rearrangement reinforcement-learning robotics tensorflow transporter-nets vision

Last synced: 15 Dec 2024

https://github.com/anki/vector-python-sdk

Anki Vector Python SDK

ai anki robot robotics vector vision

Last synced: 17 Dec 2024

https://github.com/robotlocomotion/pytorch-dense-correspondence

Code for "Dense Object Nets: Learning Dense Visual Object Descriptors By and For Robotic Manipulation"

3d artificial-intelligence computer-vision deep-learning manipulation pytorch robotics self-supervised-learning vision

Last synced: 15 Dec 2024

https://github.com/RobotLocomotion/pytorch-dense-correspondence

Code for "Dense Object Nets: Learning Dense Visual Object Descriptors By and For Robotic Manipulation"

3d artificial-intelligence computer-vision deep-learning manipulation pytorch robotics self-supervised-learning vision

Last synced: 14 Nov 2024

https://github.com/mostafasadeghi97/design2code

Convert any web design screenshot to clean HTML/CSS code

ai code-generation coding-assistant design-to-code gpt4 openai vision

Last synced: 04 Nov 2024

https://github.com/davidbau/rewriting

Rewriting a Deep Generative Model, ECCV 2020 (oral). Interactive tool to directly edit the rules of a GAN to synthesize scenes with objects added, removed, or altered. Change StyleGANv2 to make extravagant eyebrows, or horses wearing hats.

deep-learning gans graphics hci machine-learning research vision

Last synced: 21 Dec 2024

https://github.com/rowanz/neural-motifs

Code for Neural Motifs: Scene Graph Parsing with Global Context (CVPR 2018)

pytorch scene-graph vision visual-genome

Last synced: 02 Nov 2024

https://github.com/hrnet/hrformer

[ NeurIPS2021] This is an official implementation of our paper "HRFormer: High-Resolution Transformer for Dense Prediction".

classification hrnet pose-estimation segmentation transformer vision

Last synced: 16 Dec 2024

https://github.com/KimDarren/FaceCropper

:scissors: Crop faces, inside of your image, with iOS 11 Vision api.

face face-detection face-recognition ios ios11 swift vision vision-api

Last synced: 06 Dec 2024

https://github.com/HRNet/HRFormer

[ NeurIPS2021] This is an official implementation of our paper "HRFormer: High-Resolution Transformer for Dense Prediction".

classification hrnet pose-estimation segmentation transformer vision

Last synced: 18 Nov 2024

https://github.com/rowanz/r2c

Recognition to Cognition Networks (code for the model in "From Recognition to Cognition: Visual Commonsense Reasoning", CVPR 2019)

commonsense reasoning vcr vision visual visual-commonsense-reasoning

Last synced: 07 Nov 2024

https://github.com/myndex/sapc-apca

APCA (Accessible Perceptual Contrast Algorithm) is a new method for predicting contrast for use in emerging web standards (WCAG 3) for determining readability contrast. APCA is derived form the SAPC (S-LUV Advanced Predictive Color) which is an accessibility-oriented color appearance model designed for self-illuminated displays.

accessibility apca cieluv color color-contrast color-contrast-checker color-models color-theory colorimetry contrast contrast-calculator css luminance readability srgb vision wcag wcag-contrast web

Last synced: 13 Dec 2024

https://github.com/zihangJiang/TokenLabeling

Pytorch implementation of "All Tokens Matter: Token Labeling for Training Better Vision Transformers"

imagenet lv-vit pytorch segmentation transformer vision

Last synced: 13 Nov 2024

https://github.com/tomrunia/opticalflow_visualization

Python optical flow visualization following Baker et al. (ICCV 2007) as used by the MPI-Sintel challenge

iccv motion opencv optical-flow python vision visualization

Last synced: 16 Dec 2024

https://github.com/Myndex/SAPC-APCA

APCA (Accessible Perceptual Contrast Algorithm) is a new method for predicting contrast for use in emerging web standards (WCAG 3) for determining readability contrast. APCA is derived form the SAPC (S-LUV Advanced Predictive Color) which is an accessibility-oriented color appearance model designed for self-illuminated displays.

accessibility apca cieluv color color-contrast color-contrast-checker color-models color-theory colorimetry contrast contrast-calculator css luminance readability srgb vision wcag wcag-contrast web

Last synced: 14 Nov 2024

https://github.com/wpiroboticsprojects/grip

Program for rapidly developing computer vision applications

camera computer-vision first-frc first-robotics-competition firstrobotics opencv robotics vision wpi

Last synced: 15 Dec 2024

https://github.com/WPIRoboticsProjects/GRIP

Program for rapidly developing computer vision applications

camera computer-vision first-frc first-robotics-competition firstrobotics opencv robotics vision wpi

Last synced: 17 Nov 2024

https://github.com/aprilrobotics/apriltag_ros

A ROS wrapper of the AprilTag 3 visual fiducial detector

apriltags fiducial-markers ros vision wrapper

Last synced: 21 Dec 2024

https://github.com/AprilRobotics/apriltag_ros

A ROS wrapper of the AprilTag 3 visual fiducial detector

apriltags fiducial-markers ros vision wrapper

Last synced: 13 Nov 2024

https://github.com/cocoa-ai/FacesVisionDemo

👀 iOS11 demo application for age and gender classification of facial images.

coreml coreml-models emotion-recognition facial-recognition gender-classification ios machine-learning swift swift4 vision

Last synced: 17 Nov 2024

https://github.com/cocoa-ai/facesvisiondemo

👀 iOS11 demo application for age and gender classification of facial images.

coreml coreml-models emotion-recognition facial-recognition gender-classification ios machine-learning swift swift4 vision

Last synced: 18 Dec 2024

https://github.com/andyzeng/apc-vision-toolbox

MIT-Princeton Vision Toolbox for the Amazon Picking Challenge 2016 - RGB-D ConvNet-based object segmentation and 6D object pose estimation.

3d amazon-picking-challenge artificial-intelligence computer-vision deep-learning marvin mit-princeton rgbd ros segmentation vision

Last synced: 20 Dec 2024

https://github.com/Feghal/ImageDetect

✂️ Detect and crop faces, barcodes and texts in image with iOS 11 Vision api.

barcode detector face face-detection face-recognition ios ios11 recognition swift vision vision-api

Last synced: 09 Dec 2024

https://github.com/andyzeng/arc-robot-vision

MIT-Princeton Vision Toolbox for Robotic Pick-and-Place at the Amazon Robotics Challenge 2017 - Robotic Grasping and One-shot Recognition of Novel Objects with Deep Learning.

3d amazon-robotics-challenge artificial-intelligence computer-vision deep-learning grasping manipulation mit-princeton rgbd vision

Last synced: 20 Dec 2024

https://github.com/cheind/dest

:panda_face: One Millisecond Deformable Shape Tracking Library (DEST)

face-alignment face-detector machine-learning vision

Last synced: 17 Dec 2024

https://github.com/photonvision/photonvision

PhotonVision is the free, fast, and easy-to-use computer vision solution for the FIRST Robotics Competition.

computer-vision frc java opencv vision vision-processing wpilib

Last synced: 21 Dec 2024

https://github.com/ori-mrg/robotcar-dataset-sdk

Software Development Kit for the Oxford Robotcar Dataset

datasets learning matlab python robotics vision website

Last synced: 16 Dec 2024

https://github.com/gabeur/mmt

Multi-Modal Transformer for Video Retrieval

fusion language multimodal nlp video vision

Last synced: 18 Nov 2024

https://github.com/fcakyon/craft-text-detector

Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector

actions anaconda computer-vision craft deep-learning document hacktoberfest linux macos neural-network ocr pypi python pytorch text text-detection vision windows workflow

Last synced: 03 Nov 2024

https://github.com/rishikksh20/FNet-pytorch

Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms

feedforward-neural-network fnet fourier-transform image-classification language-model text text-classification transformer vision

Last synced: 15 Nov 2024

https://github.com/georgegach/flowiz

Converts Optical Flow files to images and optionally compiles them to a video. Flow viewer GUI is also available. Check out mockup right from Github Pages:

converter flo flow image middlebury optical python video vision visualisation visualization

Last synced: 05 Nov 2024

https://github.com/Olney1/ChatGPT-OpenAI-Smart-Speaker

This AI Smart Speaker uses speech recognition and text-to-speech to enable voice-driven conversations and vision capabilities with OpenAI and Agents. The user speaks a prompt into the microphone, and the program sends the prompt to OpenAI to generate a response. The response is then converted to an audio file and played back to the user.

agents ai artificial-intelligence chatgpt gpt-4 langchain langsmith openai smarthome smartspeaker speech-recognition speech-to-text text-to-speech vision

Last synced: 06 Nov 2024

https://github.com/zsajjad/react-native-text-detector

Text Detector from image for react native using firebase MLKit on android and Tesseract on iOS

core-ml firebase-mlkit react-native tesseract-ios tesseract-ocr text-detection vision

Last synced: 25 Nov 2024

https://github.com/aangelopoulos/conformal_classification

Wrapper for a PyTorch classifier which allows it to output prediction sets. The sets are theoretically guaranteed to contain the true class with high probability (via conformal prediction).

artificial-intelligence classification classifier computer-vision conformal conformal-prediction deep-neural-networks distribution-free imagenet machine-learning neural-networks nonparametric nonparametric-statistics prediction-sets pytorch statistics uncertainty uncertainty-quantification vision

Last synced: 05 Nov 2024

https://github.com/lucidrains/halonet-pytorch

Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones

artificial-intelligence attention-mechanism deep-learning vision

Last synced: 17 Dec 2024

https://github.com/lucidrains/res-mlp-pytorch

Implementation of ResMLP, an all MLP solution to image classification, in Pytorch

artificial-intelligence deep-learning vision

Last synced: 16 Dec 2024

https://github.com/google-research/nested-transformer

Nested Hierarchical Transformer https://arxiv.org/pdf/2105.12723.pdf

imagenet transformer vision

Last synced: 21 Dec 2024

https://tiger-ai-lab.github.io/Mantis/

Official code for Paper "Mantis: Multi-Image Instruction Tuning"

fuyu language llava-llama3 lmm mantis mllm multi-image-understanding multimodal video vision vlm

Last synced: 07 Nov 2024

https://github.com/daddydrac/-deprecated-NVIDIA-GPU-Tensor-Core-Accelerator-PyTorch-OpenCV

Computer vision container that includes Jupyter notebooks with built-in code hinting, Anaconda, CUDA 11.8, TensorRT inference accelerator for Tensor cores, CuPy (GPU drop in replacement for Numpy), PyTorch, PyTorch geometric for Graph Neural Networks, TF2, Tensorboard, and OpenCV for accelerated workloads on NVIDIA Tensor cores and GPUs.

computer-vision cupy deep-learning image-processing machine-learning opencv pytorch tensorboard tensorflow2 tensorrt-inference-accelerator vision

Last synced: 06 Nov 2024

https://github.com/Crowsinc/LiveVisionKit

LiveVisionKit brings the powers of computer vision and image processing to OBS Studio; implementing state of the art filters such as image enhancement and real-time video stabilization.

livestream obs obs-studio opencv stream vision

Last synced: 05 Nov 2024

https://github.com/mattlawer/FaceLandmarksDetection

Finds facial features such as face contour, eyes, mouth and nose in an image.

face face-detection face-landmarking face-landmarks face-tracking framework ios11 landmakring landmark-detection snapchat vision

Last synced: 12 Nov 2024