Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with vision

A curated list of projects in awesome lists tagged with vision .

https://github.com/bvlc/caffe

Caffe: a fast open framework for deep learning.

deep-learning machine-learning vision

Last synced: 29 Sep 2024

https://github.com/BVLC/caffe

Caffe: a fast open framework for deep learning.

deep-learning machine-learning vision

Last synced: 30 Jul 2024

https://github.com/paddlepaddle/paddlehub

Awesome pre-trained models toolkit based on PaddlePaddle. (400+ models including Image, Text, Audio, Video and Cross-Modal with Easy Inference & Serving)【安全加固,暂停交互,请耐心等待】

awesome deep-learning model nlp text2image vision

Last synced: 29 Sep 2024

https://github.com/PaddlePaddle/PaddleHub

Awesome pre-trained models toolkit based on PaddlePaddle. (400+ models including Image, Text, Audio, Video and Cross-Modal with Easy Inference & Serving)

awesome deep-learning model nlp text2image vision

Last synced: 31 Jul 2024

https://github.com/danny-avila/librechat

Enhanced ChatGPT Clone: Features OpenAI, Assistants API, Azure, Groq, GPT-4 Vision, Mistral, Bing, Anthropic, OpenRouter, Vertex AI, Gemini, AI model switching, message search, langchain, DALL-E-3, ChatGPT Plugins, OpenAI Functions, Secure Multi-User System, Presets, completely open-source for self-hosting. More features in development

ai anthropic assistant-api azure bing chatgpt chatgpt-clone claude clone dall-e-3 gemini google gpt-4-vision langchain librechat openai plugins search vision webui

Last synced: 02 Oct 2024

https://github.com/danny-avila/LibreChat

Enhanced ChatGPT Clone: Features OpenAI, Assistants API, Azure, Groq, GPT-4 Vision, Mistral, Bing, Anthropic, OpenRouter, Vertex AI, Gemini, AI model switching, message search, langchain, DALL-E-3, ChatGPT Plugins, OpenAI Functions, Secure Multi-User System, Presets, completely open-source for self-hosting. More features in development

ai anthropic assistant-api azure bing chatgpt chatgpt-clone claude clone dall-e-3 gemini google gpt-4-vision langchain librechat openai plugins search vision webui

Last synced: 31 Jul 2024

https://github.com/Skyvern-AI/skyvern

Automate browser-based workflows with LLMs and Computer Vision

api automation browser browser-automation computer gpt llm playwright python rpa vision workflow

Last synced: 01 Aug 2024

https://github.com/artemnovichkov/iOS-11-by-Examples

👨🏻‍💻 Examples of new iOS 11 APIs

arkit core-nfc coreml ios11 swift vision xcode9

Last synced: 03 Aug 2024

https://github.com/autorope/donkeycar

Open source hardware and software platform to build a small scale self driving car.

cv2 donkeycar jetson-nano keras python raspberry-pi self-driving-car tensorflow vision

Last synced: 29 Sep 2024

https://github.com/sightmachine/SimpleCV

The Open Source Framework for Machine Vision

computer-vision cv image-processing python vision visionprocessing

Last synced: 31 Jul 2024

https://github.com/googlecloudplatform/java-docs-samples

Java and Kotlin Code samples used on cloud.google.com

appengine auth automl cdn java kotlin samples translate video vision

Last synced: 28 Sep 2024

https://github.com/mediar-ai/screenpipe

Library to build personalized AI powered by what you've seen, said, or heard. Works with Ollama. Alternative to Rewind.ai. Open. Secure. You own your data. Rust.

ai computer-vision llm machine-learning ml multimodal vision

Last synced: 01 Oct 2024

https://github.com/roatienza/deep-learning-experiments

Videos, notes and experiments to understand deep learning

artificial-intelligence deep-learning deep-learning-tutorial nlp pytorch speech vision

Last synced: 01 Oct 2024

https://github.com/roatienza/Deep-Learning-Experiments

Videos, notes and experiments to understand deep learning

artificial-intelligence deep-learning deep-learning-tutorial nlp pytorch speech vision

Last synced: 31 Jul 2024

https://github.com/kevingong2013/chineseidcardocr

[Deprecated] 🇨🇳中国二代身份证光学识别

cnn coreml deep-learning ios11 machine-learning swift vision xcode

Last synced: 30 Sep 2024

https://github.com/KevinGong2013/ChineseIDCardOCR

[Deprecated] 🇨🇳中国二代身份证光学识别

cnn coreml deep-learning ios11 machine-learning swift vision xcode

Last synced: 02 Aug 2024

https://github.com/lucidrains/mlp-mixer-pytorch

An All-MLP solution for Vision, from Google AI

deep-learning vision

Last synced: 03 Oct 2024

https://github.com/aheze/OpenFind

An app to find text in real life.

app camera find hacktoberfest ios ocr photos realm swift swiftui uikit vision

Last synced: 31 Jul 2024

https://github.com/anupamchugh/iowncode

A curated collection of iOS, ML, AR resources sprinkled with some UI additions

alamofire arkit computer-vision coreml coremltools ios keras ml-kit natural-language-processing nlp realitykit swift swiftui vision vision-framework

Last synced: 09 Aug 2024

https://github.com/andyzeng/visual-pushing-grasping

Train robotic agents to learn to plan pushing and grasping actions for manipulation with deep reinforcement learning.

3d artificial-intelligence computer-vision deep-learning deep-reinforcement-learning grasping manipulation pushing robotics vision

Last synced: 02 Aug 2024

https://github.com/Celebrandil/CudaSift

A CUDA implementation of SIFT for NVidia GPUs (1.2 ms on a GTX 1060)

cuda gpu nvidia sift vision

Last synced: 02 Aug 2024

https://github.com/andyzeng/3dmatch-toolbox

3DMatch - a 3D ConvNet-based local geometric descriptor for aligning 3D meshes and point clouds.

3d 3d-deep-learning 3dmatch artificial-intelligence computer-vision deep-learning geometry-processing point-cloud rgbd vision

Last synced: 31 Jul 2024

https://github.com/AravisProject/aravis

A vision library for genicam based cameras

c camera genicam gige glib gobject gobject-introspection gstreamer gtk3 meson usb3 video vision

Last synced: 01 Aug 2024

https://github.com/louis030195/screen-pipe

Library to build personalized AI powered by what you've seen, said, or heard. Works with Ollama. Alternative to Rewind.ai. Open. Secure. You own your data. Rust.

ai computer-vision llm machine-learning ml multimodal vision

Last synced: 01 Aug 2024

https://github.com/evilgix/evil

Optical Character Recognition in Swift for iOS&macOS. 银行卡、身份证、门牌号光学识别

cnn-model keras machine-learning ocr swift4 vision

Last synced: 28 Sep 2024

https://github.com/evilgix/Evil

Optical Character Recognition in Swift for iOS&macOS. 银行卡、身份证、门牌号光学识别

cnn-model keras machine-learning ocr swift4 vision

Last synced: 04 Aug 2024

https://github.com/anki/vector-python-sdk

Anki Vector Python SDK

ai anki robot robotics vector vision

Last synced: 30 Sep 2024

https://github.com/RobotLocomotion/pytorch-dense-correspondence

Code for "Dense Object Nets: Learning Dense Visual Object Descriptors By and For Robotic Manipulation"

3d artificial-intelligence computer-vision deep-learning manipulation pytorch robotics self-supervised-learning vision

Last synced: 03 Aug 2024

https://github.com/google-research/ravens

Train robotic agents to learn pick and place with deep learning for vision-based manipulation in PyBullet. Transporter Nets, CoRL 2020.

artificial-intelligence computer-vision deep-learning imitation-learning manipulation openai-gym pick-and-place pybullet rearrangement reinforcement-learning robotics tensorflow transporter-nets vision

Last synced: 04 Aug 2024

https://github.com/davidbau/rewriting

Rewriting a Deep Generative Model, ECCV 2020 (oral). Interactive tool to directly edit the rules of a GAN to synthesize scenes with objects added, removed, or altered. Change StyleGANv2 to make extravagant eyebrows, or horses wearing hats.

deep-learning gans graphics hci machine-learning research vision

Last synced: 01 Aug 2024

https://github.com/rowanz/neural-motifs

Code for Neural Motifs: Scene Graph Parsing with Global Context (CVPR 2018)

pytorch scene-graph vision visual-genome

Last synced: 01 Aug 2024

https://github.com/mostafasadeghi97/design2code

Convert any web design screenshot to clean HTML/CSS code

ai code-generation coding-assistant design-to-code gpt4 openai vision

Last synced: 01 Aug 2024

https://github.com/KimDarren/FaceCropper

:scissors: Crop faces, inside of your image, with iOS 11 Vision api.

face face-detection face-recognition ios ios11 swift vision vision-api

Last synced: 14 Aug 2024

https://github.com/HRNet/HRFormer

[ NeurIPS2021] This is an official implementation of our paper "HRFormer: High-Resolution Transformer for Dense Prediction".

classification hrnet pose-estimation segmentation transformer vision

Last synced: 03 Aug 2024

https://github.com/rowanz/r2c

Recognition to Cognition Networks (code for the model in "From Recognition to Cognition: Visual Commonsense Reasoning", CVPR 2019)

commonsense reasoning vcr vision visual visual-commonsense-reasoning

Last synced: 01 Aug 2024

https://github.com/zihangJiang/TokenLabeling

Pytorch implementation of "All Tokens Matter: Token Labeling for Training Better Vision Transformers"

imagenet lv-vit pytorch segmentation transformer vision

Last synced: 02 Aug 2024

https://github.com/tomrunia/opticalflow_visualization

Python optical flow visualization following Baker et al. (ICCV 2007) as used by the MPI-Sintel challenge

iccv motion opencv optical-flow python vision visualization

Last synced: 26 Sep 2024

https://github.com/Myndex/SAPC-APCA

APCA (Accessible Perceptual Contrast Algorithm) is a new method for predicting contrast for use in emerging web standards (WCAG 3) for determining readability contrast. APCA is derived form the SAPC (S-LUV Advanced Predictive Color) which is an accessibility-oriented color appearance model designed for self-illuminated displays.

accessibility apca cieluv color color-contrast color-contrast-checker color-models color-theory colorimetry contrast contrast-calculator css luminance readability srgb vision wcag wcag-contrast web

Last synced: 03 Aug 2024

https://github.com/wpiroboticsprojects/grip

Program for rapidly developing computer vision applications

camera computer-vision first-frc first-robotics-competition firstrobotics opencv robotics vision wpi

Last synced: 26 Sep 2024

https://github.com/WPIRoboticsProjects/GRIP

Program for rapidly developing computer vision applications

camera computer-vision first-frc first-robotics-competition firstrobotics opencv robotics vision wpi

Last synced: 03 Aug 2024

https://github.com/AprilRobotics/apriltag_ros

A ROS wrapper of the AprilTag 3 visual fiducial detector

apriltags fiducial-markers ros vision wrapper

Last synced: 02 Aug 2024

https://github.com/cocoa-ai/facesvisiondemo

👀 iOS11 demo application for age and gender classification of facial images.

coreml coreml-models emotion-recognition facial-recognition gender-classification ios machine-learning swift swift4 vision

Last synced: 28 Sep 2024

https://github.com/cocoa-ai/FacesVisionDemo

👀 iOS11 demo application for age and gender classification of facial images.

coreml coreml-models emotion-recognition facial-recognition gender-classification ios machine-learning swift swift4 vision

Last synced: 03 Aug 2024

https://github.com/Feghal/ImageDetect

✂️ Detect and crop faces, barcodes and texts in image with iOS 11 Vision api.

barcode detector face face-detection face-recognition ios ios11 recognition swift vision vision-api

Last synced: 17 Aug 2024

https://github.com/photonvision/photonvision

PhotonVision is the free, fast, and easy-to-use computer vision solution for the FIRST Robotics Competition.

computer-vision frc java opencv vision vision-processing wpilib

Last synced: 29 Sep 2024

https://github.com/gabeur/mmt

Multi-Modal Transformer for Video Retrieval

fusion language multimodal nlp video vision

Last synced: 03 Aug 2024

https://github.com/rishikksh20/FNet-pytorch

Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms

feedforward-neural-network fnet fourier-transform image-classification language-model text text-classification transformer vision

Last synced: 03 Aug 2024

https://github.com/fcakyon/craft-text-detector

Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector

actions anaconda computer-vision craft deep-learning document hacktoberfest linux macos neural-network ocr pypi python pytorch text text-detection vision windows workflow

Last synced: 01 Aug 2024

https://github.com/georgegach/flowiz

Converts Optical Flow files to images and optionally compiles them to a video. Flow viewer GUI is also available. Check out mockup right from Github Pages:

converter flo flow image middlebury optical python video vision visualisation visualization

Last synced: 01 Aug 2024

https://github.com/zsajjad/react-native-text-detector

Text Detector from image for react native using firebase MLKit on android and Tesseract on iOS

core-ml firebase-mlkit react-native tesseract-ios tesseract-ocr text-detection vision

Last synced: 06 Aug 2024

https://github.com/aangelopoulos/conformal_classification

Wrapper for a PyTorch classifier which allows it to output prediction sets. The sets are theoretically guaranteed to contain the true class with high probability (via conformal prediction).

artificial-intelligence classification classifier computer-vision conformal conformal-prediction deep-neural-networks distribution-free imagenet machine-learning neural-networks nonparametric nonparametric-statistics prediction-sets pytorch statistics uncertainty uncertainty-quantification vision

Last synced: 01 Aug 2024

https://github.com/lucidrains/halonet-pytorch

Implementation of the 😇 Attention layer from the paper, Scaling Local Self-Attention For Parameter Efficient Visual Backbones

artificial-intelligence attention-mechanism deep-learning vision

Last synced: 03 Oct 2024

https://github.com/lucidrains/res-mlp-pytorch

Implementation of ResMLP, an all MLP solution to image classification, in Pytorch

artificial-intelligence deep-learning vision

Last synced: 03 Oct 2024

https://github.com/Olney1/ChatGPT-OpenAI-Smart-Speaker

This AI Smart Speaker uses speech recognition and text-to-speech to enable voice-driven conversations and vision capabilities with OpenAI and Agents. The user speaks a prompt into the microphone, and the program sends the prompt to OpenAI to generate a response. The response is then converted to an audio file and played back to the user.

agents ai artificial-intelligence chatgpt gpt-4 langchain langsmith openai smarthome smartspeaker speech-recognition speech-to-text text-to-speech vision

Last synced: 01 Aug 2024

https://github.com/Crowsinc/LiveVisionKit

LiveVisionKit brings the powers of computer vision and image processing to OBS Studio; implementing state of the art filters such as image enhancement and real-time video stabilization.

livestream obs obs-studio opencv stream vision

Last synced: 01 Aug 2024

https://github.com/mattlawer/FaceLandmarksDetection

Finds facial features such as face contour, eyes, mouth and nose in an image.

face face-detection face-landmarking face-landmarks face-tracking framework ios11 landmakring landmark-detection snapchat vision

Last synced: 02 Aug 2024

https://github.com/mackysoft/Vision

UnityEngine.CullingGroup API for everyone.

csharp culling distance fast performance unity visibility vision

Last synced: 03 Aug 2024

https://tiger-ai-lab.github.io/Mantis/

Official code for Paper "Mantis: Multi-Image Instruction Tuning"

fuyu language llava-llama3 lmm mantis mllm multi-image-understanding multimodal video vision vlm

Last synced: 01 Aug 2024

https://github.com/BiomedSciAI/fuse-med-ml

A python framework accelerating ML based discovery in the medical field by encouraging code reuse. Batteries included :)

ai cmmd collaboration ct deep-learning fuse fuse-med-ml fusemedml hacktoberfest healthcare isic knight-challenge machine-learning medical medical-imaging multimodality python pytorch stoic vision

Last synced: 03 Aug 2024

https://github.com/IBM-Cloud/openwhisk-darkvisionapp

Discover dark data in videos with IBM Watson and IBM Cloud Functions

audio cloudant ibm-bluemix ibm-cloud-solutions openwhisk video vision watson-speech watson-visual-recognition

Last synced: 04 Aug 2024

https://github.com/eliranwong/toolmate

FreeGenius AI, an advanced AI assistant that can talk and take multi-step actions. Supports numerous open-source LLMs via Llama.cpp or Ollama or Groq Cloud API, with optional integration with AutoGen agents, OpenAI API, Google Gemini Pro and unlimited plugins.

agent ai autogen chatgpt fabric gemini google groq llama3 llamacpp ollama openai plugin stable-diffusion tool vision

Last synced: 01 Oct 2024

https://github.com/cocoa-ai/FlowersVisionDemo

🌸 iOS11 demo application for flower classification.

coreml flower-classification ios machine-learning swift swift4 vision

Last synced: 09 Aug 2024

https://github.com/paganpasta/eqxvision

A Python package of computer vision models for the Equinox ecosystem.

equinox python pytorch vision

Last synced: 01 Aug 2024

https://github.com/cameleon-rs/cameleon

A safe, fast, and flexible library for GenICam compatible cameras

camera genapi genicam gige rust usb3 uvc vision

Last synced: 24 Sep 2024

https://github.com/xiaohk/FaceData

A macOS app to parse face landmarks from a video for training GANs

annotator gans macos swift vision

Last synced: 01 Aug 2024

https://github.com/RituYadav92/Radar-RGB-Attentive-Multimodal-Object-Detection

Object Detection on Radar sensor and RGB camera images. https://ieeexplore.ieee.org/document/9191046

attention-model autonomous-vehicles multifusion-technique object-detection radar rgbd-image-processing sensor-fusion vision

Last synced: 31 Jul 2024

https://github.com/e-roy/gemini-pro-vision-playground

A simple playground Web UI for using the Gemini Pro Vision and Gemini Pro AI models with Next.js

gemini gemini-ai gemini-api gemini-pro gemini-pro-vision nextjs vision

Last synced: 07 Aug 2024

https://github.com/jessielw/HDR-Multi-Tool

A graphical user interface for parsing HDR10+ and Dolby Vision

dolby dolbyvision electron extract gui hdr10 hdr10plus json modern parser queue rpu tool vision windows

Last synced: 04 Aug 2024

https://github.com/IvLabs/autonomous-delivery-robot

Repository for Autonomous Delivery Robot project of IvLabs, VNIT

arduino autonomous-driving autonomous-vehicles controls hacktoberfest planning ros segmentation vision

Last synced: 31 Jul 2024

https://github.com/kabouzeid/point2vec

Self-Supervised Representation Learning on Point Clouds (GCPR 2023 | T4V Workshop @ CVPR 2023)

lightning machine-learning point-cloud pytorch self-supervised-learning transformer vision vision-transformer

Last synced: 31 Jul 2024

https://github.com/cocoa-ai/InceptionVisionDemo

🎥 iOS11 demo application for dominant objects detection.

coreml inceptionv3 ios machine-learning object-classification object-detection swift swift4 vision

Last synced: 09 Aug 2024

https://github.com/cocoa-ai/StylesVisionDemo

🖼 iOS11 demo application for image style classification.

classification coreml ios machine-learning swift swift4 vision

Last synced: 09 Aug 2024

https://github.com/alladinian/Visionaire

Streamlined, ergonomic APIs around Apple's Vision framework

computer-vision ios macos swift vision

Last synced: 09 Aug 2024

https://github.com/opensight-cv/opensight

Easy-to-use, powerful, and free vision suite.

first-robotics-competition frc frc-vision vision visionprocessing

Last synced: 26 Sep 2024

https://github.com/root-systems/handbook

We're a small high-trust livelihood pod doing tech consulting within Enspiral.

consulting cooperative ecosystem handbook life open-source vision workers

Last synced: 01 Aug 2024

https://github.com/cocoa-ai/SentimentVisionDemo

🌅 iOS11 demo application for visual sentiment prediction.

coreml coreml-models ios machine-learning sentiment-analysis swift swift4 vision

Last synced: 03 Aug 2024