https://github.com/buddylim/iuys
Intelligently Understanding Your Screenshots
https://github.com/buddylim/iuys
genai lancedb mlx pyee pyhton vlm watchdog
Last synced: about 2 months ago
JSON representation
Intelligently Understanding Your Screenshots
- Host: GitHub
- URL: https://github.com/buddylim/iuys
- Owner: BuddyLim
- Created: 2024-07-18T09:11:23.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-08-22T07:09:01.000Z (about 1 year ago)
- Last Synced: 2025-03-01T04:41:35.292Z (8 months ago)
- Topics: genai, lancedb, mlx, pyee, pyhton, vlm, watchdog
- Language: Python
- Homepage:
- Size: 77.7 MB
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# IUYS (Intelligently Understanding Your Screenshots)
## About
Inspired by [Sam Witteveen](https://github.com/samwit) during his demonstration in Machine Learning Singapore group meetup.
This is a more "software engineering" take on the idea (if you'll allow me) and also to improve my skills relating to application development and GenAI related matters
### Note: This project is developed on a Apple Silicon chip!
## Description
IUYS is a tool that understands your images or screenshots for you to be able perform query and find the relevant results ala "Google Search" style
## Tools Used
Note: lancedb in this usage is an embedded database, once we shut the tooling down it loses all context. We retain context by creating a dump file and loading it back when the tool initializes again
- pyee (Event broker)
- Watchdog (File watcher)
- lancedb (Vector store)
- mlx-vlm (Visual language model framework)
## Flows
### Creation Flow

## To Do List
### General
- Exception handling
- Convert to CLI based tool
- Allow to be used by other program as an external sidecar
- Testing
- Changing of saving key-value store
### File watcher
- Receive file creation events and emit to Queue worker
- Filter file event only by images
- Identify file by their checksums to decide whether to perform VLM ops
- Exception handling
- Testing
### Queue Worker
- Receive file creation events from File Watcher
- Filter any unrelated events
- Task events to a queue
- Optimization?
- Exception handling
- Testing
### OCU
- Receive new tasks from Queue worker and perform inference
- Allow changing of "list-of-allowed" models via CLI arguements
- Testing
- Optimization
- Currently the models being loaded into my M3 Pro 36gb RAM consumes 25gb!! <- (YIKES)
### Vector store
- Receive OCU inferences into embeddings and storing it into vector store
- Retrieval pipeline
- Testing