https://github.com/buddylim/iuys

Intelligently Understanding Your Screenshots
https://github.com/buddylim/iuys

genai lancedb mlx pyee pyhton vlm watchdog

Last synced: about 2 months ago
JSON representation

Intelligently Understanding Your Screenshots

Host: GitHub
URL: https://github.com/buddylim/iuys
Owner: BuddyLim
Created: 2024-07-18T09:11:23.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-08-22T07:09:01.000Z (about 1 year ago)
Last Synced: 2025-03-01T04:41:35.292Z (8 months ago)
Topics: genai, lancedb, mlx, pyee, pyhton, vlm, watchdog
Language: Python
Homepage:
Size: 77.7 MB
Stars: 3
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# IUYS (Intelligently Understanding Your Screenshots)

## About

Inspired by [Sam Witteveen](https://github.com/samwit) during his demonstration in Machine Learning Singapore group meetup.
This is a more "software engineering" take on the idea (if you'll allow me) and also to improve my skills relating to application development and GenAI related matters

### Note: This project is developed on a Apple Silicon chip!

## Description

IUYS is a tool that understands your images or screenshots for you to be able perform query and find the relevant results ala "Google Search" style

## Tools Used

Note: lancedb in this usage is an embedded database, once we shut the tooling down it loses all context. We retain context by creating a dump file and loading it back when the tool initializes again

- pyee (Event broker)
- Watchdog (File watcher)
- lancedb (Vector store)
- mlx-vlm (Visual language model framework)

## Flows

### Creation Flow

![Creation Flow](./imgs/creation_flow.png)

## To Do List

### General

- Exception handling
- Convert to CLI based tool
- Allow to be used by other program as an external sidecar
- Testing
- Changing of saving key-value store

### File watcher

- Receive file creation events and emit to Queue worker
- Filter file event only by images
- Identify file by their checksums to decide whether to perform VLM ops
- Exception handling
- Testing

### Queue Worker

- Receive file creation events from File Watcher
- Filter any unrelated events
- Task events to a queue
- Optimization?
- Exception handling
- Testing

### OCU

- Receive new tasks from Queue worker and perform inference

- Allow changing of "list-of-allowed" models via CLI arguements
- Testing
- Optimization
- Currently the models being loaded into my M3 Pro 36gb RAM consumes 25gb!! <- (YIKES)

### Vector store

- Receive OCU inferences into embeddings and storing it into vector store
- Retrieval pipeline
- Testing

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/buddylim/iuys

Awesome Lists containing this project

README