https://github.com/alexcg1/example-clip-as-service

Run CLIP as service frontend in your browser
https://github.com/alexcg1/example-clip-as-service

Last synced: 3 months ago
JSON representation

Run CLIP as service frontend in your browser

Host: GitHub
URL: https://github.com/alexcg1/example-clip-as-service
Owner: alexcg1
License: apache-2.0
Created: 2022-03-24T16:30:41.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2022-04-01T13:52:01.000Z (over 3 years ago)
Last Synced: 2025-02-12T10:18:21.403Z (5 months ago)
Language: Python
Size: 20.5 KB
Stars: 1
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

## Initial setup

1. Set up virtualenv
2. `pip install -r requirements.txt`

## Run CLIP-as-service client/frontend

1. `python run_once.py` to download initial datasets and store in SQLite for faster loading in frontend
2. In a new terminal, `streamlit run frontend.py`
3. Set server in the sidebar

## Use your own datasets

Currently we pull two datasets **with pre-existing embeddings** from Jina Cloud using `DocumentArray.pull()`:

- `ttl-embedding`: Images and embeddings from [Totally Looks Like Dataset](https://sites.google.com/view/totally-looks-like-dataset)
- `ttl-textual`: Sentences and embeddings from Pride Prejudice

Why do we use pre-computed embeddings? Because our server doesn't have a GPU and can't handle all that heavy lifting (though a more powerful server could!). Our server only handles creating embeddings for the **user inputs** (either a line of text or an image)

To use your own dataset:

### On your local machine

1. Create a file to load your content into a `DocumentArray`
2. Create CLIP embeddings for it via a simple Jina Flow (all it needs is `jinahub://CLIPEncoder` Executor)
3. Push your `DocumentArray` to Jina Cloud with [`DocumentArray.push('your_dataset_name', show_progress=True)`](https://docarray.jina.ai/fundamentals/documentarray/serialization/?highlight=push%20pull#from-to-cloud)
### In `frontend.py`

1. Go to line 13 or 16 (look for `ttl_textual` or `ttl_embedding`)
2. Change `ttl_whatever` to the name of the token you used with `DocumentArray.push()` above
3. `rm -rf clip-as-service.db`
4. `python run_once.py`

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/alexcg1/example-clip-as-service

Awesome Lists containing this project

README