Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/elbakerino/baistro
https://github.com/elbakerino/baistro
Last synced: 25 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/elbakerino/baistro
- Owner: elbakerino
- License: mit
- Created: 2023-10-27T11:17:25.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-08-23T13:11:12.000Z (3 months ago)
- Last Synced: 2024-08-23T14:28:27.630Z (3 months ago)
- Language: Python
- Size: 376 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# baistro ยท AI APIs
[![Github actions Build](https://github.com/elbakerino/baistro/actions/workflows/blank.yml/badge.svg)](https://github.com/elbakerino/baistro/actions)
Some APIs around AI models.
- CPU only docker setup
- Expects preloaded models, no (annoying) auto-downloads
- Stats about token usages (partially/WIP)## Tasks & Models
> ๐ = most stable
>
> โ๏ธ = very experimental / unstable### Vector Space Representations
> by [Sentence-Transformers](https://www.sbert.net/)
- Text / Sentences ๐
- Image
- Code ๐### Linguistic Analysis
> by [Stanza](https://stanfordnlp.github.io/stanza/pipeline.html) ๐
- Locale Identification
- Sentence Segmentation
- Token Classification (NER, POS, MWT)
- Sequence Classification (Sentiment)
- Lemmatization### Document Processing
- Image to Data (by `donut`) โ๏ธ
- Visual Document Question Answering (Image) (by `donut`)
- *WIP* Document Classification (Image) (by `dit`) โ๏ธ
- (dataset) RVL-CDIP: `"letter", "form", "email", "handwritten", "advertisement", "scientific report", "scientific publication", "specification", "file folder", "news article", "budget", "invoice", "presentation", "questionnaire", "resume", "memo"`### NLI / QA / QAG / QG
> general Natural Language Inference
- Question Answering
- Question Answer Generation โ๏ธ
- Question Generation โ๏ธ
- Question Natural Language Inference / QNLI ๐### Task Implementations
- Semantic Search ๐
- *WIP* Sentence Clustering
- *todo* Topic Clustering (by [BERTopic](https://maartengr.github.io/BERTopic/))## Usage
Startup server:
```shell
docker compose up
```- Service Home: [localhost:8702](http://localhost:8702)
- OpenAPI Docs: [localhost:8702/docs](http://localhost:8702/docs) (WIP ๐ง)Run CLI in docker container:
```shell
# build container before using cli (if never `up`ed before)
docker compose build baistro# open shell in container:
docker compose run --rm baistro bash# run cli help:
poetry run cli# download models:
poetry run cli download# download model `stanza-multilingual` directly:
poetry run cli download stanza-multilingual# list models:
poetry run cli models
```## DEV Notes
Manage dependencies with poetry:
```shell
poetry lock --no-update
poetry install --sync
# poetry lock --no-update && poetry install --sync
```## License
This project is distributed as **free software** under the **MIT License**, see [License](https://github.com/elbakerino/baistro/blob/main/LICENSE).
ยฉ 2024 Michael Becker https://i-am-digital.eu