Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/michaelfeil/infinity
Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpali
https://github.com/michaelfeil/infinity
bert-embeddings llm text-embeddings
Last synced: 4 days ago
JSON representation
Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpali
- Host: GitHub
- URL: https://github.com/michaelfeil/infinity
- Owner: michaelfeil
- License: mit
- Created: 2023-10-11T17:53:38.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-01-01T21:50:35.000Z (11 days ago)
- Last Synced: 2025-01-02T22:07:22.221Z (10 days ago)
- Topics: bert-embeddings, llm, text-embeddings
- Language: Python
- Homepage: https://michaelfeil.github.io/infinity/
- Size: 11.9 MB
- Stars: 1,651
- Watchers: 19
- Forks: 119
- Open Issues: 52
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Citation: CITATION.cff
Awesome Lists containing this project
- Awesome-LLM-Productization - infinity - a high-throughput, low-latency REST API for serving vector embeddings, supporting a wide range of text-embedding models and frameworks (Python based, MIT); (Models and Tools / Embeddings)
- awesome-ccamel - michaelfeil/infinity - Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpali (Python)
- awesome-llmops - Infinity - embeddings | ![GitHub Badge](https://img.shields.io/github/stars/michaelfeil/infinity.svg?style=flat-square) | (Serving / Large Model Serving)
- awesome-production-machine-learning - Infinity - Infinity is a high-throughput, low-latency REST API for serving text-embeddings, reranking models and clip. (Deployment and Serving)
- Awesome-LLM - Infinity - Inference for text-embeddings in Python (LLM Deployment)
README
[![Contributors][contributors-shield]][contributors-url]
[![Forks][forks-shield]][forks-url]
[![Stargazers][stars-shield]][stars-url]
[![Issues][issues-shield]][issues-url]
[![MIT License][license-shield]][license-url]# Infinity βΎοΈ
[![codecov][codecov-shield]][codecov-url]
[![ci][ci-shield]][ci-url]
[![Downloads][pepa-shield]][pepa-url]
[![DOI](https://zenodo.org/badge/703686617.svg)](https://zenodo.org/doi/10.5281/zenodo.11406462)
![Docker pulls](https://img.shields.io/docker/pulls/michaelf34/infinity)Infinity is a high-throughput, low-latency REST API for serving text-embeddings, reranking models, clip, clap and colpali. Infinity is developed under [MIT License](https://github.com/michaelfeil/infinity/blob/main/LICENSE).
## Why Infinity
* **Deploy any model from HuggingFace**: deploy any embedding, reranking, clip and sentence-transformer model from [HuggingFace]( https://huggingface.co/models?other=text-embeddings-inference&sort=trending)
* **Fast inference backends**: The inference server is built on top of [PyTorch](https://github.com/pytorch/pytorch), [optimum (ONNX/TensorRT)](https://huggingface.co/docs/optimum/index) and [CTranslate2](https://github.com/OpenNMT/CTranslate2), using FlashAttention to get the most out of your **NVIDIA CUDA**, **AMD ROCM**, **CPU**, **AWS INF2** or **APPLE MPS** accelerator. Infinity uses dynamic batching and tokenization dedicated in worker threads.
* **Multi-modal and multi-model**: Mix-and-match multiple models. Infinity orchestrates them.
* **Tested implementation**: Unit and end-to-end tested. Embeddings via infinity are correctly embedded. Lets API users create embeddings till infinity and beyond.
* **Easy to use**: Built on [FastAPI](https://fastapi.tiangolo.com/). Infinity CLI v2 allows launching of all arguments via Environment variable or argument. OpenAPI aligned to [OpenAI's API specs](https://platform.openai.com/docs/guides/embeddings/what-are-embeddings). View the docs at [https://michaelfeil.github.io/infinity](https://michaelfeil.github.io/infinity/) on how to get started.### Latest News π₯
- [2024/11] AMD, CPU, ONNX docker images
- [2024/10] `pip install infinity_client`
- [2024/07] Inference deployment example via [Modal](./infra/modal/README.md) and a [free GPU deployment](https://infinity.modal.michaelfeil.eu/)
- [2024/06] Support for multi-modal: clip, text-classification & launch all arguments from env variables
- [2024/05] launch multiple models using the `v2` cli, including `--api-key`
- [2024/03] infinity supports experimental int8 (cpu/cuda) and fp8 (H100/MI300) support
- [2024/03] Docs are online: https://michaelfeil.github.io/infinity/latest/
- [2024/02] Community meetup at the [Run:AI Infra Club](https://discord.gg/7D4fbEgWjv)
- [2024/01] TensorRT / ONNX inference
- [2023/10] Initial release## Getting started
### Launch the cli via pip install
```bash
pip install infinity-emb[all]
```
After your pip install, with your venv active, you can run the CLI directly.```bash
infinity_emb v2 --model-id BAAI/bge-small-en-v1.5
```
Check the `v2 --help` command to get a description for all parameters.
```bash
infinity_emb v2 --help
```
### Launch the CLI using a pre-built docker container (recommended)
Instead of installing the CLI via pip, you may also use docker to run `michaelf34/infinity`.
Make sure you mount your accelerator ( i.e. install `nvidia-docker` and activate with `--gpus all`).```bash
port=7997
model1=michaelfeil/bge-small-en-v1.5
model2=mixedbread-ai/mxbai-rerank-xsmall-v1
volume=$PWD/datadocker run -it --gpus all \
-v $volume:/app/.cache \
-p $port:$port \
michaelf34/infinity:latest \
v2 \
--model-id $model1 \
--model-id $model2 \
--port $port
```
The cache path inside the docker container is set by the environment variable `HF_HOME`.#### Specialized docker images
Docker container for CPU
Use the `latest-cpu` image or `x.x.x-cpu` for slimer image.
Run like any other cpu-only docker image.
Optimum/Onnx is often the prefered engine.```
docker run -it \
-v $volume:/app/.cache \
-p $port:$port \
michaelf34/infinity:latest-cpu \
v2 \
--engine optimum \
--model-id $model1 \
--model-id $model2 \
--port $port
```Docker Container for ROCm (MI200 Series and MI300 Series)
Use the `latest-rocm` image or `x.x.x-rocm` for rocm compatible inference.
**This image is currently not build via CI/CD (to large), consider pinning to exact version.**
Make sure you have ROCm is correctly installed and ready to use with Docker.Visit [Docs](https://michaelfeil.github.io/infinity) for more info.
Docker Container for Onnx-GPU, Cuda Extensions, TensorRT
Use the `latest-trt-onnx` image or `x.x.x-trt-onnx` for nvidia compatible inference.
**This image is currently not build via CI/CD (to large), consider pinning to exact version.**This image has support for:
- ONNX-Cuda "CudaExecutionProvider"
- ONNX-TensorRT "TensorRTExecutionProvider" (may not always work due to version mismatch with ORT)
- CudaExtensions and packages, e.g. Tri-Dao's `pip install flash-attn` package when using Pytorch.
- nvcc compiler support
```
docker run -it \
-v $volume:/app/.cache \
-p $port:$port \
michaelf34/infinity:latest-trt-onnx \
v2 \
--engine optimum \
--device cuda \
--model-id $model1 \
--port $port
```#### Advanced CLI usage
Launching multiple models at once
Since `infinity_emb>=0.0.34`, you can use cli `v2` method to launch multiple models at the same time.
Checkout `infinity_emb v2 --help` for all args and validation.Multiple Model CLI Playbook:
- 1. cli options can be repeated e.g. `v2 --model-id model/id1 --model-id model/id2 --batch-size 8 --batch-size 4`. This will create two models `model/id1` and `model/id2`
- 2. or adapt the defaults by setting ENV Variables separated by `;`: `INFINITY_MODEL_ID="model/id1;model/id2;" && INFINITY_BATCH_SIZE="8;4;"`
- 3. single items are broadcasted to `--model-id` length, `v2 --model-id model/id1 --model-id/id2 --batch-size 8` making both models have batch-size 8.
- 4. Everything is broadcasted to the number of `--model-id` + API requests are routed to the `--served-model-name/--model-id`Using environment variables instead of the cli
All CLI arguments are also launchable via environment variables.Environment variables start with `INFINITY_{UPPER_CASE_SNAKE_CASE}` and often match the `--{lower-case-kebab-case}` cli arguments.
The following two are equivalent:
- CLI `infinity_emb v2 --model-id BAAI/bge-base-en-v1.5`
- ENV-CLI: `export INFINITY_MODEL_ID="BAAI/bge-base-en-v1.5" && infinity_emb v2`Multiple arguments can be used via `;` syntax: `INFINITY_MODEL_ID="model/id1;model/id2;"`
API Key
Supply an `--api-key secret123` via CLI or ENV INFINITY_API_KEY="secret123".Chosing the fastest engine
With the command `--engine torch` the model must be compatible with https://github.com/UKPLab/sentence-transformers/ and AutoModelWith the command `--engine optimum`, there must be an onnx file. Models from https://huggingface.co/Xenova are recommended.
With the command `--engine ctranslate2`
- only `BERT` models are supported.Telemetry opt-out
See which telemetry is collected: https://michaelfeil.eu/infinity/main/telemetry/
```
# Disable
export INFINITY_ANONYMOUS_USAGE_STATS="0"
```### Supported Tasks and Models by Infinity
Infinity aims to be the inference server supporting most functionality for embeddings, reranking and related RAG tasks. The following Infinity tests 15+ architectures and all of the below cases in the Github CI.
Click on the sections below to find tasks and **validated example models**.Text Embeddings
Text embeddings measure the relatedness of text strings. Embeddings are used for search, clustering, recommendations.
Think about a private deployed version of openai's text embeddings. https://platform.openai.com/docs/guides/embeddingsTested embedding models:
- [mixedbread-ai/mxbai-embed-large-v1](https://huggingface.co/mixedbread-ai/mxbai-embed-large-v1)
- [WhereIsAI/UAE-Large-V1](https://huggingface.co/WhereIsAI/UAE-Large-V1)
- [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5)
- [Alibaba-NLP/gte-large-en-v1.5](https://huggingface.co/Alibaba-NLP/gte-large-en-v1.5)
- [jinaai/jina-embeddings-v2-base-code](https://huggingface.co/jinaai/jina-embeddings-v2-base-code)
- [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2)
- [intfloat/multilingual-e5-large-instruct](https://huggingface.co/intfloat/multilingual-e5-large-instruct)
- [intfloat/multilingual-e5-small](https://huggingface.co/intfloat/multilingual-e5-small)
- [jinaai/jina-embeddings-v3](nomic-ai/nomic-embed-text-v1.5)
- [BAAI/bge-m3, no sparse](https://huggingface.co/BAAI/bge-m3)
- decoder-based models. Keep in mind that they are ~20-100x larger (&slower) than bert-small models:
- [Alibaba-NLP/gte-Qwen2-1.5B-instruct](https://huggingface.co/Alibaba-NLP/gte-Qwen2-1.5B-instruct/discussions/20)
- [Salesforce/SFR-Embedding-2_R](https://huggingface.co/Salesforce/SFR-Embedding-2_R/discussions/6)
- [Alibaba-NLP/gte-Qwen2-7B-instruct](https://huggingface.co/Alibaba-NLP/gte-Qwen2-7B-instruct/discussions/39)Other models:
- Most embedding model are likely supported: https://huggingface.co/models?pipeline_tag=feature-extraction&other=text-embeddings-inference&sort=trending
- Check MTEB leaderboard for models https://huggingface.co/spaces/mteb/leaderboard.Reranking
Given a query and a list of documents, Reranking indexes the documents from most to least semantically relevant to the query.
Think like a locally deployed version of https://docs.cohere.com/reference/rerank
Tested reranking models:
- [mixedbread-ai/mxbai-rerank-xsmall-v1](https://huggingface.co/mixedbread-ai/mxbai-rerank-xsmall-v1)
- [Alibaba-NLP/gte-multilingual-reranker-base](https://huggingface.co/Alibaba-NLP/gte-multilingual-reranker-base)
- [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base)
- [BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large)
- [BAAI/bge-reranker-v2-m3](https://huggingface.co/BAAI/bge-reranker-v2-m3)
- [jinaai/jina-reranker-v1-turbo-en](https://huggingface.co/jinaai/jina-reranker-v1-turbo-en)Other reranking models:
- Reranking Models supported by infinity are bert-style classification Models with one category.
- Most reranking model are likely supported: https://huggingface.co/models?pipeline_tag=text-classification&other=text-embeddings-inference&sort=trending
- https://huggingface.co/models?pipeline_tag=text-classification&sort=trending&search=rerankMulti-modal and cross-modal - image and audio embeddings
Specialized embedding models that allow for image<->text or image<->audio search.
Typically, these models allow for text<->text, text<->other and other<->other search, with accuracy tradeoffs when going cross-modal.
Image<->text models can be used for e.g. photo-gallery search, where users can type in keywords to find photos, or use a photo to find related images.
Audio<->text models are less popular, and can be e.g. used to find music songs based on a text description or related music songs.
Tested image<->text models:
- [wkcn/TinyCLIP-ViT-8M-16-Text-3M-YFCC15M](https://huggingface.co/wkcn/TinyCLIP-ViT-8M-16-Text-3M-YFCC15M)
- [jinaai/jina-clip-v1](https://huggingface.co/jinaai/jina-clip-v1)
- [google/siglip-so400m-patch14-384](https://huggingface.co/google/siglip-so400m-patch14-384)
- Models of type: ClipModel / SiglipModel in `config.json`
Tested audio<->text models:
- [Clap Models from LAION](https://huggingface.co/collections/laion/clap-contrastive-language-audio-pretraining-65415c0b18373b607262a490)
- limited number open source organizations training these models
- * Note: The sampling rate of the audio data needs to match the model *Not supported:
- Plain vision models e.g. nomic-ai/nomic-embed-vision-v1.5ColBert-style late-interaction Embeddings
ColBert Embeddings don't perform any special Pooling methods, but return the raw **token embeddings**.
The **token embeddings** are then to be scored with the MaxSim Metric in a VectorDB (Qdrant / Vespa)
For usage via the RestAPI, late-interaction embeddings may best be transported via `base64` encoding.
Example notebook: https://colab.research.google.com/drive/14FqLc0N_z92_VgL_zygWV5pJZkaskyk7?usp=sharing
Tested colbert models:
- [colbert-ir/colbertv2.0](https://huggingface.co/colbert-ir/colbertv2.0)
- [jinaai/jina-colbert-v2](https://huggingface.co/jinaai/jina-colbert-v2)
- [mixedbread-ai/mxbai-colbert-large-v1](https://huggingface.co/mixedbread-ai/mxbai-colbert-large-v1)
- [answerai-colbert-small-v1 - click link for instructions](https://huggingface.co/answerdotai/answerai-colbert-small-v1/discussions/14)ColPali-style late-interaction Image<->Text Embeddings
Similar usage to ColBert, but scanning over an image<->text instead of only text.
For usage via the RestAPI, late-interaction embeddings may best be transported via `base64` encoding.
Example notebook: https://colab.research.google.com/drive/14FqLc0N_z92_VgL_zygWV5pJZkaskyk7?usp=sharing
Tested ColPali/ColQwen models:
- [vidore/colpali-v1.2-merged](https://huggingface.co/michaelfeil/colpali-v1.2-merged)
- [michaelfeil/colqwen2-v0.1](https://huggingface.co/michaelfeil/colqwen2-v0.1)
- No lora adapters supported, only "merged" models.Text classification
A bert-style multi-label text classification. Classifies it into distinct categories.
Tested models:
- [ProsusAI/finbert](https://huggingface.co/ProsusAI/finbert), financial news classification
- [SamLowe/roberta-base-go_emotions](https://huggingface.co/SamLowe/roberta-base-go_emotions), text to emotion categories.
- bert-style text-classifcation models with more than >1 label in `config.json`### Infinity usage via the Python API
Instead of the cli & RestAPI use infinity's interface via the Python API.
This gives you most flexibility. The Python API builds on `asyncio` with its `await/async` features, to allow concurrent processing of requests. Arguments of the CLI are also available via Python.#### Embeddings
```python
import asyncio
from infinity_emb import AsyncEngineArray, EngineArgs, AsyncEmbeddingEnginesentences = ["Embed this is sentence via Infinity.", "Paris is in France."]
array = AsyncEngineArray.from_args([
EngineArgs(model_name_or_path = "BAAI/bge-small-en-v1.5", engine="torch", embedding_dtype="float32", dtype="auto")
])async def embed_text(engine: AsyncEmbeddingEngine):
async with engine:
embeddings, usage = await engine.embed(sentences=sentences)
# or handle the async start / stop yourself.
await engine.astart()
embeddings, usage = await engine.embed(sentences=sentences)
await engine.astop()
asyncio.run(embed_text(array[0]))
```#### Reranking
Reranking gives you a score for similarity between a query and multiple documents.
Use it in conjunction with a VectorDB+Embeddings, or as standalone for small amount of documents.
Please select a model from huggingface that is a AutoModelForSequenceClassification compatible model with one class classification.```python
import asyncio
from infinity_emb import AsyncEngineArray, EngineArgs, AsyncEmbeddingEngine
query = "What is the python package infinity_emb?"
docs = ["This is a document not related to the python package infinity_emb, hence...",
"Paris is in France!",
"infinity_emb is a package for sentence embeddings and rerankings using transformer models in Python!"]
array = AsyncEmbeddingEngine.from_args(
[EngineArgs(model_name_or_path = "mixedbread-ai/mxbai-rerank-xsmall-v1", engine="torch")]
)async def rerank(engine: AsyncEmbeddingEngine):
async with engine:
ranking, usage = await engine.rerank(query=query, docs=docs)
print(list(zip(ranking, docs)))
# or handle the async start / stop yourself.
await engine.astart()
ranking, usage = await engine.rerank(query=query, docs=docs)
await engine.astop()asyncio.run(rerank(array[0]))
```When using the CLI, use this command to launch rerankers:
```bash
infinity_emb v2 --model-id mixedbread-ai/mxbai-rerank-xsmall-v1
```#### Image-Embeddings: CLIP models
CLIP models are able to encode images and text at the same time.
```python
import asyncio
from infinity_emb import AsyncEngineArray, EngineArgs, AsyncEmbeddingEnginesentences = ["This is awesome.", "I am bored."]
images = ["http://images.cocodataset.org/val2017/000000039769.jpg"]
engine_args = EngineArgs(
model_name_or_path = "wkcn/TinyCLIP-ViT-8M-16-Text-3M-YFCC15M",
engine="torch"
)
array = AsyncEngineArray.from_args([engine_args])async def embed(engine: AsyncEmbeddingEngine):
await engine.astart()
embeddings, usage = await engine.embed(sentences=sentences)
embeddings_image, _ = await engine.image_embed(images=images)
await engine.astop()asyncio.run(embed(array["wkcn/TinyCLIP-ViT-8M-16-Text-3M-YFCC15M"]))
```#### Audio-Embeddings: CLAP models
CLAP models are able to encode audio and text at the same time.
```python
import asyncio
from infinity_emb import AsyncEngineArray, EngineArgs, AsyncEmbeddingEngine
import requests
import soundfile as sf
import iosentences = ["This is awesome.", "I am bored."]
url = "https://bigsoundbank.com/UPLOAD/wav/2380.wav"
raw_bytes = requests.get(url, stream=True).contentaudios = [raw_bytes]
engine_args = EngineArgs(
model_name_or_path = "laion/clap-htsat-unfused",
dtype="float32",
engine="torch")
array = AsyncEngineArray.from_args([engine_args])async def embed(engine: AsyncEmbeddingEngine):
await engine.astart()
embeddings, usage = await engine.embed(sentences=sentences)
embedding_audios = await engine.audio_embed(audios=audios)
await engine.astop()asyncio.run(embed(array["laion/clap-htsat-unfused"]))
```#### Text Classification
Use text classification with Infinity's `classify` feature, which allows for sentiment analysis, emotion detection, and more classification tasks.
```python
import asyncio
from infinity_emb import AsyncEngineArray, EngineArgs, AsyncEmbeddingEnginesentences = ["This is awesome.", "I am bored."]
engine_args = EngineArgs(
model_name_or_path = "SamLowe/roberta-base-go_emotions",
engine="torch", model_warmup=True)
array = AsyncEngineArray.from_args([engine_args])async def classifier(engine: AsyncEmbeddingEngine):
async with engine:
predictions, usage = await engine.classify(sentences=sentences)
# or handle the async start / stop yourself.
await engine.astart()
predictions, usage = await engine.classify(sentences=sentences)
await engine.astop()
asyncio.run(classifier(array["SamLowe/roberta-base-go_emotions"]))
```### Infinity usage via the Python Client
Infinity has a generated client code for RestAPI client side usage.
If you want to call a remote infinity instance via RestAPI, install the following package locally:
```bash
pip install infinity_client
```For more information, check out the Client Readme
https://github.com/michaelfeil/infinity/tree/main/libs/client_infinity/infinity_client## Integrations:
- [Serverless deployments at Runpod](https://github.com/runpod-workers/worker-infinity-embedding)
- [Truefoundry Cognita](https://github.com/truefoundry/cognita)
- [Langchain example](https://github.com/langchain-ai/langchain)
- [imitater - A unified language model server built upon vllm and infinity.](https://github.com/the-seeds/imitater)
- [Dwarves Foundation: Deployment examples using Modal.com](https://github.com/dwarvesf/llm-hosting)
- [infiniflow/Ragflow](https://github.com/infiniflow/ragflow)
- [SAP Core AI](https://github.com/SAP-samples/btp-generative-ai-hub-use-cases/tree/main/10-byom-oss-llm-ai-core)
- [gpt_server - gpt_server is an open-source framework designed for production-level deployment of LLMs (Large Language Models) or Embeddings.](https://github.com/shell-nlp/gpt_server)
- [KubeAI: Kubernetes AI Operator for inferencing](https://github.com/substratusai/kubeai)
- [LangChain](https://python.langchain.com/docs/integrations/text_embedding/infinity)
- [Batched, modification of the Batching algoritm in Infinity](https://github.com/mixedbread-ai/batched)## Documentation
View the docs at [https:///michaelfeil.github.io/infinity](https://michaelfeil.github.io/infinity) on how to get started.
After startup, the Swagger Ui will be available under `{url}:{port}/docs`, in this case `http://localhost:7997/docs`. You can also find a interactive preview here: https://infinity.modal.michaelfeil.eu/docs (and https://michaelfeil-infinity.hf.space/docs)## Contribute and Develop
Install via Poetry 1.8.1, Python3.11 on Ubuntu 22.04
```bash
cd libs/infinity_emb
poetry install --extras all --with lint,test
```To pass the CI:
```bash
cd libs/infinity_emb
make precommit
```All contributions must be made in a way to be compatible with the MIT License of this repo.
### Citation
```
@software{feil_2023_11630143,
author = {Feil, Michael},
title = {Infinity - To Embeddings and Beyond},
month = oct,
year = 2023,
publisher = {Zenodo},
doi = {10.5281/zenodo.11630143},
url = {https://doi.org/10.5281/zenodo.11630143}
}
```[contributors-shield]: https://img.shields.io/github/contributors/michaelfeil/infinity.svg?style=for-the-badge
[contributors-url]: https://github.com/michaelfeil/infinity/graphs/contributors
[forks-shield]: https://img.shields.io/github/forks/michaelfeil/infinity.svg?style=for-the-badge
[forks-url]: https://github.com/michaelfeil/infinity/network/members
[stars-shield]: https://img.shields.io/github/stars/michaelfeil/infinity.svg?style=for-the-badge
[stars-url]: https://github.com/michaelfeil/infinity/stargazers
[issues-shield]: https://img.shields.io/github/issues/michaelfeil/infinity.svg?style=for-the-badge
[issues-url]: https://github.com/michaelfeil/infinity/issues
[license-shield]: https://img.shields.io/github/license/michaelfeil/infinity.svg?style=for-the-badge
[license-url]: https://github.com/michaelfeil/infinity/blob/main/LICENSE
[pepa-shield]: https://static.pepy.tech/badge/infinity-emb
[pepa-url]: https://www.pepy.tech/projects/infinity-emb
[codecov-shield]: https://codecov.io/gh/michaelfeil/infinity/branch/main/graph/badge.svg?token=NMVQY5QOFQ
[codecov-url]: https://codecov.io/gh/michaelfeil/infinity/branch/main
[ci-shield]: https://github.com/michaelfeil/infinity/actions/workflows/ci.yaml/badge.svg
[ci-url]: https://github.com/michaelfeil/infinity/actions