Projects in Awesome Lists tagged with model-serving
A curated list of projects in awesome lists tagged with model-serving .
https://github.com/vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
amd cuda deepseek gpt hpu inference inferentia llama llm llm-serving llmops mlops model-serving pytorch qwen rocm tpu trainium transformer xpu
Last synced: 29 Jan 2026
https://github.com/bentoml/bentoml
The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!
ai-inference deep-learning generative-ai inference-platform llm llm-inference llm-serving llmops machine-learning ml-engineering mlops model-inference-service model-serving multimodal python
Last synced: 12 May 2025
https://github.com/bentoml/BentoML
The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and much more!
ai-inference deep-learning generative-ai inference-platform llm llm-inference llm-serving llmops machine-learning ml-engineering mlops model-inference-service model-serving multimodal python
Last synced: 12 Mar 2025
https://github.com/ahkarami/deep-learning-in-production
In this repository, I will share some useful notes and references about deploying deep learning-based models in production.
angularjs c-plus-plus caffe2 convert-pytorch-models deep-learning deep-neural-networks flask keras model-serving mxnet production python pytorch react rest-api serving serving-pytorch-models tensorflow-models tesnorflow tutorial
Last synced: 14 May 2025
https://github.com/ahkarami/Deep-Learning-in-Production
In this repository, I will share some useful notes and references about deploying deep learning-based models in production.
angularjs c-plus-plus caffe2 convert-pytorch-models deep-learning deep-neural-networks flask keras model-serving mxnet production python pytorch react rest-api serving serving-pytorch-models tensorflow-models tesnorflow tutorial
Last synced: 14 Mar 2025
https://github.com/FedML-AI/FedML
FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs on any GPU cloud or on-premise cluster. Built on this library, TensorOpera AI (https://TensorOpera.ai) is your generative AI platform at scale.
ai-agent deep-learning distributed-training edge-ai federated-learning inference-engine machine-learning mlops model-deployment model-serving on-device-training
Last synced: 04 Apr 2025
https://github.com/kserve/kserve
Standardized Serverless ML Inference Platform on Kubernetes
artificial-intelligence genai hacktoberfest istio k8s knative kserve kubeflow kubernetes llm-inference machine-learning mlops model-interpretability model-serving pytorch service-mesh sklearn tensorflow xgboost
Last synced: 13 May 2025
https://github.com/beclab/olares
Olares: An Open-Source Personal Cloud to Reclaim Your Data
ai-agents ai-privacy edge-ai home-automation home-cloud home-server homelab homeserver kubernetes local-ai mcp model-serving personal-cloud self-hosted
Last synced: 12 Feb 2026
https://github.com/fedml-ai/fedml
FEDML - The unified and scalable ML library for large-scale distributed training, model serving, and federated learning. FEDML Launch, a cross-cloud scheduler, further enables running any AI jobs on any GPU cloud or on-premise cluster. Built on this library, TensorOpera AI (https://TensorOpera.ai) is your generative AI platform at scale.
ai-agent deep-learning distributed-training edge-ai federated-learning inference-engine machine-learning mlops model-deployment model-serving on-device-training
Last synced: 08 May 2025
https://github.com/modeltc/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
deep-learning gpt llama llm model-serving nlp openai-triton
Last synced: 13 May 2025
https://github.com/predibase/lorax
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
fine-tuning gpt llama llm llm-inference llm-serving llmops lora model-serving pytorch transformers
Last synced: 12 May 2025
https://github.com/HuaizhengZhang/Awesome-System-for-Machine-Learning
π Awesome System for Machine Learning β‘οΈ AI System Papers and Industry Practice. β‘οΈ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). π» OSDI, NSDI, SIGCOMM, SoCC, MLSys, etc. ποΈ Llama3, Mistral, etc. π§βπ» Video Tutorials.
ai-infra genai large-language-models llmsys mlsys model-serving model-training
Last synced: 09 Apr 2025
https://github.com/ModelTC/lightllm
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
deep-learning gpt llama llm model-serving nlp openai-triton
Last synced: 20 Mar 2025
https://github.com/tensorchord/envd
ποΈ Reproducible development environment
buildkit developer-tools development-environment docker hacktoberfest llmops mlops mlops-workflow model-serving
Last synced: 03 Oct 2025
https://github.com/microsoft/aici
AICI: Prompts as (Wasm) Programs
ai inference language-model llm llm-framework llm-inference llm-serving llmops model-serving rust transformer wasm wasmtime
Last synced: 14 May 2025
https://github.com/beclab/Olares
Olares: An Open-Source Sovereign Cloud OS for Local AI
ai-agents ai-privacy edge-ai home-automation homelab homeserver kubernetes local-ai mcp model-serving nas self-hosted
Last synced: 02 May 2025
https://github.com/mlrun/mlrun
MLRun is an open source MLOps platform for quickly building and managing continuous ML applications across their lifecycle. MLRun integrates into your development and CI/CD environment and automates the delivery of production data, ML pipelines, and online applications.
data-engineering data-science experiment-tracking kubernetes machine-learning mlops mlops-workflow model-serving python workflow
Last synced: 18 Feb 2026
https://github.com/logicalclocks/hopsworks
Hopsworks - Data-Intensive AI platform with a Feature Store
aws azure data-science feature-engineering feature-management feature-store gcp governance hopsworks kserve machine-learning ml mlops model-serving pyspark python serverless
Last synced: 14 May 2025
https://github.com/basetenlabs/truss
The simplest way to serve AI/ML models in production
artificial-intelligence easy-to-use falcon inference-api inference-server machine-learning model-serving open-source packaging stable-diffusion whisper wizardlm
Last synced: 26 Feb 2026
https://github.com/zhihu/zhilight
A highly optimized LLM inference acceleration engine for Llama and its variants.
cuda deepseek-r1 gpt inference-engine llama llm llm-inference llm-serving model-serving pytorch
Last synced: 15 May 2025
https://github.com/alibaba/rtp-llm
RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.
gpt inference llama llm llm-serving llmops model-serving
Last synced: 14 Oct 2025
https://github.com/mosecorg/mosec
A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine
cv deep-learning gpu hacktoberfest jax llm llm-serving machine-learning machine-learning-platform mlops model-serving mxnet nerual-network python pytorch rust tensorflow tts
Last synced: 14 May 2025
https://github.com/bentoml/yatai
Model Deployment at Scale on Kubernetes π¦οΈ
bentoml k8s kubernetes machine-learning mlops model-deployment model-serving
Last synced: 16 May 2025
https://github.com/bentoml/Yatai
Model Deployment at Scale on Kubernetes π¦οΈ
bentoml k8s kubernetes machine-learning mlops model-deployment model-serving
Last synced: 24 Mar 2025
https://github.com/kitops-ml/kitops
An open source DevOps tool for packaging and versioning AI/ML models, datasets, code, and configuration into an OCI artifact.
ai code datasets devops devops-tools gguf hacktoberfest kubernetes kubernetes-deployment ml mlops mlops-tools model-interpretability model-serving models opensource platform-engineering pytorch sklearn tensorflow
Last synced: 15 May 2025
https://github.com/vllm-project/vllm-ascend
Community maintained hardware plugin for vLLM on Ascend
ascend inference llm llm-serving llmops mlops model-serving transformer vllm
Last synced: 27 Feb 2026
https://github.com/efeslab/Nanoflow
A throughput-oriented high-performance serving framework for LLMs
cuda inference llama2 llm llm-serving model-serving
Last synced: 21 Apr 2025
https://github.com/efeslab/nanoflow
A throughput-oriented high-performance serving framework for LLMs
cuda inference llama2 llm llm-serving model-serving
Last synced: 16 May 2025
https://github.com/kitops-ml/kitops?tab=readme-ov-file
An open source DevOps tool for packaging and versioning AI/ML models, datasets, code, and configuration into an OCI artifact.
ai code datasets devops devops-tools gguf hacktoberfest kubernetes kubernetes-deployment ml mlops mlops-tools model-interpretability model-serving models opensource platform-engineering pytorch sklearn tensorflow
Last synced: 28 Apr 2025
https://github.com/openvinotoolkit/model_server
A scalable inference server for models optimized with OpenVINOβ’
ai cloud dag deep-learning edge genai inference kubernetes machine-learning model-serving openvino serving
Last synced: 14 May 2025
https://github.com/jozu-ai/kitops
An open source DevOps tool for packaging and versioning AI/ML models, datasets, code, and configuration into an OCI artifact.
ai code datasets devops devops-tools gguf hacktoberfest kubernetes kubernetes-deployment ml mlops mlops-tools model-interpretability model-serving models opensource platform-engineering pytorch sklearn tensorflow
Last synced: 16 Mar 2025
https://github.com/underneathall/pinferencia
Python + Inference - Model Deployment library in Python. Simplest model inference server ever.
ai artificial-intelligence computer-vision data-science deep-learning huggingface inference inference-server machine-learning model-deployment model-serving modelserver nlp paddlepaddle predict python pytorch serving tensorflow transformers
Last synced: 08 Oct 2025
https://github.com/ServerlessLLM/ServerlessLLM
Serverless LLM Serving for Everyone.
cuda huggingface-transformers large-language-models model-as-a-service model-serving pytorch serverless-inference
Last synced: 07 May 2025
https://github.com/eightBEC/fastapi-ml-skeleton
FastAPI Skeleton App to serve machine learning models production-ready.
fastapi machine-learning model-serving python python3
Last synced: 15 Mar 2025
https://github.com/serverlessllm/serverlessllm
Serverless LLM Serving for Everyone.
cuda huggingface-transformers large-language-models model-as-a-service model-serving pytorch serverless-inference
Last synced: 15 May 2025
https://github.com/bentoml/BentoDiffusion
BentoDiffusion: A collection of diffusion models served with BentoML
ai diffusion-models fine-tuning kubernetes lora model-serving stable-diffusion
Last synced: 17 Aug 2025
https://github.com/bentoml/bentodiffusion
BentoDiffusion: A collection of diffusion models served with BentoML
ai diffusion-models fine-tuning kubernetes lora model-serving stable-diffusion
Last synced: 16 May 2025
https://github.com/ai-hypercomputer/jetstream
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
gemma gpt gpu inference jax large-language-models llama llama2 llm llm-inference llmops mlops model-serving pytorch tpu transformer
Last synced: 23 Oct 2025
https://github.com/AI-Hypercomputer/JetStream
JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).
gemma gpt gpu inference jax large-language-models llama llama2 llm llm-inference llmops mlops model-serving pytorch tpu transformer
Last synced: 31 Mar 2025
https://github.com/aniketmaurya/chitra
A multi-functional library for full-stack Deep Learning. Simplifies Model Building, API development, and Model Deployment.
bounding-boxes deep-learning fastapi gradcam hacktoberfest image-classification image-dataset image-processing machine-learning mlops model-deployment model-interpretation model-serving model-visualization object-detection python pytorch tensorflow visualization
Last synced: 15 May 2025
https://github.com/lightbend/kafka-with-akka-streams-kafka-streams-tutorial
Code samples for the Lightbend tutorial on writing microservices with Akka Streams, Kafka Streams, and Kafka
akka kafka-streams model-serving
Last synced: 02 May 2025
https://github.com/clearml/clearml-serving
ClearML - Model-Serving Orchestration and Repository Solution
ai clearml deep-learning devops kubernetes machine-learning mlops model-serving serving serving-ml serving-pytorch-models tensorflow-serving triton triton-inference-server
Last synced: 17 Jun 2025
https://github.com/FederatedAI/FATE-Serving
A scalable, high-performance serving system for federated learning models
federated-learning inference model-serving model-versioning monitor
Last synced: 16 Nov 2025
https://github.com/bentoml/gallery
BentoML Example Projects π¨
aws-lambda aws-sagemaker azure-machine-learning bentoml data-science gallery gcp-cloud-functions machine-learning machine-learning-library machine-learning-workflow model-deployment model-management model-serving serverless
Last synced: 04 Feb 2026
https://github.com/project-monai/monai-deploy-app-sdk
MONAI Deploy App SDK offers a framework and associated tools to design, develop and verify AI-driven applications in the healthcare imaging domain.
ai deep-learning deploy dicom healthcare image-processing machine-learning medical-imaging ml ml-infrastructure ml-platform mlops model-deployment model-serving monai pipeline python pytorch workflow
Last synced: 16 May 2025
https://github.com/alvarobartt/serving-pytorch-models
Serving PyTorch models with TorchServe :fire:
image-classification machine-learning mlops model-deployment model-serving pytorch pytorch-cnn serve-pytorch torchserve
Last synced: 12 Apr 2025
https://github.com/notai-tech/fastdeploy
Deploy DL/ ML inference pipelines with minimal extra code.
deep-learning docker falcon gevent gunicorn http-server inference-server model-deployment model-serving python pytorch serving streaming-audio tensorflow-serving tf-serving torchserve triton triton-inference-server triton-server websocket
Last synced: 13 Apr 2025
https://github.com/nimbleboxai/nbox
The official python package for NimbleBox. Exposes all APIs as CLIs and contains modules to make ML πΈ
data-science machine-learning ml-infrastructure ml-platform ml-service mlops mlops-automation mlops-pipeline mlops-tool mlops-workflow model-deployment model-management model-monitoring model-serving practical-mlops
Last synced: 14 Dec 2025
https://github.com/aporia-ai/inferencedb
π Stream inferences of real-time ML models in production to any data lake (Experimental)
kafka machine-learning mlops model-monitoring model-serving s3
Last synced: 30 Apr 2025
https://github.com/balavenkatesh3322/model_deployment
A collection of model deployment library and technique.
aws azure caffe data-science deep-learning keras machine-learning model model-deployment model-server model-serving mxnet neural-network pytorch serving serving-pytorch-models serving-recommendation serving-tensors tensorflow
Last synced: 22 Apr 2025
https://github.com/thu-pacman/chitu
High-performance inference framework for large language models, focusing on efficiency, flexibility, and availability.
deepseek gpu llm llm-serving model-serving pytorch
Last synced: 17 Mar 2025
https://github.com/alibaba/servegen
A framework for generating realistic LLM serving workloads
deepseek llm llm-serving model-serving qwen
Last synced: 14 Oct 2025
https://github.com/kspviswa/PyOMlx
A wannabe Ollama equivalent for Apple MlX models
Last synced: 10 Apr 2025
https://github.com/ai-hypercomputer/jetstream-pytorch
PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"
attention batching gemma inference llama llama2 llm llm-inference model-serving pytorch tpu
Last synced: 27 Oct 2025
https://github.com/messense/fasttext-serving
fastText model serving service
fasttext model-server model-serving nlp
Last synced: 04 Apr 2025
https://github.com/bentoml/clip-api-service
CLIP as a service - Embed image and sentences, object recognition, visual reasoning, image classification and reverse image search
ai-applications clip cloud-native mlops model-inference model-inference-service model-serving openai-clip
Last synced: 04 May 2025
https://github.com/bentoml/BentoOCR
Turn any OCR models into online inference API endpoint π π
ai-applications model-deployment model-serving ocr ocr-python
Last synced: 04 May 2025
https://github.com/alvarobartt/serving-tensorflow-models
Serving TensorFlow models with TensorFlow Serving :orange_book:
image-classification machine-learning mlops model-deployment model-serving serve-tensorflow-models tensorflow tensorflow-serving
Last synced: 12 Apr 2025
https://github.com/bentoml/transformers-nlp-service
Online Inference API for NLP Transformer models - summarization, text classification, sentiment analysis and more
llm llmops mlops model-deployment model-inference-service model-serving nlp nlp-machine-learning online-inference transformer
Last synced: 04 May 2025
https://github.com/kspviswa/pyomlx
A wannabe Ollama equivalent for Apple MlX models
Last synced: 18 Sep 2025
https://github.com/galileo-galilei/kedro-mlflow-tutorial
A tutorial on how to use kedro-mlflow plugin (https://github.com/Galileo-Galilei/kedro-mlflow) to synchronize training and inference and serve kedro pipeline
kedro kedro-mlflow kedro-tutorial mlflow mlops model-serving
Last synced: 12 May 2025
https://github.com/lightbend/kubeflow-recommender
Kubeflow example of machine learning/model serving
kubeflow machine-learning model-serving
Last synced: 02 May 2025
https://github.com/ml-libs/mlserve
mlserve turns your python models into RESTful API, serves web page with form generated to match your input data.
machine-learning mlserve model-deployment model-serving scikit-learn
Last synced: 28 Jan 2026
https://github.com/modzy/sdk-python
Python library for Modzy Machine Learning Operations (MLOps) Platform
ai-security api-client deployment docker drift-detection explainable-ai kuberenetes machine-learning machine-learning-operations microservices mlops model-deployment model-serving production-machine-learning python serving
Last synced: 29 Jun 2025
https://github.com/a2i2/surround
Surround is a framework for building AI driven microservices in Python, https://surround.readthedocs.io/en/latest/
data-science machine-learning model-serving pipeline-framework python
Last synced: 14 Jan 2026
https://github.com/animator/titus2
Titus 2 : Portable Format for Analytics (PFA) implementation for Python 3.4+
analytics inference inference-engine ml-engine model-deployment model-evaluation model-serving pfa pfa-standard pmml python scoring scoring-engine titus
Last synced: 22 Aug 2025
https://github.com/bentoml/fraud-detection-model-serving
Online model serving with Fraud Detection model trained with XGBoost on IEEE-CIS dataset
ai-applications fraud-detection model-deployment model-serving
Last synced: 07 Aug 2025
https://github.com/h2oai/mlops-dai-runtimes
Production ready templates for deploying Driverless AI (DAI) scorers. https://h2oai.github.io/dai-deployment-templates/
h2o h2oai machine-learning model-deployment model-server model-serving mojo
Last synced: 07 Apr 2025
https://github.com/ksm26/efficiently-serving-llms
Learn the ins and outs of efficiently serving Large Language Models (LLMs). Dive into optimization techniques, including KV caching and Low Rank Adapters (LoRA), and gain hands-on experience with Predibaseβs LoRAX framework inference server.
batch-processing deep-learning-techniques inference-optimization large-scale-deployment machine-learning-operations model-acceleration model-inference-service model-serving optimization-techniques performance-enhancement scalability-strategies server-optimization serving-infrastructure text-generation
Last synced: 02 Aug 2025
https://github.com/galileo-galilei/kedro-serving
A kedro-plugin to serve Kedro Pipelines as API
fastapi kedro kedro-plugin mlops model-serving pipeline-serving serving
Last synced: 12 May 2025
https://github.com/rishit-dagli/tfserving-demos
TF Serving demos
jupyter-notebook model-serving python3 tensorflow tensorflow-model-server tensorflow-serving tensorflow2
Last synced: 07 May 2025
https://github.com/bentoml/diffusers-examples
API serving for your diffusers models
bentoml diffusers model-deployment model-serving
Last synced: 22 Jul 2025
https://github.com/yas-sim/openvino-model-server-wrapper
Python wrapper class for OpenVINO Model Server. User can submit inference request to OVMS with just a few lines of code.
ai area-intrusion-detection cloud deep-learning edge grpc grpc-client inference intel line-crossing-detection model-serving object-tracking openvino openvino-docker openvino-model-server python serving tensorflow-serving triton-inference-server
Last synced: 01 Aug 2025
https://github.com/adrien-legros/rhods-mnist
Data science pipelines and model serving using Red Hat OpenShift Data Science
data-science model-serving openshift-ai pipelines redhat rhoai rhods
Last synced: 17 Jan 2026
https://github.com/kristofferv98/whisper_turboapi
An optimized FastAPI server for OpenAI's Whisper whisper-large-v3-turbo model using MLX optimization
ai api asynchronous audio audio-processing fastapi huggingface machine-learning macos mlx model-serving nlp openai optimization python speech-to-text synchronous transcription whisper whisper-turbo
Last synced: 12 May 2025
https://github.com/peva3/smarterrouter
SmarterRouter: An intelligent LLM gateway and VRAM-aware router for Ollama, llama.cpp, and OpenAI. Features semantic caching, model profiling, and automatic failover for local AI labs.
ai-cache ai-gateway docker fastapi gpu-monitoring llm llm-proxy llm-router local-llm model-serving ollama ollama-api openai-proxy self-hosted self-hosted-ai semantic-cache
Last synced: 27 Feb 2026
https://github.com/logicalclocks/machine-learning-api
Hopsworks Machine Learning Api π Model management with a model registry and model serving
Last synced: 09 Apr 2025
https://github.com/riccorl/ner-serve
Simple NER model using Docker, FastAPI, ONNX and Multilingual Mini-LM.
backend deep-learning fastapi huggingface huggingface-transformers model-serving named-entity-recognition natural-language-processing ner nlp onnx onnxruntime pytorch transformers
Last synced: 05 Aug 2025
https://github.com/alibaba/aiopsserving
Open source code for AIOpsServing
ai-ops alicloud-compatible machine-learning mlflow-compatible model-benchmarking model-serving
Last synced: 14 Oct 2025
https://github.com/unaidedelf8777/faster-outlines
A Lazy, high throughput and blazing fast structured text generation backend.
ai llama llm llm-serving llmops model-serving performance transformer
Last synced: 27 Jun 2025
https://github.com/rapidrabbit76/fastapi-deep-learning-model-micro-batching-serving
FastAPI pytorch model serving with micro batching
Last synced: 24 Jun 2025
https://github.com/zerohertz/yolo-serving-cookbook
πΈ YOLO Serving Cookbook based on Triton Inference Server πΈ
docker docker-compose fastapi gradio k8s kubernetes mlops model-serving onnx pytorch triton-inference-server yolo yolov5
Last synced: 18 Mar 2025
https://github.com/saivarunk/krypton
Model Server for ML and DL Models built using FastAPI
deep-learning fastapi machine-learning model-serving rest-api
Last synced: 07 Sep 2025
https://github.com/modzy/sdk-go
The Golang library for Modzy Machine Learning Operations (MLOps) Platform
ai-security api-client api-client-go docker drift-detection explainable-ai golang kubernetes machine-learning-operations microservices mlops model-serving production-machine-learning serving
Last synced: 25 Jan 2026
https://github.com/algorithmiaio/algorithmia-modeldeployment-action
Algorithmia Github Action capable of running Jupyter notebooks to create the ML model, uploading the model and updating the algorithm at Algorithmia
algorithmia ci-cd githubaction-workflow githubactions machine-learning model-deployment model-serving
Last synced: 06 Oct 2025
https://github.com/Aquila-Network/AquilaHub
Load and serve Neural Encoder Models
machine-learning model-serving neural-search personal-search vector-search-engine
Last synced: 12 May 2025
https://github.com/prassanna-ravishankar/modalkit
A powerful Python framework for deploying ML models on Modal with production-ready features
Last synced: 05 Sep 2025
https://github.com/trustyai-explainability/trustyai-kserve-explainer
KServe TrustyAI explainer
explainable-ai kserve kubernetes model-serving trustyai xai
Last synced: 13 Jul 2025
https://github.com/amine-akrout/smoker_detection
Smoker Detection deep learning model served via a Web App using TensorFlow, tensorflow-serving, flask and Docker compose
deep-learning docker docker-compose flask inceptionv3 keras model-deployment model-serving tensorflow tesnorflow-serving transfer-learning
Last synced: 24 Mar 2025
https://github.com/mpolinowski/ray-serve-model
Using Ray Serve for ML Model Serving
consensus model-serving python ray
Last synced: 02 Aug 2025
https://github.com/algorithmiaio/githubactions-modeldeployment-demo-algorithmiaalgo
Demo ML repository, using Algorithmia Model Deployment Github Action, to auto deploy on an algorithm hosted on Algorithmia
algorithmia ci-cd githubaction-workflow githubactions jupyter-notebook machine-learning model-deployment model-serving xgboost
Last synced: 05 Apr 2025
https://github.com/dudeperf3ct/11-cortex-deploy
aws-lambda cortex docker fastapi mlops model-serving transformers
Last synced: 20 Feb 2025
https://github.com/algorithmiaio/githubactions-modeldeployment-template
Template ML repository to get started with Algorithmia Model Deployment Github Action integration
algorithmia cicd github-actions inference machine-learning model-deployment model-serving
Last synced: 05 Apr 2025
https://github.com/dudeperf3ct/8-fastapi-tests-gcp-gke
docker fastapi gke mlops model-serving
Last synced: 20 Feb 2025
https://github.com/dudeperf3ct/6-ml-fastapi-aws-serverless
aws codepipeline docker elasticbeanstalk fastapi mlops model-serving
Last synced: 05 Jul 2025
https://github.com/md-emon-hasan/bentoml
BentoML is a high-performance model serving framework it provides various scripts and configurations to help streamline and deployment process.
ai bentoml data-science ml-engineering mlops model-deployment model-serving
Last synced: 02 Mar 2025
https://github.com/jeongahyun/flask-server-main
λμλ¬Ό λ³ν΄μΆ© νμ§ μΉ μλΉμ€
caffe flask model-serving object-detection pytorch tensorflow
Last synced: 05 Apr 2025
https://github.com/ronylpatil/mlflow-pipeline
Built an E2E MLFlow Pipeline & hosted on AWS.
mlflow-tracking model-registry model-serving
Last synced: 13 Apr 2025
https://github.com/dudeperf3ct/12-serverless-deploy
aws-lambda docker fastapi mlops model-serving serverless-framework transformers
Last synced: 24 Jun 2025