{"id":15175576,"url":"https://github.com/kenza-ai/sagify","last_synced_at":"2025-05-16T15:07:54.040Z","repository":{"id":32233742,"uuid":"123693096","full_name":"Kenza-AI/sagify","owner":"Kenza-AI","description":"LLMs and Machine Learning done easily","archived":false,"fork":false,"pushed_at":"2024-03-10T12:38:07.000Z","size":37897,"stargazers_count":440,"open_issues_count":13,"forks_count":69,"subscribers_count":17,"default_branch":"main","last_synced_at":"2025-05-12T23:38:54.059Z","etag":null,"topics":["ai-gateway","anthropic","cohere","generative-ai","langchain","langchain-python","large-language-model","large-language-models","llm","llm-inference","llmops","open-source-llm","openai","sagemaker"],"latest_commit_sha":null,"homepage":"https://kenza-ai.github.io/sagify/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Kenza-AI.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-03-03T13:12:14.000Z","updated_at":"2025-04-19T21:36:11.000Z","dependencies_parsed_at":"2024-01-13T21:25:54.818Z","dependency_job_id":"f0f29dc0-2e69-4479-9a2b-f8db4b076555","html_url":"https://github.com/Kenza-AI/sagify","commit_stats":{"total_commits":403,"total_committers":12,"mean_commits":"33.583333333333336","dds":"0.35732009925558317","last_synced_commit":"054a6a2edc215db8a0b84b7e7564d076115fc43d"},"previous_names":[],"tags_count":59,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Kenza-AI%2Fsagify","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Kenza-AI%2Fsagify/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Kenza-AI%2Fsagify/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Kenza-AI%2Fsagify/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Kenza-AI","download_url":"https://codeload.github.com/Kenza-AI/sagify/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254553958,"owners_count":22090417,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-gateway","anthropic","cohere","generative-ai","langchain","langchain-python","large-language-model","large-language-models","llm","llm-inference","llmops","open-source-llm","openai","sagemaker"],"created_at":"2024-09-27T12:39:33.470Z","updated_at":"2025-05-16T15:07:49.025Z","avatar_url":"https://github.com/Kenza-AI.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"![Sagify](docs/sagify@2x.png)\n\n\u003cp align=\"center\"\u003e\n    \u003cem\u003eLLMs and Machine Learning done easily.\u003c/em\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\n\u003ca href=\"https://github.com/kenza-ai/sagify/actions?query=workflow%3ACI\" target=\"_blank\"\u003e\n    \u003cimg src=\"https://github.com/kenza-ai/sagify/workflows/CI/badge.svg\" alt=\"Test\"\u003e\n\u003c/a\u003e\n\u003c/p\u003e\n\n# sagify\n\nSagify provides a simplified interface to manage machine learning workflows on [AWS SageMaker](https://aws.amazon.com/sagemaker/), helping you focus on building ML models rather than infrastructure. Its modular architecture includes an LLM Gateway module to provide a unified interface for leveraging both open source and proprietary large language models. The LLM Gateway gives access to various LLMs through a simple API, letting you easily incorporate them into your workflows.\n\nFor detailed reference to Sagify please go to: [Read the Docs](https://Kenza-AI.github.io/sagify/)\n\n## Installation\n\n### Prerequisites\n\nsagify requires the following:\n\n1. Python (3.7, 3.8, 3.9, 3.10, 3.11)\n2. [Docker](https://www.docker.com/) installed and running\n3. Configured [awscli](https://pypi.python.org/pypi/awscli)\n\n### Install sagify\n\nAt the command line:\n\n    pip install sagify\n\n\n## Getting started -  LLM Deployment with no code\n                \n1. Make sure to configure your AWS account by following the instructions at section [Configure AWS Account](#configure-aws-account)\n  \n2. Finally, run the following command:\n\n```sh\nsagify cloud foundation-model-deploy --model-id model-txt2img-stabilityai-stable-diffusion-v2-1-base --model-version 1.* -n 1 -e ml.p3.2xlarge --aws-region us-east-1 --aws-profile sagemaker-dev\n```\n        \nYou can change the values for ec2 type (-e), aws region and aws profile with your preferred ones.\n\nOnce the Stable Diffusion model is deployed, you can use the generated code snippet to query it. Enjoy!\n\n## Backend Platforms\n\n### OpenAI\n\nThe following models are offered for chat completions:\n\n| Model Name | URL |\n|:------------:|:-----:|\n|gpt-4|https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo|\n|gpt-4-32k|https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo|\n|gpt-3.5-turbo|https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo|\n\nFor image creation you can rely on the following models:\n\n| Model Name | URL |\n|:------------:|:-----:|\n|dall-e-3|https://platform.openai.com/docs/models/dall-e|\n|dall-e-2|https://platform.openai.com/docs/models/dall-e|\n\nAnd for embeddings:\n\n| Model Name | URL |\n|:------------:|:-----:|\n|text-embedding-3-large|https://platform.openai.com/docs/models/embeddings|\n|text-embedding-3-small|https://platform.openai.com/docs/models/embeddings|\n|text-embedding-ada-002|https://platform.openai.com/docs/models/embeddings|\n\nAll these lists of supported models on Openai can be retrieved by running the command `sagify llm models --all --provider openai`. If you want to focus only on chat completions models, then run `sagify llm models --chat-completions --provider openai`. For image creations and embeddings, `sagify llm models --image-creations --provider openai` and `sagify llm models --embeddings --provider openai`, respectively.\n\n### Open-Source\n\nThe following open-source models are offered for chat completions:\n\n| Model Name | URL |\n|:------------:|:-----:|\n|llama-2-7b|https://huggingface.co/meta-llama/Llama-2-7b|\n|llama-2-13b|https://huggingface.co/meta-llama/Llama-2-13b|\n|llama-2-70b|https://huggingface.co/meta-llama/Llama-2-70b|\n\nFor image creation you can rely on the following open-source models:\n\n| Model Name | URL |\n|:------------:|:-----:|\n|stabilityai-stable-diffusion-v2|https://huggingface.co/stabilityai/stable-diffusion-2|\n|stabilityai-stable-diffusion-v2-1-base|https://huggingface.co/stabilityai/stable-diffusion-2-1-base|\n|stabilityai-stable-diffusion-v2-fp16|https://huggingface.co/stabilityai/stable-diffusion-2/tree/fp16|\n\nAnd for embeddings:\n\n| Model Name | URL |\n|:------------:|:-----:|\n|bge-large-en|https://huggingface.co/BAAI/bge-large-en|\n|bge-base-en|https://huggingface.co/BAAI/bge-base-en|\n|gte-large|https://huggingface.co/thenlper/gte-large|\n|gte-base|https://huggingface.co/thenlper/gte-base|\n|e5-large-v2|https://huggingface.co/intfloat/e5-large-v2|\n|bge-small-en|https://huggingface.co/BAAI/bge-small-en|\n|e5-base-v2|https://huggingface.co/intfloat/e5-base-v2|\n|multilingual-e5-large|https://huggingface.co/intfloat/multilingual-e5-large|\n|e5-large|https://huggingface.co/intfloat/e5-large|\n|gte-small|https://huggingface.co/thenlper/gte-small|\n|e5-base|https://huggingface.co/intfloat/e5-base|\n|e5-small-v2|https://huggingface.co/intfloat/e5-small-v2|\n|multilingual-e5-base|https://huggingface.co/intfloat/multilingual-e5-base|\n|all-MiniLM-L6-v2|https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2|\n\nAll these lists of supported open-source models are supported on AWS Sagemaker and can be retrieved by running the command `sagify llm models --all --provider sagemaker`. If you want to focus only on chat completions models, then run `sagify llm models --chat-completions --provider sagemaker`. For image creations and embeddings, `sagify llm models --image-creations --provider sagemaker` and `sagify llm models --embeddings --provider sagemaker`, respectively.\n\n## Set up OpenAI\n\nYou need to define the following env variables before you start the LLM Gateway server:\n\n- `OPENAI_API_KEY`: Your OpenAI API key. Example: `export OPENAI_API_KEY=...`.\n- `OPENAI_CHAT_COMPLETIONS_MODEL`: It should have one of values [here](https://platform.openai.com/docs/models/gpt-3-5-turbo) or [here](https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo).\n- `OPENAI_EMBEDDINGS_MODEL`: It should have one of values [here](https://platform.openai.com/docs/models/embeddings).\n- `OPENAI_IMAGE_CREATION_MODEL`: It should have one of values [here](https://platform.openai.com/docs/models/dall-e).\n\n## Set up open-source LLMs\n\nFirst step is to deploy the LLM model(s). You can choose to deploy all backend services (chat completions, image creations, embeddings) or some of them. \n\nIf you want to deploy all of them, then run `sagify llm start --all`. This command will deploy all backend services (chat completions, image creations, embeddings) with the following configuration:\n\n```json\n{\n    \"chat_completions\": {\n        \"model\": \"llama-2-7b\",\n        \"instance_type\": \"ml.g5.2xlarge\",\n        \"num_instances\": 1,\n    },\n    \"image_creations\": {\n        \"model\": \"stabilityai-stable-diffusion-v2-1-base\",\n        \"instance_type\": \"ml.p3.2xlarge\",\n        \"num_instances\": 1,\n    },\n    \"embeddings\": {\n        \"model\": \"gte-small\",\n        \"instance_type\": \"ml.g5.2xlarge\",\n        \"num_instances\": 1,\n    },\n}\n```\n\nYou can change this configuration by suppling your own config file, then you can run `sagify llm start -all --config YOUR_CONFIG_FILE.json`.\n\nIt takes 15 to 30 minutes to deploy all the backend services as Sagemaker endpoints.\n\nThe deployed model names, which are the Sagemaker endpoint names, are printed out and stored in the hidden file `.sagify_llm_infra.json`. You can also access them from the AWS Sagemaker web console.\n\n## Deploy FastAPI LLM Gateway - Docker\n\nOnce you have set up your backend platform, you can deploy the FastAPI LLM Gateway locally. \n\nIn case of using the AWS Sagemaker platform, you need to define the following env variables before you start the LLM Gateway server:\n\n- `AWS_ACCESS_KEY_ID`: It can be the same one you use locally for Sagify. It should have access to Sagemaker and S3. Example: `export AWS_ACCESS_KEY_ID=...`.\n- `AWS_SECRET_ACCESS_KEY`:  It can be the same one you use locally for Sagify. It should have access to Sagemaker and S3. Example: `export AWS_ACCESS_KEY_ID=...`.\n- `AWS_REGION_NAME`: AWS region where the LLM backend services (Sagemaker endpoints) are deployed.\n- `S3_BUCKET_NAME`: S3 bucket name where the created images by the image creation backend service are stored.\n- `IMAGE_URL_TTL_IN_SECONDS`: TTL in seconds of the temporary url to the created images. Default value: 3600.\n- `SM_CHAT_COMPLETIONS_MODEL`: The Sagemaker endpoint name where the chat completions model is deployed.\n- `SM_EMBEDDINGS_MODEL`: The Sagemaker endpoint name where the embeddings model is deployed.\n- `SM_IMAGE_CREATION_MODEL`: The Sagemaker endpoint name where the image creation model is deployed.\n\nIn case of using the OpenAI platform, you need to define the following env variables before you start the LLM Gateway server:\n\n- `OPENAI_API_KEY`: Your OpenAI API key. Example: `export OPENAI_API_KEY=...`.\n- `OPENAI_CHAT_COMPLETIONS_MODEL`: It should have one of values [here](https://platform.openai.com/docs/models/gpt-3-5-turbo) or [here](https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo).\n- `OPENAI_EMBEDDINGS_MODEL`: It should have one of values [here](https://platform.openai.com/docs/models/embeddings).\n- `OPENAI_IMAGE_CREATION_MODEL`: It should have one of values [here](https://platform.openai.com/docs/models/dall-e).\n\nNow, you can run the command `sagify llm gateway --image sagify-llm-gateway:v0.1.0 --start-local` to start the LLM Gateway locally. You can change the name of the image via the `--image` argument.\n\nThis command will output the Docker container id. You can stop the container by executing `docker stop \u003cCONTAINER_ID\u003e`.\n\n**Examples**\n\n(*Remember to export first all the environment variables you need*)\n\nIn the case you want to create a docker image and then run it\n```{bash}\nsagify llm gateway --image sagify-llm-gateway:v0.1.0 --start-local\n ```\n\n If you want to use just build the image\n ```{bash}\n sagify llm gateway --image sagify-llm-gateway:v0.1.0\n ```\n\nIf you want to support both platforms (OpenAI and AWS Sagemaker), then pass all the env variables for both platforms.\n\n## Deploy FastAPI LLM Gateway - AWS Fargate\n\nIn case you want to deploy the LLM Gateway to AWS Fargate, then you can follow these general steps:\n\n1. Containerize the FastAPI LLM Gateway: See previous section.\n2. Push Docker image to Amazon ECR.\n3. Define Task Definition: Define a task definition that describes how to run your containerized FastAPI application on Fargate. Specify the Docker image, container port, CPU and memory requirements, and environment variables.\n4. Create ECS Service: Create a Fargate service using the task definition. Configure the desired number of tasks, networking options, load balancing, and auto-scaling settings.\n4. Set Environment Variables: Ensure that your FastAPI application retrieves the environment variables correctly at runtime.\n\nHere's an example CloudFormation template to deploy a FastAPI service to AWS Fargate with 5 environment variables:\n\n```yaml\nResources:\n  MyFargateTaskDefinition:\n    Type: AWS::ECS::TaskDefinition\n    Properties:\n      Family: my-fargate-task\n      ContainerDefinitions:\n        - Name: fastapi-container\n          Image: \u003cYOUR_ECR_REPOSITORY_URI\u003e\n          Memory: 512\n          PortMappings:\n            - ContainerPort: 80\n          Environment:\n            - Name: AWS_ACCESS_KEY_ID\n              Value: \"value1\"\n            - Name: AWS_SECRET_ACCESS_KEY\n              Value: \"value2\"\n            - Name: AWS_REGION_NAME\n              Value: \"value3\"\n            - Name: S3_BUCKET_NAME\n              Value: \"value4\"\n            - Name: IMAGE_URL_TTL_IN_SECONDS\n              Value: \"value5\"\n            - Name: SM_CHAT_COMPLETIONS_MODEL\n              Value: \"value6\"\n            - Name: SM_EMBEDDINGS_MODEL\n              Value: \"value7\"\n            - Name: SM_IMAGE_CREATION_MODEL\n              Value: \"value8\"\n            - Name: OPENAI_CHAT_COMPLETIONS_MODEL\n              Value: \"value9\"\n            - Name: OPENAI_EMBEDDINGS_MODEL\n              Value: \"value10\"\n            - Name: OPENAI_IMAGE_CREATION_MODEL\n              Value: \"value11\"\n\n  MyFargateService:\n    Type: AWS::ECS::Service\n    Properties:\n      Cluster: default\n      TaskDefinition: !Ref MyFargateTaskDefinition\n      DesiredCount: 2\n      LaunchType: FARGATE\n      NetworkConfiguration:\n        AwsvpcConfiguration:\n          Subnets:\n            - \u003cYOUR_SUBNET_ID\u003e\n          SecurityGroups:\n            - \u003cYOUR_SECURITY_GROUP_ID\u003e\n```\n\n## LLM Gateway API\n\nOnce the LLM Gateway is deployed, you can access it on `HOST_NAME/docs`.\n\n### Completions\n\n```shell\ncurl --location --request POST '/v1/chat/completions' \\\n--header 'Content-Type: application/json' \\\n--data-raw '{\n    \"provider\": \"sagemaker\",\n     \"messages\": [\n      {\n        \"role\": \"system\",\n        \"content\": \"you are a cook\"\n      },\n      {\n        \"role\": \"user\",\n        \"content\": \"what is the recipe of mayonnaise\"\n      }\n    ],\n    \"temperature\": 0,\n    \"max_tokens\": 600,\n    \"top_p\": 0.9,\n    \"seed\": 32\n}'\n```\n\n\u003e Example responses\n\n\u003e 200 Response\n\n```json\n{\n    \"id\": \"chatcmpl-8167b99c-f22b-4e04-8e26-4ca06d58dc86\",\n    \"object\": \"chat.completion\",\n    \"created\": 1708765682,\n    \"provider\": \"sagemaker\",\n    \"model\": \"meta-textgeneration-llama-2-7b-f-2024-02-24-08-49-32-123\",\n    \"choices\": [\n        {\n            \"index\": 0,\n            \"message\": {\n                \"role\": \"assistant\",\n                \"content\": \" Ah, a fellow foodie! Mayonnaise is a classic condiment that's easy to make and can elevate any dish. Here's my trusty recipe for homemade mayonnaise:\\n\\nIngredients:\\n\\n* 2 egg yolks\\n* 1/2 cup (120 ml) neutral-tasting oil, such as canola or grapeseed\\n* 1 tablespoon lemon juice or vinegar\\n* Salt and pepper to taste\\n\\nInstructions:\\n\\n1. In a small bowl, whisk together the egg yolks and lemon juice or vinegar until well combined.\\n2. Slowly pour in the oil while continuously whisking the mixture. You can do this by hand with a whisk or use an electric mixer on low speed.\\n3. Continue whisking until the mixture thickens and emulsifies, which should take about 5-7 minutes. You'll know it's ready when it reaches a thick, creamy consistency.\\n4. Taste and adjust the seasoning as needed. You can add more salt, pepper, or lemon juice to taste.\\n5. Transfer the mayonnaise to a jar or airtight container and store it in the fridge for up to 1 week.\\n\\nThat's it! Homemade mayonnaise is a great way to control the ingredients and flavor, and it's also a fun kitchen experiment. Enjoy!\"\n            }\n        }\n    ]\n}\n```\n\n### Embeddings\n\n```shell\ncurl --location --request POST '/v1/embeddings' \\\n--header 'Content-Type: application/json' \\\n--data-raw '{\n  \"provider\": \"sagemaker\",\n  \"input\": [\n    \"The mayonnaise was delicious\"\n  ]\n}'\n```\n\n\u003e Example responses\n\n\u003e 200 Response\n\n```json\n{\n    \"data\": [\n        {\n            \"object\": \"embedding\",\n            \"embedding\": [\n                -0.04274585098028183,\n                0.021814687177538872,\n                -0.004705613013356924,\n                ...\n                -0.07548460364341736,\n                0.036427777260541916,\n                0.016453085467219353,\n                0.004641987383365631,\n                -0.0072729517705738544,\n                0.02343473769724369,\n                -0.002924458822235465,\n                0.0339619480073452,\n                0.005262510851025581,\n                -0.06709178537130356,\n                -0.015170316211879253,\n                -0.04612169787287712,\n                -0.012380547821521759,\n                -0.006663458421826363,\n                -0.0573800653219223,\n                0.007938326336443424,\n                0.03486081212759018,\n                0.021514462307095528\n            ],\n            \"index\": 0\n        }\n    ],\n    \"provider\": \"sagemaker\",\n    \"model\": \"hf-sentencesimilarity-gte-small-2024-02-24-09-24-27-341\",\n    \"object\": \"list\"\n}\n```\n\n### Image Generations\n\n```shell\ncurl --location --request POST '/v1/images/generations' \\\n--header 'Content-Type: application/json' \\\n--data-raw '{\n  \"provider\": \"sagemaker\",\n  \"prompt\": \n    \"A baby sea otter\"\n  ,\n  \"n\": 1,\n  \"width\": 512,\n  \"height\": 512,\n  \"seed\": 32,\n  \"response_format\": \"url\"\n}'\n```\n\n\u003e Example responses\n\n\u003e 200 Response\n\n```json\n{\n    \"provider\": \"sagemaker\",\n    \"model\": \"stable-diffusion-v2-1-base-2024-02-24-11-43-32-177\",\n    \"created\": 1708775601,\n    \"data\": [\n        {\n            \"url\": \"https://your-bucket.s3.amazonaws.com/31cedd17-ccd7-4cba-8dea-cb7e8b915782.png?AWSAccessKeyId=AKIAUKEQBDHITP26MLXH\u0026Signature=%2Fd1J%2FUjOWbRnP5cwtkSzYUVoEoo%3D\u0026Expires=1708779204\"\n        }\n    ]\n}\n```\n\n## Talk with the team\n\nEmail: pavlos@sagify.ai\n\n## Why did we build this\n\nWe realized that there is not a single LLM to rule them all!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkenza-ai%2Fsagify","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkenza-ai%2Fsagify","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkenza-ai%2Fsagify/lists"}