{"id":29403618,"url":"https://github.com/parry-97/fastapi-transformers","last_synced_at":"2026-04-14T03:32:48.499Z","repository":{"id":300418398,"uuid":"1006075184","full_name":"Parry-97/fastapi-transformers","owner":"Parry-97","description":"Simple `FastAPI` + `transformers` inference built on Azure, K8s, Terraform","archived":false,"fork":false,"pushed_at":"2025-07-01T21:24:31.000Z","size":82,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-01T22:26:49.154Z","etag":null,"topics":["azure","fastapi","kubernetes","terraform","transformers"],"latest_commit_sha":null,"homepage":"","language":"HCL","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Parry-97.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-21T12:42:15.000Z","updated_at":"2025-07-01T21:24:35.000Z","dependencies_parsed_at":"2025-06-21T15:46:55.509Z","dependency_job_id":null,"html_url":"https://github.com/Parry-97/fastapi-transformers","commit_stats":null,"previous_names":["parry-97/fastapi-transformers"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Parry-97/fastapi-transformers","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Parry-97%2Ffastapi-transformers","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Parry-97%2Ffastapi-transformers/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Parry-97%2Ffastapi-transformers/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Parry-97%2Ffastapi-transformers/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Parry-97","download_url":"https://codeload.github.com/Parry-97/fastapi-transformers/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Parry-97%2Ffastapi-transformers/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":264637862,"owners_count":23642063,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["azure","fastapi","kubernetes","terraform","transformers"],"created_at":"2025-07-10T19:00:47.574Z","updated_at":"2026-04-14T03:32:48.494Z","avatar_url":"https://github.com/Parry-97.png","language":"HCL","funding_links":[],"categories":[],"sub_categories":[],"readme":"# fastapi-transformers\n\nA minimal FastAPI web API for text generation using Hugging Face Transformers models, served with [Ray Serve](https://www.ray.io/ray-serve) for scalable model serving. This project provides a `/text/simple-gen` endpoint that generates text completions using a default text generation pipeline.\n\n## Features\n\n- REST API for text generation (using Hugging Face `pipeline(\"text-generation\")`)\n- Scalable model serving with [Ray Serve](https://www.ray.io/ray-serve) on a [Ray Cluster](https://www.ray.io/ray-core)\n- Deployed on Kubernetes and managed by the [KubeRay](https://github.com/ray-project/kuberay) operator\n- FastAPI-based, easily extendable and documented (provides OpenAPI/Swagger docs out of the box)\n- Configured for easy Docker deployment using [uv](https://github.com/astral-sh/uv) for ultra-fast Python package management\n- Works with PyTorch (CPU) by default\n- Infrastructure-as-code for Azure provisioning using Terraform, with Kubernetes manifests for AKS-based app deployment\n\n## Requirements\n\n- Python 3.13+ (see `pyproject.toml`)\n- Or Docker\n\n**Note on Python and Ray versions:** There is a potential version conflict. `pyproject.toml` specifies Python 3.13+ and `ray\u003e=2.50.0`, while a comment in `infra/k8s/rayservice.yml` suggests `ray==2.46.0` which is not compatible with Python 3.13. This README assumes the versions in `pyproject.toml` are correct.\n\n## Installation\n\n### Native\n\n1. Install [uv](https://github.com/astral-sh/uv) (`pip install uv` or use pre-built binaries).\n2. Sync the dependencies:\n\n   ```bash\n   uv sync\n   ```\n\n3. Run the app with Ray Serve:\n\n   ```bash\n   serve run serve_app:deployment_graph\n   ```\n\n### With Docker\n\nYou can build and run the container as follows:\n\n```bash\ndocker build -t fastapi-transformers .\ndocker run -p 8000:8000 fastapi-transformers\n```\n\n## Usage\n\nThe API is served by Ray Serve and exposes the following endpoint:\n\n### `POST /text/simple-gen`\n\nGenerates text from provided input text. Uses the default text-generation pipeline from Hugging Face transformers (e.g., `gpt2` or equivalent, depending on environment/model cache).\n\n- **Request Body**:\n\n  ```json\n  {\n    \"input\": \"Once upon a time\"\n  }\n  ```\n\n- **Response**:\n\n  ```json\n  [\n    {\n      \"generated_text\": \"Once upon a time...\"\n    }\n  ]\n  ```\n\n  (output format depends on the underlying model)\n\n#### Example with curl\n\n```bash\ncurl -X POST http://localhost:8000/text/simple-gen -H 'Content-Type: application/json' -d '{\"input\":\"Hello, world!\"}'\n```\n\n## API Docs\n\n- Once running, see Swagger UI at [http://localhost:8000/docs](http://localhost:8000/docs)\n- The OpenAPI schema is available at [http://localhost:8000/openapi.json](http://localhost:8000/openapi.json)\n\n## Project Structure\n\n```\n.\n├── serve_app.py                  # Ray Serve application entrypoint\n├── Dockerfile                    # Docker container configuration\n├── pyproject.toml, uv.lock       # Project dependencies (managed by uv)\n├── infra/\n│   ├── azure/terraform/          # Terraform for Azure resources (AKS, ACR)\n│   └── k8s/\n│       └── rayservice.yml        # RayService manifest for deploying the app on K8s\n└── routers/\n    ├── models/\n    │   └── text_gen/\n    │       └── simple_input.py   # Data model for text generation input\n    └── text/\n        └── __init__.py\n```\n\n## Extending\n\n- To add new models or pipelines, create new Ray Serve deployments in `serve_app.py`.\n- To change the default model, override the `pipeline(\"text-generation\")` call in the `TextGenService` class with your desired model, e.g. `pipeline(\"text-generation\", model=\"gpt2\")`.\n\n## Infrastructure Deployment\n\n### Azure Infrastructure via Terraform\n\nThe Terraform configurations are located at `infra/azure/terraform` and provision the following Azure resources:\n- Resource Group\n- Azure Container Registry (ACR)\n- Azure Kubernetes Service (AKS) cluster\n\nTo deploy the infrastructure, ensure you have the Azure CLI installed and are logged in:\n```bash\naz login\n```\nThen, from the Terraform directory:\n```bash\ncd infra/azure/terraform\nterraform init\nterraform plan -out=tfplan\nterraform apply tfplan\n```\n\nAfter deployment, view the outputs (e.g., resource group and AKS cluster names):\n```bash\nterraform output\n```\n\n### Accessing the AKS Cluster\n\nConfigure `kubectl` to connect to the new AKS cluster:\n```bash\naz aks get-credentials --resource-group $(terraform output -raw rg_name) --name $(terraform output -raw aks_name)\nkubectl get nodes\n```\n\n### Deploying the Application on AKS with KubeRay\n\nThe application is deployed as a `RayService` on the AKS cluster. This requires the KubeRay operator to be installed on the cluster.\n\n1.  **Install the KubeRay operator:**\n\n    Follow the instructions in the [KubeRay documentation](https://ray-project.github.io/kuberay/deploy/helm-chart/) to install the operator using Helm.\n\n2.  **Deploy the RayService:**\n\n    The Kubernetes manifest is located at `infra/k8s/rayservice.yml`. Review and adjust the `image` field to match your ACR, then deploy:\n    ```bash\n    kubectl apply -f infra/k8s/rayservice.yml\n    ```\n\n3.  **Verify the deployment:**\n\n    Check the status of the RayService and the pods:\n    ```bash\n    kubectl get rayservice\n    kubectl get pods\n    ```\n\n    To access the application, you will need to port-forward the Ray Serve service:\n    ```bash\n    kubectl port-forward service/fastapi-transformer-service-head-svc 8000:8000\n    ```\n\n## License\n\nThis project is for educational/starter purposes. No explicit license.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fparry-97%2Ffastapi-transformers","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fparry-97%2Ffastapi-transformers","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fparry-97%2Ffastapi-transformers/lists"}