Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/lilili-bot/mlops-infra-ta
fastapi service observability with Grafana Cloud and OpenTelemetry, gke for infra.
https://github.com/lilili-bot/mlops-infra-ta
fastapi gke-terraform grafana observability opentelemetry
Last synced: 28 days ago
JSON representation
fastapi service observability with Grafana Cloud and OpenTelemetry, gke for infra.
- Host: GitHub
- URL: https://github.com/lilili-bot/mlops-infra-ta
- Owner: lilili-bot
- Created: 2024-09-09T09:54:40.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2024-09-09T19:14:59.000Z (about 2 months ago)
- Last Synced: 2024-09-27T06:42:50.566Z (about 1 month ago)
- Topics: fastapi, gke-terraform, grafana, observability, opentelemetry
- Language: Smarty
- Homepage: https://lilili-bot.github.io/mlops-infra-ta/
- Size: 331 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# FastAPI OpenTelemetry Project
## Summary
This project demonstrates how to effectively integrate FastAPI with OpenTelemetry to achieve enhanced observability. By combining FastAPI with OpenTelemetry and Grafana, you gain several advantages:
- **Simplified Instrumentation:** The project includes minimal customized instrumentation with OpenTelemetry, showcasing how easy it is to get started.
- **Flexibility:** The setup allows for flexible and extensible monitoring infrastructure management, enabling easy adjustments and scalability.
- **Enhanced Observability:** Leverage distributed tracing and monitoring to gain deep insights into your application's performance and behavior.
- **Near Real-time Visualization:** Use Grafana dashboards to visualize metrics and traces in real-time, providing quick feedback and alerting capabilities.## Table of Contents
- [Prerequisites](#prerequisites)
- [Infrastructure Setup](#infrastructure-setup)
- [Seceret management](#secret-management)
- [oTelCol deployment](#deploy-otel-collector-daemonset)
- [Service Deployment](#service-deployment)
- [Monitoring and Tracing](#monitoring-and-tracing)
- [To do](#to-do)## Prerequisites
Before you begin, ensure you have the following installed:
- [Terraform](https://www.terraform.io/downloads.html)
- [Docker](https://www.docker.com/products/docker-desktop)
- [GCP SDK](https://cloud.google.com/sdk/install)
- [kubectl](https://kubernetes.io/docs/tasks/tools/)
- [Google Artifactory](https://cloud.google.com/artifact-registry/docs)
- [GCS](https://cloud.google.com/storage/docs)## Infrastructure Setup
1. **Authenticate with GCP:**
```sh
gcloud auth login
gcloud auth application-default login
```
2. **Set Environment Variables:**Export the necessary environment variables for Terraform:
```sh
export TF_VAR_gcp_credentials_json="path/to/your/credentials.json"
export TF_VAR_project_id="your-gcp-project-id"
export TF_VAR_region="your-gcp-region"
```
3. **Initialize Terraform:**
```sh
terraform init -backend-config="gke-infra/terraform/backend_configs/staging-backend-config.hcl"
```3. **Create Infrastructure:**
```sh
terraform apply -var-file="gke-infra/terraform/envs/staging.tfvars"
```This will create the necessary GKE cluster, deploy the OpenTelemetry Collector and the Grafana instance.
## Secret Management
Before deploying the FastAPI service, create the necessary secrets to pass them as variables to the service.
1. **Create Secrets** (example using `kubectl`):
In this repo, all the gke resources are build namespace otel-col. You could use your own namespace, such as monitoring-demo.
```sh
kubectl create secret generic otel-endpoint --from-literal=OTEL_EXPORTER_OTLP_ENDPOINT=
--namespace=otel-colkubectl create secret generic grafana-auth --from-literal=username="" --from-literal=password="" -n
```
note: the otel-col endpoint is ..svc.cluster.local## Deploy otel-collector daemonset
```
helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm install otel-collector open-telemetry/opentelemetry-collector -f values.yaml
```
Verify if the pods are created.
```
./check_pod_all_logs.sh
```## Service Deployment
1. **Build and Push Docker Image:**
```sh
docker build --platform linux/amd64 -t fastapi-otel:v2.0 .
docker tag fastapi-otel:v2.0 /
docker push /
```2. **Deploy to GKE Cluster:**
Update the `deployment.yaml` file to include the correct image path from Google Artifactory.
```sh
cd gke-infra/deploy/fast_api
kubectl apply -f deployment.yaml
```
deploy debug pod with traffic simulation script
```
kubectl apply -f alpine-pod.yaml
kubectl apply -f traffic-simulation-configmap.yaml
```
login to the pod
```
kubectl exec -it apline-curl -n otel-col -- sh
cd script
traffic_simulation.sh
```
You should see some 200 and 422 responses from service. To check details you could check the log of the service pod. The metics which we custimised with open-telemetry export also to pod log, for debuging purpose.
```
kubectl logs -f -n otel-col
```## Monitoring and Tracing
The FastAPI service is instrumented with OpenTelemetry, allowing for distributed tracing and monitoring.
1. **Configure OpenTelemetry Collector:**
See gke-infra/deploy/otel-collector/values.yaml
2. **Access Metrics in Grafana cloud:**
Assume you have already Grafana Cloud set up. Stack created, credential set up, enabled OpenTelemetry Connection
Reference documentation: https://grafana.com/docs/grafana-cloud/send-data/otlp/send-data-otlp/4. **Configure Grafana Datasource:**
* There will be datasource automatically available based on the connection set in pervious step.
* For metrics, go to default datasource connection, details see below image.
![Image description](readme.png)5. **Create Dashboards:**
You can now create custom dashboards in Grafana to visualize the metrics collected by OpenTelemetry.## To do
* Set up alert via Grafana API.
* Customised more instrumentation for fastapi as well as for possible backends, such as postgress, redis, rocksdb.
* Migrate all GCP deployment to terraform.