{"id":13807525,"url":"https://github.com/Blueswen/fastapi-observability","last_synced_at":"2025-05-14T00:31:39.181Z","repository":{"id":40738153,"uuid":"481317492","full_name":"blueswen/fastapi-observability","owner":"blueswen","description":"Observe FastAPI app with three pillars of observability: Traces (Tempo), Metrics (Prometheus), Logs (Loki) on Grafana through OpenTelemetry and OpenMetrics.","archived":false,"fork":false,"pushed_at":"2025-05-11T13:46:29.000Z","size":17110,"stargazers_count":808,"open_issues_count":3,"forks_count":119,"subscribers_count":7,"default_branch":"main","last_synced_at":"2025-05-12T19:18:52.555Z","etag":null,"topics":["fastapi","grafana","loki","observability","openmetrics","opentelemetry","prometheus","tempo"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/blueswen.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"ko_fi":"blueswen"}},"created_at":"2022-04-13T17:47:12.000Z","updated_at":"2025-05-11T20:02:42.000Z","dependencies_parsed_at":"2023-01-25T13:46:09.520Z","dependency_job_id":"bd6084fe-3e86-441e-8945-3624311132ce","html_url":"https://github.com/blueswen/fastapi-observability","commit_stats":{"total_commits":36,"total_committers":2,"mean_commits":18.0,"dds":0.02777777777777779,"last_synced_commit":"b54a27480c1489d92a65d40a0236c3ea9a76e49d"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blueswen%2Ffastapi-observability","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blueswen%2Ffastapi-observability/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blueswen%2Ffastapi-observability/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/blueswen%2Ffastapi-observability/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/blueswen","download_url":"https://codeload.github.com/blueswen/fastapi-observability/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254046332,"owners_count":22005573,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fastapi","grafana","loki","observability","openmetrics","opentelemetry","prometheus","tempo"],"created_at":"2024-08-04T01:01:26.402Z","updated_at":"2025-05-14T00:31:34.164Z","avatar_url":"https://github.com/blueswen.png","language":"Python","readme":"# FastAPI with Observability\n\nObserve the FastAPI application with three pillars of observability on [Grafana](https://github.com/grafana/grafana):\n\n1. Traces with [Tempo](https://github.com/grafana/tempo) and [OpenTelemetry Python SDK](https://github.com/open-telemetry/opentelemetry-python)\n2. Metrics with [Prometheus](https://prometheus.io/) and [Prometheus Python Client](https://github.com/prometheus/client_python)\n3. Logs with [Loki](https://github.com/grafana/loki)\n\n![Observability Architecture](./images/observability-arch.jpg)\n\n## Table of contents\n- [FastAPI with Observability](#fastapi-with-observability)\n  - [Table of contents](#table-of-contents)\n  - [Quick Start](#quick-start)\n  - [Explore with Grafana](#explore-with-grafana)\n    - [Metrics to Traces](#metrics-to-traces)\n    - [Traces to Logs](#traces-to-logs)\n    - [Logs to Traces](#logs-to-traces)\n  - [Detail](#detail)\n    - [FastAPI Application](#fastapi-application)\n      - [Traces and Logs](#traces-and-logs)\n      - [Span Inject](#span-inject)\n      - [Metrics](#metrics)\n      - [OpenTelemetry Instrumentation](#opentelemetry-instrumentation)\n    - [Prometheus - Metrics](#prometheus---metrics)\n      - [Prometheus Config](#prometheus-config)\n      - [Grafana Data Source](#grafana-data-source)\n    - [Tempo - Traces](#tempo---traces)\n      - [Grafana Data Source](#grafana-data-source-1)\n    - [Loki - Logs](#loki---logs)\n      - [Loki Docker Driver](#loki-docker-driver)\n      - [Grafana Data Source](#grafana-data-source-2)\n    - [Grafana](#grafana)\n  - [Reference](#reference)\n\n## Quick Start\n\n1. Install [Loki Docker Driver](https://grafana.com/docs/loki/latest/clients/docker-driver/)\n\n   ```bash\n   docker plugin install grafana/loki-docker-driver:2.9.2 --alias loki --grant-all-permissions\n   ```\n\n2. Start all services with docker-compose\n\n   ```bash\n   docker-compose up -d\n   ```\n\n   If got the error message `Error response from daemon: error looking up logging plugin loki: plugin loki found but disabled`, please run the following command to enable the plugin:\n\n   ```bash\n   docker plugin enable loki\n   ```\n\n3. Send requests with [siege](https://linux.die.net/man/1/siege) and curl to the FastAPI app\n\n   ```bash\n   bash request-script.sh\n   bash trace.sh\n   ```\n\n   Or you can use [Locust](https://locust.io/) to send requests:\n\n   ```bash\n   # install locust first with `pip install locust` if you don't have it\n   locust -f locustfile.py --headless --users 10 --spawn-rate 1 -H http://localhost:8000\n   ```\n\n   Or you can send requests with [k6](https://k6.io/):\n\n   ```bash\n   k6 run --vus 1 --duration 300s k6-script.js\n   ```\n\n4. Check predefined dashboard `FastAPI Observability` on Grafana [http://localhost:3000/](http://localhost:3000/) login with `admin:admin`\n\n   Dashboard screenshot:\n\n   ![FastAPI Monitoring Dashboard](./images/dashboard.png)\n\n   The dashboard is also available on [Grafana Dashboards](https://grafana.com/grafana/dashboards/16110).\n\n## Explore with Grafana\n\nGrafana provides a great solution, which could observe specific actions in service between traces, metrics, and logs through trace ID and exemplar.\n\n![Observability Correlations](./images/observability-correlations.jpeg)\n\nImage Source: [Grafana](https://grafana.com/blog/2021/03/31/intro-to-exemplars-which-enable-grafana-tempos-distributed-tracing-at-massive-scale/)\n\n### Metrics to Traces\n\nGet Trace ID from an exemplar in metrics, then query in Tempo.\n\nQuery: `histogram_quantile(.99,sum(rate(fastapi_requests_duration_seconds_bucket{app_name=\"app-a\", path!=\"/metrics\"}[1m])) by(path, le))`\n\n![Metrics to Traces](./images/metrics-to-traces.png)\n\n### Traces to Logs\n\nGet Trace ID and tags (here is `compose.service`) defined in Tempo data source from span, then query with Loki.\n\n![Traces to Logs](./images/traces-to-logs.png)\n\n### Logs to Traces\n\nGet Trace ID from log (regex defined in Loki data source), then query in Tempo.\n\n![Logs to Traces](./images/logs-to-traces.png)\n\n## Detail\n\n### FastAPI Application\n\nFor a more complex scenario, we use three FastAPI applications with the same code in this demo. There is a cross-service action in `/chain` endpoint, which provides a good example of how to use OpenTelemetry SDK and how Grafana presents trace information.\n\n#### Traces and Logs\n\nWe use [OpenTelemetry Python SDK](https://github.com/open-telemetry/opentelemetry-python) to send trace info with gRCP to Tempo. Each request span contains other child spans when using OpenTelemetry instrumentation. The reason is that instrumentation will catch each internal asgi interaction ([opentelemetry-python-contrib issue #831](https://github.com/open-telemetry/opentelemetry-python-contrib/issues/831#issuecomment-1005163018)). If you want to get rid of the internal spans, there is a [workaround](https://github.com/open-telemetry/opentelemetry-python-contrib/issues/831#issuecomment-1116225314) in the same issue #831 by using a new OpenTelemetry middleware with two overridden methods for span processing.\n\nWe use [OpenTelemetry Logging Instrumentation](https://opentelemetry-python-contrib.readthedocs.io/en/latest/instrumentation/logging/logging.html) to override the logger format with another format with trace id and span id.\n\n```py\n# fastapi_app/utils.py\n\ndef setting_otlp(app: ASGIApp, app_name: str, endpoint: str, log_correlation: bool = True) -\u003e None:\n    # Setting OpenTelemetry\n    # set the service name to show in traces\n    resource = Resource.create(attributes={\n        \"service.name\": app_name, # for Tempo to distinguish source\n        \"compose_service\": app_name # as a query criteria for Trace to logs\n    })\n\n    # set the tracer provider\n    tracer = TracerProvider(resource=resource)\n    trace.set_tracer_provider(tracer)\n\n    tracer.add_span_processor(BatchSpanProcessor(\n        OTLPSpanExporter(endpoint=endpoint)))\n\n    if log_correlation:\n        LoggingInstrumentor().instrument(set_logging_format=True)\n\n    FastAPIInstrumentor.instrument_app(app, tracer_provider=tracer)\n```\n\nThe following image shows the span info sent to Tempo and queried on Grafana. Trace span info provided by `FastAPIInstrumentor` with trace ID (17785b4c3d530b832fb28ede767c672c), span id(d410eb45cc61f442), service name(app-a), custom attributes(service.name=app-a, compose_service=app-a) and so on.\n\n![Span Information](./images/span-info.png)\n\nLog format with trace id and span id, which is overridden by `LoggingInstrumentor``\n\n```txt\n%(asctime)s %(levelname)s [%(name)s] [%(filename)s:%(lineno)d] [trace_id=%(otelTraceID)s span_id=%(otelSpanID)s resource.service.name=%(otelServiceName)s] - %(message)s\n```\n\nThe following image is what the logs look like.\n\n![Log With Trace ID And Span ID](./images/log-format.png)\n\n#### Span Inject\n\nIf you want other services to use the same Trace ID, you have to use `inject` function to add current span information to the header. Because OpenTelemetry FastAPI instrumentation only takes care of the asgi app's request and response, it does not affect any other modules or actions like sending HTTP requests to other servers or function calls.\n\n```py\n# fastapi_app/main.py\n\nfrom opentelemetry.propagate import inject\n\n@app.get(\"/chain\")\nasync def chain(response: Response):\n\n    headers = {}\n    inject(headers)  # inject trace info to header\n\n    async with httpx.AsyncClient() as client:\n        await client.get(f\"http://localhost:8000/\", headers=headers,)\n    async with httpx.AsyncClient() as client:\n        await client.get(f\"http://{TARGET_ONE_HOST}:8000/io_task\", headers=headers,)\n    async with httpx.AsyncClient() as client:\n        await client.get(f\"http://{TARGET_TWO_HOST}:8000/cpu_task\", headers=headers,)\n\n    return {\"path\": \"/chain\"}\n```\n\nAlternatively, we can use the [instrumentation library for HTTPX](https://github.com/open-telemetry/opentelemetry-python-contrib/tree/main/instrumentation/opentelemetry-instrumentation-httpx) to instrument HTTPX. Following is the example of using OpenTelemetry HTTPX Instrumentation which will automatically inject trace info to the header.\n\n```py\nimport httpx\nfrom opentelemetry.instrumentation.httpx import HTTPXClientInstrumentor\n\nHTTPXClientInstrumentor().instrument()\n\n@app.get(\"/chain\")\nasync def chain(response: Response):\n    async with httpx.AsyncClient() as client:\n        await client.get(f\"http://localhost:8000/\")\n    async with httpx.AsyncClient() as client:\n        await client.get(f\"http://{TARGET_ONE_HOST}:8000/io_task\")\n    async with httpx.AsyncClient() as client:\n        await client.get(f\"http://{TARGET_TWO_HOST}:8000/cpu_task\")\n\n    return {\"path\": \"/chain\"}\n```\n\n#### Metrics\n\nUse [Prometheus Python Client](https://github.com/prometheus/client_python) to generate OpenTelemetry format metric with [exemplars](https://github.com/prometheus/client_python#exemplars) and expose on `/metrics` for Prometheus.\n\nIn order to add an exemplar to metrics, we retrieve the trace id from the current span for the exemplar and add the trace id dict to the Histogram or Counter metrics.\n\n```py\n# fastapi_app/utils.py\n\nfrom opentelemetry import trace\nfrom prometheus_client import Histogram\n\nREQUESTS_PROCESSING_TIME = Histogram(\n    \"fastapi_requests_duration_seconds\",\n    \"Histogram of requests processing time by path (in seconds)\",\n    [\"method\", \"path\", \"app_name\"],\n)\n\n# retrieve trace id for exemplar\nspan = trace.get_current_span()\ntrace_id = trace.format_trace_id(\n      span.get_span_context().trace_id)\n\nREQUESTS_PROCESSING_TIME.labels(method=method, path=path, app_name=self.app_name).observe(\n      after_time - before_time, exemplar={'TraceID': trace_id}\n)\n```\n\nBecause exemplars is a new datatype proposed in [OpenMetrics](https://github.com/OpenObservability/OpenMetrics/blob/main/specification/OpenMetrics.md#exemplars), `/metrics` have to use `CONTENT_TYPE_LATEST` and `generate_latest` from `prometheus_client.openmetrics.exposition` module instead of `prometheus_client` module. Otherwise using the wrong generate_latest the exemplars dict behind Counter and Histogram will never show up, and using the wrong CONTENT_TYPE_LATEST will cause Prometheus scraping to fail.\n\n```py\n# fastapi_app/utils.py\n\nfrom prometheus_client import REGISTRY\nfrom prometheus_client.openmetrics.exposition import CONTENT_TYPE_LATEST, generate_latest\n\ndef metrics(request: Request) -\u003e Response:\n    return Response(generate_latest(REGISTRY), headers={\"Content-Type\": CONTENT_TYPE_LATEST})\n```\n\nMetrics with exemplars\n\n![Metrics With Exemplars](./images/metrics-with-exemplars.png)\n\n#### OpenTelemetry Instrumentation\n\nThere are two methods to add trace information to spans and logs using the OpenTelemetry Python SDK:\n\n1. [Code-based Instrumentation](https://opentelemetry.io/docs/languages/python/instrumentation/): This involves adding trace information to spans, logs, and metrics using the OpenTelemetry Python SDK. It requires more coding effort but allows for the addition of exemplars to metrics. We employ this approach in this project.\n2. [Zero-code Instrumentation](https://opentelemetry.io/docs/zero-code/python/): This method automatically instruments a Python application using instrumentation libraries, but only when the used [frameworks and libraries](https://github.com/open-telemetry/opentelemetry-python-contrib/tree/main/instrumentation#readme) are supported. It simplifies the process by eliminating the need for manual code changes. However, it does not allow for the addition of exemplars to metrics. For more insights into zero-code instrumentation, refer to my other project, [OpenTelemetry APM](https://github.com/blueswen/opentelemetry-apm?tab=readme-ov-file#python---fastapi).\n\n### Prometheus - Metrics\n\nCollects metrics from applications.\n\n#### Prometheus Config\n\nDefine all FastAPI applications metrics scrape jobs in `etc/prometheus/prometheus.yml`.\n\n```yaml\n...\nscrape_configs:\n  - job_name: 'app-a'\n    scrape_interval: 5s\n    static_configs:\n      - targets: ['app-a:8000']\n  - job_name: 'app-b'\n    scrape_interval: 5s\n    static_configs:\n      - targets: ['app-b:8000']\n  - job_name: 'app-c'\n    scrape_interval: 5s\n    static_configs:\n      - targets: ['app-c:8000']\n```\n\n#### Grafana Data Source\n\nAdd an Exemplars which uses the value of `TraceID` label to create a Tempo link.\n\nGrafana data source setting example:\n\n![Data Source of Prometheus: Exemplars](./images/prometheus-exemplars.png)\n\nGrafana data sources config example:\n\n```yaml\nname: Prometheus\ntype: prometheus\ntypeName: Prometheus\naccess: proxy\nurl: http://prometheus:9090\npassword: ''\nuser: ''\ndatabase: ''\nbasicAuth: false\nisDefault: true\njsonData:\nexemplarTraceIdDestinations:\n   - datasourceUid: tempo\n      name: TraceID\nhttpMethod: POST\nreadOnly: false\neditable: true\n```\n\n### Tempo - Traces\n\nReceives spans from applications.\n\n#### Grafana Data Source\n\n[Trace to logs](https://grafana.com/docs/grafana/latest/datasources/tempo/#trace-to-logs) setting:\n\n1. Data source: target log source\n2. Tags: key of tags or process level attributes from the trace, which will be log query criteria if the key exists in the trace\n3. Map tag names: Convert existing key of tags or process level attributes from trace to another key, then used as log query criteria. Use this feature when the values of the trace tag and log label are identical but the keys are different.\n\nGrafana data source setting example:\n\n![Data Source of Tempo: Trace to logs](./images/tempo-trace-to-logs.png)\n\nGrafana data sources config example:\n\n```yaml\nname: Tempo\ntype: tempo\ntypeName: Tempo\naccess: proxy\nurl: http://tempo\npassword: ''\nuser: ''\ndatabase: ''\nbasicAuth: false\nisDefault: false\njsonData:\nnodeGraph:\n   enabled: true\ntracesToLogs:\n   datasourceUid: loki\n   filterBySpanID: false\n   filterByTraceID: true\n   mapTagNamesEnabled: false\n   tags:\n      - compose_service\nreadOnly: false\neditable: true\n```\n\n### Loki - Logs\n\nCollect logs with Loki Docker Driver from all services.\n\n#### Loki Docker Driver\n\n1. Use [YAML anchor and alias](https://support.atlassian.com/bitbucket-cloud/docs/yaml-anchors/) feature to set logging options for each service.\n2. Set [Loki Docker Driver options](https://grafana.com/docs/loki/latest/clients/docker-driver/configuration/)\n   1. loki-url: loki service endpoint\n   2. loki-pipeline-stages: processes multiline log from FastAPI application with multiline and regex stages ([reference](https://grafana.com/docs/loki/latest/clients/promtail/stages/multiline/))\n\n```yaml\nx-logging: \u0026default-logging # anchor(\u0026): 'default-logging' for defines a chunk of configuration\n  driver: loki\n  options:\n    loki-url: 'http://localhost:3100/api/prom/push'\n    loki-pipeline-stages: |\n      - multiline:\n          firstline: '^\\d{4}-\\d{2}-\\d{2} \\d{1,2}:\\d{2}:\\d{2}'\n          max_wait_time: 3s\n      - regex:\n          expression: '^(?P\u003ctime\u003e\\d{4}-\\d{2}-\\d{2} \\d{1,2}:\\d{2}:\\d{2},d{3}) (?P\u003cmessage\u003e(?s:.*))$$'\n# Use $$ (double-dollar sign) when your configuration needs a literal dollar sign.\n\nversion: \"3.4\"\n\nservices:\n   foo:\n      image: foo\n      logging: *default-logging # alias(*): refer to 'default-logging' chunk \n```\n\n#### Grafana Data Source\n\nAdd a TraceID derived field to extract the trace id and create a Tempo link from the trace id.\n\nGrafana data source setting example:\n\n![Data Source of Loki: Derived fields](./images/loki-derive-filed.png)\n\nGrafana data source config example:\n\n```yaml\nname: Loki\ntype: loki\ntypeName: Loki\naccess: proxy\nurl: http://loki:3100\npassword: ''\nuser: ''\ndatabase: ''\nbasicAuth: false\nisDefault: false\njsonData:\nderivedFields:\n   - datasourceUid: tempo\n      matcherRegex: (?:trace_id)=(\\w+)\n      name: TraceID\n      url: $${__value.raw}\n      # Use $$ (double-dollar sign) when your configuration needs a literal dollar sign.\nreadOnly: false\neditable: true\n```\n\n### Grafana\n\n1. Add Prometheus, Tempo, and Loki to the data source with config file `etc/grafana/datasource.yml`.\n2. Load predefined dashboard with `etc/dashboards.yaml` and `etc/dashboards/fastapi-observability.json`.\n\n```yaml\n# grafana in docker-compose.yaml\ngrafana:\n   image: grafana/grafana:10.4.2\n   volumes:\n      - ./etc/grafana/:/etc/grafana/provisioning/datasources # data sources\n      - ./etc/dashboards.yaml:/etc/grafana/provisioning/dashboards/dashboards.yaml # dashboard setting\n      - ./etc/dashboards:/etc/grafana/dashboards # dashboard json files directory\n```\n\n## Reference\n\n1. [FastAPI Traces Demo](https://github.com/softwarebloat/python-tracing-demo)\n2. [Waber - A Uber-like (Car-Hailing APP) cloud-native application with OpenTelemetry](https://github.com/Johnny850807/Waber)\n3. [Intro to exemplars, which enable Grafana Tempo’s distributed tracing at massive scale](https://grafana.com/blog/2021/03/31/intro-to-exemplars-which-enable-grafana-tempos-distributed-tracing-at-massive-scale/)\n4. [Trace discovery in Grafana Tempo using Prometheus exemplars, Loki 2.0 queries, and more](https://grafana.com/blog/2020/11/09/trace-discovery-in-grafana-tempo-using-prometheus-exemplars-loki-2.0-queries-and-more/)\n5. [The New Stack (TNS) observability app](https://github.com/grafana/tns)\n6. [Don’t Repeat Yourself with Anchors, Aliases and Extensions in Docker Compose Files](https://medium.com/@kinghuang/docker-compose-anchors-aliases-extensions-a1e4105d70bd)\n7. [How can I escape a $ dollar sign in a docker compose file?](https://stackoverflow.com/a/40621373)\n8. [Tempo Trace to logs tags discussion](https://community.grafana.com/t/need-to-customize-tempo-option-for-trace-logs-with-loki/59612)\n9. [Starlette Prometheus](https://github.com/perdy/starlette-prometheus)\n10. [Tempo Example](https://github.com/grafana/tempo/tree/main/example/docker-compose/local)\n","funding_links":["https://ko-fi.com/blueswen"],"categories":["13. Examples and Sandbox's","Projects"],"sub_categories":["Anomalies Detection","Open Source Projects"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FBlueswen%2Ffastapi-observability","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FBlueswen%2Ffastapi-observability","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FBlueswen%2Ffastapi-observability/lists"}