https://github.com/wisemanIV/strands-costguard

This package provides a Strands-native “cost guard” layer for AI agents, focused on controlling and attributing LLM and tool spend within Strands-based workflows. It enforces budgets and policies at run time while emitting persistent, OpenTelemetry-style metrics so teams can monitor and optimize cost across tenants, strands, and workflows.
https://github.com/wisemanIV/strands-costguard

aws-cost-management aws-cost-saving strands strands-agents

Last synced: 4 months ago
JSON representation

Host: GitHub
URL: https://github.com/wisemanIV/strands-costguard
Owner: wisemanIV
License: other
Created: 2025-12-09T13:16:53.000Z (6 months ago)
Default Branch: main
Last Pushed: 2025-12-10T17:46:17.000Z (6 months ago)
Last Synced: 2025-12-10T17:58:54.957Z (6 months ago)
Topics: aws-cost-management, aws-cost-saving, strands, strands-agents
Language: Python
Homepage:
Size: 246 KB
Stars: 1
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE

Awesome Lists containing this project

awesome-strands-agents - Strands CostGuard - tenant tracking, and policy-based spending controls | [wisemanIV/strands-costguard](https://github.com/wisemanIV/strands-costguard) | DevOps & Operations | (Community Projects / For PyPI Packages)

README

          # Strands CostGuard

A cost management library for the [Strands Agents SDK](https://github.com/strands-agents/sdk-python) with budget enforcement, adaptive model routing, and OpenTelemetry-compatible metrics.

## Features

- **Budget Enforcement**: Define budgets at tenant, strand, workflow, and run levels with configurable limits and actions

- **Adaptive Model Routing**: Automatically route to fallback models based on budget utilization and other conditions

- **Cost Tracking**: Track and attribute costs by tenant, strand, workflow, run, model, and tool

- **OpenTelemetry Metrics**: Emit cost metrics compatible with OTel collectors for long-term storage and analysis

- **Flexible Policies**: Configure via YAML files or environment variables

- **Persistent Budget State**: Optional Valkey/Redis persistence for budget state across restarts

## Requirements

- Python 3.10+

- [Strands Agents SDK](https://github.com/strands-agents/sdk-python) 0.1.0+

## Installation

```bash

pip install strands-costguard

```

For persistence support:

```bash

pip install strands-costguard[valkey]

```

## Quick Start

### Running the Examples

```bash

# Install the package in development mode

pip install -e .

# Run the basic usage example

python examples/basic_usage.py

```

### Basic Usage

```python

from strands_costguard import (

    CostGuard,

    CostGuardConfig,

    FilePolicySource,

    ModelUsage,

)

# Initialize Cost Guard

config = CostGuardConfig(

    policy_source=FilePolicySource(path="./policies"),

    enable_budget_enforcement=True,

    enable_routing=True,

    enable_metrics=True,

)

guard = CostGuard(config=config)

# Start a run

decision = guard.on_run_start(

    tenant_id="prod-tenant",

    strand_id="analytics_assistant",

    workflow_id="data_analysis",

    run_id="run-123",

)

if not decision.allowed:

    print(f"Run rejected: {decision.reason}")

else:

    # Execute your agent loop...

    # Before model calls

    model_decision = guard.before_model_call(

        run_id="run-123",

        model_name="gpt-4o",

        stage="planning",

        prompt_tokens_estimate=500,

    )

    # Use the effective model (may be downgraded)

    effective_model = model_decision.effective_model

    # After model calls

    guard.after_model_call(

        run_id="run-123",

        usage=ModelUsage.from_response(

            model_name=effective_model,

            prompt_tokens=500,

            completion_tokens=200,

        ),

    )

    # End the run

    guard.on_run_end("run-123", "completed")

# Shutdown (flushes metrics)

guard.shutdown()

```

## Configuration

### Budget Policies (budgets.yaml)

```yaml

budgets:

  - id: "tenant-default"

    scope: "tenant"

    match:

      tenant_id: "*"

    period: "monthly"

    max_cost: 1000.0

    soft_thresholds: [0.7, 0.9, 1.0]

    hard_limit: true

    on_soft_threshold_exceeded: "DOWNGRADE_MODEL"

    on_hard_limit_exceeded: "REJECT_NEW_RUNS"

  - id: "analytics-strand"

    scope: "strand"

    match:

      strand_id: "analytics_assistant"

    period: "daily"

    max_cost: 50.0

    max_runs_per_period: 1000

    max_concurrent_runs: 100

    constraints:

      max_iterations_per_run: 8

      max_tool_calls_per_run: 20

      max_model_tokens_per_run: 30000

```

### Routing Policies (routing.yaml)

```yaml

routing_policies:

  - id: "default-routing"

    match:

      strand_id: "*"

    stages:

      - stage: "planning"

        default_model: "gpt-4o-mini"

        max_tokens: 2000

      - stage: "synthesis"

        default_model: "gpt-4o"

        fallback_model: "gpt-4o-mini"

        trigger_downgrade_on:

          soft_threshold_exceeded: true

          remaining_budget_below: 5.0

```

### Pricing Table (pricing.yaml)

```yaml

pricing:

  currency: "USD"

  models:

    "gpt-4o":

      input_per_1k: 2.50

      output_per_1k: 10.00

    "gpt-4o-mini":

      input_per_1k: 0.15

      output_per_1k: 0.60

  tools:

    "web_search":

      cost_per_call: 0.01

```

## Lifecycle Hooks

Cost Guard integrates with your agent runtime via lifecycle hooks:

| Hook | When Called | Returns |

|------|-------------|---------|

| `on_run_start()` | Before starting a new run | `AdmissionDecision` |

| `on_run_end()` | After a run completes | None |

| `before_iteration()` | Before each agent loop iteration | `IterationDecision` |

| `after_iteration()` | After each iteration completes | None |

| `before_model_call()` | Before each model call | `ModelDecision` |

| `after_model_call()` | After each model call | None |

| `before_tool_call()` | Before each tool call | `ToolDecision` |

| `after_tool_call()` | After each tool call | None |

## OpenTelemetry Metrics

### Enabling OTLP Export

To export metrics to an OpenTelemetry collector, configure StrandsTelemetry before initializing CostGuard:

```python

from strands.telemetry.config import StrandsTelemetry

from strands_costguard import CostGuard, CostGuardConfig, FilePolicySource

# Configure telemetry with OTLP export

telemetry = StrandsTelemetry()

telemetry.setup_otlp_exporter(endpoint="http://localhost:4317")

telemetry.setup_meter(enable_otlp_exporter=True)

# Initialize CostGuard (will use the global MeterProvider)

config = CostGuardConfig(

    policy_source=FilePolicySource(path="./policies"),

    enable_metrics=True,

)

guard = CostGuard(config=config)

```

**Requirements:**

- An OpenTelemetry collector running at the specified endpoint (default: `localhost:4317`)

- For local development, you can run a collector with Docker:

  ```bash

  docker run -p 4317:4317 otel/opentelemetry-collector:latest

  ```

**Disabling OTLP Export:**

If you don't have a collector running, disable OTLP export to avoid connection errors:

```python

telemetry.setup_meter(enable_otlp_exporter=False)

```

### Metrics Reference

Cost Guard emits the following metrics:

| Metric | Type | Description |

|--------|------|-------------|

| `genai.cost.total` | Counter | Total cost in currency units |

| `genai.cost.model` | Counter | Cost per model |

| `genai.cost.tool` | Counter | Cost per tool |

| `genai.tokens.input` | Counter | Total input tokens |

| `genai.tokens.output` | Counter | Total output tokens |

| `genai.agent.iterations` | Counter | Agent loop iterations |

| `genai.agent.tool_calls` | Counter | Tool calls |

| `genai.cost.downgrade_events` | Counter | Model downgrade events |

| `genai.cost.rejection_events` | Counter | Run rejection events |

Metrics include resource attributes:

- `service.name`, `service.namespace`, `deployment.environment`

- `strands.tenant_id`, `strands.strand_id`, `strands.workflow_id`

## Budget Scopes and Priority

Budgets can be defined at multiple scopes, with higher priority scopes taking precedence:

1. **Global** (lowest priority) - Default limits for all

2. **Tenant** - Organization-level limits

3. **Strand** - Agent definition limits

4. **Workflow** (highest priority) - Specific workflow limits

When multiple budgets match, constraints are merged with more specific budgets taking priority.

## Threshold Actions

When budget soft thresholds are exceeded:

| Action | Effect |

|--------|--------|

| `LOG_ONLY` | Log warning, continue normally |

| `DOWNGRADE_MODEL` | Switch to fallback models |

| `LIMIT_CAPABILITIES` | Reduce max tokens/iterations |

| `HALT_NEW_RUNS` | Reject new runs |

When hard limits are exceeded:

| Action | Effect |

|--------|--------|

| `HALT_RUN` | Stop the current run |

| `REJECT_NEW_RUNS` | Reject new runs only |

## Development

```bash

# Install dev dependencies

pip install -e ".[dev]"

# Run tests

pytest

# Type checking

mypy src/

# Linting

ruff check src/

```

## License

Apache-2.0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/wisemanIV/strands-costguard

Awesome Lists containing this project

README