{"id":37063036,"url":"https://github.com/ubermorgenland/logcost","last_synced_at":"2026-01-14T07:01:46.341Z","repository":{"id":326555765,"uuid":"1102419326","full_name":"ubermorgenland/LogCost","owner":"ubermorgenland","description":"Find and fix expensive log statements to reduce cloud logging costs","archived":false,"fork":false,"pushed_at":"2025-11-28T17:28:39.000Z","size":93,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-11-30T21:48:36.292Z","etag":null,"topics":["cost-analysis","logging","monitoring","observability","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ubermorgenland.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-11-23T12:35:57.000Z","updated_at":"2025-11-29T00:55:49.000Z","dependencies_parsed_at":"2025-12-02T08:01:08.678Z","dependency_job_id":null,"html_url":"https://github.com/ubermorgenland/LogCost","commit_stats":null,"previous_names":["ubermorgenland/logcost"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/ubermorgenland/LogCost","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ubermorgenland%2FLogCost","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ubermorgenland%2FLogCost/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ubermorgenland%2FLogCost/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ubermorgenland%2FLogCost/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ubermorgenland","download_url":"https://codeload.github.com/ubermorgenland/LogCost/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ubermorgenland%2FLogCost/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28412482,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-14T05:26:33.345Z","status":"ssl_error","status_checked_at":"2026-01-14T05:21:57.251Z","response_time":107,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cost-analysis","logging","monitoring","observability","python"],"created_at":"2026-01-14T07:01:45.666Z","updated_at":"2026-01-14T07:01:46.335Z","avatar_url":"https://github.com/ubermorgenland.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# LogCost\n\n**Your cloud logging bill is $2,000/month. One debug statement in a hot path is responsible for $800 of it.**\n\nLogCost finds expensive log statements by tracking and aggregating logs at the source code level. Drop-in instrumentation (just `import logcost`) pinpoints which lines generate the most data, helping you cut cloud logging costs by 40-60% without guessing.\n\n**See exactly what's costing you:**\n\n```\nsrc/memory_utils.py:338\n  DEBUG: Processing step: %s\n  315 GB  |  $157.50  |  1.2M calls\n```\n\nInstead of wondering where your logging costs go, LogCost shows the exact file:line, bytes logged, cost, and call count. Fix the top offenders and save hundreds monthly.\n\n## Features\n\n- **Zero-config tracking** - Monkey-patches `logging` and `print` to measure file/line, level, message template, call count, and bytes\n- **Aggregation by location** - Logs from the same file:line:level are aggregated together, regardless of message content. This means a loop logging 1000 times shows as one entry with count=1000, not 1000 separate entries\n- **Thread-safe** - Lock-protected tracking works across concurrent requests\n- **Framework support** - Examples for Flask, FastAPI, Django, Kubernetes\n- **Export options** - JSON, CSV, Prometheus, HTML reports\n- **Cost analysis** - Compute GCP/AWS/Azure cost estimates and identify anti-patterns\n- **Performance** - Low overhead design for production use\n- **GCloud source attribution** - Preserves actual source file:line in logs, not wrapper attribution\n\n## Quick Start\n\n```bash\npip install logcost\n```\n\n```python\nimport logcost\nimport logging\n\nlogging.getLogger().setLevel(logging.INFO)\nlogging.info(\"Processing user %s\", 123)\n\nstats_file = logcost.export(\"/tmp/logcost_stats.json\")\nprint(\"Exported to\", stats_file)\n```\n\nAnalyze the results:\n\n```bash\npython -m logcost.cli analyze /tmp/logcost_stats.json --provider gcp --top 5\n```\n\n**Example Output:**\n\n```\nProvider: GCP  Currency: USD\nTotal bytes: 900,000,000,000  Estimated cost: 450.00 USD\n\nTop 5 cost drivers:\n- src/memory_utils.py:338 [DEBUG] Processing step: %s... 157.5000 USD\n- _trace.py:87 [INFO] connect_tcp.started host='api.github...' 112.5000 USD\n- _base_client.py:452 [DEBUG] Request options: %s... 67.5000 USD\n- connectionpool.py:544 [DEBUG] %s://%s:%s \"%s %s %s\" %s %s... 45.0000 USD\n- streamable_http.py:385 [DEBUG] Sending client message: root=JSONRPCRequest... 67.5000 USD\n\nDetected anti-patterns:\n  * DEBUG level logs producing non-zero cost\n  * High-frequency logs (\u003e1000 calls) in hot paths\n  * Large payload logging (\u003e5KB per call)\n\nRecommendations:\n  * Remove DEBUG statement at src/memory_utils.py:338 - potential $157.50/month savings\n  * Silence httpcore DEBUG logging - potential $112.50/month savings\n```\n\n**Real-world example:** A typical service logging 900 GB/month (30 GB/day average) would see costs like this with GCP, with debug statements and library tracing accounting for the majority of the bill.\n\n## Installation\n\n### From PyPI (Recommended)\n\n```bash\npip install logcost\n```\n\n### From Source\n\n```bash\ngit clone https://github.com/ubermorgenland/LogCost.git\ncd LogCost\npip install -e .\n\n# Run tests\npytest tests/\n\n# Install with development dependencies\npip install -e \".[dev]\"\n```\n\n## Usage\n\n### Basic Tracking\n\n```python\nimport logcost  # auto-installs tracker on import\nimport logging\n\nlogger = logging.getLogger(__name__)\n\n# Your normal logging code\nlogger.info(\"Processing order %s\", order_id)\nlogger.debug(\"User data for %s\", user_id)\nprint(\"Debug output\")  # print() is also tracked\n\n# Export stats (automatically on exit, or manually)\nstats_path = logcost.export(\"/tmp/logcost_stats.json\")\n```\n\n### Controlling External Library Logging\n\nExternal libraries often emit DEBUG-level logs that can inflate logging costs without providing production value. Silence them selectively:\n\n```python\nimport logging\nimport logcost\n\n# Silence external library debug logs (production best practice)\nlogging.getLogger(\"httpcore\").setLevel(logging.WARNING)      # HTTP connection tracing\nlogging.getLogger(\"httpx\").setLevel(logging.WARNING)         # HTTP client\nlogging.getLogger(\"anthropic\").setLevel(logging.WARNING)     # Anthropic SDK\nlogging.getLogger(\"urllib3\").setLevel(logging.WARNING)       # urllib3\n\n# Your code's logs still tracked - just reduced noise from dependencies\nlogger = logging.getLogger(__name__)\nlogger.info(\"Important app event\")  # Still tracked by LogCost\n```\n\nLogCost will still track these suppressed logs if they somehow get through, but setting appropriate levels prevents the noisy ones from being generated in the first place.\n\n### Skipping Helper Modules\n\nIf you wrap logging in helper utilities:\n\n```python\nimport logcost\n\n# Ignore helper frames to attribute cost to original caller\nlogcost.ignore_module(\"myapp.logging_helpers\")\n```\n\n### Long-Running Services\n\nFor services that don't exit:\n\n```python\nimport signal\nimport logcost\n\ndef handle_sigusr1(signum, frame):\n    logcost.export(\"/tmp/logcost_snapshot.json\")\n\nsignal.signal(signal.SIGUSR1, handle_sigusr1)\n```\n\nOr use the CLI:\n\n```bash\npython -m logcost.cli capture /tmp/logcost_stats.json\n```\n\n### Slack Notifications\n\nGet proactive alerts about logging costs in your Slack channel:\n\n**Setup:**\n1. Create a Slack Incoming Webhook:\n   - Go to https://api.slack.com/messaging/webhooks\n   - Click \"Create your Slack app\" → \"Incoming Webhooks\"\n   - Activate and create a webhook for your channel\n   - Copy the webhook URL (e.g., `https://hooks.slack.com/services/T00.../B00.../XXX...`)\n\n2. Configure environment variables:\n```bash\nexport LOGCOST_SLACK_WEBHOOK=\"https://hooks.slack.com/services/YOUR/WEBHOOK/URL\"\nexport LOGCOST_PROVIDER=\"gcp\"  # or \"aws\", \"azure\"\nexport LOGCOST_NOTIFICATION_TOP_N=\"5\"  # number of top logs to show\n```\n\n**Usage:**\n\nAutomatic notifications with periodic flush:\n```python\nimport logcost\n\n# Start periodic flush - automatically sends Slack notifications\nlogcost.start_periodic_flush(\"/var/log/logcost/stats.json\")\n# Stats flushed every 5 minutes (LOGCOST_FLUSH_INTERVAL=300)\n# Notifications sent every 1 hour (LOGCOST_NOTIFICATION_INTERVAL=3600, configurable)\n# Optional: set LOGCOST_NOTIFICATION_TEST_DELAY (seconds) to send a one-time [Test] message after startup\n```\n\nManual notification:\n```python\nimport logcost\nfrom logcost import send_notification_if_configured\n\nstats = logcost.get_stats()\nsend_notification_if_configured(stats)  # Uses LOGCOST_SLACK_WEBHOOK env var\n```\n\n**Notification includes:**\n- Total logging cost and volume\n- Top N most expensive log statements with file:line references\n- Anti-pattern warnings (DEBUG in production, high-frequency loops, large payloads)\n- Week-over-week trend (if available)\n\n**Example Slack Notification:**\n\n```\nLogCost Report - GCP\nTotal: 900.00 GB ($450.00)\nLog calls: 2,847,000\nTrend: 📈 +12% from previous period\n\n🔥 Top 5 Most Expensive Logs:\n1. src/memory_utils.py:338 - $157.50 (315.00 GB, 1.2M calls)\n   Processing step: %s...\n2. _trace.py:87 - $112.50 (225.00 GB, 2.8M calls)\n   connect_tcp.started host='api.github...'\n3. _base_client.py:452 - $67.50 (135.00 GB, 8.4K calls)\n   Request options: %s...\n4. connectionpool.py:544 - $45.00 (90.00 GB, 1.1M calls)\n   %s://%s:%s \"%s %s %s\" %s %s...\n5. streamable_http.py:385 - $67.50 (135.00 GB, 850K calls)\n   Sending client message: root=JSONRPCRequest...\n\n⚠️  Warnings:\n• DEBUG level logs producing non-zero cost\n• High-frequency logs (\u003e1000 calls) in hot paths\n• Large payload logging (\u003e5KB per call)\n\nTotal logs tracked: 45 unique locations | Analyzed with LogCost\n```\n\n**Security Note:** The webhook URL is a credential - treat it like a password. Never commit it to version control. Use environment variables, Kubernetes secrets, or secrets managers.\n\n## CLI Commands\n\n### Analyze\n\nShow top expensive log statements:\n\n```bash\npython -m logcost.cli analyze stats.json --top 10 --provider gcp\n```\n\n### Report\n\nExport analysis to JSON:\n\n```bash\npython -m logcost.cli report stats.json reports/analysis.json\n```\n\n### Estimate ROI\n\nCalculate potential savings:\n\n```bash\npython -m logcost.cli estimate stats.json --reduction 0.4 --hours 12 --rate 120\n```\n\n- `--reduction`: Expected cost reduction (0.4 = 40%)\n- `--hours`: Engineering hours to fix\n- `--rate`: Hourly rate in USD\n\n### Diff\n\nCompare before/after:\n\n```bash\npython -m logcost.cli diff stats_before.json stats_after.json\n```\n\n### Capture\n\nSnapshot running service:\n\n```bash\npython -m logcost.cli capture /tmp/logcost_stats.json\n```\n\n## Framework Integration\n\n### Flask / WSGI\n\n```python\nimport logcost\nfrom flask import Flask\n\napp = Flask(__name__)\nlogger = app.logger\n\n@app.route(\"/\")\ndef hello():\n    logger.info(\"Homepage accessed\")\n    return \"Hello\"\n\nif __name__ == \"__main__\":\n    app.run()\n    # Stats exported automatically on exit\n```\n\nSee `examples/flask_app/` for full example.\n\n### FastAPI / ASGI\n\n```python\nimport logcost\nfrom fastapi import FastAPI\n\napp = FastAPI()\n\n@app.get(\"/\")\nasync def root():\n    logger.info(\"Root endpoint hit\")\n    return {\"message\": \"Hello\"}\n```\n\nThe tracker works with async code since it hooks the core `logging` machinery.\nSee `examples/fastapi_app/` for complete demo.\n\n### Django\n\nImport in `settings.py` so tracker attaches before middleware:\n\n```python\n# settings.py\nimport logcost\n\n# ... rest of settings\n```\n\nRun your app and export stats:\n\n```bash\npython manage.py runserver\n# In another terminal\npython -m logcost.cli capture /tmp/django_logcost.json\n```\n\nSee `examples/django_app/` for full setup.\n\n### Docker \u0026 Kubernetes (Sidecar Pattern)\n\nFor production deployments, LogCost uses a sidecar architecture that separates logging from monitoring:\n\n**Architecture:**\n- **App Container**: Your application with LogCost library installed, writes stats to shared volume\n- **Sidecar Container**: LogCost monitoring container that watches stats, aggregates data, stores history, and sends notifications\n\n**Benefits:** Separation of concerns, reusable sidecar, no application code changes after setup\n\n#### Build and Publish Docker Image\n\nBuild locally:\n```bash\ncd LogCost/\ndocker build -t logcost/logcost:latest .\n```\n\nPublish to Docker Hub (requires Docker Hub account):\n```bash\n# Login to Docker Hub\ndocker login\n\n# Build and push\ndocker build -t your-username/logcost:latest .\ndocker push your-username/logcost:latest\n\n# Or build for multiple architectures (recommended)\ndocker buildx build --platform linux/amd64,linux/arm64 \\\n  -t your-username/logcost:latest \\\n  -t your-username/logcost:v0.1.0 \\\n  --push .\n```\n\nSee [DOCKER.md](DOCKER.md) for complete publishing guide including GitHub Actions automation, other registries (GCR, ECR, ACR), security scanning, and versioning strategy.\n\n#### Kubernetes Deployment\n\nAdd LogCost sidecar to your deployment:\n\n```yaml\napiVersion: apps/v1\nkind: Deployment\nmetadata:\n  name: myapp-with-logcost\nspec:\n  template:\n    spec:\n      containers:\n      # Your application\n      - name: app\n        image: your-registry/myapp:latest\n        env:\n        - name: LOGCOST_OUTPUT\n          value: /var/log/logcost/stats.json\n        - name: LOGCOST_FLUSH_INTERVAL\n          value: \"300\"  # 5 minutes\n        volumeMounts:\n        - name: logcost-data\n          mountPath: /var/log/logcost\n\n      # LogCost sidecar\n      - name: logcost-sidecar\n        image: logcost/logcost:latest\n        env:\n        - name: LOGCOST_NOTIFICATION_INTERVAL\n          value: \"3600\"  # 1 hour\n        - name: LOGCOST_PROVIDER\n          value: gcp  # or aws, azure\n        - name: LOGCOST_SLACK_WEBHOOK\n          valueFrom:\n            secretKeyRef:\n              name: logcost-slack-webhook\n              key: webhook-url\n        volumeMounts:\n        - name: logcost-data\n          mountPath: /var/log/logcost\n        resources:\n          requests:\n            memory: \"64Mi\"\n            cpu: \"50m\"\n          limits:\n            memory: \"128Mi\"\n            cpu: \"100m\"\n\n      volumes:\n      - name: logcost-data\n        emptyDir: {}\n```\n\nYour app code needs one line:\n\n```python\nimport logcost\nlogcost.start_periodic_flush(\"/var/log/logcost/stats.json\")\n```\n\nThe sidecar will automatically:\n- Watch for stats updates\n- Store historical snapshots (7 days retention)\n- Send hourly Slack notifications with trends\n- Detect anti-patterns (DEBUG in production, high-frequency logs, large payloads)\n\n**File Permissions (Important):** The sidecar container runs as UID 1000 and needs read access to stats.json. Solution: Use an init container to set up the shared volume with permissive permissions:\n\n```yaml\ninitContainers:\n- name: setup-logcost-perms\n  image: busybox:latest\n  command: ['sh', '-c', 'mkdir -p /var/log/logcost \u0026\u0026 chmod 777 /var/log/logcost']\n  volumeMounts:\n  - name: logcost-data\n    mountPath: /var/log/logcost\n```\n\nThis ensures both containers can read/write to the shared volume. See `examples/kubernetes/deployment-with-init-container.yaml` for a complete working example.\n\n## Cost Calculation\n\nThe analyzer estimates cost using:\n\n```\ncost = (bytes_emitted / 1GB) × price_per_gb\n```\n\n**Default Pricing:**\n- GCP: $0.50/GB\n- AWS: $0.57/GB\n- Azure: $0.63/GB\n\nOverride pricing:\n\n```python\nfrom logcost.analyzer import CostAnalyzer\n\nanalyzer = CostAnalyzer(stats, price_per_gb=0.75)\n```\n\nOr via CLI:\n\n```bash\npython -m logcost.cli analyze stats.json --provider gcp\n```\n\n### Anti-Pattern Detection\n\nThe analyzer flags:\n\n- **High-frequency logs** - Statements executed \u003e1,000 times (likely tight loops)\n- **Debug logs in production** - DEBUG level logs producing non-zero cost\n- **Large payloads** - Messages exceeding 5 KB per call\n\n### ROI Calculation\n\n```\npotential_savings = total_cost × reduction_percent\neffort_cost = hours_to_fix × hourly_rate\nroi = (potential_savings - effort_cost) / effort_cost\n```\n\nExample:\n\n```bash\npython -m logcost.cli estimate stats.json --reduction 0.5 --hours 8 --rate 100\n```\n\nOutput:\n```\nPotential monthly savings: $250.00\nEffort cost: $800.00\nROI: -68.75% (not worth it)\n```\n\n## FAQ\n\n**Does LogCost change my logging behavior?**\nNo. It wraps `logging.Logger._log` and `print` but always calls the original implementation after recording stats.\n\n**What about other logging libs (structlog, loguru)?**\nMost delegate to Python's `logging` module. If not, you can manually call `logcost.tracker._track_call()`. Adapters are planned.\n\n**How often should I export?**\nFor scripts, rely on the built-in `atexit` export. For long-running services, export on intervals (cron, signal handler, or sidecar) to avoid losing stats on crashes.\n\n**Is tracking configurable?**\nUse `logcost.ignore_module(\"module.prefix\")` to skip helper frames. Anti-pattern thresholds are constants in `logcost/analyzer.py` (PRs welcome for config support).\n\n**Performance impact?**\nDesigned for low overhead (lock-protected dict updates + string formatting). Run the benchmark to measure on your hardware:\n\n```bash\npython benchmarks/tracker_benchmark.py --iterations 100000\n```\n\n## Examples\n\n- **`examples/flask_app/`** - Classic Flask app with tracked routes\n- **`examples/fastapi_app/`** - Async FastAPI integration\n- **`examples/django_app/`** - Minimal Django project with LogCost\n- **`examples/kubernetes/`** - K8s deployment + sidecar pattern\n\n## Contributing\n\nContributions welcome! See `CONTRIBUTING.md` for guidelines.\n\n## License\n\nMIT License - see `LICENSE` file for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fubermorgenland%2Flogcost","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fubermorgenland%2Flogcost","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fubermorgenland%2Flogcost/lists"}