{"id":28485662,"url":"https://github.com/jonaskahn/azure-invoice","last_synced_at":"2026-06-19T23:31:00.511Z","repository":{"id":297199170,"uuid":"995944099","full_name":"jonaskahn/azure-invoice","owner":"jonaskahn","description":"Simple app for simple life","archived":false,"fork":false,"pushed_at":"2025-06-08T15:35:59.000Z","size":300,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-01-17T05:10:29.240Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://azure-invoice.streamlit.app","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jonaskahn.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-04T08:26:09.000Z","updated_at":"2025-06-08T15:36:03.000Z","dependencies_parsed_at":"2025-06-04T16:41:31.679Z","dependency_job_id":"47413797-816e-4f0e-9c4e-05a6bc3e8841","html_url":"https://github.com/jonaskahn/azure-invoice","commit_stats":null,"previous_names":["jonaskahn/azure-invoice"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/jonaskahn/azure-invoice","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jonaskahn%2Fazure-invoice","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jonaskahn%2Fazure-invoice/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jonaskahn%2Fazure-invoice/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jonaskahn%2Fazure-invoice/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jonaskahn","download_url":"https://codeload.github.com/jonaskahn/azure-invoice/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jonaskahn%2Fazure-invoice/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34552295,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-19T02:00:06.005Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-08T00:30:38.082Z","updated_at":"2026-06-19T23:31:00.502Z","avatar_url":"https://github.com/jonaskahn.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Azure Invoice Analyzer Pro - Complete Strategy Document\n\n## Executive Summary\n\nAzure Invoice Analyzer Pro is an advanced Streamlit application that transforms Azure billing data into actionable business intelligence. The system provides comprehensive cost analysis through 9 business categories, interactive drill-down capabilities, and complete financial reconciliation with PDF export functionality.\n\n## Quick Chart Reference\n\n### Chart Calculation Summary\n\n| Chart Type                    | Calculation Method                            | Key Metrics                         |\n| ----------------------------- | --------------------------------------------- | ----------------------------------- |\n| **Cost Category Pie/Bar**     | `df.groupby('CostCategory')['Cost'].sum()`    | Category percentages of total cost  |\n| **Service Provider Analysis** | `df.groupby('ConsumedService')['Cost'].sum()` | Total cost per Azure service        |\n| **Resource Group Costs**      | `df.groupby('ResourceGroup')['Cost'].sum()`   | Total cost per resource group       |\n| **Machine Efficiency**        | `Cost / Quantity` per resource                | Cost per unit/hour analysis         |\n| **Interactive Drill-Down**    | Multi-level grouping + pattern matching       | Resource Group → Machine → Category |\n| **Cost Reconciliation**       | `abs(original_total - categorized_total)`     | Financial validation \u0026 accuracy     |\n\n### Core Calculations\n\n**Cost Reconciliation:**\n\n```python\noriginal_total = df['Cost'].sum()\ncategorized_total = classified_df['Cost'].sum()\ncoverage = (categorized_cost / original_total) × 100\n```\n\n**Category Classification:**\n\n```python\n# 9 business categories based on Azure service patterns\nif MeterSubcategory in ['Premium SSD Managed Disks']:\n    return 'Managed Disks'\nelif ConsumedService == 'Microsoft.Compute' and not disk_category:\n    return 'VM Compute'\n```\n\n**Machine Cost Breakdown:**\n\n```python\n# Includes related resources (disks, NICs, etc.)\nmachine_data = df[df['ResourceName'] == machine_name]\nrelated_data = df[df['ResourceName'].str.contains(machine_name)]\ntotal_cost = combined_data['Cost'].sum()\n```\n\n### Chart Types and Business Value\n\n| Chart                         | Purpose                                | Key Insight                         | Optimization Action                       |\n| ----------------------------- | -------------------------------------- | ----------------------------------- | ----------------------------------------- |\n| **Cost Category Pie**         | Budget allocation by business function | Which category consumes most budget | Focus optimization on largest segments    |\n| **Service Provider Bar**      | Azure service spending distribution    | Which Microsoft services cost most  | Negotiate better rates, optimize usage    |\n| **Resource Group Costs**      | Environment/team cost allocation       | Which groups spend most             | Budget allocation, cost accountability    |\n| **Machine Efficiency**        | Resource utilization analysis          | Cost per hour performance           | Right-size over/under-utilized resources  |\n| **Drill-Down Analysis**       | Granular cost investigation            | Machine-level cost breakdown        | Target specific machines for optimization |\n| **Reconciliation Validation** | Financial accuracy verification        | Data integrity and completeness     | Ensure 100% cost accountability           |\n\n## Core Architecture\n\n### Data Processing Pipeline\n\n1. **Data Ingestion**: CSV file upload with validation and error handling\n2. **Cost Classification**: 9-category business intelligence classification system\n3. **Cost Reconciliation**: Mathematical validation ensuring 100% cost accountability\n4. **Interactive Analysis**: Multi-level drill-down from resource groups to individual machines\n5. **Export Generation**: Professional PDF reports and individual chart downloads\n\n### Classification Engine\n\n#### 9 Business Cost Categories\n\n1. **Managed Disks** (typically 60-80% of costs)\n\n   - Premium SSD Managed Disks\n   - Standard HDD Managed Disks\n   - Standard SSD Managed Disks\n   - Ultra SSD Managed Disks\n\n2. **VM Compute** (10-25% of costs)\n\n   - Virtual machine runtime costs\n   - CPU and memory consumption\n   - Excludes storage components\n\n3. **CDN** (5-15% of costs)\n\n   - Content Delivery Network services\n   - Data transfer and caching\n\n4. **Network/IP** (3-10% of costs)\n\n   - Virtual Network services\n   - Public IP addresses\n   - Network security groups\n\n5. **Backup** (2-8% of costs)\n\n   - Recovery Services vault\n   - Backup storage and operations\n\n6. **Load Balancer** (1-5% of costs)\n\n   - Standard and Basic load balancers\n   - Application gateways\n\n7. **Other Storage** (\u003c2% of costs)\n\n   - Blob storage, File storage\n   - Table storage, Queue storage\n\n8. **Bandwidth** (\u003c1% of costs)\n\n   - Data transfer charges\n   - Inter-region communication\n\n9. **Key Vault** (\u003c1% of costs)\n   - Secret management services\n   - Certificate operations\n\n### Interactive Drill-Down System\n\n#### Three-Level Analysis Hierarchy\n\n1. **Resource Group Level**\n\n   - Total cost, machine count, usage hours\n   - High-level resource allocation view\n\n2. **Machine Level**\n\n   - Individual machine costs and usage\n   - Service provider breakdown\n   - Related resource identification\n\n3. **Category Level**\n   - Detailed cost breakdown by business category\n   - Service-level attribution\n   - Optimization recommendations\n\n### Enhanced Machine Analysis\n\n#### Related Resource Detection\n\n- **Pattern Matching**: Identifies associated resources using naming conventions\n- **Resource Relationships**: Links VMs with their disks, NICs, and other components\n- **Complete Cost Attribution**: Ensures all machine-related costs are captured\n\n#### Categories Include:\n\n```\nvm-name → Primary virtual machine\nvm-name-disk → Managed disks\nvm-name-nic → Network interfaces\nvm-name_OsDisk → Operating system disks\nvm-name-backup → Backup services\n```\n\n## Chart Calculations and Analysis\n\n### 1. Executive Summary \u0026 Validation\n\n#### Cost Reconciliation Engine\n\n**Mathematical Foundation:**\n\n```python\n# Core reconciliation calculation\noriginal_total = df['Cost'].sum()\ncategorized_total = classified_df['Cost'].sum()\ndifference = abs(original_total - categorized_total)\nreconciliation_success = difference \u003c 0.01\n\n# Coverage calculation\ncategorized_cost = total_cost - uncategorized_cost\ncoverage_percentage = (categorized_cost / original_total) × 100\n```\n\n**Key Metrics Explained:**\n\n- **Original Invoice Total**: Sum of all Cost column values from uploaded CSV\n- **Categorized Total**: Sum of all costs after classification processing\n- **Difference**: Mathematical variance between original and processed totals\n- **Coverage Percentage**: (Categorized costs / Original total) × 100\n- **Reconciliation Status**: ✅ Success if difference \u003c $0.01, ❌ Failed otherwise\n\n**Business Interpretation:**\n\n- **100% Coverage**: All costs properly classified, complete financial accountability\n- **95-99% Coverage**: Good classification, minor uncategorized items need review\n- **\u003c95% Coverage**: Significant classification gaps, immediate attention required\n\n#### Executive Metrics Dashboard\n\n**Calculation Method:**\n\n```python\ntotal_cost = df['Cost'].sum()\ntotal_quantity = df['Quantity'].sum()\nunique_resource_groups = df['ResourceGroup'].nunique()\nunique_machines = df['ResourceName'].nunique()\n```\n\n**Metrics Meaning:**\n\n- **Total Cost**: Complete Azure spending for the invoice period\n- **Total Usage**: Sum of all quantity values (typically hours)\n- **Top Category**: Highest cost category from 9-category classification\n- **Resource Groups**: Number of distinct Azure resource groups\n- **Services**: Number of unique Azure service providers (Microsoft.Compute, etc.)\n\n### 2. Cost Category Analysis\n\n#### Cost Category Pie Chart\n\n**Calculation Method:**\n\n```python\ncategory_summary = classified_df.groupby('CostCategory').agg({\n    'Cost': 'sum',\n    'Quantity': 'sum'\n}).round(4)\n\ncategory_percentage = (category_summary['Cost'] / total_cost * 100).round(2)\n```\n\n**Chart Elements:**\n\n- **Pie Segments**: Each represents one of 9 business categories\n- **Percentage Labels**: Category cost as percentage of total invoice\n- **Dollar Values**: Absolute cost amount for each category\n- **Color Coding**: Consistent colors across all charts for category identification\n\n**Business Interpretation:**\n\n- **Dominant Categories**: Largest segments indicate primary cost drivers\n- **Distribution Balance**: Even distribution suggests diversified infrastructure\n- **Anomalies**: Unusually large percentages may indicate optimization opportunities\n\n#### Cost Category Bar Chart (Detailed View)\n\n**Calculation Method:**\n\n```python\ncategory_breakdown = classified_df.groupby('CostCategory').agg({\n    'Cost': ['sum', 'count', 'mean'],\n    'Quantity': 'sum'\n})\n\nrecord_count = category_breakdown['Cost']['count']\navg_cost = category_breakdown['Cost']['mean']\n```\n\n**Data Points Explained:**\n\n- **Horizontal Bars**: Length represents total cost for each category\n- **Cost Values**: Dollar amount displayed on each bar\n- **Percentage Labels**: Category percentage of total costs\n- **Record Count**: Number of invoice line items in each category (hover data)\n\n**Key Insights:**\n\n- **Bar Length Comparison**: Visual representation of spending distribution\n- **High Record Count**: Many small transactions vs few large transactions\n- **Cost Concentration**: Identifies which categories drive majority of spending\n\n### 3. Service Provider Analysis\n\n#### Service Provider Chart\n\n**Calculation Method:**\n\n```python\nprovider_summary = df.groupby('ConsumedService').agg({\n    'Cost': ['sum', 'count'],\n    'Quantity': 'sum'\n}).round(4)\n\nprovider_percentage = (provider_summary['Cost'] / total_cost * 100).round(2)\n```\n\n**Chart Components:**\n\n- **X-Axis**: Azure service providers (Microsoft.Compute, Microsoft.Storage, etc.)\n- **Y-Axis**: Total cost in USD for each provider\n- **Bar Height**: Proportional to spending on each service\n- **Text Labels**: Dollar amounts displayed above bars\n\n**Service Provider Breakdown:**\n\n- **Microsoft.Compute**: Virtual machines, disks, compute resources\n- **Microsoft.Network**: Virtual networks, load balancers, IP addresses\n- **Microsoft.Storage**: Blob storage, file storage, queues\n- **Microsoft.RecoveryServices**: Backup and disaster recovery\n- **Microsoft.Cdn**: Content delivery network services\n\n**Business Analysis:**\n\n- **Compute Dominance**: High Microsoft.Compute costs indicate VM-heavy workloads\n- **Storage Intensity**: High Microsoft.Storage suggests data-intensive applications\n- **Network Costs**: Significant Microsoft.Network indicates complex networking requirements\n\n### 4. Interactive Drill-Down Analysis\n\n#### Resource Group Selection Metrics\n\n**Calculation Method:**\n\n```python\nrg_data = df[df['ResourceGroup'] == selected_rg]\nrg_cost = rg_data['Cost'].sum()\nrg_machines = rg_data['ResourceName'].nunique()\nrg_quantity = rg_data['Quantity'].sum()\n```\n\n**Metrics Explanation:**\n\n- **Total Cost**: Sum of all costs within selected resource group\n- **Machines**: Count of unique ResourceName values in the group\n- **Total Usage**: Sum of quantity values (usually compute hours)\n\n#### Machine Analysis Table\n\n**Calculation Method:**\n\n```python\nmachine_summary = rg_data.groupby('ResourceName').agg({\n    'Cost': 'sum',\n    'Quantity': 'sum',\n    'ConsumedService': lambda x: ', '.join(x.unique()),\n    'MeterCategory': lambda x: ', '.join(x.unique())\n})\n\ncost_percentage = (machine_summary['Cost'] / rg_cost * 100).round(2)\n```\n\n**Table Columns:**\n\n- **Machine Name**: ResourceName from Azure invoice\n- **Total Cost**: Sum of all costs for this specific machine\n- **Total Usage**: Sum of quantity values for this machine\n- **Services Used**: List of Azure services consumed by this machine\n- **Meter Categories**: Types of Azure meters associated with this machine\n- **Cost %**: This machine's percentage of total resource group costs\n\n#### Individual Machine Analysis\n\n**Calculation Method:**\n\n```python\n# Enhanced resource detection\nmachine_data = classified_df[classified_df['ResourceName'] == selected_machine]\nrelated_data = classified_df[\n    (classified_df['ResourceName'].str.contains(selected_machine, case=False)) |\n    (classified_df['ResourceName'].str.startswith(selected_machine + '-')) |\n    (classified_df['ResourceName'].str.startswith(selected_machine + '_'))\n]\n\ncombined_data = pd.concat([machine_data, related_data]).drop_duplicates()\n```\n\n**Machine Metrics:**\n\n- **Total Cost**: Sum of costs for machine and all related resources\n- **Total Usage**: Sum of quantity values across all related resources\n- **Categories**: Number of cost categories this machine uses\n- **Cost/Hour**: Total cost divided by total usage hours\n\n**Related Resources Detection:**\n\n- **Primary Resource**: Exact match of machine name\n- **Associated Disks**: Resources with patterns like \"vm-name-disk\", \"vm-name_OsDisk\"\n- **Network Components**: Resources like \"vm-name-nic\" (network interfaces)\n- **Backup Resources**: Resources like \"vm-name-backup\"\n\n#### Machine Cost Breakdown Charts\n\n**Pie Chart (Cost Distribution):**\n\n```python\ncategory_breakdown = combined_data.groupby('CostCategory').agg({\n    'Cost': 'sum',\n    'Quantity': 'sum'\n})\n\nmachine_percentage = (category_breakdown['Cost'] / machine_cost * 100).round(2)\n```\n\n**Elements:**\n\n- **Pie Segments**: Proportional to category costs for this machine\n- **Percentage Labels**: Category percentage of machine's total cost\n- **Dollar Values**: Absolute cost for each category\n\n**Bar Chart (Category Details):**\n\n- **Horizontal Bars**: Cost amount for each category\n- **Category Colors**: Consistent color scheme matching pie chart\n- **Hover Data**: Additional details including quantity used\n\n### 5. Resource Efficiency Analysis\n\n#### Efficiency Metrics Chart\n\n**Calculation Method:**\n\n```python\nefficiency_df = df.groupby('ResourceName').agg({\n    'Cost': 'sum',\n    'Quantity': 'sum'\n})\n\nefficiency_df['CostPerUnit'] = efficiency_df['Cost'] / efficiency_df['Quantity']\nefficiency_df['EfficiencyScore'] = efficiency_df['Cost'] / efficiency_df['Quantity']\n```\n\n**Dual-Axis Chart Components:**\n\n- **Left Y-Axis**: Total cost in USD (bars)\n- **Right Y-Axis**: Cost per unit/hour (line)\n- **X-Axis**: Resource names (top 15 by cost)\n- **Bar Height**: Total spending on each resource\n- **Line Points**: Efficiency score (cost per hour)\n\n**Efficiency Interpretation:**\n\n- **High Cost + Low Efficiency**: Expensive resources with high cost per hour\n- **High Cost + High Efficiency**: Expensive but efficient resources\n- **Low Cost + Low Efficiency**: Cheap resources but poor cost per hour ratio\n- **Low Cost + High Efficiency**: Cost-effective resources\n\n**Optimization Targets:**\n\n- **High Efficiency Score**: Resources with cost per hour above average\n- **Right-Sizing Candidates**: Resources with consistently high efficiency scores\n- **Optimization Opportunities**: Resources showing poor cost per unit ratios\n\n### 6. Traditional Resource Analysis\n\n#### Cost by Resource Group Chart\n\n**Calculation Method:**\n\n```python\ncost_by_rg = df.groupby('ResourceGroup')['Cost'].sum().sort_values(ascending=False)\n```\n\n**Chart Elements:**\n\n- **X-Axis**: Resource group names\n- **Y-Axis**: Total cost in USD\n- **Bar Height**: Proportional to total spending per resource group\n- **Text Labels**: Dollar amounts displayed above each bar\n\n**Business Usage:**\n\n- **Environment Comparison**: Compare prod, dev, test environment costs\n- **Department Allocation**: Understand which teams/projects drive costs\n- **Budget Attribution**: Assign costs to specific business units\n\n#### Top Machines by Cost Chart\n\n**Calculation Method:**\n\n```python\ncost_by_machine = df.groupby('ResourceName')['Cost'].sum().sort_values(ascending=False)\ntop_machines = cost_by_machine.head(Config.TOP_ITEMS_COUNT)\n```\n\n**Analysis Focus:**\n\n- **Top N Machines**: Configurable from 5-50 most expensive resources\n- **Cost Ranking**: Machines ordered by total spending\n- **Optimization Targets**: Highest-cost machines for immediate attention\n\n#### Cost vs Usage Comparison Chart\n\n**Calculation Method:**\n\n```python\nagg_data = df.groupby('ResourceGroup').agg({\n    'Cost': 'sum',\n    'Quantity': 'sum'\n})\n```\n\n**Dual-Axis Visualization:**\n\n- **Primary Y-Axis (Bars)**: Total cost per resource group\n- **Secondary Y-Axis (Line)**: Total usage hours per resource group\n- **Correlation Analysis**: Relationship between cost and usage\n\n**Business Insights:**\n\n- **High Cost + Low Usage**: Potentially over-provisioned resources\n- **Low Cost + High Usage**: Efficient resource utilization\n- **Correlation Patterns**: Expected vs unexpected cost-usage relationships\n\n### 7. Uncategorized Items Analysis\n\n#### Uncategorized Cost Metrics\n\n**Calculation Method:**\n\n```python\nuncategorized_items = classified_df[classified_df['CostCategory'] == 'Other']\nuncategorized_cost = uncategorized_items['Cost'].sum()\nuncategorized_percentage = (uncategorized_cost / total_cost * 100)\n```\n\n**Key Indicators:**\n\n- **Uncategorized Cost**: Dollar amount not classified into business categories\n- **Percentage of Total**: Uncategorized amount as percentage of total invoice\n- **Number of Items**: Count of line items that couldn't be classified\n- **Status Assessment**: ✅ Excellent (\u003c1%), ⚠️ Good (1-5%), ❌ Needs Review (\u003e5%)\n\n#### Service Type Breakdown\n\n**Calculation Method:**\n\n```python\nservice_breakdown = uncategorized_items.groupby([\n    'ConsumedService', 'MeterCategory', 'MeterSubcategory'\n]).agg({\n    'Cost': 'sum',\n    'ResourceName': 'count'\n})\n```\n\n**Table Analysis:**\n\n- **Service Provider**: Azure service generating uncategorized costs\n- **Meter Category**: Type of Azure service meter\n- **Meter Subcategory**: Specific service subtype\n- **Total Cost**: Dollar amount for this service combination\n- **Item Count**: Number of invoice lines for this service type\n- **% of Uncategorized**: Percentage of uncategorized costs from this service\n\n### 8. Enhanced Service Breakdown (Machine Level)\n\n#### Comprehensive Service Analysis\n\n**Calculation Method:**\n\n```python\nenhanced_breakdown = all_machine_data.groupby([\n    'ConsumedService', 'MeterCategory', 'MeterSubcategory'\n]).agg({\n    'Cost': 'sum',\n    'Quantity': 'sum',\n    'ResourceName': lambda x: ', '.join(x.unique())\n})\n```\n\n**Detailed Breakdown Components:**\n\n- **Service Provider**: Azure service (Microsoft.Compute, Microsoft.Storage)\n- **Meter Category**: Service category (Virtual Machines, Storage, Network)\n- **Meter Subcategory**: Specific service type (Premium SSD, D2s v3, etc.)\n- **Cost**: Total spending for this service type\n- **Quantity**: Total usage amount (hours, GB, transactions)\n- **Resource Names**: Which specific resources use this service\n- **Suggested Category**: What business category this should be classified as\n\n#### Storage Analysis for Machines\n\n**Calculation Method:**\n\n```python\nstorage_data = all_machine_data[\n    all_machine_data['MeterCategory'].str.contains('Storage|Disk', case=False)\n]\n\nstorage_breakdown = storage_data.groupby(['MeterSubcategory', 'ResourceName']).agg({\n    'Cost': 'sum',\n    'Quantity': 'sum'\n})\n\nstorage_percentage = (storage_cost / total_machine_cost * 100)\n```\n\n**Storage Metrics:**\n\n- **Storage Type**: Specific disk type (Premium SSD, Standard HDD)\n- **Resource Name**: Individual disk or storage resource\n- **Cost**: Dollar amount for this storage component\n- **Quantity**: Storage usage (typically GB-hours or disk-hours)\n- **Storage Percentage**: Storage costs as percentage of total machine costs\n\n**Optimization Insights:**\n\n- **Disk Tier Analysis**: Premium vs Standard SSD usage\n- **Storage Efficiency**: Cost per GB analysis\n- **Right-Sizing Opportunities**: Over-provisioned storage identification\n\n## Technical Implementation\n\n### Required CSV Structure\n\n```\nDate,Cost,Quantity,ResourceGroup,ResourceName,ConsumedService,MeterCategory,MeterSubcategory\n2024-01-01,45.67,720,prod-rg,web-server-01,Microsoft.Compute,Virtual Machines,D2s v3\n2024-01-01,89.23,744,prod-rg,web-server-01-disk,Microsoft.Compute,Storage,Premium SSD Managed Disks\n```\n\n### Classification Logic Implementation\n\n```python\n# Managed Disks Classification\nif MeterSubcategory in ['Premium SSD Managed Disks', 'Standard HDD Managed Disks']:\n    return 'Managed Disks'\n\n# VM Compute Classification\nif ConsumedService == 'Microsoft.Compute' and MeterSubcategory not in disk_categories:\n    return 'VM Compute'\n\n# Network Classification\nif MeterCategory == 'Virtual Network':\n    return 'Network/IP'\n```\n\n### Cost Reconciliation Engine\n\n- **Mathematical Validation**: Original total = Sum of categorized costs\n- **Coverage Tracking**: Percentage of costs successfully categorized\n- **Gap Identification**: Uncategorized items analysis and recommendations\n- **Error Detection**: Data quality issues and missing classifications\n\nThis comprehensive strategy document provides the technical foundation and business rationale for implementing Azure Invoice Analyzer Pro as a critical tool for Azure cost management and financial governance.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjonaskahn%2Fazure-invoice","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjonaskahn%2Fazure-invoice","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjonaskahn%2Fazure-invoice/lists"}