{"id":19308694,"url":"https://github.com/mtulio/cloud-costs-parser","last_synced_at":"2026-02-27T09:37:27.248Z","repository":{"id":89055212,"uuid":"300126569","full_name":"mtulio/cloud-costs-parser","owner":"mtulio","description":"PROPOSAL: The Cloud Costs processor (Import/Export to common DB)","archived":false,"fork":false,"pushed_at":"2020-10-01T04:26:54.000Z","size":7,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-02-24T03:17:44.707Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mtulio.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-10-01T03:17:51.000Z","updated_at":"2020-10-01T04:26:56.000Z","dependencies_parsed_at":"2023-06-13T17:54:18.258Z","dependency_job_id":null,"html_url":"https://github.com/mtulio/cloud-costs-parser","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mtulio/cloud-costs-parser","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mtulio%2Fcloud-costs-parser","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mtulio%2Fcloud-costs-parser/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mtulio%2Fcloud-costs-parser/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mtulio%2Fcloud-costs-parser/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mtulio","download_url":"https://codeload.github.com/mtulio/cloud-costs-parser/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mtulio%2Fcloud-costs-parser/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29889474,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-27T08:34:21.514Z","status":"ssl_error","status_checked_at":"2026-02-27T08:32:38.035Z","response_time":57,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-10T00:16:07.901Z","updated_at":"2026-02-27T09:37:27.219Z","avatar_url":"https://github.com/mtulio.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# cloud-costs-parser (State:Proposal)\n\nThe Cloud Costs processor, Importer/Exporter, to a database unified.\n\n## Description\n\nCloud Costs parser import CSV file costs data from Cloud Provider (initially AWS and Azure) and export to a common database - preferably NoSQL.\n\n## Command (SPEC suggested)\n\n### Parser `parser`\n\nCommand to start the parser.\n\n#### Subcommand\n\n- `--vendor`\n\nThe Cloud provider to create metadata field `vendor`\n\n- `--in`\n\nThe value of CSV file path. The source driver could be discovered the source (s3, gzip, csv, blob, etc)\n\nExample of AWS CUR v2: `--in s3://my-bucker-reported-by-cur/path/to/monthlyReport/20201001-20201101`\n\nWill try to read all CSV files inside the path. See `--latest` option to save extra processing.\n\n- `--latest`\n\nRead latest data. Latest by default will be D-1 (last day). See `--latest-days N`\n\nThis option could save extra operation on the destination storage (less read/comparation/upserts)\n\n- `--latest-days N`\n\nOverrides `--latest` to define how many days to parse the data. By default it will get N days from now until the end of file.\n\nIf the monthly processor range is older than current month, one option could be created consider N days from last day of month. (v2)\n\nThis option could save extra operation on the destination storage (less read/comparation/upserts)\n\n- `--override-char --override-from CHAR_ORIGIN --override-to CHAR_DEST`\n\nEnable the override character on field name, usefull when destination storage does not support special characters on field name, or you just want to have a better view without nesting first level of dictionaries on NoSQL databases (it will save complexity on queries);\n\nExample on AWS CUR v2: `--override-char --override-from / --override-to _`\n\nThe field `identity/LineItemId` on CSV will be saved as `identity_LineItemId`\n\n- `--key-fields`\n\nFields sepparated by commad that will be used to UID (`_key`) of cost item.\n\nFor example on AWS CUR the following fields could be used to identify the cost item on whole file: `identity_LineItemId,identity_TimeInterval`\n\n- `--transform-field-data`\n\nDefault: `false`\n\nIn AWS vendor, from the `identity/TimeInterval` (current day interval of cost item) field will be splited into `identity/TimeStart` and `identity/TimeEnd`.\n\nThis option could be very helpfull on the queries on the database.\n\n\n- `--filter-keys KEYS`\n\nDefault: `None`\n\nThe keys sepparated by commad to be filtered on destination database - could save space, but could limit the insights.\n\n- `--out`\n\nThe value of stdout storage, could be MySQL, PGSQL, ArangoDB, S3.\n\n\nExample:\n\n`--out arangodb://server:5432/database/collection?username=x,password=y`\n\n`--out psql://server:5432/database/table?username=x,password=y`\n\n`--out aztables://stgAccount/TableName?storageKey=x`\n\n- `--out-builk-size N`\n\nDefault: 100\n\nBy default the write to output storage will be done by bulk inserts from 100, to change the size just set a new option.\n\n- `--out-indexes FIELDS`\n\nThe fields name to be createed a index on destination storage (if applicable)\n\n- `--out-meta`\n\nDisplay the metadata fields.\n\nThe fields inserted by parser:\n\n`_key` : the unique cost item identifier. Could be the PK of relational DBs.\n\n`_vendor`: the vendor from the source file\n\n`_datetime`: datetime of the importer \n\n### Fields `fields`\n\nExtract the fields from the source file and show the output (without parsing).\n\nSubcommands allowed for this command:\n\n- `--in`\n\nThe output will be the fields in input file and metadata fields.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmtulio%2Fcloud-costs-parser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmtulio%2Fcloud-costs-parser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmtulio%2Fcloud-costs-parser/lists"}