{"id":18638500,"url":"https://github.com/anthonyharrison/mlbomdoc","last_synced_at":"2025-04-11T10:31:30.299Z","repository":{"id":215134414,"uuid":"726777121","full_name":"anthonyharrison/mlbomdoc","owner":"anthonyharrison","description":"Document generator for ML-BOM (ML Bill of Materials)","archived":false,"fork":false,"pushed_at":"2024-07-28T17:30:10.000Z","size":20,"stargazers_count":6,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2024-09-30T00:13:19.971Z","etag":null,"topics":["ai","cyclonedx","mlbom","supply-chain","transparency"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/anthonyharrison.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null},"funding":{"github":["anthonyharrison"],"patreon":null,"open_collective":null,"ko_fi":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"lfx_crowdfunding":null,"polar":null,"buy_me_a_coffee":null,"custom":null}},"created_at":"2023-12-03T11:24:43.000Z","updated_at":"2024-08-15T01:57:41.000Z","dependencies_parsed_at":"2024-03-25T21:32:25.822Z","dependency_job_id":"b0cac9de-3d1d-4b61-ac2f-c325f514e350","html_url":"https://github.com/anthonyharrison/mlbomdoc","commit_stats":null,"previous_names":["anthonyharrison/mlbomdoc"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anthonyharrison%2Fmlbomdoc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anthonyharrison%2Fmlbomdoc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anthonyharrison%2Fmlbomdoc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/anthonyharrison%2Fmlbomdoc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/anthonyharrison","download_url":"https://codeload.github.com/anthonyharrison/mlbomdoc/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223440184,"owners_count":17145334,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","cyclonedx","mlbom","supply-chain","transparency"],"created_at":"2024-11-07T05:42:07.762Z","updated_at":"2024-11-07T05:42:08.236Z","avatar_url":"https://github.com/anthonyharrison.png","language":"Python","funding_links":["https://github.com/sponsors/anthonyharrison"],"categories":[],"sub_categories":[],"readme":"# MLBOMDoc\n\nMLBOMDOC is a human-readable document generator for an ML-BOM (ML Bill of Materials). MLBOMs document Machine Learning model components\nwhich are typically contained within an SBOM (Software Bill of Materials). MLBOMs are supported for [CycloneDX](https://www.cyclonedx.org).\n\n## Installation\n\nTo install use the following command:\n\n`pip install mlbomdoc`\n\nAlternatively, just clone the repo and install dependencies using the following command:\n\n`pip install -U -r requirements.txt`\n\nThe tool requires Python 3 (3.8+). It is recommended to use a virtual python environment especially\nif you are using different versions of python. `virtualenv` is a tool for setting up virtual python environments which\nallows you to have all the dependencies for the tool set up in a single environment, or have different environments set\nup for testing using different versions of Python.\n\n## Usage\n\n```\nusage: mlbomdoc [-h] [-i INPUT_FILE] [--debug] [-f {console,json,markdown,pdf}] [-o OUTPUT_FILE] [-V]\n\nMLBOMdoc generates documentation for a MLBOM.\n\noptions:\n  -h, --help            show this help message and exit\n  -V, --version         show program's version number and exit\n\nInput:\n  -i INPUT_FILE, --input-file INPUT_FILE\n                        Name of MLBOM file\n\nOutput:\n  --debug               add debug information\n  -f {console,json,markdown,pdf}, --format {console,json,markdown,pdf}\n                        Output format (default: output to console)\n  -o OUTPUT_FILE, --output-file OUTPUT_FILE\n                        output filename (default: output to stdout)\n```\n\t\t\t\t\t\n## Operation\n\nThe `--input-file` option is used to specify the MLBOM to be processed. The format of the SBOM is determined according to\nthe following filename conventions.\n\n| SBOM      | Format    | Filename extension |\n| --------- | --------- |--------------------|\n| CycloneDX | JSON      | .json              |\n\nThe `--output-file` option is used to control the destination of the output generated by the tool. The\ndefault is to report to the console, but it can also be stored in a file (specified using `--output-file` option).\n\n## Example\n\nGiven the following MLBOM (test.json), the following output is produced to the console.\n\n**NOTE** that the data is purely fictitious in order to demonstrate the capability of the tool.\n\n```bash\n{\n  \"$schema\": \"http://cyclonedx.org/schema/bom-1.5.schema.json\",\n  \"bomFormat\": \"CycloneDX\",\n  \"specVersion\": \"1.5\",\n  \"serialNumber\": \"urn:uuid:997191f5-6c2b-4572-9a73-5e0f2d03cedd\",\n  \"version\": 1,\n  \"metadata\": {\n    \"timestamp\": \"2024-01-02T11:02:22Z\",\n    \"tools\": {\n      \"components\": [\n        {\n          \"name\": \"lib4sbom\",\n          \"version\": \"0.6.0\",\n          \"type\": \"application\"\n        }\n      ]\n    },\n    \"component\": {\n      \"type\": \"application\",\n      \"bom-ref\": \"CDXRef-DOCUMENT\",\n      \"name\": \"MLApp\"\n    }\n  },\n  \"components\": [\n    {\n      \"type\": \"library\",\n      \"bom-ref\": \"1-glibc\",\n      \"name\": \"glibc\",\n      \"version\": \"2.15\",\n      \"supplier\": {\n        \"name\": \"gnu\"\n      },\n      \"cpe\": \"cpe:/a:gnu:glibc:2.15\",\n      \"licenses\": [\n        {\n          \"license\": {\n            \"id\": \"GPL-3.0-only\",\n            \"url\": \"https://www.gnu.org/licenses/gpl-3.0-standalone.html\"\n          }\n        }\n      ]\n    },\n    {\n      \"type\": \"operating-system\",\n      \"bom-ref\": \"2-almalinux\",\n      \"name\": \"almalinux\",\n      \"version\": \"9.0\",\n      \"supplier\": {\n        \"name\": \"alma\"\n      },\n      \"cpe\": \"cpe:/o:alma:almalinux:9.0\",\n      \"licenses\": [\n        {\n          \"license\": {\n            \"id\": \"Apache-2.0\",\n            \"url\": \"https://www.apache.org/licenses/LICENSE-2.0\"\n          }\n        }\n      ]\n    },\n    {\n      \"type\": \"library\",\n      \"bom-ref\": \"3-glibc\",\n      \"name\": \"glibc\",\n      \"version\": \"2.29\",\n      \"supplier\": {\n        \"name\": \"gnu\"\n      },\n      \"cpe\": \"cpe:/a:gnu:glibc:2.29\",\n      \"licenses\": [\n        {\n          \"license\": {\n            \"id\": \"GPL-3.0-only\",\n            \"url\": \"https://www.gnu.org/licenses/gpl-3.0-standalone.html\"\n          }\n        }\n      ],\n      \"properties\": [\n        {\n          \"name\": \"language\",\n          \"value\": \"C\"\n        }\n      ]\n    },\n    {\n      \"type\": \"library\",\n      \"bom-ref\": \"4-tomcat\",\n      \"name\": \"tomcat\",\n      \"version\": \"9.0.46\",\n      \"supplier\": {\n        \"name\": \"apache\"\n      },\n      \"cpe\": \"cpe:/a:apache:tomcat:9.0.46\",\n      \"licenses\": [\n        {\n          \"license\": {\n            \"id\": \"Apache-2.0\",\n            \"url\": \"https://www.apache.org/licenses/LICENSE-2.0\"\n          }\n        }\n      ]\n    },\n    {\n      \"type\": \"machine-learning-model\",\n      \"bom-ref\": \"5-resnet-50\",\n      \"name\": \"resnet-50\",\n      \"version\": \"1.5\",\n      \"supplier\": {\n        \"name\": \"microsoft\"\n      },\n      \"description\": \"ResNet (Residual Network) is a convolutional neural network that democratized the concepts of residual learning and skip connections. This enables to train much deeper models.\",\n      \"licenses\": [\n        {\n          \"license\": {\n            \"id\": \"Apache-2.0\",\n            \"url\": \"https://www.apache.org/licenses/LICENSE-2.0\"\n          }\n        }\n      ],\n      \"modelCard\": {\n        \"bom-ref\": \"5-resnet-50-model\",\n        \"modelParameters\": {\n          \"approach\": {\n            \"type\": \"supervised\"\n          },\n          \"task\": \"classification\",\n          \"architectureFamily\": \"Convolutional neural network\",\n          \"modelArchitecture\": \"ResNet-50\",\n          \"datasets\": [\n            {\n              \"type\": \"dataset\",\n              \"name\": \"ImageNet\",\n              \"contents\": {\n                \"url\": \"https://huggingface.co/datasets/imagenet-1k\"\n              },\n              \"classification\": \"public\",\n              \"sensitiveData\": \"no personal data\",\n              \"description\": \"ILSVRC 2012, commonly known as \\\"ImageNet\\\" is an image dataset organized according to the WordNet hierarchy. Each meaningful concept in WordNet, possibly described by multiple words or word phrases, is called a \\\"synonym set\\\" or \\\"synset\\\". There are more than 100,000 synsets in WordNet, majority of them are nouns (80,000+). ImageNet aims to provide on average 1000 images to illustrate each synset. Images of each concept are quality-controlled and human-annotated.\",\n              \"governance\": {\n                \"owners\": [\n                  {\n                    \"organization\": {\n                      \"name\": \"microsoft\"\n                    },\n                    \"contact\": {\n                      \"email\": \"sales@microsoft.com\"\n                    }\n                  },\n                  {\n                    \"organization\": {\n                      \"name\": \"microsoft\"\n                    },\n                    \"contact\": {\n                      \"email\": \"consulting@microsoft.com\"\n                    }\n                  }\n                ]\n              }\n            }\n          ],\n          \"inputs\": [\n            {\n              \"format\": \"image\"\n            }\n          ],\n          \"outputs\": [\n            {\n              \"format\": \"image class\"\n            }\n          ]\n        },\n        \"quantitativeAnalysis\": {\n          \"performanceMetrics\": [\n            {\n              \"type\": \"CPU\",\n              \"value\": \"10%\",\n              \"confidenceInterval\": {\n                \"lowerBound\": \"8\",\n                \"upperBound\": \"12\"\n              }\n            }\n          ],\n          \"graphics\": {\n            \"description\": \"Test data\",\n            \"collection\": [\n              {\n                \"name\": \"cat\",\n                \"image\": {\n                  \"contentType\": \"text/plain\",\n                  \"encoding\": \"base64\",\n                  \"content\": \"cat.jpg\"\n                }\n              },\n              {\n                \"name\": \"dog\",\n                \"image\": {\n                  \"contentType\": \"text/plain\",\n                  \"encoding\": \"base64\",\n                  \"content\": \"dog.jpg\"\n                }\n              }\n            ]\n          }\n        },\n        \"considerations\": {\n          \"users\": [\n            \"Researcher\"\n          ],\n          \"technicalLimitations\": [\n            \"To be used in the EU.\",\n            \"To be used in the UK.\"\n          ],\n          \"ethicalConsiderations\": [\n            {\n              \"name\": \"User from prohibited location\",\n              \"mitigationStrategy\": \"Use geolocation to validate source of request.\"\n            }\n          ]\n        },\n        \"properties\": [\n          {\n            \"name\": \"num_channels\",\n            \"value\": \"3\"\n          }\n        ]\n      }\n    }\n  ]\n}\n\n```\n\nThe following commands will generate a summary of the contents of the MLBOM to the console.\n\n```bash\nmlbomdoc --input test.json \n\n╭───────────────╮\n│ MLBOM Summary │\n╰───────────────╯\n┏━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n┃ Item       ┃ Details                                                      ┃\n┡━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n│ MLBOM File │ test.json                                                    │\n│ MLBOM Type │ cyclonedx                                                    │\n│ Version    │ 1.5                                                          │\n│ Name       │ MLApp                                                        │\n│ Creator    │ tool:lib4sbom#0.6.0                                          │\n│ Created    │ 2024-01-02T11:02:22Z                                         │\n└────────────┴──────────────────────────────────────────────────────────────┘\n\n╭───────────────────────────╮\n│ Model Details - resnet-50 │\n╰───────────────────────────╯\n┏━━━━━━━━━━┳━━━━━━━━━━━━┓\n┃ Item     ┃ Value      ┃\n┡━━━━━━━━━━╇━━━━━━━━━━━━┩\n│ Version  │ 1.5        │\n│ Supplier │ microsoft  │\n│ License  │ Apache-2.0 │\n└──────────┴────────────┘\n╭──────────────────╮\n│ Model Parameters │\n╰──────────────────╯\n┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n┃ Parameter           ┃ Value                        ┃\n┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n│ Approach            │ supervised                   │\n│ Task                │ classification               │\n│ Architecture Family │ Convolutional neural network │\n│ Model Architecture  │ ResNet-50                    │\n│ Input               │ image                        │\n│ Output              │ image class                  │\n└─────────────────────┴──────────────────────────────┘\n╭───────────────╮\n│ Model Dataset │\n╰───────────────╯\n┏━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n┃ Parameter      ┃ Value                                                                                                                                                                                     ┃\n┡━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n│ Type           │ dataset                                                                                                                                                                                   │\n│ Contents URL   │ https://huggingface.co/datasets/imagenet-1k                                                                                                                                               │\n│ Classification │ public                                                                                                                                                                                    │\n│ Sensitive Data │ no personal data                                                                                                                                                                          │\n│ Description    │ ILSVRC 2012, commonly known as \"ImageNet\" is an image dataset organized according to the WordNet hierarchy. Each meaningful concept in WordNet, possibly described by multiple words or   │\n│                │ word phrases, is called a \"synonym set\" or \"synset\". There are more than 100,000 synsets in WordNet, majority of them are nouns (80,000+). ImageNet aims to provide on average 1000       │\n│                │ images to illustrate each synset. Images of each concept are quality-controlled and human-annotated.                                                                                      │\n└────────────────┴───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘\n╭────────────────────╮\n│ Dataset Governance │\n╰────────────────────╯\n┏━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n┃ Category ┃ Organization ┃ Contact                  ┃\n┡━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n│ Owner    │ microsoft    │ sales@microsoft.com      │\n│ Owner    │ microsoft    │ consulting@microsoft.com │\n└──────────┴──────────────┴──────────────────────────┘\n╭───────────────────────╮\n│ Quantitative Analysis │\n╰───────────────────────╯\n╭─────────────────────╮\n│ Performance Metrics │\n╰─────────────────────╯\n┏━━━━━━┳━━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓\n┃ Type ┃ Value ┃ Slice ┃ Lower BOund ┃ Upper Bound ┃\n┡━━━━━━╇━━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩\n│ CPU  │ 10%   │       │ 8           │ 12          │\n└──────┴───────┴───────┴─────────────┴─────────────┘\n╭──────────────────────╮\n│ Graphics - Test data │\n╰──────────────────────╯\n┏━━━━━━┳━━━━━━━━━┓\n┃ Name ┃ Content ┃\n┡━━━━━━╇━━━━━━━━━┩\n│ cat  │ cat.jpg │\n│ dog  │ dog.jpg │\n└──────┴─────────┘\n╭────────────────╮\n│ Considerations │\n╰────────────────╯\n┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n┃ Category                                     ┃ Value                                          ┃\n┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n│ Users                                        │ Researcher                                     │\n│ Technical Limitations                        │ To be used in the EU.                          │\n│ Technical Limitations                        │ To be used in the UK.                          │\n│ Ethical Considerations                       │ User from prohibited location                  │\n│ Ethical Considerations - Mitigation Strategy │ Use geolocation to validate source of request. │\n└──────────────────────────────────────────────┴────────────────────────────────────────────────┘\n╭────────────╮\n│ Properties │\n╰────────────╯\n┏━━━━━━━━━━━━━━┳━━━━━━━┓\n┃ Name         ┃ Value ┃\n┡━━━━━━━━━━━━━━╇━━━━━━━┩\n│ num_channels │ 3     │\n└──────────────┴───────┘\n                                                                   \n```\n\n## Licence\n\nLicenced under the Apache 2.0 Licence.\n\n## Limitations\n\nThe tool has the following limitations\n\n- Invalid SBOMs will result in unpredictable results.\n\n## Feedback and Contributions\n\nBugs and feature requests can be made via GitHub Issues.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fanthonyharrison%2Fmlbomdoc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fanthonyharrison%2Fmlbomdoc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fanthonyharrison%2Fmlbomdoc/lists"}