{"id":31777792,"url":"https://github.com/opensource-observer/code-analyzer","last_synced_at":"2026-02-15T22:03:30.225Z","repository":{"id":309933168,"uuid":"1037607407","full_name":"opensource-observer/code-analyzer","owner":"opensource-observer","description":"Working title — subject to renaming","archived":false,"fork":false,"pushed_at":"2025-08-14T15:51:10.000Z","size":6,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-08-14T17:34:41.010Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/opensource-observer.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-08-13T20:49:01.000Z","updated_at":"2025-08-14T15:51:14.000Z","dependencies_parsed_at":"2025-08-14T17:34:43.361Z","dependency_job_id":"f4958273-317c-425d-a0e6-a5e5f38465f0","html_url":"https://github.com/opensource-observer/code-analyzer","commit_stats":null,"previous_names":["opensource-observer/code-analyzer"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/opensource-observer/code-analyzer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opensource-observer%2Fcode-analyzer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opensource-observer%2Fcode-analyzer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opensource-observer%2Fcode-analyzer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opensource-observer%2Fcode-analyzer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/opensource-observer","download_url":"https://codeload.github.com/opensource-observer/code-analyzer/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opensource-observer%2Fcode-analyzer/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279002970,"owners_count":26083488,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-10T02:00:06.843Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-10-10T06:20:18.865Z","updated_at":"2025-10-10T06:20:20.164Z","avatar_url":"https://github.com/opensource-observer.png","language":null,"readme":"# Code Analyzer\n\nIndex GitHub repositories from OSO `artifacts_v1` and collect direct dependencies (npm, Python/PyPI, Rust crates), git submodules, and Foundry libs into Parquet snapshots and an events stream.\n\n## Quick start (local)\n\n```bash\npython -m venv .venv \u0026\u0026 source .venv/bin/activate\npip install -r requirements.txt\n\n# Configure credentials (use .env locally if you prefer)\nexport OSO_API_KEY=\"\u003cyour_api_key\u003e\"\n# optional for higher clone rate limits\nexport GITHUB_TOKEN=\"\u003cyour_gh_pat\u003e\"\n\n# Small smoke test run (default limit applies if no --only) to test setup end to end without assuming any specfic owner\n./.venv/bin/python scripts/sbom_fetcher.py --output-dir data/sbom --incremental\n\n# Focused owner run (no default limit when --only is provided) to test the actual OSO scoped command\n./.venv/bin/python scripts/sbom_fetcher.py --output-dir data/sbom --only opensource-observer/ --limit 0\n\n# Source the deactivate script:\nsource .venv/bin/deactivate\n```\n\n## GitHub Actions\n\n1. Add repository secrets: `OSO_API_KEY` and (optionally) `GITHUB_TOKEN`.\n2. Run the \"SBOM fetcher\" workflow manually or wait for the daily schedule.\n\nOutputs are written under `data/sbom/` and committed back to the repo.\n\nNotes:\n- Provide `OSO_API_KEY` and optionally `GITHUB_TOKEN` as repository Secrets.\n- `.env` files are for local development only; Actions will not read them.\n\n## Supported ecosystems (direct deps)\n\n- npm (`package.json`) — dependencies, devDependencies, peerDependencies\n- Python/PyPI (`pyproject.toml`, `requirements*.txt`) — PEP 621 and Poetry\n- Rust/Cargo (`Cargo.toml`) — dependencies, dev-dependencies, build-dependencies\n- Git submodules (`.gitmodules`)\n- Foundry (`foundry.toml`, `lib/*`)\n\n## Output layout\n\n- `data/sbom/snapshots/{artifact_namespace__artifact_name}/{YYYY-MM-DD}/*.parquet`\n- `data/sbom/events/{YYYY-MM-DD}/*.parquet`\n- `data/sbom/state.json`\n- `data/sbom/run_summaries/*.parquet`\n\n## Data model\n\n- Snapshots (subset of columns):\n  - `artifact_id`, `artifact_source`, `artifact_namespace`, `artifact_name`, `artifact_url`, `repo_head_sha`\n  - `package_manager`, `dependency_name`, `dependency_version_requirement`, `dependency_scope`, `manifest_path`, `direct`\n  - `time_collected`\n\n- Events columns:\n  - `artifact_namespace`, `artifact_name`, `package_manager`, `dependency_name`, `change_type` (added/removed/updated), `previous_version`, `current_version`, `event_time_collected`\n\n## Runtime controls\n\n- Flags:\n  - `--only \u003cowner\u003e`: filter by owner/namespace (trailing slash optional)\n  - `--limit \u003cN\u003e`: cap number of repos (use `0` for all; recommended with `--only`)\n  - `--incremental`: skip repos whose HEAD SHA is unchanged\n  - `--max-workers \u003cN\u003e`: concurrent clones/parsers (default via `SBOM_MAX_WORKERS`)\n\n- Environment variables:\n  - `SBOM_MAX_WORKERS` (default 8)\n  - `SBOM_GIT_CLONE_RETRIES` (default 2)\n  - `SBOM_GIT_CLONE_TIMEOUT` (seconds, default 120)\n  - `SBOM_GIT_CMD_TIMEOUT` (seconds, default 60)","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopensource-observer%2Fcode-analyzer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fopensource-observer%2Fcode-analyzer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopensource-observer%2Fcode-analyzer/lists"}