{"id":50396948,"url":"https://github.com/tg12/phantomcreds","last_synced_at":"2026-05-30T21:30:28.396Z","repository":{"id":358652079,"uuid":"1242319107","full_name":"tg12/phantomcreds","owner":"tg12","description":"Automated detection and tracking of credential-harvesting and unsafe credential-storage repos on GitHub. Scores suspicious repos daily, captures evidence, and files issues only when fixable.","archived":false,"fork":false,"pushed_at":"2026-05-26T07:35:09.000Z","size":450,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-26T09:29:27.188Z","etag":null,"topics":["appsec","automated-scanning","credential-harvesting","credential-security","github-actions","osint","python","security","security-research","threat-detection"],"latest_commit_sha":null,"homepage":"https://labs.jamessawyer.co.uk/ai-slop-intelligence-dashboards/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tg12.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-18T10:15:06.000Z","updated_at":"2026-05-26T07:35:13.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/tg12/phantomcreds","commit_stats":null,"previous_names":["tg12/phantomcreds"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/tg12/phantomcreds","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tg12%2Fphantomcreds","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tg12%2Fphantomcreds/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tg12%2Fphantomcreds/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tg12%2Fphantomcreds/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tg12","download_url":"https://codeload.github.com/tg12/phantomcreds/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tg12%2Fphantomcreds/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33711018,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-30T02:00:06.278Z","response_time":92,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["appsec","automated-scanning","credential-harvesting","credential-security","github-actions","osint","python","security","security-research","threat-detection"],"created_at":"2026-05-30T21:30:26.468Z","updated_at":"2026-05-30T21:30:28.388Z","avatar_url":"https://github.com/tg12.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/phantomcreds-v0.1.0-0f766e?style=for-the-badge\" alt=\"phantomcreds\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/python-3.14-blue?style=for-the-badge\" alt=\"Python 3.14\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/license-Apache--2.0-green?style=for-the-badge\" alt=\"Apache 2.0\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/CI-GitHub%20Actions-orange?style=for-the-badge\" alt=\"GitHub Actions\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/runs-daily-brightgreen?style=for-the-badge\" alt=\"Daily\"\u003e\n\u003c/p\u003e\n\n\u003ch1 align=\"center\"\u003ephantomcreds\u003c/h1\u003e\n\u003cp align=\"center\"\u003e\u003cstrong\u003eAutomated detection and tracking of credential-harvesting and unsafe credential-storage repos on GitHub\u003c/strong\u003e\u003c/p\u003e\n\u003cp align=\"center\"\u003e\n  A \u003ca href=\"https://labs.jamessawyer.co.uk/\"\u003eJS Labs\u003c/a\u003e project -\n  part of the \u003ca href=\"https://labs.jamessawyer.co.uk/ai-slop-intelligence-dashboards/\"\u003eAI Slop Intelligence\u003c/a\u003e initiative.\u003cbr\u003e\n  Runs every day. Scores suspicious repos. Captures evidence. Files issues only when the code looks fixable.\n\u003c/p\u003e\n\n---\n\n## Why this exists\n\nThe counterintuitive move here is restraint.\n\nThe easy version of this project is a giant crawler that flags every repo mentioning `token`, `cookie`, or `OAuth`. That path loses trust immediately because the maintenance tax becomes larger than the signal. Legitimate software stores tokens. Legitimate tools proxy requests. Legitimate integrations use OAuth callbacks.\n\nThe useful version is narrower: detect repos whose docs and code jointly suggest credential harvesting, unsafe persistence, replay posture, or exposed management surfaces. Record the evidence every day. File issues only when the target still looks like a maintainable software project rather than an overt abuse kit.\n\nThat is what **phantomcreds** does.\n\nIt is built around one premise: operator trust is the product. If the scanner cannot explain *why* a repo was flagged and *which lines* created that judgment, it is not finished.\n\n---\n\n## What it does\n\n**phantomcreds** runs a daily GitHub Actions job that:\n\n1. Searches GitHub repositories for posture phrases such as `multi-account`, `no API key needed`, `auth file`, `shared subscription`, session reuse, provider relays, and imported browser-auth language\n2. Searches code across Go, Python, JavaScript, and TypeScript for credential-risk fingerprints such as token or session persistence, raw `Authorization` forwarding, management auth bypass wrappers, wildcard management exposure, callback listeners bound to `0.0.0.0`, and committed secret-bearing `.env`, `.netrc`, `.pypirc`, Docker auth config, Terraform credential, private-key, service-account, and connection-string material\n3. Fetches targeted high-signal files plus a bounded sweep of broadly text-like repo files directly from the GitHub API\n4. Scores each repo against a repo-level evidence model that prefers multi-family matches over single-query noise, then biases toward recently pushed non-archived non-fork repos\n5. Writes append-only ledgers to this repo:\n   - [`data/repos.jsonl`](data/repos.jsonl) for per-repo scan outcomes\n   - [`data/findings.jsonl`](data/findings.jsonl) for concrete findings with evidence\n6. Updates the README dashboard automatically\n7. Opens or updates one issue per target repo **only** when the findings are specific and fixable\n8. Leaves overt abuse-oriented repos as `report_only` records instead of spamming them with issues\n\nNo servers. No database. No dashboard backend.\n\n---\n\n## Detection model\n\nThe scanner combines four evidence classes:\n\n| Evidence class | What it means |\n|---|---|\n| Harvest posture | README or description markets shared subscriptions, relays, auth-file import, or \"no API key needed\" positioning |\n| Credential persistence | Code writes token-like material to local auth files or serialized session stores |\n| Direct secret exposure | Current repo files appear to contain committed cloud, model-provider, CI, package-registry, webhook, SSH, service-account, registry-auth, Terraform, or database-connection credentials; evidence is redacted in stored findings and issue bodies |\n| Unsafe exposure | Callback listeners bind broadly, management routes use wildcard CORS, or auth bypass wrappers weaken the control plane |\n| Centralized leakage | Request logging or telemetry paths appear to forward raw credential-bearing headers |\n\nNot every hit is issue-worthy.\n\nThe product rule is deliberate:\n\n- `file_issue`: concrete technical defect with defensible evidence and a plausible maintainer remediation path\n- `report_only`: repo posture looks overtly abusive, or the scan can defend the risk but issue filing is unlikely to improve behavior\n- `watch`: suspicious signals exist, but the evidence is not strong enough for automated external action\n\nThis is the main maintenance-tax control. It avoids treating every suspicious repo as a workflow target.\n\n---\n\n## Code smell and maintenance tax\n\nThree uncomfortable truths drive the design:\n\n1. The biggest failure mode is not false negatives. It is false-positive automation with weak evidence. That destroys the product faster than missing a repo.\n2. Repo families matter more than individual repos. Once one credential-harvesting codebase is confirmed, the next high-leverage step is searching for reused paths and symbol names across derivatives.\n3. The project should prefer append-only evidence over complicated state machines. Daily JSONL ledgers and deterministic README updates are lower-maintenance than a bespoke datastore.\n\n### Devil's-advocate view\n\nThe comfortable answer is \"scan everything and file everything.\"\n\nWhy that loses:\n\n- GitHub code search is rate-limited and noisy.\n- Most repositories that mention tokens are normal software.\n- Bulk issue creation on overt abuse repos creates work without changing outcomes.\n- A complex crawler increases breakage surface and lowers operator confidence.\n\nThe winning move is smaller:\n\n- search-first discovery\n- multi-language query families\n- targeted file fetches\n- repo-level scoring\n- one issue per repo at most\n- explicit `report_only` for abuse-heavy cases\n\nThat is less dramatic and more durable.\n\n---\n\n## What you will regret not knowing later\n\n- Which repo families cloned the same unsafe credential paths\n- Which findings were recurring but never issue-worthy\n- Which wording in README posture was a leading indicator before the code confirmed it\n\nThe data model is structured so those questions can be answered from the ledger later without redesigning the project.\n\n---\n\n## Three questions to ask next\n\n1. Which clone-family fingerprints should graduate from \"interesting\" to \"hard finding\" after recurring across multiple repos?\n2. Which issue classes actually lead to maintainer response, and which are operational dead ends that should stay `report_only`?\n3. At what scale does GitHub Search API noise justify adding a local corpus or scheduled seed list?\n\n---\n\n## Live dashboard\n\n\u003c!-- STATS:START --\u003e\n| Date | Scanned | Flagged | High Risk | Issue-Worthy | Report Only | New High Risk |\n|------|---------|---------|-----------|--------------|-------------|---------------|\n| 2026-05-30 | 31 | 18 | 12 | 16 | 8 | 1 |\n| 2026-05-29 | 41 | 19 | 14 | 17 | 9 | 2 |\n| 2026-05-27 | 36 | 17 | 13 | 14 | 5 | 0 |\n| 2026-05-26 | 34 | 19 | 13 | 16 | 6 | 0 |\n| 2026-05-25 | 39 | 20 | 11 | 17 | 5 | 0 |\n| 2026-05-24 | 33 | 17 | 11 | 14 | 5 | 1 |\n| 2026-05-23 | 33 | 18 | 14 | 17 | 9 | 0 |\n| 2026-05-22 | 34 | 15 | 13 | 14 | 7 | 0 |\n| 2026-05-21 | 33 | 18 | 12 | 17 | 7 | 0 |\n| 2026-05-20 | 30 | 14 | 11 | 14 | 5 | 0 |\n| 2026-05-19 | 55 | 21 | 14 | 16 | 7 | 1 |\n| 2026-05-18 | 115 | 69 | 45 | 46 | 28 | 45 |\n\u003c!-- STATS:END --\u003e\n\n---\n\n## Highest-risk repos today\n\n\u003c!-- REPO_STATS:START --\u003e\n| Repo | Score | Findings | Action | Stars | Updated |\n|------|-------|----------|--------|-------|---------|\n| jeffnash/CLIProxyAPI | 1.000 | 7 | report_only | 0 | 2026-05-30 |\n| leic4u/CLIProxyAPIPlus | 1.000 | 7 | report_only | 3 | 2026-05-29 |\n| lzt404/CLIProxyAPI-RUM | 1.000 | 7 | report_only | 1 | 2026-05-29 |\n| fxzer/CLIProxyAPI | 1.000 | 7 | report_only | 0 | 2026-05-27 |\n| router-for-me/CLIProxyAPI | 1.000 | 7 | report_only | 35434 | 2026-05-30 |\n| 6enta0/CPAplus | 1.000 | 6 | file_issue | 16 | 2026-05-27 |\n| kdjahdiel-code/c-pipe-engine | 1.000 | 6 | file_issue | 0 | 2026-05-29 |\n| rituprodhan-ops/c-channel-engine | 1.000 | 6 | file_issue | 0 | 2026-05-29 |\n| BlueSkyXN/CPA-Core-LTS | 1.000 | 6 | report_only | 2 | 2026-05-29 |\n| Zeuyel/Proxy-me | 1.000 | 6 | report_only | 1 | 2026-05-29 |\n| kittors/CliRelay | 1.000 | 5 | file_issue | 764 | 2026-05-30 |\n| Tatsumaki123123/cliproxyapi | 1.000 | 5 | file_issue | 0 | 2026-05-28 |\n| SylphAI-Inc/AdalFlow | 0.560 | 2 | file_issue | 4154 | 2026-05-30 |\n| Wei-Shaw/claude-relay-service | 0.410 | 2 | report_only | 11923 | 2026-05-30 |\n| kunish/wheel | 0.390 | 2 | watch | 1 | 2026-04-29 |\n| vlad-levchenko/claude-skills | 0.390 | 2 | watch | 0 | 2026-05-30 |\n| winstonwilliamsiii/BBBot | 0.350 | 1 | file_issue | 2 | 2026-05-29 |\n| PlanExeOrg/PlanExe | 0.200 | 1 | file_issue | 381 | 2026-05-28 |\n\u003c!-- REPO_STATS:END --\u003e\n\n---\n\n## Data format\n\n**repos.jsonl** - one row per scanned repo per run:\n\n```json\n{\n  \"full_name\": \"owner/repo\",\n  \"composite\": 0.82,\n  \"classification\": \"high_risk\",\n  \"action\": \"file_issue\",\n  \"finding_count\": 4,\n  \"issue_worthy_count\": 3,\n  \"stars\": 431,\n  \"scan_date\": \"2026-05-18\",\n  \"created_at\": \"2026-04-29T20:14:00Z\",\n  \"updated_at\": \"2026-05-18T08:42:11Z\",\n  \"discovery_sources\": [\"auth-bypass\", \"callback-exposure\", \"shared-subscription-posture\"],\n  \"finding_types\": [\"callback_exposure\", \"credential_persistence\", \"management_auth_bypass\"]\n}\n```\n\n**findings.jsonl** - one row per concrete finding:\n\n```json\n{\n  \"repo_full_name\": \"owner/repo\",\n  \"finding_type\": \"exposed_secret\",\n  \"title\": \"Secret-bearing credential material appears committed in current repository files\",\n  \"severity\": \"high\",\n  \"confidence\": \"confirmed\",\n  \"summary\": \"Current repository files appear to contain committed cloud, model-provider, CI, package-registry, webhook, SSH, or service-account credential material. Evidence is redacted in the report output.\",\n  \"issue_worthy\": true,\n  \"scan_date\": \"2026-05-18\",\n  \"evidence\": [\n    \".env:1 - OPENAI_API_KEY=[REDACTED:sk-pro...3456]\",\n    \"deploy/id_rsa:1 - [REDACTED:-----BEGIN OPENSSH PRIVATE KEY-----]\"\n  ]\n}\n```\n\n---\n\n## Setup\n\n### 1. Create or fork the repo\n\nThis repo commits its own ledgers back to `main` after each successful scan.\n\n### 2. Add a GitHub PAT secret\n\nCreate a **classic** Personal Access Token with scopes:\n\n- `public_repo`\n- `read:user`\n\nAdd it as `GH_TOKEN` under:\n\n**Settings -\u003e Secrets and variables -\u003e Actions -\u003e New repository secret**\n\n### 3. Enable Actions\n\nThe workflow runs at **07:00 UK time daily** using the `Europe/London` clock:\n\n- `06:00 UTC` during British Summer Time\n- `07:00 UTC` during Greenwich Mean Time\n\nGitHub cron is UTC-only, so the workflow triggers at both UTC hours and only proceeds when local London time is `07`.\n\nManual trigger:\n\n**Actions -\u003e Daily Phantomcreds Scan -\u003e Run workflow**\n\n### 4. Run locally\n\nSafe local test run:\n\n```bash\ngit clone https://github.com/YOUR_USERNAME/phantomcreds.git\ncd phantomcreds\npython -m venv venv \u0026\u0026 source venv/bin/activate\npip install -e .[dev]\nPHANTOMCREDS_LOCAL_MODE=1 GH_TOKEN=ghp_your_token phantomcreds\n```\n\nThis uses the same scan logic locally but:\n- disables external GitHub issue creation by default\n- does not rewrite the main `README.md`\n- writes results under `.local/phantomcreds/`\n- keeps the same GitHub API fetch, heuristic scoring, and issue-decision logic as the hosted run\n\nProduction-style local run:\n\n```bash\nGH_TOKEN=ghp_your_token \\\nPHANTOMCREDS_NOTIFY_EXTERNAL=1 \\\nPHANTOMCREDS_UPDATE_README=1 \\\nphantomcreds\n```\n\nUseful local overrides:\n- `PHANTOMCREDS_OUTPUT_DIR=/tmp/phantomcreds-run`\n- `PHANTOMCREDS_NOTIFY_EXTERNAL=0|1`\n- `PHANTOMCREDS_UPDATE_README=0|1`\n- `PHANTOMCREDS_REPORTS_FILE=/tmp/repos.jsonl`\n- `PHANTOMCREDS_FINDINGS_FILE=/tmp/findings.jsonl`\n- `PHANTOMCREDS_README_PATH=/tmp/README.md`\n\nOperational difference from GitHub Actions:\n- same discovery, fetch, scoring, and notification code paths\n- no scheduler wrapper\n- no Actions step summary unless `GITHUB_STEP_SUMMARY` is set\n- local mode is the safer way to test scanner changes before allowing external issue creation\n\n---\n\n## False positives and exclusions\n\nIf a repo is repeatedly benign but matches the search posture, add it to [`data/allowlist.txt`](data/allowlist.txt), one `owner/repo` per line. Allowlisted repos are skipped entirely in future runs.\n\nThe scanner also applies built-in context filters before raising secret findings:\n- redacted evidence snippets are ignored\n- test, fixture, and docs paths are not treated as live secret exposure\n- template files such as `.env.example` remain non-issues when they contain placeholders, but still raise findings if they contain real credential material\n- Docker auth evidence must decode to printable `user:password` material before it is treated as a committed secret\n- credential-persistence findings require nearby write or serialization behavior, not just words like `session` or `cookie`\n\nThis is a repo-level scanner. It does not store individual user identities, and it does not attempt attribution beyond public repository content.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftg12%2Fphantomcreds","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftg12%2Fphantomcreds","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftg12%2Fphantomcreds/lists"}