{"id":36688098,"url":"https://github.com/converged-computing/scheduler-sniffer","last_synced_at":"2026-01-12T11:16:41.040Z","repository":{"id":234847681,"uuid":"789614694","full_name":"converged-computing/scheduler-sniffer","owner":"converged-computing","description":"Basic setup for recording Kubernetes scheduler decisions","archived":false,"fork":false,"pushed_at":"2024-04-23T21:51:43.000Z","size":396,"stargazers_count":1,"open_issues_count":1,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-09-10T14:50:08.789Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/converged-computing.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-04-21T03:54:35.000Z","updated_at":"2024-04-24T18:24:26.000Z","dependencies_parsed_at":"2024-04-21T04:42:26.966Z","dependency_job_id":"dd7d6b17-9dc8-4ae3-a743-5c9f84ca60be","html_url":"https://github.com/converged-computing/scheduler-sniffer","commit_stats":null,"previous_names":["converged-computing/scheduler-sniffer"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/converged-computing/scheduler-sniffer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/converged-computing%2Fscheduler-sniffer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/converged-computing%2Fscheduler-sniffer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/converged-computing%2Fscheduler-sniffer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/converged-computing%2Fscheduler-sniffer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/converged-computing","download_url":"https://codeload.github.com/converged-computing/scheduler-sniffer/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/converged-computing%2Fscheduler-sniffer/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28338970,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-12T10:58:46.209Z","status":"ssl_error","status_checked_at":"2026-01-12T10:58:42.742Z","response_time":98,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-01-12T11:16:40.309Z","updated_at":"2026-01-12T11:16:41.034Z","avatar_url":"https://github.com/converged-computing.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Scheduler Sniffer\n\n\u003e Smells like HPC!👃️\n\nThe scheduler-sniffer is an attempt to build a custom-scheduler plugin that mimics using the default scheduler, but adds in the ability\nto see what is going on. This is a work in progress and I've already changed the design several times - expect that to happen again! Currently I'm taking an approach of creating a custom scheduler plugin, but largely making it empty. It only serves as a means to provide a custom entrypoint to the main scheduler code, and I'll customize this to add pings to a gRPC service that can record decisions at each step.\n\n![docs/img/sniffer.png](docs/img/sniffer.png)\n\n🚧️ Under Development 🚧️\n\n## Development\n\nTry making a kind cluster first.\n\n```bash\nkind create cluster\n```\n\nCourtesy scripts are provided to build the sniffer and load into your local kind cluster.\n\n```bash\n./hack/kind-build.sh\n```\n\nTry submitting a job.\n\n```bash\nkubectl apply -f example/job.yaml\n```\n\nYou can see from events that it was scheduled to the sniffer:\n\n```bash\nkubectl get events -o wide |  awk {'print $4\" \" $5\" \" $6'} | column -t | grep sniffer\n```\n\nYou can also look at the sniffer logs to see the binding events printed:\n\n```bash\n$ kubectl exec -it sniffer-74c76ff4f8-hrqdn -c watcher -- cat /tmp/logs/sniffer.log\n```\n```console\n{\"object\":\"Pod\",\"name\":\"local-path-provisioner-7577fdbbfb-w84m2\",\"endpoint\":\"podUpdate\",\"event\":\"ContainersReady\",\"timestamp\":\"2024-04-20 18:22:03 +0000 UTC\"}\n{\"object\":\"Pod\",\"name\":\"local-path-provisioner-7577fdbbfb-w84m2\",\"endpoint\":\"podUpdate\",\"event\":\"PodScheduled\",\"timestamp\":\"2024-04-20 18:22:00 +0000 UTC\"}\n```\n\nYou can grep for \"Node\" or \"Pod\" to filter events:\n\n```bash\n$ kubectl exec -it sniffer-74c76ff4f8-hrqdn -c watcher -- cat /tmp/logs/sniffer.log | grep Node\n```\n```console\n{\"object\":\"Node\",\"name\":\"kind-control-plane\",\"endpoint\":\"nodeUpdate\",\"reason\":\"KubeletReady\",\"message\":\"kubelet is posting ready status\",\"event\":\"Ready\",\"timestamp\":\"2024-04-22 01:08:24 +0000 UTC\"}\n{\"object\":\"Node\",\"name\":\"kind-control-plane\",\"endpoint\":\"nodeUpdate\",\"extra\":{\"capacity-cpu\":\"12\",\"capacity-ephemeral-storage\":\"1921208544Ki\",\"capacity-hugepages-1Gi\":\"0\",\"capacity-hugepages-2Mi\":\"0\",\"capacity-memory\":\"32512128Ki\",\"capacity-pods\":\"110\",\"allocatable-hugepages-2Mi\":\"0\",\"allocatable-memory\":\"32512128Ki\",\"allocatable-pods\":\"110\",\"allocatable-cpu\":\"12\",\"allocatable-ephemeral-storage\":\"1921208544Ki\",\"allocatable-hugepages-1Gi\":\"0\"}}\n```\n\nThis is still early in development, more soon. I think likely we will need to update the output file to an actual database.\n\n### Build Logic\n\nThe main scheduler container for the \"sniffer\" is built from the [Dockerfile](Dockerfile) here.\nThis logic used to be under the sig-scheduler-plugins repository with a `hack/build-images.sh`\nscript, but I found this logic complex, and there was no easy way to avoid building the controller\n(which takes extra time). To speed things up, make the build more transparent, and allow\nbuilding a custom kube-scheduler command, I moved this up to the root. You can see the build logic in\nthe [Dockerfile](Dockerfile). When you run `make` to do the build, or use one of the [hack](hack)\nscripts for a full build-\u003edeploy, the following will happen:\n\n - An upstreams directory with kubernetes and sig-scheduler-plugins is prepared\n - Changed files are copied into it\n - The sidecar [sniffer](sniffer) and scheduler image are built\n\nMy hope is that this scheduler will be able to support pointing to other plugin extensions to use.\nWhat isn't clear to me is/will be the best way to combine them. \n\n### Organization\n\n- [src](src) has code intending to be moved into the sig-scheduler-plugins repository\n- [scheduler](scheduler) is slightly tweaked Kubernetes code with added gRPC for the sniffer\n- [sniffer](sniffer) is the sniffer service (sidecar) to the scheduler\n\nIt would be cool to do this with eBPF, but I haven't found a good, working container base yet.\n\n## Notes\n\nThings to track for the simulator:\n- keep track of pod to node mappings (this is the basic unit of what we need)\n- node occupancy and time (which nodes contain which pods at what point)\n - TODO we likely want to use a database proper instead of a log file for larger runs\n\nThis will be a replacement for the in-tree Kubernetes scheduler, with the intention of adding a small service to ping and communicate\nscheduling decisions. This is not meant for production use cases, but rather understanding what is happening in the scheduler. It \nwill also serve as a prototype for me to understand developing an in-tree scheduler so we can eventually do one, for realsies.\n\n## License\n\nHPCIC DevTools is distributed under the terms of the MIT license.\nAll new contributions must be made under this license.\n\nSee [LICENSE](https://github.com/converged-computing/cloud-select/blob/main/LICENSE),\n[COPYRIGHT](https://github.com/converged-computing/cloud-select/blob/main/COPYRIGHT), and\n[NOTICE](https://github.com/converged-computing/cloud-select/blob/main/NOTICE) for details.\n\nSPDX-License-Identifier: (MIT)\n\nLLNL-CODE- 842614\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fconverged-computing%2Fscheduler-sniffer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fconverged-computing%2Fscheduler-sniffer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fconverged-computing%2Fscheduler-sniffer/lists"}