{"id":17360618,"url":"https://github.com/ivanyu/pyheap","last_synced_at":"2026-03-07T18:01:30.357Z","repository":{"id":60578208,"uuid":"530935511","full_name":"ivanyu/pyheap","owner":"ivanyu","description":"A heap dumper and analyzer for CPython based on GDB","archived":false,"fork":false,"pushed_at":"2025-01-24T03:54:09.000Z","size":6652,"stargazers_count":42,"open_issues_count":19,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-04-15T00:33:55.481Z","etag":null,"topics":["cpython","gdb","memory","python","python3"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ivanyu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null},"funding":{"github":null,"patreon":null,"open_collective":null,"ko_fi":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"lfx_crowdfunding":null,"polar":null,"buy_me_a_coffee":"ivanyu","thanks_dev":null,"custom":null}},"created_at":"2022-08-31T04:33:21.000Z","updated_at":"2025-03-21T15:18:46.000Z","dependencies_parsed_at":"2025-04-15T00:29:53.366Z","dependency_job_id":"487de469-be49-4fda-8e04-adcde51354db","html_url":"https://github.com/ivanyu/pyheap","commit_stats":null,"previous_names":[],"tags_count":8,"template":false,"template_full_name":null,"purl":"pkg:github/ivanyu/pyheap","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ivanyu%2Fpyheap","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ivanyu%2Fpyheap/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ivanyu%2Fpyheap/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ivanyu%2Fpyheap/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ivanyu","download_url":"https://codeload.github.com/ivanyu/pyheap/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ivanyu%2Fpyheap/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30225406,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-07T17:00:40.062Z","status":"ssl_error","status_checked_at":"2026-03-07T17:00:39.026Z","response_time":53,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cpython","gdb","memory","python","python3"],"created_at":"2024-10-15T19:26:46.516Z","updated_at":"2026-03-07T18:01:30.340Z","avatar_url":"https://github.com/ivanyu.png","language":"Python","funding_links":["https://buymeacoffee.com/ivanyu"],"categories":[],"sub_categories":[],"readme":"# PyHeap\n\nA heap dumper and analyzer for CPython based on GDB.\n\nThe product consists of two parts:\n1. The dumper which uses GDB.\n2. The Flask-based UI for heap dump visualization.\n\nIt does not require any modification to the target program code, CPython code or installation. A restart is also not needed. In case the target is dockerized, the container also doesn't need any modification.\n\n## Requirements\n\nThe dumper needs the following:\n1. GDB must be installed where the dumper runs (e.g. on the machine host), but is not needed near a target process (e.g. in a container).\n2. CPython 3.8 - 3.12.\n3. Docker CLI for working with Docker containers directly (e.g. calling `docker inspect`).\n\n## Compatibility\n\n**Only Linux** is supported at the moment.\n\nThe dumper has been tested on x86 (x86_64) and ARM (AArch64) processors.\n\nThe dumper is compatible with a target process running on CPython 3.8 - 3.12.\n\nThe target process were tested in the following OSes:\n- Alpine Linux;\n- Ubuntu;\n- Fedora;\n- Debian.\n\nSome popular libraries were tested:\n- Django;\n- FastAPI;\n- Flask;\n- SQLAlchemy\n- Jupyter.\n\n## Usage\n\n### Heap Dumping\nFind the PID of a running CPython process you're interested in.\n\nRun:\n```bash\n$ python3 pyheap_dump --pid \u003cpid\u003e --file heap.pyheap\n```\n\nThe heap file is written by the target process in its `/tmp`, but is moved subsequently under the specified path.\n\nIf the target process belongs to a different user, use `sudo`.\n\nSee \n```bash\n$ python3 pyheap_dump -h\n```\nfor additional options.\n\n#### Running in a Docker Container\n\nThe dumper also can be run in a Docker container.\n\nIf the target process is also running in a Docker container, it's possible to attach the dumper container directly to it:\n\n```bash\ndocker run \\\n  --rm \\\n  --pid=container:\u003ccontainer_name_or_id\u003e \\\n  --cap-add=SYS_PTRACE \\\n  --volume $(pwd):/heap-dumps \\\n  ivanyu/pyheap-dumper:latest \\\n  --pid 1 \\\n  --file /heap-dumps/heap.pyheap\n```\n\nYou can replace `latest` with a release version.\n\nIf you need to run it against a process on the host, use `--pid=host` instead.\n\n### Containers and Namespaces\n\nPyHeap can attach to targets that are running in Linux namespaces. Docker containers is the most common example of this situation.\n\n**Note:** Some Docker setups doesn't have real processes running on the same (virtual) machine where `docker ...` control commands are executed. One example is WSL 2 + Docker Desktop on Windows. PyHeap doesn't work in such environments.\n\nIf you want to use PyHeap on the root process in a Docker container, use `--docker-container` instead of `--pid/-p` and specify the name or ID:\n\n```bash\n$ sudo python3 pyheap_dump --docker-container \u003ccontainer_name\u003e --file heap.pyheap\n```\n\nIf it's not the root process in the container, or you work with another container system (e.g. systemd-nspawn) or just generic Linux namespaces, you need to find the target PID. Please mind that this must be the PID from the dumper point of view: processes in namespaces can have their own PID numbers. For example, if you're about to run the dumper on a Linux host and the target process is running in a container, check the process list with `ps` or `top` on the host. Use `--pid/-p` for the dumper.\n\nIf the target process is running under a different user (normal for Docker), you need to use `sudo` with `python3 pyheap_dump ...`.\n\nPyHeap dumper will automatically transfer the heap file from the target namespace to the specified location.\n\n### Browser-Based UI\n\nThe browser-based PyHeap UI is a convenient way to explore heap dumps. It can show threads, objects with the most retained heap. It allows exploring individual objects as well.\n\n![Thread view](doc/screenshot1.png)\n\n\u003cdetails\u003e\n  \u003csummary\u003eMore screenshots\u003c/summary\u003e\n\n![Heap view](doc/screenshot2.png)\n\n![Object view - Attributes](doc/screenshot3.png)\n\n![Object view - Referents](doc/screenshot4.png)\n\n\u003c/details\u003e\n\n#### Running with Docker\n\nRunning the PyHeap UI with Docker is simple:\n\n```bash\ndocker run -it --rm \\\n  --userns=host --user=$(id -u):$(id -g) \\\n  -v ${PWD}:/pyheap-workdir \\\n  --name pyheap-ui -p 5000:5000 \\\n  ivanyu/pyheap-ui:latest \\\n  heap.pyheap\n```\nand open [http://127.0.0.1:5000](http://127.0.0.1:5000).\n\nYou can replace `latest` with a release version.\n\nThe images are published on [Docker Hub](https://hub.docker.com/repository/docker/ivanyu/pyheap-ui).\n\n#### Running as a Python Program\n\nYou need a Python installation with Flask to run it. There's a Poetry environment for your convenience in [pyheap-ui/](pyheap-ui/).\n\nTo view the heap dump with the browser-based UI, go to [pyheap-ui/](pyheap-ui/) and run:\n```bash\nPYTHONPATH=src poetry run python -m pyheap_ui --file heap.pyheap\n```\nand open [http://127.0.0.1:5000](http://127.0.0.1:5000).\n\n### Command-Line Heap Analyzer\n\n\u003cdetails\u003e\n  \u003csummary\u003eIn case you cannot use the browser-based UI\u003c/summary\u003e\n\nAnalyze the heap with the `analyzer` module:\n```bash\n$ PYTHONPATH=src poetry run python -m analyzer retained-heap --file heap.pyheap\n\n[2022-09-07 09:40:46,594] INFO Loading file heap.json.gz\n[2022-09-07 09:40:46,633] INFO Loading file finished in 0.04 seconds\n[2022-09-07 09:40:46,633] INFO Heap dump contains 18269 objects\n[2022-09-07 09:40:46,646] INFO 1761 unknown objects filtered\n[2022-09-07 09:40:46,681] INFO Indexing inbound references\n[2022-09-07 09:40:46,695] INFO Inbound references indexed in 0.01 seconds\n[2022-09-07 09:40:46,701] INFO Loaded retained heap cache\n  heap.json.gz.ce7ade900911c6edac5fe332a36d43d0a76ac103.retained_heap\nAddress         | Object type     | Retained heap size | String representation  \n--------------------------------------------------------------------------------\n140494124474176 | dict            |            1101494 | {'__name__': '__main__'\n140494121988112 | str             |            1000049 | xxxxxxxxxxxxxxxxxxxxxxx\n140494125217792 | list            |             100113 | ['xxxxxxxxxxxxxxxxxxxxx\n94613255597520  | str             |             100049 | xxxxxxxxxxxxxxxxxxxxxxx\n140494126265024 | dict            |              89546 | {'/usr/lib/python310.zi\n140494124519104 | dict            |              70465 | {'__name__': 'os', '__d\n140494123404608 | dict            |              64157 | {'__name__': 'typing', \n140494126265984 | dict            |              35508 | {'__name__': 'builtins'\n140494125686720 | dict            |              32920 | {94613227788704: \u003cweakr\n94613255487824  | ABCMeta         |              32790 | \u003cclass 'collections.Use\n140494125072000 | dict            |              31566 | {'__module__': 'collect\n140494124621856 | _Printer        |              28111 | Type license() to see t\n140494124550272 | dict            |              28063 | {'_Printer__name': 'lic\n140494105358656 | list            |              27229 | ['A. HISTORY OF THE SOF\n140494125744640 | frozenset       |              25447 | frozenset({'_curses', '\n140494124629056 | FileFinder      |              22804 | FileFinder('/usr/lib/py\n140494124679104 | dict            |              22756 | {'_loaders': [('.cpytho\n...\n```\n(in the repo root directory).\n\u003c/details\u003e\n\n## How It Works\n\nPyHeap uses GDB to attach to a running CPython process.\n\nAfter the debugger is attached, a break point is set at the [`_PyEval_EvalFrameDefault`](https://github.com/python/cpython/blob/3594ebca2cacf5d9b5212d2c487fd017cd00e283/Python/ceval.c#L1577) function inside CPython, which indicated the Python stack frame execution. It's a good spot to intervene into the CPython's normal job.\n\nWhen the break point is hit by one of the threads, the Python script `injector.py` is loaded and executed (as `$dump_python_heap` function) in the context of the GDB's own Python interpreter. The main purpose of this script is to make the target CPython process to load the `dumper_inferior.py` script and execute it in the context of the target process.\n\nThe dumper script uses the Python standard modules `gc` and `sys` to collect some information about heap objects and their sizes. It does some job to avoid irrelevant garbage created by itself to appear in the heap dump, but some traces of it will be there.\n\nA dump is not a fair snapshot in time as some threads and the garbage collector continue working while it's being done.\n\n### What Objects Are Dumped\n\nCurrently, the dumper sees objects traced by the CPython garbage collector and the objects they reference to (more precisely, the ones they return in their [`tp_traverse`](https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_traverse)).\n\n## Development\n\n### Integration Tests\n\nIntegration tests run on CI. However, end-to-end tests that use the real GDB cannot be run in GitHub Actions. You can run them locally using\n```bash\nmake clean integration-tests\n```\n\nYou need [pyenv](https://github.com/pyenv/pyenv) with Python 3.8, 3.9, 3.10, 3.11, and 3.12 installed and [Poetry](https://python-poetry.org/).\n\n## License\n\n[Apache License, Version 2.0](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fivanyu%2Fpyheap","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fivanyu%2Fpyheap","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fivanyu%2Fpyheap/lists"}