{"id":20940420,"url":"https://github.com/fedora-python/marshalparser","last_synced_at":"2025-05-13T23:30:35.073Z","repository":{"id":40302741,"uuid":"257908925","full_name":"fedora-python/marshalparser","owner":"fedora-python","description":"Simple parser for Python marshal serialization and pyc files","archived":false,"fork":false,"pushed_at":"2024-10-23T18:22:33.000Z","size":9935,"stargazers_count":18,"open_issues_count":0,"forks_count":5,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-30T09:23:52.079Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fedora-python.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-04-22T13:25:27.000Z","updated_at":"2024-10-23T18:22:37.000Z","dependencies_parsed_at":"2023-01-21T04:16:01.778Z","dependency_job_id":"c340c319-6b85-429a-bbcc-37bf0de8ce92","html_url":"https://github.com/fedora-python/marshalparser","commit_stats":{"total_commits":99,"total_committers":4,"mean_commits":24.75,"dds":"0.11111111111111116","last_synced_commit":"76baf14b788b3cc9661b235b473c99c6f46cb51a"},"previous_names":[],"tags_count":14,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fedora-python%2Fmarshalparser","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fedora-python%2Fmarshalparser/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fedora-python%2Fmarshalparser/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fedora-python%2Fmarshalparser/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fedora-python","download_url":"https://codeload.github.com/fedora-python/marshalparser/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254043215,"owners_count":22004911,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-18T23:10:20.082Z","updated_at":"2025-05-13T23:30:33.017Z","avatar_url":"https://github.com/fedora-python.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Marshal parser\n\n[`marshal`](https://docs.python.org/3/library/marshal.html)\nis an internal Python object serialization which is internally used\nfor serialization of [code objects](https://docs.python.org/3/c-api/code.html) into `*.pyc` files.\n\nIn the foreseeable future, this kinda useless brain exercise should\nhelp me to solve issues with non-reproducible `.pyc` files.\n\n## Installation\n\nMarshal parser is available on PyPI:\n\n```\npip install marshalparser\n```\n\nand as a Fedora RPM package:\n\n```\nsudo dnf install python3-marshalparser\n```\n\n## Parser in action\n\n### Printing parsed content in a human-readable way\n\nThe current version of parser creates a human-readable list of parsed objects\nwith info about bytes where objects start and about their content:\n\n```\n$ python3 -m marshalparser -p test/pure_marshal/list_of_simple_objects.dat\nn=0/0x0 byte=(b'5b', b'[', 0b1011011) TYPE_LIST REF[0]\n  tuple/list/set size: 4\n  n=5/0x5 byte=(b'54', b'T', 0b1010100) TYPE_TRUE\n  result=True, type=\u003cclass 'bool'\u003e\n  n=6/0x6 byte=(b'46', b'F', 0b1000110) TYPE_FALSE\n  result=False, type=\u003cclass 'bool'\u003e\n  n=7/0x7 byte=(b'4e', b'N', 0b1001110) TYPE_NONE\n  result=None, type=\u003cclass 'NoneType'\u003e\n  n=8/0x8 byte=(b'2e', b'.', 0b101110) TYPE_ELLIPSIS\n  result=Ellipsis, type=\u003cclass 'ellipsis'\u003e\nresult=[True, False, None, Ellipsis], type=\u003cclass 'list'\u003e\n```\n\nThe same for `.pyc` files but they are more complex as they contain code objects which are reported as dictionaries:\n\n```\n$ python3 -m marshalparser -p test/python_stdlib/3.9/doctest.cpython-39.opt-1.pyc | head -n 30\nn=16/0x10 byte=(b'63', b'c', 0b1100011) TYPE_CODE REF[0]\n  n=41/0x29 byte=(b'73', b's', 0b1110011) TYPE_STRING\n  result=b'd\\x00Z\\x00d\\x01Z\\x01g\\x00d\\x02\\xa2\\x01Z\\x02d\\x03d\\x04l\\x03Z\\x03d\\x03d\\x04l\\x04Z\\x04d\\x03d\\x04l\\x05Z\\x05d\\x03d\\x04l\\x06Z\\x06d\\x03d\\x04l\\x07Z\\x07d\\x03d\\x04l\\x08Z\\x08d\\x03d\\x04l\\tZ\\td\\x03d\\x04l\\nZ\\nd\\x03d\\x04l\\x0bZ\\x0bd\\x03d\\x04l\\x0cZ\\x0cd\\x03d\\x05l\\rm\\x0eZ\\x0e\\x01\\x00d\\x03d\\x06l\\x0fm\\x10Z\\x10\\x01\\x00e\\x10d\\x07d\\x08\\x83\\x02Z\\x11i\\x00Z\\x12d\\td\\n\\x84\\x00Z\\x13e\\x13d\\x0b\\x83\\x01Z\\x14e\\x13d\\x0c\\x83\\x01Z\\x15e\\x13d\\r\\x83\\x01Z\\x16e\\x13d\\x0e\\x83\\x01Z\\x17e\\x13d\\x0f\\x83\\x01Z\\x18e\\x13d\\x10\\x83\\x01Z\\x19e\\x14e\\x15B\\x00e\\x16B\\x00e\\x17B\\x00e\\x18B\\x00e\\x19B\\x00Z\\x1ae\\x13d\\x11\\x83\\x01Z\\x1be\\x13d\\x12\\x83\\x01Z\\x1ce\\x13d\\x13\\x83\\x01Z\\x1de\\x13d\\x14\\x83\\x01Z\\x1ee\\x13d\\x15\\x83\\x01Z\\x1fe\\x1be\\x1cB\\x00e\\x1dB\\x00e\\x1eB\\x00e\\x1fB\\x00Z d\\x16Z!d\\x17Z\"d\\x18d\\x19\\x84\\x00Z#drd\\x1bd\\x1c\\x84\\x01Z$d\\x1dd\\x1e\\x84\\x00Z%d\\x1fd \\x84\\x00Z\u0026dsd\"d#\\x84\\x01Z\\'d$d%\\x84\\x00Z(G\\x00d\u0026d\\'\\x84\\x00d\\'e\\x0e\\x83\\x03Z)d(d)\\x84\\x00Z*d*d+\\x84\\x00Z+d,d-\\x84\\x00Z,G\\x00d.d/\\x84\\x00d/e\\x08j-\\x83\\x03Z.d0d1\\x84\\x00Z/G\\x00d2d3\\x84\\x00d3\\x83\\x02Z0G\\x00d4d5\\x84\\x00d5\\x83\\x02Z1G\\x00d6d7\\x84\\x00d7\\x83\\x02Z2G\\x00d8d9\\x84\\x00d9\\x83\\x02Z3G\\x00d:d;\\x84\\x00d;\\x83\\x02Z4G\\x00d\u003cd=\\x84\\x00d=\\x83\\x02Z5G\\x00d\u003ed?\\x84\\x00d?e6\\x83\\x03Z7G\\x00d@dA\\x84\\x00dAe6\\x83\\x03Z8G\\x00dBdC\\x84\\x00dCe4\\x83\\x03Z9d\\x04a:dtdFdG\\x84\\x01Z;dDd\\x04d\\x04d\\x04d\\x04dDd\\x03d\\x04dEe2\\x83\\x00d\\x04f\\x0bdHdI\\x84\\x01Z\u003cdudKdL\\x84\\x01Z=d\\x03a\u003edMdN\\x84\\x00Z?G\\x00dOdP\\x84\\x00dPe\\x0cj@\\x83\\x03ZAG\\x00dQdR\\x84\\x00dReA\\x83\\x03ZBG\\x00dSdT\\x84\\x00dTe\\x0cjC\\x83\\x03ZDdvdUdV\\x84\\x01ZEG\\x00dWdX\\x84\\x00dXeA\\x83\\x03ZFdDd\\x04d\\x04e2\\x83\\x00d\\x04f\\x05dYdZ\\x84\\x01ZGd[d\\\\\\x84\\x00ZHd]d^\\x84\\x00ZId_d`\\x84\\x00ZJdwdadb\\x84\\x01ZKdxdcdd\\x84\\x01ZLdydedf\\x84\\x01ZMG\\x00dgdh\\x84\\x00dh\\x83\\x02ZNeNdidjdkdldmdn\\x9c\\x06ZOdodp\\x84\\x00ZPeQdqk\\x02\\x90\\x03r2e\\n\\xa0ReP\\x83\\x00\\xa1\\x01\\x01\\x00d\\x04S\\x00', type=\u003cclass 'bytes'\u003e\n  n=868/0x364 byte=(b'29', b')', 0b101001) TYPE_SMALL_TUPLE\n    Small tuple size: 122\n    n=870/0x366 byte=(b'61', b'a', 0b1100001) TYPE_ASCII\n    result=b'Module doctest -- a framework for running examples in docstrings.\\n\\nIn simplest use, end each module M to be tested with:\\n\\ndef _test():\\n    import doctest\\n    doctest.testmod()\\n\\nif __name__ == \"__main__\":\\n    _test()\\n\\nThen running the module as a script will cause the examples in the\\ndocstrings to get executed and verified:\\n\\npython M.py\\n\\nThis won\\'t display anything unless an example fails, in which case the\\nfailing example(s) and the cause(s) of the failure(s) are printed to stdout\\n(why not stderr? because stderr is a lame hack \u003c0.2 wink\u003e), and the final\\nline of output is \"Test failed.\".\\n\\nRun it with the -v switch instead:\\n\\npython M.py -v\\n\\nand a detailed report of all examples tried is printed to stdout, along\\nwith assorted summaries at the end.\\n\\nYou can force verbose mode by passing \"verbose=True\" to testmod, or prohibit\\nit by passing \"verbose=False\".  In either of those cases, sys.argv is not\\nexamined by testmod.\\n\\nThere are a variety of other ways to run doctests, including integration\\nwith the unittest framework, and support for running non-Python text\\nfiles containing doctests.  There are also many ways to override parts\\nof doctest\\'s default behaviors.  See the Library Reference Manual for\\ndetails.\\n', type=\u003cclass 'bytes'\u003e\n    n=2096/0x830 byte=(b'7a', b'z', 0b1111010) TYPE_SHORT_ASCII\n    result=b'reStructuredText en', type=\u003cclass 'bytes'\u003e\n    n=2117/0x845 byte=(b'29', b')', 0b101001) TYPE_SMALL_TUPLE\n      … etc …\n```\n\n### Unused `FLAG_REF`s\n\nNew version of the parser can produce also a list of unused `FLAG_REF`s — objects with\nenabled possibility to refference to them but with zero usage of that possibility.\n\nWe use the same example as before here so you can try to find the unused `FLAG_REF`\nmanually on the top of this page.\n\n```\npython3 -m marshalparser -u test/pure_marshal/list_of_simple_objects.dat\nUnused FLAG_REFs:\n0 - Flag_ref(byte=0, type='TYPE_LIST', content=[True, False, None, Ellipsis], usages=0)\n```\n\nIf we can detect it, we can also fix it. With option `-f`, Marshal parser produces a new\nfile where all unused `FLAG_REF` are removed and all useful references recalculated.\n\n```\n# Fix it\n$ python3 -m marshalparser -f test/pure_marshal/list_of_simple_objects.dat\n# Check the fixed file\n$ python3 -m marshalparser -u test/pure_marshal/list_of_simple_objects.fixed.dat\n# Print it\n$ python3 -m marshalparser -p test/pure_marshal/list_of_simple_objects.fixed.dat\nn=0/0x0 byte=(b'5b', b'[', 0b1011011) TYPE_LIST\n  tuple/list/set size: 4\n  n=5/0x5 byte=(b'54', b'T', 0b1010100) TYPE_TRUE\n  result=True, type=\u003cclass 'bool'\u003e\n  n=6/0x6 byte=(b'46', b'F', 0b1000110) TYPE_FALSE\n  result=False, type=\u003cclass 'bool'\u003e\n  n=7/0x7 byte=(b'4e', b'N', 0b1001110) TYPE_NONE\n  result=None, type=\u003cclass 'NoneType'\u003e\n  n=8/0x8 byte=(b'2e', b'.', 0b101110) TYPE_ELLIPSIS\n  result=Ellipsis, type=\u003cclass 'ellipsis'\u003e\nresult=[True, False, None, Ellipsis], type=\u003cclass 'list'\u003e\n```\n\nIt's also possible to overwrite the existing file with `-fo`.\n\n## Tests\n\nTests use pytest and `/test/python_stdlib/3.X` cotains around hundred of random `pyc` files from Python standard library\n(python3-libs or python36 etc.) RPM package in Fedora for each supported Python version.\n\nTests check that the parser is able to parse/fix a `pyc` file and then that the unmarshaled code object is the same\nin both files (original and fixed).\n\nTests ensures that MarshalParser running (for example) with Python 3.9 is able to parse and fix pyc files for other supported\nPython versions. But to check whether the original and fixed pyc files are the same, we need to run `marshal_content_check.py`\nwith the Python version the files were compiled by.\n\n## Python support\n\nThe code is tested with Python 3.6+ and it's also able to fix pyc files produced by Python 3.6+.\nPython 3.6 itself requires [`dataclasses`](https://pypi.org/project/dataclasses/).\n\n## Supported object types\n\n* ✓ TYPE_NULL (as a null operator for TYPE_DICT)\n* ✓ TYPE_NONE\n* ✓ TYPE_FALSE\n* ✓ TYPE_TRUE\n* ✓ TYPE_STOPITER\n* ✓ TYPE_ELLIPSIS\n* ✓ TYPE_INT\n* ✘ TYPE_INT64 (is not generated anymore)\n* ✘ TYPE_FLOAT (only in marshal version 1)\n* ✓ TYPE_BINARY_FLOAT\n* ✘ TYPE_COMPLEX (only in marshal version 1)\n* ✓ TYPE_BINARY_COMPLEX\n* ✓ TYPE_LONG (Parsed to digits but not reconstructed to PyLong)\n* ✓ TYPE_STRING\n* ✓ TYPE_INTERNED\n* ✓ TYPE_REF\n* ✓ TYPE_TUPLE\n* ✓ TYPE_LIST\n* ✓ TYPE_DICT\n* ✓ TYPE_CODE\n* ✓ TYPE_UNICODE\n* ? TYPE_UNKNOWN (no idea how to test unknown bytes-like objects as a bytes object)\n* ✓ TYPE_SET\n* ✓ TYPE_FROZENSET\n* ✓ TYPE_SLICE\n* ✓ FLAG_REF (recognized as a flag for all complex types)\n* ✓ TYPE_ASCII\n* ✓ TYPE_ASCII_INTERNED\n* ✓ TYPE_SMALL_TUPLE\n* ✓ TYPE_SHORT_ASCII\n* ✓ TYPE_SHORT_ASCII_INTERNED\n\n## References\n\n* [distutils is not reproducible](https://bugs.python.org/issue34033)\n* [python-3.6 packages do not build reproducibly](https://bugzilla.opensuse.org/show_bug.cgi?id=1049186)\n* [PEP 552](https://www.python.org/dev/peps/pep-0552/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffedora-python%2Fmarshalparser","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffedora-python%2Fmarshalparser","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffedora-python%2Fmarshalparser/lists"}