{"id":13532116,"url":"https://github.com/cloudpipe/cloudpickle","last_synced_at":"2025-05-14T21:00:32.289Z","repository":{"id":30327903,"uuid":"33880218","full_name":"cloudpipe/cloudpickle","owner":"cloudpipe","description":"Extended pickling support for Python objects","archived":false,"fork":false,"pushed_at":"2025-03-25T09:32:53.000Z","size":922,"stargazers_count":1749,"open_issues_count":97,"forks_count":178,"subscribers_count":29,"default_branch":"master","last_synced_at":"2025-05-03T02:11:28.979Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cloudpipe.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGES.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2015-04-13T16:33:00.000Z","updated_at":"2025-05-02T08:51:03.000Z","dependencies_parsed_at":"2023-02-14T17:45:31.719Z","dependency_job_id":"9d470bff-063b-404a-b13a-b37146af06f0","html_url":"https://github.com/cloudpipe/cloudpickle","commit_stats":{"total_commits":389,"total_committers":61,"mean_commits":6.377049180327869,"dds":0.7043701799485862,"last_synced_commit":"6220b0ce83ffee5e47e06770a1ee38ca9e47c850"},"previous_names":[],"tags_count":45,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cloudpipe%2Fcloudpickle","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cloudpipe%2Fcloudpickle/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cloudpipe%2Fcloudpickle/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cloudpipe%2Fcloudpickle/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cloudpipe","download_url":"https://codeload.github.com/cloudpipe/cloudpickle/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252835026,"owners_count":21811469,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T07:01:08.315Z","updated_at":"2025-05-07T07:38:31.423Z","avatar_url":"https://github.com/cloudpipe.png","language":"Python","readme":"# cloudpickle\n\n[![Automated Tests](https://github.com/cloudpipe/cloudpickle/workflows/Automated%20Tests/badge.svg?branch=master\u0026event=push)](https://github.com/cloudpipe/cloudpickle/actions)\n[![codecov.io](https://codecov.io/github/cloudpipe/cloudpickle/coverage.svg?branch=master)](https://codecov.io/github/cloudpipe/cloudpickle?branch=master)\n\n`cloudpickle` makes it possible to serialize Python constructs not supported\nby the default `pickle` module from the Python standard library.\n\n`cloudpickle` is especially useful for **cluster computing** where Python\ncode is shipped over the network to execute on remote hosts, possibly close\nto the data.\n\nAmong other things, `cloudpickle` supports pickling for **lambda functions**\nalong with **functions and classes defined interactively** in the\n`__main__` module (for instance in a script, a shell or a Jupyter notebook).\n\nCloudpickle can only be used to send objects between the **exact same version\nof Python**.\n\nUsing `cloudpickle` for **long-term object storage is not supported and\nstrongly discouraged.**\n\n**Security notice**: one should **only load pickle data from trusted sources** as\notherwise `pickle.load` can lead to arbitrary code execution resulting in a critical\nsecurity vulnerability.\n\n\nInstallation\n------------\n\nThe latest release of `cloudpickle` is available from\n[pypi](https://pypi.python.org/pypi/cloudpickle):\n\n    pip install cloudpickle\n\n\nExamples\n--------\n\nPickling a lambda expression:\n\n```python\n\u003e\u003e\u003e import cloudpickle\n\u003e\u003e\u003e squared = lambda x: x ** 2\n\u003e\u003e\u003e pickled_lambda = cloudpickle.dumps(squared)\n\n\u003e\u003e\u003e import pickle\n\u003e\u003e\u003e new_squared = pickle.loads(pickled_lambda)\n\u003e\u003e\u003e new_squared(2)\n4\n```\n\nPickling a function interactively defined in a Python shell session\n(in the `__main__` module):\n\n```python\n\u003e\u003e\u003e CONSTANT = 42\n\u003e\u003e\u003e def my_function(data: int) -\u003e int:\n...     return data + CONSTANT\n...\n\u003e\u003e\u003e pickled_function = cloudpickle.dumps(my_function)\n\u003e\u003e\u003e depickled_function = pickle.loads(pickled_function)\n\u003e\u003e\u003e depickled_function\n\u003cfunction __main__.my_function(data:int) -\u003e int\u003e\n\u003e\u003e\u003e depickled_function(43)\n85\n```\n\n\nOverriding pickle's serialization mechanism for importable constructs:\n----------------------------------------------------------------------\n\nAn important difference between `cloudpickle` and `pickle` is that\n`cloudpickle` can serialize a function or class **by value**, whereas `pickle`\ncan only serialize it **by reference**. Serialization by reference treats\nfunctions and classes as attributes of modules, and pickles them through\ninstructions that trigger the import of their module at load time.\nSerialization by reference is thus limited in that it assumes that the module\ncontaining the function or class is available/importable in the unpickling\nenvironment. This assumption breaks when pickling constructs defined in an\ninteractive session, a case that is automatically detected by `cloudpickle`,\nthat pickles such constructs **by value**.\n\nAnother case where the importability assumption is expected to break is when\ndeveloping a module in a distributed execution environment: the worker\nprocesses may not have access to the said module, for example if they live on a\ndifferent machine than the process in which the module is being developed. By\nitself, `cloudpickle` cannot detect such \"locally importable\" modules and\nswitch to serialization by value; instead, it relies on its default mode, which\nis serialization by reference. However, since `cloudpickle 2.0.0`, one can\nexplicitly specify modules for which serialization by value should be used,\nusing the\n`register_pickle_by_value(module)`/`/unregister_pickle_by_value(module)` API:\n\n```python\n\u003e\u003e\u003e import cloudpickle\n\u003e\u003e\u003e import my_module\n\u003e\u003e\u003e cloudpickle.register_pickle_by_value(my_module)\n\u003e\u003e\u003e cloudpickle.dumps(my_module.my_function)  # my_function is pickled by value\n\u003e\u003e\u003e cloudpickle.unregister_pickle_by_value(my_module)\n\u003e\u003e\u003e cloudpickle.dumps(my_module.my_function)  # my_function is pickled by reference\n```\n\nUsing this API, there is no need to re-install the new version of the module on\nall the worker nodes nor to restart the workers: restarting the client Python\nprocess with the new source code is enough.\n\nNote that this feature is still **experimental**, and may fail in the following\nsituations:\n\n- If the body of a function/class pickled by value contains an `import` statement:\n  ```python\n  \u003e\u003e\u003e def f():\n  \u003e\u003e\u003e ... from another_module import g\n  \u003e\u003e\u003e ... # calling f in the unpickling environment may fail if another_module\n  \u003e\u003e\u003e ... # is unavailable\n  \u003e\u003e\u003e ... return g() + 1\n  ```\n\n- If a function pickled by reference uses a function pickled by value during its execution.\n\n\nRunning the tests\n-----------------\n\n- With `tox`, to test run the tests for all the supported versions of\n  Python and PyPy:\n\n      pip install tox\n      tox\n\n  or alternatively for a specific environment:\n\n      tox -e py312\n\n\n- With `pytest` to only run the tests for your current version of\n  Python:\n\n      pip install -r dev-requirements.txt\n      PYTHONPATH='.:tests' pytest\n\nHistory\n-------\n\n`cloudpickle` was initially developed by [picloud.com](http://web.archive.org/web/20140721022102/http://blog.picloud.com/2013/11/17/picloud-has-joined-dropbox/) and shipped as part of\nthe client SDK.\n\nA copy of `cloudpickle.py` was included as part of PySpark, the Python\ninterface to [Apache Spark](https://spark.apache.org/). Davies Liu, Josh\nRosen, Thom Neale and other Apache Spark developers improved it significantly,\nmost notably to add support for PyPy and Python 3.\n\nThe aim of the `cloudpickle` project is to make that work available to a wider\naudience outside of the Spark ecosystem and to make it easier to improve it\nfurther notably with the help of a dedicated non-regression test suite.\n","funding_links":[],"categories":["Python","Data Format \u0026 I/O","Data Serialization"],"sub_categories":["For Python"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcloudpipe%2Fcloudpickle","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcloudpipe%2Fcloudpickle","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcloudpipe%2Fcloudpickle/lists"}