{"id":13532184,"url":"https://github.com/explosion/srsly","last_synced_at":"2025-10-08T18:30:18.967Z","repository":{"id":34059545,"uuid":"159904634","full_name":"explosion/srsly","owner":"explosion","description":"🦉 Modern high-performance serialization utilities for Python (JSON, MessagePack, Pickle)","archived":false,"fork":false,"pushed_at":"2025-01-16T23:10:20.000Z","size":829,"stargazers_count":461,"open_issues_count":7,"forks_count":32,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-04-10T00:07:06.104Z","etag":null,"topics":["json","msgpack","pickle","python","python-2","python-3","serialization","ujson","yaml"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/explosion.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-12-01T03:21:56.000Z","updated_at":"2025-04-09T16:08:13.000Z","dependencies_parsed_at":"2023-01-15T04:17:59.282Z","dependency_job_id":"f0c00d5d-eaa6-403b-9b2a-b531169b8c1f","html_url":"https://github.com/explosion/srsly","commit_stats":{"total_commits":264,"total_committers":16,"mean_commits":16.5,"dds":0.4734848484848485,"last_synced_commit":"e621b2b6891823951a2b09e0d8b1e30ead218cee"},"previous_names":[],"tags_count":37,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/explosion%2Fsrsly","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/explosion%2Fsrsly/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/explosion%2Fsrsly/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/explosion%2Fsrsly/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/explosion","download_url":"https://codeload.github.com/explosion/srsly/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251520147,"owners_count":21602436,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["json","msgpack","pickle","python","python-2","python-3","serialization","ujson","yaml"],"created_at":"2024-08-01T07:01:08.853Z","updated_at":"2025-10-08T18:30:13.932Z","avatar_url":"https://github.com/explosion.png","language":"Python","readme":"\u003ca href=\"https://explosion.ai\"\u003e\u003cimg src=\"https://explosion.ai/assets/img/logo.svg\" width=\"125\" height=\"125\" align=\"right\" /\u003e\u003c/a\u003e\n\n# srsly: Modern high-performance serialization utilities for Python\n\nThis package bundles some of the best Python serialization libraries into one\nstandalone package, with a high-level API that makes it easy to write code\nthat's correct across platforms and Pythons. This allows us to provide all the\nserialization utilities we need in a single binary wheel. Currently supports\n**JSON**, **JSONL**, **MessagePack**, **Pickle** and **YAML**.\n\n[![tests](https://github.com/explosion/srsly/actions/workflows/tests.yml/badge.svg)](https://github.com/explosion/srsly/actions/workflows/tests.yml)\n[![PyPi](https://img.shields.io/pypi/v/srsly.svg?style=flat-square\u0026logo=pypi\u0026logoColor=white)](https://pypi.python.org/pypi/srsly)\n[![conda](https://img.shields.io/conda/vn/conda-forge/srsly.svg?style=flat-square\u0026logo=conda-forge\u0026logoColor=white)](https://anaconda.org/conda-forge/srsly)\n[![GitHub](https://img.shields.io/github/release/explosion/srsly/all.svg?style=flat-square\u0026logo=github)](https://github.com/explosion/srsly)\n[![Python wheels](https://img.shields.io/badge/wheels-%E2%9C%93-4c1.svg?longCache=true\u0026style=flat-square\u0026logo=python\u0026logoColor=white)](https://github.com/explosion/wheelwright/releases)\n\n## Motivation\n\nSerialization is hard, especially across Python versions and multiple platforms.\nAfter dealing with many subtle bugs over the years (encodings, locales, large\nfiles) our libraries like [spaCy](https://github.com/explosion/spaCy) and\n[Prodigy](https://prodi.gy) had steadily grown a number of utility functions to\nwrap the multiple serialization formats we need to support (especially `json`,\n`msgpack` and `pickle`). These wrapping functions ended up duplicated across our\ncodebases, so we wanted to put them in one place.\n\nAt the same time, we noticed that having a lot of small dependencies was making\nmaintenance harder, and making installation slower. To solve this, we've made\n`srsly` standalone, by including the component packages directly within it. This\nway we can provide all the serialization utilities we need in a single binary\nwheel.\n\n`srsly` currently includes forks of the following packages:\n\n- [`ujson`](https://github.com/esnme/ultrajson)\n- [`msgpack`](https://github.com/msgpack/msgpack-python)\n- [`msgpack-numpy`](https://github.com/lebedov/msgpack-numpy)\n- [`cloudpickle`](https://github.com/cloudpipe/cloudpickle)\n- [`ruamel.yaml`](https://github.com/pycontribs/ruamel-yaml) (without unsafe\n  implementations!)\n\n## Installation\n\n\u003e ⚠️ Note that `v2.x` is only compatible with **Python 3.6+**. For 2.7+\n\u003e compatibility, use `v1.x`.\n\n`srsly` can be installed from pip. Before installing, make sure that your `pip`,\n`setuptools` and `wheel` are up to date.\n\n```bash\npython -m pip install -U pip setuptools wheel\npython -m pip install srsly\n```\n\nOr from conda via conda-forge:\n\n```bash\nconda install -c conda-forge srsly\n```\n\nAlternatively, you can also compile the library from source. You'll need to make\nsure that you have a development environment with a Python distribution\nincluding header files, a compiler (XCode command-line tools on macOS / OS X or\nVisual C++ build tools on Windows), pip and git installed.\n\nInstall from source:\n\n```bash\n# clone the repo\ngit clone https://github.com/explosion/srsly\ncd srsly\n\n# create a virtual environment\npython -m venv .env\nsource .env/bin/activate\n\n# update pip\npython -m pip install -U pip setuptools wheel\n\n# compile and install from source\npython -m pip install .\n```\n\nFor developers, install requirements separately and then install in editable\nmode without build isolation:\n\n```bash\n# install in editable mode\npython -m pip install -r requirements.txt\npython -m pip install --no-build-isolation --editable .\n\n# run test suite\npython -m pytest --pyargs srsly\n```\n\n## API\n\n### JSON\n\n\u003e 📦 The underlying module is exposed via `srsly.ujson`. However, we normally\n\u003e interact with it via the utility functions only.\n\n#### \u003ckbd\u003efunction\u003c/kbd\u003e `srsly.json_dumps`\n\nSerialize an object to a JSON string. Falls back to `json` if `sort_keys=True`\nis used (until it's fixed in `ujson`).\n\n```python\ndata = {\"foo\": \"bar\", \"baz\": 123}\njson_string = srsly.json_dumps(data)\n```\n\n| Argument    | Type | Description                                            |\n| ----------- | ---- | ------------------------------------------------------ |\n| `data`      | -    | The JSON-serializable data to output.                  |\n| `indent`    | int  | Number of spaces used to indent JSON. Defaults to `0`. |\n| `sort_keys` | bool | Sort dictionary keys. Defaults to `False`.             |\n| **RETURNS** | str  | The serialized string.                                 |\n\n#### \u003ckbd\u003efunction\u003c/kbd\u003e `srsly.json_loads`\n\nDeserialize unicode or bytes to a Python object.\n\n```python\ndata = '{\"foo\": \"bar\", \"baz\": 123}'\nobj = srsly.json_loads(data)\n```\n\n| Argument    | Type        | Description                     |\n| ----------- | ----------- | ------------------------------- |\n| `data`      | str / bytes | The data to deserialize.        |\n| **RETURNS** | -           | The deserialized Python object. |\n\n#### \u003ckbd\u003efunction\u003c/kbd\u003e `srsly.write_json`\n\nCreate a JSON file and dump contents or write to standard output.\n\n```python\ndata = {\"foo\": \"bar\", \"baz\": 123}\nsrsly.write_json(\"/path/to/file.json\", data)\n```\n\n| Argument | Type         | Description                                            |\n| -------- | ------------ | ------------------------------------------------------ |\n| `path`   | str / `Path` | The file path or `\"-\"` to write to stdout.             |\n| `data`   | -            | The JSON-serializable data to output.                  |\n| `indent` | int          | Number of spaces used to indent JSON. Defaults to `2`. |\n\n#### \u003ckbd\u003efunction\u003c/kbd\u003e `srsly.read_json`\n\nLoad JSON from a file or standard input.\n\n```python\ndata = srsly.read_json(\"/path/to/file.json\")\n```\n\n| Argument    | Type         | Description                                |\n| ----------- | ------------ | ------------------------------------------ |\n| `path`      | str / `Path` | The file path or `\"-\"` to read from stdin. |\n| **RETURNS** | dict / list  | The loaded JSON content.                   |\n\n#### \u003ckbd\u003efunction\u003c/kbd\u003e `srsly.write_gzip_json`\n\nCreate a gzipped JSON file and dump contents.\n\n```python\ndata = {\"foo\": \"bar\", \"baz\": 123}\nsrsly.write_gzip_json(\"/path/to/file.json.gz\", data)\n```\n\n| Argument | Type         | Description                                            |\n| -------- | ------------ | ------------------------------------------------------ |\n| `path`   | str / `Path` | The file path.                                         |\n| `data`   | -            | The JSON-serializable data to output.                  |\n| `indent` | int          | Number of spaces used to indent JSON. Defaults to `2`. |\n\n#### \u003ckbd\u003efunction\u003c/kbd\u003e `srsly.write_gzip_jsonl`\n\nCreate a gzipped JSONL file and dump contents.\n\n```python\ndata = [{\"foo\": \"bar\"}, {\"baz\": 123}]\nsrsly.write_gzip_json(\"/path/to/file.jsonl.gz\", data)\n```\n\n| Argument          | Type         | Description                                                                                                                                                                                                             |\n| ----------------- | ------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |\n| `path`            | str / `Path` | The file path.                                                                                                                                                                                                          |\n| `lines`           | -            | The JSON-serializable contents of each line.                                                                                                                                                                            |\n| `append`          | bool         | Whether or not to append to the location. Appending to .gz files is generally not recommended, as it doesn't allow the algorithm to take advantage of all data when compressing - files may hence be poorly compressed. |\n| `append_new_line` | bool         | Whether or not to write a new line before appending to the file.                                                                                                                                                        |\n\n#### \u003ckbd\u003efunction\u003c/kbd\u003e `srsly.read_gzip_json`\n\nLoad gzipped JSON from a file.\n\n```python\ndata = srsly.read_gzip_json(\"/path/to/file.json.gz\")\n```\n\n| Argument    | Type         | Description              |\n| ----------- | ------------ | ------------------------ |\n| `path`      | str / `Path` | The file path.           |\n| **RETURNS** | dict / list  | The loaded JSON content. |\n\n#### \u003ckbd\u003efunction\u003c/kbd\u003e `srsly.read_gzip_jsonl`\n\nLoad gzipped JSONL from a file.\n\n```python\ndata = srsly.read_gzip_jsonl(\"/path/to/file.jsonl.gz\")\n```\n\n| Argument    | Type         | Description               |\n| ----------- | ------------ | ------------------------- |\n| `path`      | str / `Path` | The file path.            |\n| **RETURNS** | dict / list  | The loaded JSONL content. |\n\n#### \u003ckbd\u003efunction\u003c/kbd\u003e `srsly.write_jsonl`\n\nCreate a JSONL file (newline-delimited JSON) and dump contents line by line, or\nwrite to standard output.\n\n```python\ndata = [{\"foo\": \"bar\"}, {\"baz\": 123}]\nsrsly.write_jsonl(\"/path/to/file.jsonl\", data)\n```\n\n| Argument          | Type         | Description                                                                                                            |\n| ----------------- | ------------ | ---------------------------------------------------------------------------------------------------------------------- |\n| `path`            | str / `Path` | The file path or `\"-\"` to write to stdout.                                                                             |\n| `lines`           | iterable     | The JSON-serializable lines.                                                                                           |\n| `append`          | bool         | Append to an existing file. Will open it in `\"a\"` mode and insert a newline before writing lines. Defaults to `False`. |\n| `append_new_line` | bool         | Defines whether a new line should first be written when appending to an existing file. Defaults to `True`.             |\n\n#### \u003ckbd\u003efunction\u003c/kbd\u003e `srsly.read_jsonl`\n\nRead a JSONL file (newline-delimited JSON) or from JSONL data from standard\ninput and yield contents line by line. Blank lines will always be skipped.\n\n```python\ndata = srsly.read_jsonl(\"/path/to/file.jsonl\")\n```\n\n| Argument   | Type       | Description                                                          |\n| ---------- | ---------- | -------------------------------------------------------------------- |\n| `path`     | str / Path | The file path or `\"-\"` to read from stdin.                           |\n| `skip`     | bool       | Skip broken lines and don't raise `ValueError`. Defaults to `False`. |\n| **YIELDS** | -          | The loaded JSON contents of each line.                               |\n\n#### \u003ckbd\u003efunction\u003c/kbd\u003e `srsly.is_json_serializable`\n\nCheck if a Python object is JSON-serializable.\n\n```python\nassert srsly.is_json_serializable({\"hello\": \"world\"}) is True\nassert srsly.is_json_serializable(lambda x: x) is False\n```\n\n| Argument    | Type | Description                              |\n| ----------- | ---- | ---------------------------------------- |\n| `obj`       | -    | The object to check.                     |\n| **RETURNS** | bool | Whether the object is JSON-serializable. |\n\n### msgpack\n\n\u003e 📦 The underlying module is exposed via `srsly.msgpack`. However, we normally\n\u003e interact with it via the utility functions only.\n\n#### \u003ckbd\u003efunction\u003c/kbd\u003e `srsly.msgpack_dumps`\n\nSerialize an object to a msgpack byte string.\n\n```python\ndata = {\"foo\": \"bar\", \"baz\": 123}\nmsg = srsly.msgpack_dumps(data)\n```\n\n| Argument    | Type  | Description            |\n| ----------- | ----- | ---------------------- |\n| `data`      | -     | The data to serialize. |\n| **RETURNS** | bytes | The serialized bytes.  |\n\n#### \u003ckbd\u003efunction\u003c/kbd\u003e `srsly.msgpack_loads`\n\nDeserialize msgpack bytes to a Python object.\n\n```python\nmsg = b\"\\x82\\xa3foo\\xa3bar\\xa3baz{\"\ndata = srsly.msgpack_loads(msg)\n```\n\n| Argument    | Type  | Description                                                                             |\n| ----------- | ----- | --------------------------------------------------------------------------------------- |\n| `data`      | bytes | The data to deserialize.                                                                |\n| `use_list`  | bool  | Don't use tuples instead of lists. Can make deserialization slower. Defaults to `True`. |\n| **RETURNS** | -     | The deserialized Python object.                                                         |\n\n#### \u003ckbd\u003efunction\u003c/kbd\u003e `srsly.write_msgpack`\n\nCreate a msgpack file and dump contents.\n\n```python\ndata = {\"foo\": \"bar\", \"baz\": 123}\nsrsly.write_msgpack(\"/path/to/file.msg\", data)\n```\n\n| Argument | Type         | Description            |\n| -------- | ------------ | ---------------------- |\n| `path`   | str / `Path` | The file path.         |\n| `data`   | -            | The data to serialize. |\n\n#### \u003ckbd\u003efunction\u003c/kbd\u003e `srsly.read_msgpack`\n\nLoad a msgpack file.\n\n```python\ndata = srsly.read_msgpack(\"/path/to/file.msg\")\n```\n\n| Argument    | Type         | Description                                                                             |\n| ----------- | ------------ | --------------------------------------------------------------------------------------- |\n| `path`      | str / `Path` | The file path.                                                                          |\n| `use_list`  | bool         | Don't use tuples instead of lists. Can make deserialization slower. Defaults to `True`. |\n| **RETURNS** | -            | The loaded and deserialized content.                                                    |\n\n### pickle\n\n\u003e 📦 The underlying module is exposed via `srsly.cloudpickle`. However, we\n\u003e normally interact with it via the utility functions only.\n\n#### \u003ckbd\u003efunction\u003c/kbd\u003e `srsly.pickle_dumps`\n\nSerialize a Python object with pickle.\n\n```python\ndata = {\"foo\": \"bar\", \"baz\": 123}\npickled_data = srsly.pickle_dumps(data)\n```\n\n| Argument    | Type  | Description                                            |\n| ----------- | ----- | ------------------------------------------------------ |\n| `data`      | -     | The object to serialize.                               |\n| `protocol`  | int   | Protocol to use. `-1` for highest. Defaults to `None`. |\n| **RETURNS** | bytes | The serialized object.                                 |\n\n#### \u003ckbd\u003efunction\u003c/kbd\u003e `srsly.pickle_loads`\n\nDeserialize bytes with pickle.\n\n```python\npickled_data = b\"\\x80\\x04\\x95\\x19\\x00\\x00\\x00\\x00\\x00\\x00\\x00}\\x94(\\x8c\\x03foo\\x94\\x8c\\x03bar\\x94\\x8c\\x03baz\\x94K{u.\"\ndata = srsly.pickle_loads(pickled_data)\n```\n\n| Argument    | Type  | Description                     |\n| ----------- | ----- | ------------------------------- |\n| `data`      | bytes | The data to deserialize.        |\n| **RETURNS** | -     | The deserialized Python object. |\n\n### YAML\n\n\u003e 📦 The underlying module is exposed via `srsly.ruamel_yaml`. However, we\n\u003e normally interact with it via the utility functions only.\n\n#### \u003ckbd\u003efunction\u003c/kbd\u003e `srsly.yaml_dumps`\n\nSerialize an object to a YAML string. See the\n[`ruamel.yaml` docs](https://yaml.readthedocs.io/en/latest/detail.html?highlight=indentation#indentation-of-block-sequences)\nfor details on the indentation format.\n\n```python\ndata = {\"foo\": \"bar\", \"baz\": 123}\nyaml_string = srsly.yaml_dumps(data)\n```\n\n| Argument          | Type | Description                                |\n| ----------------- | ---- | ------------------------------------------ |\n| `data`            | -    | The JSON-serializable data to output.      |\n| `indent_mapping`  | int  | Mapping indentation. Defaults to `2`.      |\n| `indent_sequence` | int  | Sequence indentation. Defaults to `4`.     |\n| `indent_offset`   | int  | Indentation offset. Defaults to `2`.       |\n| `sort_keys`       | bool | Sort dictionary keys. Defaults to `False`. |\n| **RETURNS**       | str  | The serialized string.                     |\n\n#### \u003ckbd\u003efunction\u003c/kbd\u003e `srsly.yaml_loads`\n\nDeserialize unicode or a file object to a Python object.\n\n```python\ndata = 'foo: bar\\nbaz: 123'\nobj = srsly.yaml_loads(data)\n```\n\n| Argument    | Type       | Description                     |\n| ----------- | ---------- | ------------------------------- |\n| `data`      | str / file | The data to deserialize.        |\n| **RETURNS** | -          | The deserialized Python object. |\n\n#### \u003ckbd\u003efunction\u003c/kbd\u003e `srsly.write_yaml`\n\nCreate a YAML file and dump contents or write to standard output.\n\n```python\ndata = {\"foo\": \"bar\", \"baz\": 123}\nsrsly.write_yaml(\"/path/to/file.yml\", data)\n```\n\n| Argument          | Type         | Description                                |\n| ----------------- | ------------ | ------------------------------------------ |\n| `path`            | str / `Path` | The file path or `\"-\"` to write to stdout. |\n| `data`            | -            | The JSON-serializable data to output.      |\n| `indent_mapping`  | int          | Mapping indentation. Defaults to `2`.      |\n| `indent_sequence` | int          | Sequence indentation. Defaults to `4`.     |\n| `indent_offset`   | int          | Indentation offset. Defaults to `2`.       |\n| `sort_keys`       | bool         | Sort dictionary keys. Defaults to `False`. |\n\n#### \u003ckbd\u003efunction\u003c/kbd\u003e `srsly.read_yaml`\n\nLoad YAML from a file or standard input.\n\n```python\ndata = srsly.read_yaml(\"/path/to/file.yml\")\n```\n\n| Argument    | Type         | Description                                |\n| ----------- | ------------ | ------------------------------------------ |\n| `path`      | str / `Path` | The file path or `\"-\"` to read from stdin. |\n| **RETURNS** | dict / list  | The loaded YAML content.                   |\n\n#### \u003ckbd\u003efunction\u003c/kbd\u003e `srsly.is_yaml_serializable`\n\nCheck if a Python object is YAML-serializable.\n\n```python\nassert srsly.is_yaml_serializable({\"hello\": \"world\"}) is True\nassert srsly.is_yaml_serializable(lambda x: x) is False\n```\n\n| Argument    | Type | Description                              |\n| ----------- | ---- | ---------------------------------------- |\n| `obj`       | -    | The object to check.                     |\n| **RETURNS** | bool | Whether the object is YAML-serializable. |\n","funding_links":[],"categories":["Python","Language specific","Data Serialization"],"sub_categories":["Python"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fexplosion%2Fsrsly","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fexplosion%2Fsrsly","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fexplosion%2Fsrsly/lists"}