{"id":18298297,"url":"https://github.com/s3rgeym/sqldump2json","last_synced_at":"2025-04-05T13:33:16.001Z","repository":{"id":199982333,"uuid":"704302875","full_name":"s3rgeym/sqldump2json","owner":"s3rgeym","description":"Converts SQL dump to a JSON stream.","archived":false,"fork":false,"pushed_at":"2024-02-25T02:30:29.000Z","size":963,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-04-25T20:20:56.895Z","etag":null,"topics":["converter","jq","json","jsonl","parser","python","python3","sql"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/s3rgeym.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-10-13T01:21:45.000Z","updated_at":"2024-06-29T04:25:41.060Z","dependencies_parsed_at":"2023-11-15T06:32:44.453Z","dependency_job_id":"eb094778-f3eb-4137-ad5f-85eccd4cbad7","html_url":"https://github.com/s3rgeym/sqldump2json","commit_stats":null,"previous_names":["s3rgeym/sqldump2json"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/s3rgeym%2Fsqldump2json","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/s3rgeym%2Fsqldump2json/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/s3rgeym%2Fsqldump2json/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/s3rgeym%2Fsqldump2json/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/s3rgeym","download_url":"https://codeload.github.com/s3rgeym/sqldump2json/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223192691,"owners_count":17103564,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["converter","jq","json","jsonl","parser","python","python3","sql"],"created_at":"2024-11-05T15:05:42.739Z","updated_at":"2024-11-05T15:05:43.573Z","avatar_url":"https://github.com/s3rgeym.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# sqldump2json\n\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/sqldump2json)]() [![PyPI - Version](https://img.shields.io/pypi/v/sqldump2json)]() [![Total Downloads](https://static.pepy.tech/badge/sqldump2json)]()\n\nConverts SQL dump to a JSON stream.\n\nA tool for administrators, data scientists and hackers. With this tool you no longer need to import dumps into Databases. You can extract INSERT data as JSON and analyze them with [jq](https://github.com/jqlang/jq) or insert into MongoDB/Elastic/etc. The dump is not read entirely into RAM, so this utility can be used to process files of any size. And it can even parse corrupted dumps. No dependencies!\n\nSupported DBMS: MySQL, SQL Server, PotsgreSQL and some other (not all formats).\n\nRESTRICTIONS:\n\n- Syntax is checked only for `INSERT INTO` and `CREATE TABLE`.\n- The common SQL syntax is used which does not fully correspond to either MySQL or Postgres.\n- Function calls and subquieries in INSERT satetements are not supported.\n\nInstallation for normal Arch-based Linux ditros:\n\n```bash\n# install pipx\nyay -S python-pipx\n\n# install package from pypi\npipx install sqldump2json\n\n# install lastet version from github\npipx install git+https://github.com/s3rgeym/sqldump2json.git\n```\n\nFor other shit like Ubuntu you need to do more steps:\n\n- Install pyenv or asdf-vm.\n- Install latest python version and make it global via pyenv or asdf-vm.\n- Install sqldump2json.\n- Or use Docker.\n\n## CLI\n\nUsage:\n\n```bash\nsqldump2json [ -h ] [ -i INPUT ] [ -o OUTPUT ] [ ... ]\n```\n\nOutput format is JSONL:\n\n```bash\necho \"INSERT INTO db.data VALUES (1, 'foo'), (2, 'bar'), (3, 'baz');\" | sqldump2json\n{\"table\": \"data\", \"schema\": \"db\", \"values\": [1, \"foo\"]}\n{\"table\": \"data\", \"schema\": \"db\", \"values\": [2, \"bar\"]}\n{\"table\": \"data\", \"schema\": \"db\", \"values\": [3, \"baz\"]}\n```\n\nValues are converted to dict only if the `INSERT INTO` contains a list of fields or the fields are declared in `CREATE TABLE`:\n\n```bash\n$ sqldump2json \u003c\u003c\u003c \"INSERT INTO data VALUES (NULL, 3.14159265, FALSE, 'Привет', 0xDEADBEEF);\" | jq\n{\n  \"table\": \"data\",\n  \"values\": [\n    null,\n    3.14159265,\n    false,\n    \"Привет\",\n    \"3q2+7w==\"\n  ]\n}\n\n$ sqldump2json \u003c\u003c\u003c 'INSERT INTO `page` (title, contents) VALUES (\"Title\", \"Text goes here\");' | jq\n{\n  \"table\": \"page\",\n  \"values\": {\n    \"title\": \"Title\",\n    \"contents\": \"Text goes here\"\n  }\n}\n```\n\nUsing together with grep:\n\n```bash\ngrep 'INSERT INTO `users`' /path/to/dump.sql | sqldump2json | jq -r '.values | [.username, .email, .password] | @tsv' \u003e output.csv\n```\n\n## Scripting\n\nIf you were looking for a way how to import data from SQL to NoSQL databases and etc:\n\n```python\n#!/usr/bin/env python\nfrom sqldump2json import DumpParser\n...\nif __name__ == '__main__':\n    parse = DumpParser()\n    for val in parse(\"/path/to/dump.sql\"):\n        do_something(val)\n```\n\n## Development\n\nRun tests:\n\n```bash\npoetry run python -m unittest\n```\n\n## TODO LIST\n\n- Add support [mysql strings with charset](https://dev.mysql.com/doc/refman/8.0/en/charset-introducer.html) (eg, `_binary '\\x00...'`). + `X'...'`\n- Строки должны конкатенироваться, если идут подряд.\n- Ускорить парсинг.\n\n## Notes\n\nПосле создания этого пакета я случайно узнал про существование [sqldump-to](https://github.com/arjunmehta/sqldump-to). Тот проект заброшен, и та утилита НЕ МОЖЕТ ПАРСИТЬ ДАМПЫ ПО 100500 ГИГАБАЙТ.\n\nЯ пробовал ускорить парсинг с помощью orjson (реализован на говнорасте и отвечает за парсинг JSON), но вопреки заявленному ускорению в 10 раз, получил замедление при парсинге 23-гигового дампа, содержащего 160 миллинов вставок, с 5 часов до 7.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fs3rgeym%2Fsqldump2json","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fs3rgeym%2Fsqldump2json","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fs3rgeym%2Fsqldump2json/lists"}