{"id":13489685,"url":"https://github.com/juancarlospaco/faster-than-csv","last_synced_at":"2025-03-17T08:37:56.608Z","repository":{"id":51312785,"uuid":"157566267","full_name":"juancarlospaco/faster-than-csv","owner":"juancarlospaco","description":"Faster CSV for Python","archived":false,"fork":false,"pushed_at":"2022-01-18T17:59:04.000Z","size":16221,"stargazers_count":101,"open_issues_count":1,"forks_count":8,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-03-16T10:23:29.839Z","etag":null,"topics":["csv","csv-data","csv-parser","csv-parsing","csv-to-html","csv-to-json","cython","faster-than-csv","process-csv","python","python3","speed","speedup","static-memory-allocation","static-typing","tabular-data","tsv","tsv-parser","type-safe"],"latest_commit_sha":null,"homepage":"https://juancarlospaco.github.io/faster-than-csv","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/juancarlospaco.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null},"funding":{"custom":["https://gist.github.com/juancarlospaco/37da34ed13a609663f55f4466c4dbc3e"]}},"created_at":"2018-11-14T15:06:07.000Z","updated_at":"2025-03-12T03:56:27.000Z","dependencies_parsed_at":"2022-09-18T08:31:26.770Z","dependency_job_id":null,"html_url":"https://github.com/juancarlospaco/faster-than-csv","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/juancarlospaco%2Ffaster-than-csv","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/juancarlospaco%2Ffaster-than-csv/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/juancarlospaco%2Ffaster-than-csv/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/juancarlospaco%2Ffaster-than-csv/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/juancarlospaco","download_url":"https://codeload.github.com/juancarlospaco/faster-than-csv/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244001685,"owners_count":20381820,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csv","csv-data","csv-parser","csv-parsing","csv-to-html","csv-to-json","cython","faster-than-csv","process-csv","python","python3","speed","speedup","static-memory-allocation","static-typing","tabular-data","tsv","tsv-parser","type-safe"],"created_at":"2024-07-31T19:00:33.273Z","updated_at":"2025-03-17T08:37:56.147Z","avatar_url":"https://github.com/juancarlospaco.png","language":"Python","funding_links":["https://gist.github.com/juancarlospaco/37da34ed13a609663f55f4466c4dbc3e"],"categories":["Resources"],"sub_categories":["Libraries"],"readme":"# Faster-than-CSV\n\n[![Benchmark Results](https://raw.githubusercontent.com/juancarlospaco/faster-than-csv/master/results_graph.png \"Benchmark Results\")](https://youtu.be/QiKwnlyhKrk?t=5)\n\n![](https://img.shields.io/github/languages/top/juancarlospaco/faster-than-csv?style=for-the-badge)\n![](https://img.shields.io/github/languages/count/juancarlospaco/faster-than-csv?logoColor=green\u0026style=for-the-badge)\n![](https://img.shields.io/github/stars/juancarlospaco/faster-than-csv?style=for-the-badge \"Star faster-than-csv on GitHub!\")\n![](https://img.shields.io/maintenance/yes/2022?style=for-the-badge)\n![](https://img.shields.io/github/languages/code-size/juancarlospaco/faster-than-csv?style=for-the-badge)\n![](https://img.shields.io/github/issues-raw/juancarlospaco/faster-than-csv?style=for-the-badge \"Bugs\")\n![](https://img.shields.io/github/issues-pr-raw/juancarlospaco/faster-than-csv?style=for-the-badge \"PRs\")\n![](https://img.shields.io/github/commit-activity/y/juancarlospaco/faster-than-csv?style=for-the-badge)\n![](https://img.shields.io/github/last-commit/juancarlospaco/faster-than-csv?style=for-the-badge \"Commits\")\n\n| Library                       | Time (Speed) |\n|-------------------------------|--------------|\n| Pandas `read_csv()`           | `20.09`      |\n| NumPy `fromfile()`            | `3.88`       |\n| NumPy `genfromtxt()`          |  `4.00`      |\n| NumPy `loadtxt()`             |  `1.26`      |\n| csv (std lib)                 |  `0.40`      |\n| csv (list)                    |  `0.38`      |\n| csv (map)                     |  `0.37`      |\n| Faster_than_csv               |  `0.08`      |\n\n- This CSV Lib is ~300 Lines of Code.\n\n\u003cdetails\u003e\n\n- Benchmarks run on Docker from Dockerfile on this repo.\n- Speed is IRL time to complete 10000 CSV Parsings.\n- Lines Of Code counted using [CLOC](https://github.com/AlDanial/cloc).\n- Direct dependencies of the package when ready to run.\n- Benchmarks run on Docker from Dockerfile on this repo.\n- Stats as of year 2021.\n- x86_64 64Bit AMD, SSD, Arch Artix Linux.\n\n\u003c/details\u003e\n\n\n# Use\n\n```python\nimport faster_than_csv as csv\n\ncsv.csv2list(\"example.csv\")                     # See Docs for more info.\n                                                # Custom Separators supported.\ncsv.csv2json(\"example.csv\", indentation=4)      # CSV to JSON, Pretty-Printed.\n\ncsv.csv2htmltable(\"example.csv\")                # CSV to HTML+CSS Table (No JavaScript).\n\ncsv.read_clipboard()                            # CSV from the Clipboard.\n\ncsv.diff_csvs(\"example.csv\", \"anotherfile.csv\") # Diff optimized for CSVs.\n```\n- Input:  CSV, TSV, Clipboard, File, URL, Custom.\n- Output: CSV, TSV, HTML, JSON, NDJSON, Diff, File, Custom.\n\n\n# csv2dict()\n\u003cdetails\u003e\n\n**Description:**\nTakes a path of a CSV file string, process CSV and returns a list of dictionaries.\nThis is very similar to `pandas.read_csv(filename)`.\n\n**Arguments:**\n- `csv_file_path` path of the CSV file, `str` type, required, must not be empty string.\n- `separator` Separator character of the CSV data, `str` type, optional, defaults to `','`, must not be empty string.\n- `quote` Quote character of the CSV data, `str` type, optional, defaults to `'\"'`, must not be empty string.\n\n**Returns:**\nData from the CSV, `dict` type.\n\n\u003c/details\u003e\n\n\n\n# csv2list()\n\u003cdetails\u003e\n\n**Description:**\nTakes a path of a CSV file string, process CSV and returns a list.\n\n**Arguments:**\n- `csv_file_path` path of the CSV file, `str` type, required, must not be empty string.\n- `separator` Separator character of the CSV data, `str` type, optional, defaults to `','`, must not be empty string.\n- `quote` Quote character of the CSV data, `str` type, optional, defaults to `'\"'`, must not be empty string.\n\n**Returns:**\nData from the CSV, `list` type.\n\n\u003c/details\u003e\n\n\n\n# read_clipboard()\n\u003cdetails\u003e\n\n**Description:**\nReads CSV string from Clipboard, process CSV and returns a list of dictionaries.\nThis is very similar to `pandas.read_clipboard()`. This works on Linux, Mac, Windows.\n\n**Arguments:**\n\n- `separator` Separator character of the CSV data, `str` type, optional, defaults to `','`, must not be empty string.\n- `quote` Quote character of the CSV data, `str` type, optional, defaults to `'\"'`, must not be empty string.\n\n**Returns:**\nData from the CSV, `dict` type.\n\n\u003c/details\u003e\n\n\n# csv2json()\n\u003cdetails\u003e\n\n**Description:**\nTakes a path of a CSV file string, process CSV and returns JSON.\n\n**Arguments:**\n- `csv_file_path` path of the CSV file, `str` type, required, must not be empty string.\n- `separator` Separator character of the CSV data, `str` type, optional, defaults to `','`, must not be empty string.\n- `quote` Quote character of the CSV data, `str` type, optional, defaults to `'\"'`, must not be empty string.\n- `indentation` Pretty-Printed or Minified JSON output, `int` type, optional, `0` is Minified, `4` is Pretty-Printed, you can use any integer to adjust the indentation.\n\n**Returns:**\nData from the CSV as JSON Minified Single-line string computer-friendly, `str` type.\n\n\u003c/details\u003e\n\n\n# csv2ndjson()\n\u003cdetails\u003e\n\n**Description:**\nTakes a path of a CSV file string, process CSV and returns NDJSON.\n\n**Arguments:**\n- `csv_file_path` path of the CSV file, `str` type, required, must not be empty string.\n- `ndjson_file_path` path of the NDJSON file, `str` type, required, must not be empty string.\n- `separator` Separator character of the CSV data, `str` type, optional, defaults to `','`, must not be empty string.\n- `quote` Quote character of the CSV data, `str` type, optional, defaults to `'\"'`, must not be empty string.\n\n**Returns:** None.\nData from the CSV as NDJSON https://github.com/ndjson/ndjson-spec, `str` type.\n\n\u003c/details\u003e\n\n\n# csv2htmltable()\n\u003cdetails\u003e\n\n**Description:**\nTakes a path of a CSV file string, process CSV and returns the data rendered on HTML Table.\n\n**Arguments:**\n- `csv_file_path` path of the CSV file, `str` type, required, must not be empty string, defaults to `\"\"`, if its empty string then No file is written.\n- `html_file_path` path of the CSV file, `str` type, optional, can be empty string.\n- `separator` Separator character of the CSV data, `str` type, optional, defaults to `','`, must not be empty string.\n- `quote` Quote character of the CSV data, `str` type, optional, defaults to `'\"'`, must not be empty string.\n- `header_html` HTML Header, `str` type, optional, defaults to Bulma CSS, can be empty string.\n\n**Returns:**\nData from the CSV as HTML Table, `str` type, raw HTML (no style at all).\n\n\u003c/details\u003e\n\n\n# csv2karax()\n\u003cdetails\u003e\n\n![](https://user-images.githubusercontent.com/22755228/117183486-482b2a00-ade0-11eb-88e6-d8eeb28951ca.png)\n\n**Description:**\nTakes a path of a CSV file string, process CSV and returns the data rendered as a [Karax](https://github.com/karaxnim/karax#karax) HTML Table.\n\n**Arguments:**\n- `csv_file_path` path of the CSV file, `str` type, required, must not be empty string.\n- `separator` Separator character of the CSV data, `str` type, optional, defaults to `','`, must not be empty string.\n- `quote` Quote character of the CSV data, `str` type, optional, defaults to `'\"'`, must not be empty string.\n\n**Returns:** Karax DSL, `str` type.\n\n\u003c/details\u003e\n\n\n# csv2terminal()\n\u003cdetails\u003e\n\n**Description:**\nTakes a path of a CSV file string, process CSV and prints to terminal a colored prety-printed table.\n\n**Arguments:**\n- `csv_file_path` path of the CSV file, `str` type, required, must not be empty string, defaults to `\"\"`, if its empty string then No file is written.\n-  `column_width` column width of the wider column, required, `int` type, must not be `0`, must not be negative.\n- `separator` Separator character of the CSV data, `str` type, optional, defaults to `','`, must not be empty string.\n- `quote` Quote character of the CSV data, `str` type, optional, defaults to `'\"'`, must not be empty string.\n\n**Returns:** None.\n\n\u003c/details\u003e\n\n\n# csv2xml()\n\u003cdetails\u003e\n\n**Description:**\nTakes a path of a CSV file string, process CSV and returns a Valid XML string.\nOutput is guaranteed to be always Valid XML.\n\n**Arguments:**\n- `csv_file_path` path of the CSV file, `str` type, required, must not be empty string.\n- `separator` Separator character of the CSV data, `str` type, optional, defaults to `','`, must not be empty string.\n- `quote` Quote character of the CSV data, `str` type, optional, defaults to `'\"'`, must not be empty string.\n- `header_xml` XML Header of the XML string, `str` type, optional, can be empty string, defaults to `\"\u003c?xml version=\\\"1.0\\\" encoding=\\\"UTF-8\\\" ?\u003e\\n\"`.\n\n**Returns:** XML, `str` type.\n\n\u003c/details\u003e\n\n\n# tsv2csv()\n\u003cdetails\u003e\n\n**Description:**\nTakes a path of a CSV file string, process CSV and returns a TSV.\n\n**Arguments:**\n- `csv_file_path` path of the CSV file, `str` type, required, must not be empty string.\n- `separator1` Separator character of the CSV data, `str` type, optional, must not be empty string.\n- `separator2` Separator character of the CSV data, `str` type, optional, must not be empty string.\n- `quote` Quote character of the CSV data, `str` type, optional, defaults to `'\"'`, must not be empty string.\n\n**Returns:**\nData from the CSV as TSV, `str` type.\n\n\u003c/details\u003e\n\n\n# diff_csvs()\n\u003cdetails\u003e\n\n**Description:**\nTakes 2 paths of 2 CSV files, process CSV and returns the Diff of the 2 CSV.\n\n**Arguments:**\n- `csv_file_path0` path of the CSV file, `str` type, required, must not be empty string, file must exist.\n- `csv_file_path1` path of the CSV file, `str` type, required, must not be empty string, file must exist.\n\n**Returns:** Diff.\n\n\u003c/details\u003e\n\n\n[**For more Examples check the Examples and Tests.**](https://github.com/juancarlospaco/faster-than-csv/blob/master/examples/example.py)\n\nInstead of having a pair of functions with a lot of arguments that you should provide to make it work,\nwe have tiny functions with very few arguments that do one thing and do it as fast as possible.\n\n\n# Install\n\n- `pip install faster_than_csv`\n\n\n# Docker\n\n- Make a quick test drive on Docker!.\n\n```bash\n$ ./build-docker.sh\n$ ./run-docker.sh\n$ ./run-benchmark.sh  # Inside Docker.\n```\n\n\n# Dependencies\n\n- **None**\n\n\n# Platforms\n\n- ✅ Linux\n- ✅ Windows\n- ✅ Mac\n- ✅ Android\n- ✅ Raspberry Pi\n- ✅ BSD\n\n\n# Requisites\n\n- Python 3.6+ 64Bit.\n\n\n# Windows\n\n- If installation fails on Windows, just use the Source Code:\n\n![win-compile](https://user-images.githubusercontent.com/1189414/63147831-b8bf6100-bfd5-11e9-9e6e-91d61040f139.png \"Git Clone and Compile on Windows 10 with only Git and Nim installed, just 2 commands!\")\n\n- Git Clone and Compile on Windows 10 on just 2 commands!.\n- [Alternatively you can try Docker for Windows.](https://docs.docker.com/docker-for-windows)\n- [Alternatively you can try WSL for Windows.](https://docs.microsoft.com/en-us/windows/wsl/about)\n- **The file extension must be `.pyd`, NOT `.dll`.**\n\n\n# Stars\n\n![Star faster-than-csv on GitHub](https://starchart.cc/juancarlospaco/faster-than-csv.svg \"Star faster-than-csvon GitHub!\")\n\n\n# Contributors\n\n- [SekouDiaoNlp](https://github.com/SekouDiaoNlp)\n\n\n# FAQ\n\n\u003cdetails\u003e\n\n- Whats the idea, inspiration, reason, etc ?.\n\n[Feel free to Fork, Clone, Download, Improve, Reimplement, Play with this Open Source. Make it 10 times faster, 10 times smaller.](http://tonsky.me/blog/disenchantment)\n\n- This requires Cython ?.\n\nNo.\n\n- This runs on PyPy ?.\n\nNo.\n\n- This runs on Python2 ?.\n\nI dunno. (Not supported)\n\n- How can I Install it ?.\n\nhttps://github.com/juancarlospaco/faster-than-csv/releases\n\nIf you dont understand how to install it, you can just download, extract, put the files on the same folder as your `*.py` file and you are good to go.\n\n- How can be faster than NumPy ?.\n\nI dunno.\n\n- How can be faster than Pandas ?.\n\nI dunno.\n\n- Why needs 64Bit ?.\n\nMaybe it works on 32Bit, but is not supported, integer sizes are too small, and performance can be worse.\n\n- Why needs Python 3 ?.\n\nMaybe it works on Python 2, but is not supported, and performance can be worse, we suggest to migrate to Python3.\n\n- Can I wrap the functions on a `try: except:` block ?.\n\nFunctions do not have internal `try: except:` blocks,\nso you can wrap them inside `try: except:` blocks if you need very resilient code.\n\n- PIP fails to install or fails build the wheel ?.\n\nAdd at the end of the PIP install command:\n\n` --isolated --disable-pip-version-check --no-cache-dir --no-binary :all: `\n\nNot my Bug.\n\n- How to Build the project ?.\n\n`build.sh`\n\n- How to Package the project ?.\n\n`package.sh`\n\n- This requires Nimble ?.\n\nNo.\n\n- Whats the unit of measurement for speed ?.\n\nUnmmodified raw output of Python `timeit` module.\n\nPlease send Pull Request to Python to improve the output of `timeit`.\n\n\u003c/details\u003e\n\n\n# Send Crypto, request features, donate today\n\n\u003cdetails\u003e \n\u003csummary title=\"Send Bitcoin\"\u003e\u003ckbd\u003e Bitcoin BTC \u003c/kbd\u003e\u003c/summary\u003e\n\n**BEP20 Binance Smart Chain Network BSC**\n```\n0xb78c4cf63274bb22f83481986157d234105ac17e\n```\n**BTC Bitcoin Network**\n```\n1Pnf45MgGgY32X4KDNJbutnpx96E4FxqVi\n```\n\u003c/details\u003e\n\n\u003cdetails\u003e \n\u003csummary\u003e\u003ckbd\u003e Ethereum ETH \u003c/kbd\u003e \u003ckbd\u003e Dai DAI \u003c/kbd\u003e \u003ckbd\u003e Uniswap UNI \u003c/kbd\u003e \u003ckbd\u003e Axie Infinity AXS \u003c/kbd\u003e \u003ckbd\u003e Smooth Love Potion SLP \u003c/kbd\u003e \u003c/summary\u003e\n\n**BEP20 Binance Smart Chain Network BSC**\n```\n0xb78c4cf63274bb22f83481986157d234105ac17e\n```\n**ERC20 Ethereum Network**\n```\n0xb78c4cf63274bb22f83481986157d234105ac17e\n```\n\u003c/details\u003e\n\u003cdetails\u003e \n\u003csummary title=\"Send Tether\"\u003e\u003ckbd\u003e Tether USDT \u003c/kbd\u003e\u003c/summary\u003e\n\n**BEP20 Binance Smart Chain Network BSC**\n```\n0xb78c4cf63274bb22f83481986157d234105ac17e\n```\n**ERC20 Ethereum Network**\n```\n0xb78c4cf63274bb22f83481986157d234105ac17e\n```\n**TRC20 Tron Network**\n```\nTWGft53WgWvH2mnqR8ZUXq1GD8M4gZ4Yfu\n```\n\u003c/details\u003e\n\u003cdetails\u003e \n\u003csummary title=\"Send Solana\"\u003e\u003ckbd\u003e Solana SOL \u003c/kbd\u003e\u003c/summary\u003e\n\n**BEP20 Binance Smart Chain Network BSC**\n```\n0xb78c4cf63274bb22f83481986157d234105ac17e\n```\n**SOL Solana Network**\n```\nFKaPSd8kTUpH7Q76d77toy1jjPGpZSxR4xbhQHyCMSGq\n```\n\u003c/details\u003e\n\u003cdetails\u003e \n\u003csummary title=\"Send Cardano\"\u003e\u003ckbd\u003e Cardano ADA \u003c/kbd\u003e\u003c/summary\u003e\n\n**BEP20 Binance Smart Chain Network BSC**\n```\n0xb78c4cf63274bb22f83481986157d234105ac17e\n```\n**ADA Cardano Network**\n```\nDdzFFzCqrht9Y1r4Yx7ouqG9yJNWeXFt69xavLdaeXdu4cQi2yXgNWagzh52o9k9YRh3ussHnBnDrg7v7W2hSXWXfBhbo2ooUKRFMieM\n```\n\u003c/details\u003e\n\u003cdetails\u003e \n\u003csummary title=\"Send Sandbox\"\u003e\u003ckbd\u003e Sandbox SAND \u003c/kbd\u003e \u003ckbd\u003e Decentraland MANA \u003c/kbd\u003e\u003c/summary\u003e\n\n**ERC20 Ethereum Network**\n```\n0xb78c4cf63274bb22f83481986157d234105ac17e\n```\n\u003c/details\u003e\n\u003cdetails\u003e \n\u003csummary title=\"Send Algorand\"\u003e\u003ckbd\u003e Algorand ALGO \u003c/kbd\u003e\u003c/summary\u003e\n\n**ALGO Algorand Network**\n```\nWM54DHVZQIQDVTHMPOH6FEZ4U2AU3OBPGAFTHSCYWMFE7ETKCUUOYAW24Q\n```\n\u003c/details\u003e\n\n\u003cdetails\u003e \n\u003csummary title=\"Send via Binance Pay\"\u003e Binance \u003c/summary\u003e\n  \nhttps://pay.binance.com/en/checkout/e92e536210fd4f62b426ea7ee65b49c3\n\u003c/details\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjuancarlospaco%2Ffaster-than-csv","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjuancarlospaco%2Ffaster-than-csv","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjuancarlospaco%2Ffaster-than-csv/lists"}