{"id":19665507,"url":"https://github.com/zytedata/flattering","last_synced_at":"2025-04-28T22:31:12.289Z","repository":{"id":109679610,"uuid":"388121219","full_name":"zytedata/flattering","owner":"zytedata","description":"Flatten, format, and export any JSON-like data to CSV (or any other string output).","archived":false,"fork":false,"pushed_at":"2021-09-13T17:41:23.000Z","size":972,"stargazers_count":17,"open_issues_count":1,"forks_count":3,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-04-05T11:34:14.041Z","etag":null,"topics":["csv","flatten","json","stringio"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zytedata.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS","dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-07-21T13:12:09.000Z","updated_at":"2024-08-14T22:46:12.000Z","dependencies_parsed_at":"2023-03-22T20:47:31.465Z","dependency_job_id":null,"html_url":"https://github.com/zytedata/flattering","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zytedata%2Fflattering","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zytedata%2Fflattering/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zytedata%2Fflattering/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zytedata%2Fflattering/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zytedata","download_url":"https://codeload.github.com/zytedata/flattering/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251397577,"owners_count":21583034,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csv","flatten","json","stringio"],"created_at":"2024-11-11T16:23:08.466Z","updated_at":"2025-04-28T22:31:12.262Z","avatar_url":"https://github.com/zytedata.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Flattering\n\n\u0026nbsp;\n\n\u003cp align=\"center\"\u003e\n\u003cimg src=\"/images/flatlogo.png\" alt=\"Flatteting\" title=\"Flatteting\" /\u003e\n\u003c/p\u003e\n\nFlattering is the tool to flatten, format, and export any JSON-like data to CSV (or any other output), no matter how complex or mixed the data is.\n\nSo, items like this:\n\n```yaml\n{\n    \"name\": \"Product\",\n    \"offers\": [{\"price\": \"154.95\", \"currency\": \"$\"}],\n    \"sku\": 9204,\n    \"images\": [\n        \"https://m.site.com/i/9204_1.jpg\",\n        \"https://m.site.com/i/9204_2.jpg\",\n        \"https://m.site.com/i/9204_3.jpg\"\n    ],\n    \"description\": \"Custom description\\non multiple lines.\",\n    \"additionalProperty\": [\n        {\"name\": \"size\", \"value\": \"XL\"}, {\"name\": \"color\", \"value\": \"blue\"}\n    ],\n    \"aggregateRating\": {\"ratingValue\": 5.0, \"reviewCount\": 3}\n}\n```\n\nwill look like this:\n\n| \u003csub\u003eName\u003c/sub\u003e| \u003csub\u003ePrice\u003c/sub\u003e \u003c/sub\u003e | \u003csub\u003eCurrency\u003c/sub\u003e \u003c/sub\u003e | \u003csub\u003eSku\u003c/sub\u003e \u003c/sub\u003e | \u003csub\u003eImages\u003c/sub\u003e \u003c/sub\u003e | \u003csub\u003eDescription\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u003c/sub\u003e| \u003csub\u003eAdditionalProperty\u003c/sub\u003e \u003c/sub\u003e | \u003csub\u003eRatingValue\u003c/sub\u003e \u003c/sub\u003e | \u003csub\u003eReviewCount\u003c/sub\u003e  |\n| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |\n| \u003csub\u003eProduct\u003c/sub\u003e| \u003csub\u003e154.95\u003c/sub\u003e| \u003csub\u003e$\u003c/sub\u003e| \u003csub\u003e9204\u003c/sub\u003e| \u003csub\u003ehttps://m.site.com/i/9204_1.jpg\u003cbr\u003ehttps://m.site.com/i/9204_2.jpg\u003cbr\u003ehttps://m.site.com/i/9204_3.jpg\u003c/sub\u003e| \u003csub\u003eCustom description\u003cbr\u003eon multiple lines.\u003c/sub\u003e \u003c/sub\u003e | \u003csub\u003esize: XL\u003cbr\u003ecolor:blue\u003c/sub\u003e \u003c/sub\u003e | \u003csub\u003e5\u003c/sub\u003e \u003c/sub\u003e | \u003csub\u003e3\u003c/sub\u003e  |\n\n\u0026nbsp;\n\n## Contents\n\n- [Flattering](#flattering)\n  - [Contents](#contents)\n  - [Quickstart](#quickstart)\n  - [CLI](#cli)\n  - [What you can do](#what-you-can-do)\n    - [1. Flatten data](#1-flatten-data)\n    - [2. Rename columns](#2-rename-columns)\n    - [3. Format data](#3-format-data)\n    - [4. Filter columns](#4-filter-columns)\n    - [5. Order columns](#5-order-columns)\n    - [6. Process invalid data](#6-process-invalid-data)\n    - [7. Process complex data](#7-process-complex-data)\n    - [8. Export data](#8-export-data)\n  - [Arguments](#arguments)\n    - [StatsCollector](#statscollector)\n    - [Exporter](#exporter)\n  - [Requirements](#requirements)\n\n\u0026nbsp;\n\n## Quickstart\n\nFlattering consists of two elements:\n\n- `StatsCollector`, to understand how many columns are required, what headers they'll have, and what data is mixed/invalid (to skip or stringify).\n- `Exporter`, to format and beatify the data, fit it in columns, and export it (as `.csv` or flat data).\n\n```python\nitem_list = [{\"some_field\": \"some_value\", \"another_field\": [1, 2, 3]}]\nsc = StatsCollector()\nsc.process_items(item_list)\nexporter = Exporter(sc.stats[\"stats\"], sc.stats[\"invalid_properties\"])\nexporter.export_csv_full(item_list, \"example.csv\")\n```\n\nYou could use both parts on the same side or separately. For example, collect stats during a running job, and then provide them (tiny `JSON` with numbers) to the backend when a user wants to export the data.\n\nAlso, stats and **items could be processed one by one** (use `append=True` to append rows, if needed):\n\n```python\nitem_list = [{\"some_field\": \"some_value\", \"another_field\": [1, 2, 3]}]\nsc = StatsCollector()\n[sc.process_object(x) for x in item_list]\nexporter = Exporter(sc.stats[\"stats\"], sc.stats[\"invalid_properties\"])\nexporter.export_csv_headers(\"example.csv\")\nfor item in item_list:\n    exporter.export_csv_row(item, \"example.csv\", append=True)\n```\n\nWhen you provide the filename, the file will be opened to write/append automatically. If you want to open the file manually or write to any other form of `StringIO`, `TextIO`, etc. - check the [8. Export data](#8-export-data) section.\n\n\n## CLI\n\nPlus, you can use the tool through CLI:\n\n```bash\nflattering --path=\"example.json\" --outpath=\"example.csv\"\n```\nCLI supports all the same parameters, you can get a complete list using the `flattering -h` command.\n\n\u0026nbsp;\n\n## What you can do\n\n### 1. Flatten data\n\nLet's pick an initial item to explain what parameters and formatting options do.\n\n```yaml\n{\n    \"name\": \"Product\",\n    \"offers\": [{\"price\": \"154.95\", \"currency\": \"$\"}],\n    \"sku\": 9204,\n    \"images\": [\n        \"https://m.site.com/i/9204_1.jpg\",\n        \"https://m.site.com/i/9204_2.jpg\",\n        \"https://m.site.com/i/9204_3.jpg\"\n    ],\n    \"description\": \"Custom description\\non multiple lines.\",\n    \"additionalProperty\": [\n        {\"name\": \"size\", \"value\": \"XL\"}, {\"name\": \"color\", \"value\": \"blue\"}\n    ],\n    \"aggregateRating\": {\"ratingValue\": 5.0, \"reviewCount\": 3}\n}\n```\nIf you don't provide any custom options:\n\n```python\nitem_list = [item]\nsc = StatsCollector()\nsc.process_items(item_list)\nexporter = Exporter(sc.stats[\"stats\"], sc.stats[\"invalid_properties\"])\nexporter.export_csv_full(item_list, \"example.csv\")\n```\n\nthe export will look like this:\n\n| \u003csub\u003ename\u003c/sub\u003e | \u003csub\u003eoffers0-\u003eprice\u003c/sub\u003e | \u003csub\u003eoffers0-\u003ecurrency\u003c/sub\u003e | \u003csub\u003esku\u003c/sub\u003e | \u003csub\u003eimages0\u003c/sub\u003e | \u003csub\u003eimages1\u003c/sub\u003e | \u003csub\u003eimages2\u003c/sub\u003e | \u003csub\u003edescription\u003c/sub\u003e | \u003csub\u003eadditionalProperty0-\u003ename\u003c/sub\u003e | \u003csub\u003eadditionalProperty0-\u003evalue\u003c/sub\u003e | \u003csub\u003eadditionalProperty1-\u003ename\u003c/sub\u003e | \u003csub\u003eadditionalProperty1-\u003evalue\u003c/sub\u003e | \u003csub\u003eaggregateRating-\u003eratingValue\u003c/sub\u003e | \u003csub\u003eaggregateRating-\u003ereviewCount\u003c/sub\u003e |\n| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |\n| \u003csub\u003eProduct\u003c/sub\u003e | \u003csub\u003e154.95\u003c/sub\u003e | \u003csub\u003e$\u003c/sub\u003e | \u003csub\u003e9204\u003c/sub\u003e | \u003csub\u003ehttps://m.site.com/i/9204_1.jpg\u003c/sub\u003e | \u003csub\u003ehttps://m.site.com/i/9204_2.jpg\u003c/sub\u003e | \u003csub\u003ehttps://m.site.com/i/9204_3.jpg\u003c/sub\u003e | \u003csub\u003eCustom description\u003cbr\u003eon multiple lines.\u003c/sub\u003e | \u003csub\u003esize\u003c/sub\u003e | \u003csub\u003eXL\u003c/sub\u003e | \u003csub\u003ecolor\u003c/sub\u003e | \u003csub\u003eblue\u003c/sub\u003e | \u003csub\u003e5.0\u003c/sub\u003e | \u003csub\u003e3\u003c/sub\u003e |\n\n\u0026nbsp;\n\n### 2. Rename columns\n\nLet's make it a bit more readable with `headers_renaming`:\n\n```python\nrenaming = [\n    (r\"^offers\\[0\\]-\u003e\", \"\"),\n    (r\"^aggregateRating-\u003e\", \"\"),\n    (r\"^additionalProperty-\u003e(.*)-\u003evalue\", r\"\\1\")\n]\nexporter = Exporter(\n    sc.stats[\"stats\"],\n    sc.stats[\"invalid_properties\"],\n    headers_renaming=renaming)\n```\n\n| \u003csub\u003eName\u003c/sub\u003e | \u003csub\u003ePrice\u003c/sub\u003e | \u003csub\u003eCurrency\u003c/sub\u003e | \u003csub\u003eSku\u003c/sub\u003e | \u003csub\u003eImages[0]\u003c/sub\u003e | \u003csub\u003eImages[1]\u003c/sub\u003e | \u003csub\u003eImages[2]\u003c/sub\u003e | \u003csub\u003eDescription\u003c/sub\u003e | \u003csub\u003eAdditionalProperty[0]-\u003ename\u003c/sub\u003e | \u003csub\u003eAdditionalProperty[0]-\u003evalue\u003c/sub\u003e | \u003csub\u003eAdditionalProperty[1]-\u003ename\u003c/sub\u003e | \u003csub\u003eAdditionalProperty[1]-\u003evalue\u003c/sub\u003e | \u003csub\u003eRatingValue\u003c/sub\u003e | \u003csub\u003eReviewCount\u003c/sub\u003e |\n| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |\n| \u003csub\u003eProduct\u003c/sub\u003e | \u003csub\u003e154.95\u003c/sub\u003e | \u003csub\u003e$\u003c/sub\u003e | \u003csub\u003e9204\u003c/sub\u003e | \u003csub\u003ehttps://m.site.com/i/9204_1.jpg\u003c/sub\u003e | \u003csub\u003ehttps://m.site.com/i/9204_2.jpg\u003c/sub\u003e | \u003csub\u003ehttps://m.site.com/i/9204_3.jpg\u003c/sub\u003e | \u003csub\u003eCustom description\u003cbr\u003eon multiple lines.\u003c/sub\u003e | \u003csub\u003esize\u003c/sub\u003e | \u003csub\u003eXL\u003c/sub\u003e | \u003csub\u003ecolor\u003c/sub\u003e | \u003csub\u003eblue\u003c/sub\u003e | \u003csub\u003e5.0\u003c/sub\u003e | \u003csub\u003e3\u003c/sub\u003e |\n\n\u0026nbsp;\n\n### 3. Format data\n\nBetter, but images take too much place. Let's **group them in a single cell**, using the name of the field and `field_options`. Fields could be `grouped` (all data in a single cell), `named` (create columns based on an object property), or both.\n\n```python\noptions = {\"images\": {\"named\": False, \"grouped\": True}}\nexporter = Exporter(\n    sc.stats[\"stats\"],\n    sc.stats[\"invalid_properties\"],\n    headers_renaming=renaming,\n    field_options=options)\n```\n\n| \u003csub\u003eName\u003c/sub\u003e | \u003csub\u003ePrice\u003c/sub\u003e | \u003csub\u003eCurrency\u003c/sub\u003e | \u003csub\u003eSku\u003c/sub\u003e | \u003csub\u003eImages\u003c/sub\u003e | \u003csub\u003eDescription\u003c/sub\u003e | \u003csub\u003eAdditionalProperty[0]-\u003ename\u003c/sub\u003e | \u003csub\u003eAdditionalProperty[0]-\u003evalue\u003c/sub\u003e | \u003csub\u003eAdditionalProperty[1]-\u003ename\u003c/sub\u003e | \u003csub\u003eAdditionalProperty[1]-\u003evalue\u003c/sub\u003e | \u003csub\u003eRatingValue\u003c/sub\u003e | \u003csub\u003eReviewCount\u003c/sub\u003e |\n| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |\n| \u003csub\u003eProduct\u003c/sub\u003e | \u003csub\u003e154.95\u003c/sub\u003e | \u003csub\u003e$\u003c/sub\u003e | \u003csub\u003e9204\u003c/sub\u003e | \u003csub\u003ehttps://m.site.com/i/9204_1.jpg\u003cbr\u003ehttps://m.site.com/i/9204_2.jpg\u003cbr\u003ehttps://m.site.com/i/9204_3.jpg\u003c/sub\u003e | \u003csub\u003eCustom description\u003cbr\u003eon multiple lines.\u003c/sub\u003e | \u003csub\u003esize\u003c/sub\u003e | \u003csub\u003eXL\u003c/sub\u003e | \u003csub\u003ecolor\u003c/sub\u003e | \u003csub\u003eblue\u003c/sub\u003e | \u003csub\u003e5.0\u003c/sub\u003e | \u003csub\u003e3\u003c/sub\u003e |\n\n\u0026nbsp;\n\nLooks even better, but we still have a lot of `additionalProperty` columns. Let's make them `named`, by using `name` property as the name of the column to make it better:\n\n```python\noptions = {\n    \"images\": {\"named\": False, \"grouped\": True},\n    \"additionalProperty\": {\n        \"named\": True, \"name\": \"name\", \"grouped\": False\n    }\n}\n```\n| \u003csub\u003eName\u003c/sub\u003e | \u003csub\u003ePrice\u003c/sub\u003e | \u003csub\u003eCurrency\u003c/sub\u003e | \u003csub\u003eSku\u003c/sub\u003e | \u003csub\u003eImages\u003c/sub\u003e | \u003csub\u003eDescription\u003c/sub\u003e | \u003csub\u003eSize\u003c/sub\u003e | \u003csub\u003eColor\u003c/sub\u003e | \u003csub\u003eRatingValue\u003c/sub\u003e | \u003csub\u003eReviewCount\u003c/sub\u003e |\n| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |\n| \u003csub\u003eProduct\u003c/sub\u003e | \u003csub\u003e154.95\u003c/sub\u003e | \u003csub\u003e$\u003c/sub\u003e | \u003csub\u003e9204\u003c/sub\u003e | \u003csub\u003ehttps://m.site.com/i/9204_1.jpg\u003cbr\u003ehttps://m.site.com/i/9204_2.jpg\u003cbr\u003ehttps://m.site.com/i/9204_3.jpg\u003c/sub\u003e | \u003csub\u003eCustom description\u003cbr\u003eon multiple lines.\u003c/sub\u003e | \u003csub\u003eXL\u003c/sub\u003e | \u003csub\u003eblue\u003c/sub\u003e | \u003csub\u003e5.0\u003c/sub\u003e | \u003csub\u003e3\u003c/sub\u003e |\n\n\u0026nbsp;\n\nNow we have a column with a value for each `additionalProperty`. But if you don't need separate columns for that, you can go even further and format them as both `named` and `grouped`:\n\n```python\n\"additionalProperty\": {\n    \"named\": True, \"name\": \"name\", \"grouped\": True\n}\n```\n\n| \u003csub\u003eName\u003c/sub\u003e | \u003csub\u003ePrice\u003c/sub\u003e | \u003csub\u003eCurrency\u003c/sub\u003e | \u003csub\u003eSku\u003c/sub\u003e | \u003csub\u003eImages\u003c/sub\u003e | \u003csub\u003eDescription\u003c/sub\u003e | \u003csub\u003eAdditionalProperty\u003c/sub\u003e | \u003csub\u003eRatingValue\u003c/sub\u003e | \u003csub\u003eReviewCount\u003c/sub\u003e |\n| :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- |\n| \u003csub\u003eProduct\u003c/sub\u003e | \u003csub\u003e154.95\u003c/sub\u003e | \u003csub\u003e$\u003c/sub\u003e | \u003csub\u003e9204\u003c/sub\u003e | \u003csub\u003ehttps://m.site.com/i/9204_1.jpg\u003cbr\u003ehttps://m.site.com/i/9204_2.jpg\u003cbr\u003ehttps://m.site.com/i/9204_3.jpg\u003c/sub\u003e | \u003csub\u003eCustom description\u003cbr\u003eon multiple lines.\u003c/sub\u003e | \u003csub\u003esize: XL\u003cbr\u003ecolor: blue\u003c/sub\u003e | \u003csub\u003e5.0\u003c/sub\u003e | \u003csub\u003e3\u003c/sub\u003e |\n\n\u0026nbsp;\n\n### 4. Filter columns\n\nAlso, let's assume we don't really need `ratingValue` and `reviewCount` in this export, so we want to filter them with `headers_filters`:\n\n```python\nfilters = [r\".*ratingValue.*\", \".*reviewCount.*\"]\nexporter = Exporter(\n    sc.stats[\"stats\"],\n    sc.stats[\"invalid_properties\"],\n    headers_renaming=renaming,\n    headers_filters=filters,\n    field_options=options\n)\n```\nIt's important to remember that filters are regular expressions and work with the initial headers, so we're replacing `aggregateRating-\u003eratingValue` and `aggregateRating-\u003ereviewCount` here.\n\n| \u003csub\u003eName\u003c/sub\u003e | \u003csub\u003ePrice\u003c/sub\u003e | \u003csub\u003eCurrency\u003c/sub\u003e | \u003csub\u003eSku\u003c/sub\u003e | \u003csub\u003eImages\u003c/sub\u003e | \u003csub\u003eDescription\u003c/sub\u003e | \u003csub\u003eAdditionalProperty\u003c/sub\u003e |\n| :--- | :--- | :--- | :--- | :--- | :--- | :--- |\n| \u003csub\u003eProduct\u003c/sub\u003e | \u003csub\u003e154.95\u003c/sub\u003e | \u003csub\u003e$\u003c/sub\u003e | \u003csub\u003e9204\u003c/sub\u003e | \u003csub\u003ehttps://m.site.com/i/9204_1.jpg\u003cbr\u003ehttps://m.site.com/i/9204_2.jpg\u003cbr\u003ehttps://m.site.com/i/9204_3.jpg\u003c/sub\u003e | \u003csub\u003eCustom description\u003cbr\u003eon multiple lines.\u003c/sub\u003e | \u003csub\u003esize: XL\u003cbr\u003ecolor: blue\u003c/sub\u003e |\n\n\u0026nbsp;\n\n### 5. Order columns\n\nAnd, to add a final touch, let's reorder the headers with `headers_order`. For example, I want `Name` and `Sku` as the first two columns:\n\n```python\norder = [\"name\", \"sku\"]\nexporter = Exporter(\n    sc.stats[\"stats\"],\n    sc.stats[\"invalid_properties\"],\n    headers_renaming=renaming,\n    headers_filters=filters,\n    headers_order=order,\n    field_options=options\n)\n```\nAll headers present in the `headers_order` list will be ordered, and other headers will be provided in the natural order they appear in your data. Also, we're sorting initial headers, so using `name` and `sku` in lowercase.\n\n| \u003csub\u003eName\u003c/sub\u003e | \u003csub\u003eSku\u003c/sub\u003e | \u003csub\u003ePrice\u003c/sub\u003e | \u003csub\u003eCurrency\u003c/sub\u003e | \u003csub\u003eImages\u003c/sub\u003e | \u003csub\u003eDescription\u003c/sub\u003e | \u003csub\u003eAdditionalProperty\u003c/sub\u003e |\n| :--- | :--- | :--- | :--- | :--- | :--- | :--- |\n| \u003csub\u003eProduct\u003c/sub\u003e | \u003csub\u003e9204\u003c/sub\u003e | \u003csub\u003e154.95\u003c/sub\u003e | \u003csub\u003e$\u003c/sub\u003e | \u003csub\u003ehttps://m.site.com/i/9204_1.jpg\u003cbr\u003ehttps://m.site.com/i/9204_2.jpg\u003cbr\u003ehttps://m.site.com/i/9204_3.jpg\u003c/sub\u003e | \u003csub\u003eCustom description\u003cbr\u003eon multiple lines.\u003c/sub\u003e | \u003csub\u003esize: XL\u003cbr\u003ecolor: blue\u003c/sub\u003e |\n\n\u0026nbsp;\n\n### 6. Process invalid data\n\nIf your input has mixed types or invalid data, it could be hard to flatten it properly. So, you can decide - either `skip` such columns or `stringify` them.\n\nFor example, here the property changed type from `dict` to `list`:\n\n```python\nitem_list = [\n    {\"a\": \"a_1\", \"b\": {\"c\": \"c_1\"}},\n    {\"a\": \"a_2\", \"b\": [1, 2, 3]}\n]\nsc = StatsCollector()\nsc.process_items(item_list)\nexporter = Exporter(sc.stats[\"stats\"], sc.stats[\"invalid_properties\"])\nexporter.export_csv_full(item_list, \"example.csv\")\n```\n\nBy default, invalid properties would be stringified, so you'll get:\n| \u003csub\u003ea\u003c/sub\u003e | \u003csub\u003eb\u003c/sub\u003e |\n| :--- | :--- |\n| \u003csub\u003ea_1\u003c/sub\u003e | \u003csub\u003e{'c': 'c_1'}\u003c/sub\u003e |\n| \u003csub\u003ea_2\u003c/sub\u003e | \u003csub\u003esome_value\u003c/sub\u003e |\n\n\u0026nbsp;\n\nBut if you want to skip them, you could set `stringify_invalid` parameter to `False`. It works at all level of nesting, and will affect only the invalid property, so items like this:\n\n```python\nitem_list = [\n    {\"a\": \"a_1\", \"b\": {\"c\": \"c_1\", \"b\": \"b_1\"}},\n    {\"a\": \"a_1\", \"b\": {\"c\": \"c_2\", \"b\": [1, 2, 3]}},\n]\nsc = StatsCollector()\nsc.process_items(item_list)\nexporter = Exporter(\n    sc.stats[\"stats\"],\n    sc.stats[\"invalid_properties\"],\n    stringify_invalid=False\n)\nexporter.export_csv_full(item_list, \"example.csv\")\n```\n\nWill export like this:\n\n| \u003csub\u003ea\u003c/sub\u003e | \u003csub\u003eb-\u003ec\u003c/sub\u003e |\n| :--- | :--- |\n| \u003csub\u003ea_1\u003c/sub\u003e | \u003csub\u003ec_1\u003c/sub\u003e |\n| \u003csub\u003ea_1\u003c/sub\u003e | \u003csub\u003ec_2\u003c/sub\u003e |\n\n\n\u0026nbsp;\n\n### 7. Process complex data\n\nFollowing the nesting, you can export and format data with any amount of nested levels. So, let's create a bit unrealistic item with multiple levels, arrays of arrays, and so on:\n\n```yaml\n{\n    \"a\": {\n        \"nested_a\": [[\n            {\n                \"2x_nested_a\": {\n                    \"3x_nested_a\": [\n                        {\"name\": \"parameter1\", \"value\": \"value1\"},\n                        {\"name\": \"parameter2\", \"value\": \"value2\"},\n                    ]\n                }\n            },\n        ]],\n        \"second_nested_a\": \"some_value\",\n    }\n}\n```\n\nIf we try to flatten it as is, it will work. However, headers will be a bit questionable, so let's show it as a code:\n\n```python\n[\n    \"a-\u003enested_a[0][0]-\u003e2x_nested_a-\u003e3x_nested_a[0]-\u003ename\",\n    \"a-\u003enested_a[0][0]-\u003e2x_nested_a-\u003e3x_nested_a[0]-\u003evalue\",\n    \"a-\u003enested_a[0][0]-\u003e2x_nested_a-\u003e3x_nested_a[1]-\u003ename\",\n    \"a-\u003enested_a[0][0]-\u003e2x_nested_a-\u003e3x_nested_a[1]-\u003evalue\",\n    \"a-\u003esecond_nested_a\",\n]\n[\"parameter1\", \"value1\", \"parameter2\", \"value2\", \"some_value\"]\n```\n\nBut the best part is that we can format data (`grouped`, `named`) on any level, so with a bit of `field_options` magic:\n\n```python\n\"a-\u003enested_a[0][0]-\u003e2x_nested_a-\u003e3x_nested_a\": {\n    \"named\": True, \"name\": \"name\", \"grouped\": True\n}\n```\n\nIt will look like this:\n\n| \u003csub\u003ea-\u003enested_a[0][0]-\u003e2x_nested_a-\u003e3x_nested_a\u003c/sub\u003e | \u003csub\u003ea-\u003esecond_nested_a\u003c/sub\u003e |\n| :--- | :--- |\n| \u003csub\u003eparameter1: value1\u003cbr\u003eparameter2: value2\u003c/sub\u003e | \u003csub\u003esome_value\u003c/sub\u003e\n\n\n\u0026nbsp;\n\n\n### 8. Export data\n\nBy default, all the data is exported to `.csv`, either in one go:\n\n```python\nexporter = Exporter(sc.stats[\"stats\"], sc.stats[\"invalid_properties\"])\nexporter.export_csv_full(item_list, \"example.csv\")\n```\n\nor one-by-one:\n\n```python\nexporter.export_csv_headers(\"example.csv\")\n[exporter.export_csv_row(x, \"example.csv\", append=True) for x in item_list]\n```\n\nAlso, you could use any writable input, like `TextIO`, `StringIO`, and so on, so all of the examples below will work:\n\n```python\n# StringIO\nbuffer = io.StringIO()\nexporter.export_csv_full(item_list, buffer)\n\n# File objects\nwith open(\"example.csv\", \"w\") as f:\n    exporter.export_csv_full(item_list, f)\n\n# Path-like objects\nfilename = tmpdir.join(\"example\")\nexporter.export_csv_full(item_list, filename)\n```\n\nWe plan to support other formats, but for now  you could also get flattened items **one by one** trough `export_item_as_row` method and write them wherever you want:\n\n```python\n# [{\"property_1\": \"value\", \"property_2\": {\"nested_property\": [1, 2, 3]}}]\nflattened_items = [exporter.export_item_as_row(x) for x in item_list]\n# [['value', '1', '2', '3']]\n```\n\n\u0026nbsp;\n\n## Arguments\n### StatsCollector\n\n- **named_columns_limit** `int(default=50)` \n  \n  How many named columns could be created for a single field. For example, you have a set of objects like `{\"name\": \"color\", \"value\": \"blue\"}`. If you decide to create a separate column for each `name` (\"color\", \"size\", etc.), the limit defines how much data would be collected to make it work. If the limit is hit (too many columns) - no named columns would be created in export. It's required to control memory usage and data size during stats collection (no need to collect stats for 1000 columns if you don't plan to have 1000 columns anyway).\n\n- **cut_separator** `str(default=\"-\u003e\")`\n  \n  Separator to organize values from items to required columns. Used instead of default \"`.`\" separator. If your properties' names include the separator - replace it with a custom one.\n\n\u0026nbsp;\n\n### Exporter\n\n- **stats** `Dict[str, Header]`\n  \n  Item stats collected by `StatsCollector` (`stats_collector.stats[\"stats\"]`).\n\n- **invalid_properties** `Dict[str, str]`\n  \n  Invalid properties data provided by `StatsCollector` (`stats_collector.stats[\"invalid_properties\"]`)\n\n- **stringify_invalid** `bool(default=True)`\n  \n  If `True` - columns with invalid data would be stringified. If `False` - columns with invalid data would be skipped\n\n- **field_options** `Dict[str, FieldOption]`\n    \n    Field options to format data.\n    - Options could be `named` (`named=True, name=\"property_name\"`), so the exporter will try to create columns based on the values of the property provided in the `\"name\"` attribute.\n    - Options could be `grouped` (`grouped=True`), so the exporter will try to fit all the data for this field into a single cell.\n    - Options could be both `named` and `grouped`, so the exporter will try to get data collected for each named property and fit all this data in a single field.\n\n- **array_limits** `Dict[str, int]`\n  \n  Limit for the array fields to export only first N elements (`{\"images\": 1}`).\n\n- **headers_renaming** `List[Tuple[str, str]]`\n  \n   Set of RegExp rules to rename existing item columns (`[\".*_price\", \"regularPrice\"]`). The first value is the pattern to replace, while the second one is the replacement.\n\n- **headers_order** `List[str]`\n  \n  List to sort columns headers. All headers that are present both in this list and actual data - would be sorted. All other headers would be appended in a natural order. Headers should be provided in the form before renaming (`\"offers[0]-\u003eprice\"`, not `\"Price\"`).\n\n- **headers_filters** `List[str]`\n  \n  List of RegExp statements to filter columns. Headers that match any of these statements would be skipped (`[\"name.*\", \"_key\"]`).\n\n- **grouped_separator** `str`\n  \n  Separator to divide values when grouping data in a single cell (if `grouped=True`).\n\n- **cut_separator** `str(default=\"-\u003e\")`\n  \n  Separator to organize values from items to required columns. Used instead of default \"`.`\" separator. If your properties' names include the separator - replace it with a custom one.\n  \n- **capitalize_headers**  `bool(default=False)`\n\n  Capitalize fist letter of CSV headers when exporting.\n\n\u0026nbsp;\n\n## Requirements\n- Python 3.7+\n- Works on Linux, Windows, macOS, BSD\n\u003cbr\u003e\u003cbr\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzytedata%2Fflattering","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzytedata%2Fflattering","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzytedata%2Fflattering/lists"}