{"id":15654934,"url":"https://github.com/lucacappelletti94/ugly_csv_generator","last_synced_at":"2025-08-21T15:32:31.238Z","repository":{"id":62586137,"uuid":"243039018","full_name":"LucaCappelletti94/ugly_csv_generator","owner":"LucaCappelletti94","description":"Python package to generate ugly real-looking csvs.","archived":false,"fork":false,"pushed_at":"2024-09-02T13:13:49.000Z","size":116,"stargazers_count":28,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-12-08T04:46:04.155Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/LucaCappelletti94.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":"LucaCappelletti94"}},"created_at":"2020-02-25T15:49:11.000Z","updated_at":"2024-12-03T21:37:59.000Z","dependencies_parsed_at":"2024-10-23T04:21:46.899Z","dependency_job_id":"69252bc6-9427-41bf-9c84-ac3309350457","html_url":"https://github.com/LucaCappelletti94/ugly_csv_generator","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LucaCappelletti94%2Fugly_csv_generator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LucaCappelletti94%2Fugly_csv_generator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LucaCappelletti94%2Fugly_csv_generator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/LucaCappelletti94%2Fugly_csv_generator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/LucaCappelletti94","download_url":"https://codeload.github.com/LucaCappelletti94/ugly_csv_generator/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":230520393,"owners_count":18238948,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-03T12:55:00.584Z","updated_at":"2024-12-20T01:15:40.867Z","avatar_url":"https://github.com/LucaCappelletti94.png","language":"Python","readme":"# Ugly CSV generator\n[![Pypi project](https://badge.fury.io/py/ugly-csv-generator.svg)](https://badge.fury.io/py/ugly-csv-generator)\n[![Pypi total project downloads](https://pepy.tech/badge/ugly-csv-generator)](https://pepy.tech/projects/ugly-csv-generator)\n[![LICENSE](https://img.shields.io/pypi/l/ugly-csv-generator)](https://github.com/LucaCappelletti94/ugly-csv-generator/blob/main/LICENSE)\n[![Python version](https://img.shields.io/pypi/pyversions/ugly-csv-generator)](https://img.shields.io/pypi/pyversions/ugly-csv-generator)\n[![Github Actions](https://github.com/LucaCappelletti94/ugly_csv_generator/actions/workflows/python.yml/badge.svg)](https://github.com/LucaCappelletti94/ugly_csv_generator/actions/)\n[![Codacy Badge](https://app.codacy.com/project/badge/Grade/e6fe64db1c9042bbaa4c0a20bde585dc)](https://app.codacy.com/gh/LucaCappelletti94/ugly_csv_generator/dashboard?utm_source=gh\u0026utm_medium=referral\u0026utm_content=\u0026utm_campaign=Badge_grade)\n\nPython package to automatically uglify CSVs. Why? To improve the testing capabilities of pipelines that must be able to support strongly malformed input data.\n\nAll the malformation automated here are non-destructive, meaning they introduce confusion in the data but do not mangle or destroy information.\n\n**The inspiration for the automated malformation are all from real-life CSVs (sigh)**\n\nHumans will always surprise us with the ever-new malformed input data, but hey, we can try to best ruining the test CSVs!\n\n## How do I install this package?\nAs usual, just download it using pip:\n\n```shell\npip install ugly_csv_generator\n```\n\n## Usage example\n\nTo ruin a CSV you can use the following snippet. In the following example we use a [random_csv_generator](https://github.com/LucaCappelletti94/random_csv_generator) to generate a random \"healthy\" csv.\n\n```python\nfrom random_csv_generator import random_csv\nfrom ugly_csv_generator import uglify\n\ncsv = random_csv(5) # CSV with 5 lines\ncsv = csv[csv.columns[:3]] # We will use only the first 3 columns for this example\nugly = uglify(csv)\n```\n\nThe initial CSV will look something like:\n\n| region    | province  | surname  |\n|-----------|-----------|----------|\n| Calabria  | Catanzaro | Rossi    |\n| Sicilia   | Ragusa    | Pinna    |\n| Lombardia | Varese    | Sbrana   |\n| Lazio     | Roma      | Mair     |\n| Sicilia   | Messina   | Ferrari  |\n\nThe result uglified CSV will look something like this:\n\n|     | 1                                     | 2                   | 3        | 4        | 5                                      | 6    |\n|-----|---------------------------------------|---------------------|----------|----------|----------------------------------------|------|\n| 0   | ////                                  | #RIF!               | #RIF!    | 0        | ....                                   | 0    |\n| 1   | \"('surname',)('.',)(0,)\"              | region              | province | surname  | \"('province',)('_',)(1,)\"              |      |\n| 2   | ////////                              | region              | \"province                                   \" | \"surname                                   \" | 0                                      | 0    |\n| 3   | ///////                               | \"region                                         \" | \"province                                   \" | \"surname                                     \" | #RIF!                                   | #RIF!     |\n| 4   |                                       | Calabria            | \"Catanzaro                                   \" | \"Rossi                                     \" | 0                                      | -------- |\n| 5   | \"                                     \" | Sicilia            | Ragusa   | \"Pinna                                     \" | \"                                            \" |          |\n| 6   | -------                               |                     | #RIF!    | #RIF!    | 0                                      | \"                                        \" |\n| 7   | /////////                             | \"Lombardia                                      \" | \"Varese                                     \" | Sbrana                                  | ///////////                             |          |\n| 8   | ---------                             | \"Lazio                                         \" | \"Roma                                       \" | \"Mair                                       \" |                                        |          |\n| 9   | --------                              | 0                   | /////    | ---      | 0                                      | ///// |\n| 10  | #RIF!                                 | \"Sicilia                                     \" | Messina  | \"Ferrari                                     \" | 0                                      |          |\n| 11  | 0                                     |                     | -----    | \"                                             \" | --------                                | 0    |\n\n## Available uglifications\nLet's take a look at the available uglifications! All of these options are available as keyword arguments in the `uglify` function.\n\nWe start by taking a look at the same example from before, but now we expand all of the available options:\n\n```python\nfrom random_csv_generator import random_csv\nfrom ugly_csv_generator import uglify\n\ncsv = random_csv(5) # CSV with 5 lines\ncsv = csv[csv.columns[:3]] # We will use only the first 3 columns for this example\n\nugly = uglify(\n    csv,\n    empty_columns = True,\n    empty_rows = True,\n    duplicate_schema = True,\n    empty_padding = True,\n    nan_like_artefacts = True,\n    replace_zeros = True,\n    replace_ones = True,\n    satellite_artefacts = False,\n    random_spaces = True,\n    include_unicode = True,\n    verbose = True,\n    seed = 42,\n)\n```\n\nLet's break down all of the available options with adequate examples. In all cases, we will use the following CSV as a starting point,\nobtained from the `random_csv_generator` package:\n\n```python\nfrom random_csv_generator import random_csv\n\ncsv = random_csv(5) # CSV with 5 lines\ncsv = csv[csv.columns[:3]] # We will use only the first 3 columns for this example\n```\n\nThe initial CSV will look something like:\n\n|   | region  | province   | surname |\n|---|---------|------------|---------|\n| 0 | Veneto  | Vicenza    | Sacco   |\n| 1 | Abruzzo | L' Aquila  | Sala    |\n| 2 | Sicilia | Messina    | Sanna   |\n| 3 | Marche  | Ancona     | Gallo   |\n| 4 | Lazio   | Frosinone  | Gallo   |\n\n### Empty columns\nIn the following example we will solely add empty columns to the CSV. This phenomenon is common when the data-entry person leaves empty columns in the middle of the table.\n\n```python\nfrom random_csv_generator import random_csv\nfrom ugly_csv_generator import uglify\n\ncsv = random_csv(5) # CSV with 5 lines\ncsv = csv[csv.columns[:3]] # We will use only the first 3 columns for this example\nugly = uglify(\n    csv,\n    empty_columns = True,\n    empty_rows = False,\n    duplicate_schema = False,\n    empty_padding = False,\n    nan_like_artefacts = False,\n    satellite_artefacts = False,\n    random_spaces = False,\n    seed = 424,\n)\n```\n\nThe result will look something like:\n\n|   | region_2 | region_0 1 | region  | region_0 | province   | surname |\n|---|----------|------------|---------|----------|------------|---------|\n| 0 |          |            | Veneto  |          | Vicenza    | Sacco   |\n| 1 |          |            | Abruzzo |          | L Aquila   | Sala    |\n| 2 |          |            | Sicilia |          | Messina    | Sanna   |\n| 3 |          |            | Marche  |          | Ancona     | Gallo   |\n| 4 |          |            | Lazio   |          | Frosinone  | Gallo   |\n\n### Empty rows\nIn the following example we will solely add empty rows to the CSV. This phenomenon is common when the data-entry person leaves empty rows in the middle of the table.\n\n```python\nfrom random_csv_generator import random_csv\nfrom ugly_csv_generator import uglify\n\ncsv = random_csv(5) # CSV with 5 lines\ncsv = csv[csv.columns[:3]] # We will use only the first 3 columns for this example\nugly = uglify(\n    csv,\n    empty_columns = False,\n    empty_rows = True,\n    duplicate_schema = False,\n    empty_padding = False,\n    nan_like_artefacts = False,\n    satellite_artefacts = False,\n    random_spaces = False,\n    seed = 424,\n)\n```\n\nThe result will look something like:\n\n|   | region  | province   | surname |\n|---|---------|------------|---------|\n| 0 | Veneto  | Vicenza    | Sacco   |\n| 1 | Abruzzo | L Aquila   | Sala    |\n| 2 | Sicilia | Messina    | Sanna   |\n| 3 |         |            |         |\n| 4 | Marche  | Ancona     | Gallo   |\n| 5 | Lazio   | Frosinone  | Gallo   |\n| 6 |         |            |         |\n\n### Duplicate schema\nIn the following example we will solely duplicate the schema of the CSV. This phenomenon is common when the data-entry person copies the header of the table multiple times, or several CSVs are concatenated together without removing the header.\n\n```python\nfrom random_csv_generator import random_csv\nfrom ugly_csv_generator import uglify\n\ncsv = random_csv(5) # CSV with 5 lines\ncsv = csv[csv.columns[:3]] # We will use only the first 3 columns for this example\nugly = uglify(\n    csv,\n    empty_columns = False,\n    empty_rows = False,\n    duplicate_schema = True,\n    empty_padding = False,\n    nan_like_artefacts = False,\n    satellite_artefacts = False,\n    random_spaces = False,\n    seed = 424,\n)\n```\n\nThe result will look something like:\n\n|   | region  | province   | surname |\n|---|---------|------------|---------|\n| 0 | Veneto  | Vicenza    | Sacco   |\n| 1 | Abruzzo | L Aquila   | Sala    |\n| 2 | Sicilia | Messina    | Sanna   |\n| 3 | region  | province   | surname |\n| 4 | Marche  | Ancona     | Gallo   |\n| 5 | Lazio   | Frosinone  | Gallo   |\n| 6 | region  | province   | surname |\n\n### Empty padding\nIn the following example we will solely add empty padding to the CSV. Padding in this context means adding empty cells around the CSV, represing when the data-entry person started the table somewhere in the middle of a sheet document.\n\n```python\nfrom random_csv_generator import random_csv\nfrom ugly_csv_generator import uglify\n\ncsv = random_csv(5) # CSV with 5 lines\ncsv = csv[csv.columns[:3]] # We will use only the first 3 columns for this example\nugly = uglify(\n    csv,\n    empty_columns = False,\n    empty_rows = False,\n    duplicate_schema = False,\n    empty_padding = True,\n    nan_like_artefacts = False,\n    satellite_artefacts = False,\n    random_spaces = False,\n    seed = 424,\n)\n```\n\nThe result will look something like:\n\n|   |   0 | 1       | 2        | 3       | 4  | 5  |\n|---|-----|---------|----------|---------|----|----|\n| 0 |     | region  | province | surname |    |    |\n| 1 |     | Veneto  | Vicenza  | Sacco   |    |    |\n| 2 |     | Abruzzo | L Aquila | Sala    |    |    |\n| 3 |     | Sicilia | Messina  | Sanna   |    |    |\n| 4 |     | Marche  | Ancona   | Gallo   |    |    |\n| 5 |     | Lazio   | Frosinone| Gallo   |    |    |\n| 6 |     |         |          |         |    |    |\n| 7 |     |         |          |         |    |    |\n| 8 |     |         |          |         |    |    |\n| 9 |     |         |          |         |    |    |\n| 10|     |         |          |         |    |    |\n| 11|     |         |          |         |    |    |\n\n### NaN-like artefacts\nIn the following example we will solely add NaN-like artefacts to the CSV. This phenomenon is common when the data-entry person follows some custom notation, which may be their own or office standard, to represent missing values. In some cases, this may be a string like \"N/A\", \"NaN\", \"NULL\", or even (one or more) \"-\", \"\\n\", or \"\\t\". Since the objective of this package is to not destroy information, it will solely replace NaN values with NaN-like artefacts.\n\nIn the example we considered earlier, we do not have any NaN values, so we will add some to the CSV by also enabling the `empty_rows` option.\n\n```python\nfrom random_csv_generator import random_csv\nfrom ugly_csv_generator import uglify\n\ncsv = random_csv(5) # CSV with 5 lines\ncsv = csv[csv.columns[:3]] # We will use only the first 3 columns for this example\nugly = uglify(\n    csv,\n    empty_columns = False,\n    empty_rows = True,\n    duplicate_schema = False,\n    empty_padding = False,\n    nan_like_artefacts = True,\n    satellite_artefacts = False,\n    random_spaces = False,\n    seed = 424,\n)\n```\n\nThe result will look something like:\n\n|   | region  | province   | surname |\n|---|---------|------------|---------|\n| 0 | Veneto  | Vicenza    | Sacco   |\n| 1 | Abruzzo | L Aquila   | Sala    |\n| 2 | Sicilia | Messina    | Sanna   |\n| 3 | \" \"     | ...        | ----    |\n| 4 | Marche  | Ancona     | Gallo   |\n| 5 | Lazio   | Frosinone  | Gallo   |\n| 6 |         | \"          | ------- |\n\n\n#### Unicode variant\nThe NaN-like artefacts can also be applied with unicode characters. This is useful to test the robustness of the CSV reader to unicode characters.\n\n```python\nfrom random_csv_generator import random_csv\nfrom ugly_csv_generator import uglify\n\ncsv = random_csv(5) # CSV with 5 lines\ncsv = csv[csv.columns[:3]] # We will use only the first 3 columns for this example\n\nugly = uglify(\n    csv,\n    empty_columns = False,\n    empty_rows = True,\n    duplicate_schema = False,\n    empty_padding = False,\n    nan_like_artefacts = True,\n    satellite_artefacts = False,\n    random_spaces = False,\n    include_unicode = True,\n    seed = 424,\n)\n```\n\nThe result will look something like:\n\n|    | region    | province   | surname   |\n|---:|:----------|:-----------|:----------|\n|  0 | Calabria  | Catanzaro  | Rossi     |\n|  1 | Sicilia   | Ragusa     | Pinna     |\n|  2 | Lombardia | Varese     | Sbrana    |\n|  3 | .         | ᴑ          | 0         |\n|  4 | Lazio     | Roma       | Mair      |\n|  5 | Sicilia   | Messina    | Ferrari   |\n|  6 | ₀         | ________   | ᪐         |\n\n### Replace zeros\nIn the following example we will solely replace zeros with a custom value. In different places in the word and different offices, zeros may be represented in different ways. Characters for zero from different alphabets, or even different symbols, may be used to represent zero. Note that this latter functionality is only enabled if the `include_unicode` option is set to `True`.\n\n```python\nfrom random_csv_generator import random_csv\nfrom ugly_csv_generator import uglify\n\ncsv = random_csv(5) # CSV with 5 lines\ncsv = csv[csv.columns[:3]] # We will use only the first 3 columns for this example\n\n# We add a column with zeros\ncsv[\"zero\"] = 0\n\nugly = uglify(\n    csv,\n    empty_columns = False,\n    empty_rows = False,\n    duplicate_schema = False,\n    empty_padding = False,\n    nan_like_artefacts = False,\n    satellite_artefacts = False,\n    random_spaces = False,\n    replace_zeros = True,\n    include_unicode = False,\n    seed = 424,\n)\n```\n\nThe result will look something like:\n\n|    | region    | province   | surname   | zero   |\n|---:|:----------|:-----------|:----------|:-------|\n|  0 | Calabria  | Catanzaro  | Rossi     | 0      |\n|  1 | Sicilia   | Ragusa     | Pinna     | o      |\n|  2 | Lombardia | Varese     | Sbrana    | 0      |\n|  3 | Lazio     | Roma       | Mair      | 0      |\n|  4 | Sicilia   | Messina    | Ferrari   | O      |\n\n#### Unicode variant\nThe replace zeros can also be applied with unicode characters. This is useful to test the robustness of the CSV reader to unicode characters.\n\n```python\nfrom random_csv_generator import random_csv\nfrom ugly_csv_generator import uglify\n\ncsv = random_csv(5) # CSV with 5 lines\ncsv = csv[csv.columns[:3]] # We will use only the first 3 columns for this example\n\n# We add a column with zeros\ncsv[\"zero\"] = 0\n\nugly = uglify(\n    csv,\n    empty_columns = False,\n    empty_rows = False,\n    duplicate_schema = False,\n    empty_padding = False,\n    nan_like_artefacts = False,\n    satellite_artefacts = False,\n    random_spaces = False,\n    replace_zeros = True,\n    include_unicode = True,\n    seed = 424,\n)\n```\n\nThe result will look something like:\n\n|    | region    | province   | surname   | zero   |\n|---:|:----------|:-----------|:----------|:-------|\n|  0 | Calabria  | Catanzaro  | Rossi     | o      |\n|  1 | Sicilia   | Ragusa     | Pinna     | ᪐      |\n|  2 | Lombardia | Varese     | Sbrana    | ο      |\n|  3 | Lazio     | Roma       | Mair      | 𝟘      |\n|  4 | Sicilia   | Messina    | Ferrari   | ᥆      |\n\n### Replace ones\nIn the following example we will solely replace ones with a custom value. In different places in the word and different offices, ones may be represented in different ways. Characters for one from different alphabets, or even different symbols, may be used to represent one. Note that this latter functionality is only enabled if the `include_unicode` option is set to `True`.\n\n```python\nfrom random_csv_generator import random_csv\nfrom ugly_csv_generator import uglify\n\ncsv = random_csv(5) # CSV with 5 lines\ncsv = csv[csv.columns[:3]] # We will use only the first 3 columns for this example\n\n# We add a column with ones\ncsv[\"one\"] = 1\n\nugly = uglify(\n    csv,\n    empty_columns = False,\n    empty_rows = False,\n    duplicate_schema = False,\n    empty_padding = False,\n    nan_like_artefacts = False,\n    satellite_artefacts = False,\n    random_spaces = False,\n    replace_ones = True,\n    include_unicode = False,\n    seed = 424,\n)\n```\n\nThe result will look something like:\n\n|    | region    | province   | surname   | one   |\n|---:|:----------|:-----------|:----------|:------|\n|  0 | Calabria  | Catanzaro  | Rossi     | 1     |\n|  1 | Sicilia   | Ragusa     | Pinna     | l     |\n|  2 | Lombardia | Varese     | Sbrana    | 1     |\n|  3 | Lazio     | Roma       | Mair      | 1     |\n|  4 | Sicilia   | Messina    | Ferrari   | I     |\n\n#### Unicode variant\nThe replace ones can also be applied with unicode characters. This is useful to test the robustness of the CSV reader to unicode characters.\n\n```python\nfrom random_csv_generator import random_csv\nfrom ugly_csv_generator import uglify\n\ncsv = random_csv(5) # CSV with 5 lines\ncsv = csv[csv.columns[:3]] # We will use only the first 3 columns for this example\n\n# We add a column with ones\ncsv[\"one\"] = 1\n\nugly = uglify(\n    csv,\n    empty_columns = False,\n    empty_rows = False,\n    duplicate_schema = False,\n    empty_padding = False,\n    nan_like_artefacts = False,\n    satellite_artefacts = False,\n    random_spaces = False,\n    replace_ones = True,\n    include_unicode = True,\n    seed = 424,\n)\n```\n\nThe result will look something like:\n\n|    | region    | province   | surname   | one   |\n|---:|:----------|:-----------|:----------|:------|\n|  0 | Calabria  | Catanzaro  | Rossi     | ¹     |\n|  1 | Sicilia   | Ragusa     | Pinna     | ₁     |\n|  2 | Lombardia | Varese     | Sbrana    | l     |\n|  3 | Lazio     | Roma       | Mair      | 1     |\n|  4 | Sicilia   | Messina    | Ferrari   | ⓵     |\n\n\n### Satellite artefacts\nIn the following example we will solely add satellite artefacts to the CSV. A satellite artefact is likely the quirkiest and most annoying artefact to deal with. It represents the situation where the data-entry person adds some notes on the side of the table. A real-world example of this which I have encountered is when the data-entry person adds the office lunch order on the side of the table and forgets to remove it.\n\nThe package offers a few satellite artefacts encountered in the wild.\n\n```python\nfrom random_csv_generator import random_csv\nfrom ugly_csv_generator import uglify\n\ncsv = random_csv(5) # CSV with 5 lines\ncsv = csv[csv.columns[:3]] # We will use only the first 3 columns for this example\nugly = uglify(\n    csv,\n    empty_columns = False,\n    empty_rows = True,\n    duplicate_schema = False,\n    empty_padding = False,\n    nan_like_artefacts = False,\n    satellite_artefacts = True,\n    random_spaces = False,\n    seed = 424,\n)\n```\n\nThe result will look something like:\n\n|    | 0       | 1         | 2                | 3       | 4  |\n|----|---------|-----------|------------------|---------|----|\n| 0  |         |           |                  | random  |    |\n| 1  |         |           | random           |         |    |\n| 2  |         | caso      |                  |         |    |\n| 3  | region  | province  | surname          |         |    |\n| 4  | Veneto  | Vicenza   | Sacco            |         |    |\n| 5  | Abruzzo | L Aquila  | Sala             |         |    |\n| 6  | Sicilia | Messina   | Sanna            |         |    |\n| 7  | Marche  | Ancona    | Gallo            |         |    |\n| 8  | Lazio   | Frosinone | Gallo            |         |    |\n| 9  |         |           |                  |         |    |\n| 10 |         |           |                  |         |    |\n| 11 |         |           |                  |         |    |\n| 12 |         |           |                  |         |    |\n| 13 |         |           |                  |         |    |\n| 14 |         |           |                  |         |    |\n| 15 | person  | food      |                  |         |    |\n| 16 | Jerry   | kebab     |                  |         |    |\n| 17 | Steven  | rice with paprika |          |         |    |\n| 18 | Vale    | pizza mit ananas |          |         |    |\n\n### Random spaces\nIn the following example we will solely add random spaces around the values in the CSV. This phenomenon is common when the data-entry person is not careful with the spaces around the values in the table and adds some random spaces, for instance to visually align the values.\n\n```python\nfrom random_csv_generator import random_csv\nfrom ugly_csv_generator import uglify\n\ncsv = random_csv(5) # CSV with 5 lines\ncsv = csv[csv.columns[:3]] # We will use only the first 3 columns for this example\nugly = uglify(\n    csv,\n    empty_columns = False,\n    empty_rows = False,\n    duplicate_schema = False,\n    empty_padding = False,\n    nan_like_artefacts = False,\n    satellite_artefacts = False,\n    random_spaces = True,\n    seed = 424,\n)\n```\n\nThe result will look something like:\n\n|   | region               | province         | surname         |\n|---|----------------------|------------------|-----------------|\n| 0 | \"    Veneto          \" | \"  Vicenza      \" | \" Sacco        \" |\n| 1 | \" Abruzzo            \" | \" L Aquila      \" | \" Sala         \" |\n| 2 | \" Sicilia            \" | \" Messina       \" | \" Sanna        \" |\n| 3 | \" Marche             \" | \" Ancona        \" | \" Gallo        \" |\n| 4 | \" Lazio              \" | \" Frosinone     \" | \" Gallo        \" |\n\n\n#### Unicode variant\nThe random spaces uglification can also be applied with unicode characters. This is useful to test the robustness of the CSV reader to unicode characters.\n\n```python\nfrom random_csv_generator import random_csv\nfrom ugly_csv_generator import uglify\n\ncsv = random_csv(5) # CSV with 5 lines\ncsv = csv[csv.columns[:3]] # We will use only the first 3 columns for this example\nugly = uglify(\n    csv,\n    empty_columns = False,\n    empty_rows = False,\n    duplicate_schema = False,\n    empty_padding = False,\n    nan_like_artefacts = False,\n    satellite_artefacts = False,\n    random_spaces = True,\n    include_unicode = True,\n    seed = 424,\n)\n```\n\nDue to limitations of the markdown rendering, we cannot show the result here. You can run the code snippet to see the result. It's just that damn cursed!\n\n## Contributing\nYou have encountered a new type of uglification that you would like to add to the package? You have a suggestion for a new feature or improvement? You have found a bug? Open an issue or a pull request, I will be happy to help you!\n\n## License\nThis project is licensed under the MIT License - see the [LICENSE](https://github.com/LucaCappelletti94/ugly_csv_generator/blob/master/LICENSE) file for details.\n","funding_links":["https://github.com/sponsors/LucaCappelletti94"],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucacappelletti94%2Fugly_csv_generator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flucacappelletti94%2Fugly_csv_generator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucacappelletti94%2Fugly_csv_generator/lists"}