{"id":51433261,"url":"https://github.com/tha-guy-nate/tha-utils-helper","last_synced_at":"2026-07-05T05:04:06.006Z","repository":{"id":358391078,"uuid":"1241225694","full_name":"tha-guy-nate/tha-utils-helper","owner":"tha-guy-nate","description":"A Tabular Helper utility library with general-purpose helpers for the tha-* ecosystem.","archived":false,"fork":false,"pushed_at":"2026-07-05T02:05:06.000Z","size":189,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-07-05T04:07:04.769Z","etag":null,"topics":["csv","date","numeric","python","string","tabular-helper","utilities"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tha-guy-nate.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-17T05:37:01.000Z","updated_at":"2026-07-05T03:42:19.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/tha-guy-nate/tha-utils-helper","commit_stats":null,"previous_names":["tha-guy-nate/tha-utils-helper"],"tags_count":8,"template":false,"template_full_name":null,"purl":"pkg:github/tha-guy-nate/tha-utils-helper","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tha-guy-nate%2Ftha-utils-helper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tha-guy-nate%2Ftha-utils-helper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tha-guy-nate%2Ftha-utils-helper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tha-guy-nate%2Ftha-utils-helper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tha-guy-nate","download_url":"https://codeload.github.com/tha-guy-nate/tha-utils-helper/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tha-guy-nate%2Ftha-utils-helper/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":35143837,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-07-05T02:00:06.290Z","response_time":100,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csv","date","numeric","python","string","tabular-helper","utilities"],"created_at":"2026-07-05T05:04:05.468Z","updated_at":"2026-07-05T05:04:05.990Z","avatar_url":"https://github.com/tha-guy-nate.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# tha-utils-helper\n\n[![CI](https://github.com/tha-guy-nate/tha-utils-helper/actions/workflows/ci.yml/badge.svg)](https://github.com/tha-guy-nate/tha-utils-helper/actions/workflows/ci.yml)\n[![codecov](https://codecov.io/gh/tha-guy-nate/tha-utils-helper/graph/badge.svg)](https://codecov.io/gh/tha-guy-nate/tha-utils-helper)\n[![PyPI](https://img.shields.io/pypi/v/tha-utils-helper)](https://pypi.org/project/tha-utils-helper/)\n[![Python](https://img.shields.io/pypi/pyversions/tha-utils-helper)](https://pypi.org/project/tha-utils-helper/)\n[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit)](https://github.com/pre-commit/pre-commit)\n[![wheel size](https://img.shields.io/badge/dynamic/json?url=https%3A%2F%2Fpypi.org%2Fpypi%2Ftha-utils-helper%2Fjson\u0026label=wheel%20size\u0026query=%24.urls%5B0%5D.size\u0026suffix=%20B)](https://pypi.org/project/tha-utils-helper/#files)\n\nA Tabular Helper utility library for the `tha-*` ecosystem. Includes general-purpose dict/list/type helpers, string normalization and slugification, numeric string parsing, and date format conversion — all with row-level error handling for CSV pipeline use.\n\n## Install\n\n```bash\npip install tha-utils-helper\n```\n\n## Quick start\n\n```python\nfrom tha_utils_helper import ThaDict, ThaList, ThaType, ThaStr, ThaNum, ThaDT\n\n# Structural helpers — work on single values or lists of row dicts\nThaDict.pick({\"a\": 1, \"b\": 2, \"c\": 3}, [\"a\", \"c\"])         # {\"a\": 1, \"c\": 3}\nThaDict.rename_keys_rows(rows, {\"studentUniqueId\": \"id\"})   # rename across all rows\n\n# String normalization\nThaStr.format_str(\"  HELLO WORLD  \", case=\"lower\")            # \"hello world\"\nThaStr.slugify(\"Hello World!\")                                 # \"hello-world\"\n\n# Numeric parsing\nThaNum.format_num(\"$1,234.56\")                                 # 1234.56\nThaNum.format_num(\"(£500)\", cast=\"int\")                        # -500\n\n# Date formatting\nThaDT.format_date(\"Apr 15, 2024\", \"%Y-%m-%d\")                 # \"2024-04-15\"\n\n# Row-level processing with on_error and skip_statuses\nformatter = ThaNum()\nrows = formatter.format_num_rows(rows, column=\"Budget\", cast=\"float\", round_to=2)\n```\n\n---\n\n## API\n\n### `ThaDict`\n\nStatic methods for single dicts and lists of row dicts.\n\n```python\nThaDict.pick(d, keys)               # new dict with only the specified keys\nThaDict.omit(d, keys)               # new dict with the specified keys removed\nThaDict.safe_get(d, *keys)          # traverse nested dicts safely — returns None on miss\nThaDict.rename_keys(d, mapping)     # rename keys; unmapped keys are preserved\n\nThaDict.pick_rows(rows, keys)       # pick() applied to every row\nThaDict.omit_rows(rows, keys)       # omit() applied to every row\nThaDict.rename_keys_rows(rows, mapping)  # rename_keys() applied to every row\n```\n\n```python\nThaDict.pick({\"a\": 1, \"b\": 2, \"c\": 3}, [\"a\", \"c\"])\n# {\"a\": 1, \"c\": 3}\n\nThaDict.safe_get({\"student\": {\"id\": 42}}, \"student\", \"id\")\n# 42\n\nThaDict.rename_keys_rows(rows, {\"studentUniqueId\": \"student_id\"})\n# [{\"student_id\": ..., ...}, ...]\n```\n\n---\n\n### `ThaList`\n\nStatic methods for lists.\n\n```python\nThaList.chunk(lst, size)   # split into consecutive chunks of size\nThaList.flatten(lst)       # flatten one level of nesting\n```\n\n```python\nThaList.chunk([1, 2, 3, 4, 5], 2)    # [[1, 2], [3, 4], [5]]\nThaList.flatten([[1, 2], [3, 4]])     # [1, 2, 3, 4]\n```\n\n`chunk` also works on lists of row dicts directly: `ThaList.chunk(rows, 100)`.\n\n---\n\n### `ThaType`\n\nStatic methods for coercing values. Row methods return `None` on failure (consistent with `safe_int` / `safe_float`).\n\n```python\nThaType.normalize_bool(val)                                   # bool or raises ValueError\nThaType.safe_int(val)                                         # int | None\nThaType.safe_float(val)                                       # float | None\n\nThaType.normalize_bool_rows(rows, column, *, out_column=None) # None on failure\nThaType.safe_int_rows(rows, column, *, out_column=None)\nThaType.safe_float_rows(rows, column, *, out_column=None)\n```\n\n`normalize_bool` recognizes:\n\n| Truthy | Falsy |\n|---|---|\n| `True`, `1`, `\"true\"`, `\"yes\"`, `\"1\"`, `\"t\"`, `\"y\"` | `False`, `0`, `\"false\"`, `\"no\"`, `\"0\"`, `\"f\"`, `\"n\"` |\n\nString matching is case-insensitive and strips whitespace.\n\n```python\nThaType.normalize_bool(\"Yes\")     # True\nThaType.safe_int(\"3.14\")          # None  (not an integer string)\nThaType.safe_float(\"abc\")         # None\n\nThaType.safe_int_rows(rows, \"count\", out_column=\"count_int\")\n# adds \"count_int\" column; original \"count\" column preserved\n```\n\n---\n\n### `ThaStr`\n\nString normalization and slugification. `format_str` and `slugify` are static methods callable without instantiation. Row methods require an instance and store results in `self.rows`.\n\n```python\nThaStr.format_str(\n    value: str,\n    *,\n    strip: bool = True,\n    case: str | None = None,     # \"upper\" | \"lower\" | \"title\" | None\n    replace: dict[str, str] | None = None,\n    regex: bool = False,\n) -\u003e str\n```\n\n```python\nThaStr.slugify(\n    value: str,\n    *,\n    sep: str = \"-\",\n    prefix: str = \"\",\n    suffix: str = \"\",\n) -\u003e str\n```\n\n```python\nrunner = ThaStr()\n\nrunner.format_str_rows(\n    rows,\n    column,\n    *,\n    strip=True,\n    case=None,\n    replace=None,\n    regex=False,\n    out_column=None,\n    on_error=\"error\",            # \"error\" | \"skip\" | \"blank\"\n    skip_statuses=None,          # default: [\"error\", \"warning\"]\n) -\u003e list[dict]\n\nrunner.slugify_rows(\n    rows,\n    columns,                     # str or list[str] — multiple columns are joined with sep\n    out_column,\n    *,\n    sep=\"-\",\n    prefix=\"\",\n    suffix=\"\",\n    on_error=\"error\",\n    skip_statuses=None,\n) -\u003e list[dict]\n```\n\n```python\nThaStr.format_str(\"  HELLO WORLD  \", case=\"lower\")    # \"hello world\"\nThaStr.slugify(\"Hello World!\")                          # \"hello-world\"\nThaStr.slugify(\"café résumé\", sep=\"_\")                  # \"cafe_resume\"\n\nrunner = ThaStr()\nrunner.format_str_rows(rows, \"Name\", case=\"lower\", out_column=\"Name Slug\")\nrunner.slugify_rows(rows, [\"First\", \"Last\"], out_column=\"id\")\n```\n\nRaises `StrError` on invalid `case` or `on_error`. Unicode is converted to ASCII via NFKD normalization.\n\n---\n\n### `ThaNum`\n\nNumeric string parsing. `format_num` is a static method callable without instantiation. `format_num_rows` requires an instance and stores results in `self.rows`.\n\n```python\nThaNum.format_num(\n    value: str | int | float,\n    *,\n    strip_currency: bool = True,   # removes $€£¥₹₩₽₺₫฿₱₴\n    strip_commas: bool = True,\n    round_to: int | None = None,\n    cast: str = \"float\",           # \"float\" | \"int\"\n) -\u003e float | int\n```\n\n```python\nrunner = ThaNum()\n\nrunner.format_num_rows(\n    rows,\n    column,\n    *,\n    strip_currency=True,\n    strip_commas=True,\n    round_to=None,\n    cast=\"float\",\n    out_column=None,\n    on_error=\"error\",\n    skip_statuses=None,\n) -\u003e list[dict]\n```\n\n```python\nThaNum.format_num(\"$1,234.56\")          # 1234.56\nThaNum.format_num(\"(£500)\", cast=\"int\") # -500\nThaNum.format_num(\"€9.99\", round_to=1)  # 10.0\n```\n\nParenthetical negatives (`(100)`) are converted automatically. Raises `NumError` on unparseable input, `bool` input, or invalid `cast`.\n\n---\n\n### `ThaDT`\n\nDate format auto-detection and conversion. `format_date` and `now` are static methods. `format_date_rows` requires an instance and stores results in `self.rows`.\n\n```python\nThaDT.now(fmt=\"%Y_%m_%d_%H_%M_%S\") -\u003e str\n\nThaDT.format_date(value: str, to_fmt: str) -\u003e str\n\nrunner = ThaDT()\n\nrunner.format_date_rows(\n    rows,\n    column,\n    to_fmt,\n    *,\n    out_column=None,\n    on_error=\"error\",\n    skip_statuses=None,\n) -\u003e list[dict]\n```\n\nAuto-detects: ISO 8601 (with/without time, with/without ms/Z), compact ISO (`20240415`), year-month (`2024-04`), US `MM/DD/YYYY`, US `MM/DD/YY`, `MM/DD`, long and short month names (`April 15, 2024` / `Apr 15, 2024`).\n\n```python\nThaDT.format_date(\"Apr 15, 2024\", \"%Y-%m-%d\")   # \"2024-04-15\"\nThaDT.format_date(\"04/15/2024\", \"%m/%d/%y\")      # \"04/15/24\"\nThaDT.now()                                       # \"2024_04_15_13_30_00\"\n```\n\nRaises `DateError` on unrecognized formats or invalid `on_error`.\n\n---\n\n### `on_error` (all row methods)\n\n| Value | Behaviour |\n|---|---|\n| `\"error\"` | `row status=\"error\"`, `message=...`, output column set to `\"\"` |\n| `\"skip\"` | Row returned unchanged |\n| `\"blank\"` | Output column set to `\"\"`, row status untouched |\n\n### `skip_statuses`\n\nRows whose `\"row status\"` value is in this list are passed through unchanged. Default: `[\"error\", \"warning\"]`. Pass `[]` to process all rows regardless of status.\n\n---\n\n### Error classes\n\n| Class | Raised by |\n|---|---|\n| `UtilsError` | Base class — catch all tha-utils-helper errors |\n| `StrError` | `ThaStr` methods |\n| `NumError` | `ThaNum` methods |\n| `DateError` | `ThaDT` methods |\n\n```python\nfrom tha_utils_helper import StrError, NumError, DateError, UtilsError\n```\n\n---\n\n## Composing with `tha-csv-runner`\n\n```python\nfrom tha_csv_runner import ThaCSV\nfrom tha_utils_helper import ThaNum, ThaStr, ThaDT\n\ncsv = ThaCSV()\ncsv.read(\"Load\", \"input.csv\", [\"Org BK\", \"Budget\", \"Start Date\", \"Name\"])\n\nrows = ThaNum().format_num_rows(csv.rows, column=\"Budget\", cast=\"float\", round_to=2)\nrows = ThaDT().format_date_rows(rows, column=\"Start Date\", to_fmt=\"%Y-%m-%d\")\nrows = ThaStr().format_str_rows(rows, column=\"Name\", case=\"lower\")\n\ncsv.write(\"Write\", \"output.csv\", rows=rows)\n```\n\n---\n\n## Alternatives\n\nThis library is intentionally limited in scope — it exists as a zero-dependency utility layer for the `tha-*` ecosystem. If you need something more comprehensive, these are the go-to options:\n\n**General utilities:**\n- [**toolz**](https://toolz.readthedocs.io) — covers most of what's here and much more: chunking, flattening, pick, omit, nested get, and functional composition\n- [**funcy**](https://funcy.readthedocs.io) — functional helpers including `pick`, `omit`, `chunks`, and silent type coercions\n\n**String normalization / slugification:**\n- [**python-slugify**](https://github.com/un33k/python-slugify) — full-featured slugification with transliteration support and configurable stop words\n- [**Unidecode**](https://github.com/avian2/unidecode) — broad unicode-to-ASCII transliteration\n\n**Numeric parsing:**\n- [**babel**](https://babel.pocoo.org) — locale-aware number parsing that handles locale-specific decimal and grouping separators\n- [**price-parser**](https://github.com/scrapinghub/price-parser) — extracts prices and currency from arbitrary text\n\n**Date parsing:**\n- [**python-dateutil**](https://dateutil.readthedocs.io) — flexible date parsing including fuzzy matching; no row-level error handling\n- [**pendulum**](https://pendulum.eustace.io) — timezone-aware datetime with parsing and formatting\n\nChoose this library when you want all of the above in a single zero-dependency install with consistent row-level error capture that slots into the `tha-*` pipeline.\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftha-guy-nate%2Ftha-utils-helper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftha-guy-nate%2Ftha-utils-helper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftha-guy-nate%2Ftha-utils-helper/lists"}