{"id":16265537,"url":"https://github.com/bruth/strconv","last_synced_at":"2025-03-19T23:30:39.223Z","repository":{"id":12068380,"uuid":"14655363","full_name":"bruth/strconv","owner":"bruth","description":"String type inference and conversion","archived":false,"fork":false,"pushed_at":"2022-05-24T01:37:51.000Z","size":28,"stargazers_count":23,"open_issues_count":5,"forks_count":14,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-03-17T11:55:03.209Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"http://bruth.github.io/strconv","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bruth.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2013-11-24T03:53:50.000Z","updated_at":"2023-01-05T14:01:10.000Z","dependencies_parsed_at":"2022-09-19T19:03:12.489Z","dependency_job_id":null,"html_url":"https://github.com/bruth/strconv","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bruth%2Fstrconv","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bruth%2Fstrconv/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bruth%2Fstrconv/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bruth%2Fstrconv/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bruth","download_url":"https://codeload.github.com/bruth/strconv/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244524505,"owners_count":20466437,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-10T17:09:36.234Z","updated_at":"2025-03-19T23:30:38.977Z","avatar_url":"https://github.com/bruth.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# strconv \n\n[![Build Status](https://travis-ci.org/bruth/strconv.png?branch=master)](https://travis-ci.org/bruth/strconv) [![Coverage Status](https://coveralls.io/repos/bruth/strconv/badge.png?branch=master)](https://coveralls.io/r/bruth/strconv?branch=master) [![Bitdeli Badge](https://d2weczhvl823v0.cloudfront.net/bruth/strconv/trend.png)](https://bitdeli.com/free \"Bitdeli Badge\")\n\nLibrary for inferring and converting strings into native Python types. The original use case for this was reading CSV data with unknown types and converting it into native types for further manipulation.\n\n## Install\n\n**Supports Python 2.7, 3.2, and 3.3**\n\n```\npip install strconv\n```\n\n## Usage\n\n### Conversion\n\n**convert(s, include_type=False)**\n\nAttempts to convert string `s` into a non-string type. If `include_type` is true, the type name is returned as a second value.\n\n```python\n\u003e\u003e\u003e import strconv\n\u003e\u003e\u003e strconv.convert('1.2')\n1.2\n\u003e\u003e\u003e strconv.convert('true')\nTrue\n\u003e\u003e\u003e strconv.convert('2013-03-01', include_type=True)\n(date(2013, 3, 1), 'date')\n```\n\n**convert_series(i, include_type=False)**\n\nTakes an interable and returns a generator. Each value will be converted independently. If `include_type` is true, each value will be paired with it's type name.\n\n```python\n\u003e\u003e\u003e list(strconv.convert_series(['1', '1.2', 't', '2013-01-01']))\n[1, 1.2, True, date(2013, 1, 1)]\n```\n\n**convert_matrix(m, include_type=False)**\n\nTakes a matrix (iterable of iterables) and returns a generator. Each value will be converted independently. If `include_type` is true, each value will be paired with it's type name.\n\n_A CSV reader can be directly passed into this function._\n\n```python\n\u003e\u003e\u003e import csv\n\u003e\u003e\u003e r = csv.reader(open('data.csv', 'rb'))\n\u003e\u003e\u003e for row in strconv.convert_matrix(r):\n...     ...\n```\n\n### Inference\n\nThese functions are merely convenience wrappers for the above `convert*` functions to return only the converter type or the converted value's type.\n\n**infer(s, converted=False)**\n\nReturns the converter's type of the string value. If `converted` is true, the type of the converted value will be returned.\n\n```python\n\u003e\u003e\u003e strconv.infer('1')\n'int'\n\u003e\u003e\u003e strconv.infer('1', converted=True)\nint\n```\n\n**infer_series(i, n=None, size=10)**\n\nInfers the types of a series of values. The original use case for this was to take a column of data and infer all the teypes that exist in the data. This would confirm whether the data contains heterogeneous values.\n\nThe output of this is a `Types` instance which stores information and a sample of the values for inspection. If `n` is an integer, only N values will be evaluated. `size` is the number of values per type that will be stored as a sample set for inspection (greater `size` == more memory).\n\n```python\n\u003e\u003e\u003e info = strconv.infer_series(['10', '5', '', '-1'])\n\u003e\u003e\u003e info\n\u003cTypes: int=3, unknown=1\u003e\n\u003e\u003e\u003e info.most_common(1)\n[('int', 3)]\n\u003e\u003e\u003e info.types['int'].freq()\n0.75\n```\n\n**infer_matrix(m, n=None, size=10)**\n\nSame as `infer_series` except it will take a matrix of values. Type information will be stored per column not per row. The output will be a list of `Types` instances of lenght M where `m` is of size NxM.\n\n```python\n\u003e\u003e\u003e import csv\n\u003e\u003e\u003e r = csv.reader(open('data.csv', 'rb'))\n\u003e\u003e\u003e col_types = strconv.infer_matrix(r)\n```\n\n## Converters\n\nConverters are registered by some name and are evaluated in order. Converters should be ordered from the most specific + less complex to the least specific + most complex since once a value matches, further evaluation is stopped. Below are the built-in converters listed in order.\n\n- `int`\n- `float`\n- `bool` - case-insensitive conversion: `t`, `true`, `yes` to `True` and `f`, `false`, `no` to `False`\n- `date` - see `strconv.DATE_FORMATS` for the default date formats\n- `time` - see `strconv.TIME_FORMATS` for the default time formats\n- `datetime` - converts using each combination of the date and time formats with either `T` or a single space as the separate, e.g. '2013-03-20T13:05:32'\n\n## Customize\n\nType inference of strings is a very difficult thing to generalize. Often times there is subtle nuances to the data that require domain knowledge in order to infer the correct type. `strconv` makes it as simple as possible to customize the behavior of the inference and conversion.\n\n**Register Converter**\n\n```python\n\u003e\u003e\u003e def convert_none(s):\n...     if s.upper() in ('\\N', 'NA', 'N/A', '', 'UNK'):\n...         return\n...     raise ValueError\n...\n\u003e\u003e\u003e strconv.register_converter('none', convert_none, priority=0)\n\u003e\u003e\u003e list(strconv.convert_series(['\\N', '', 'na', 'unk']))\n[None, None, None, None]\n```\n\n**Unregister Converter**\n\nAny of the default converters can be unregistered by name. This is recommended if the data is known not to have certain types.\n\n```python\nstrconv.unregister_converter('datetime')\n```\n\n**Strconv Class**\n\nThe `Strconv` class encapsulates all of the above functionality which makes it possible to create separate instance for different kinds of files or processing. All the above functions are simply references to the default instance. Instantiate a new empty instance:\n\n```python\nmystrconv = strconv.Strconv()\n```\n\nThe built-in converters are defined in the module:\n\n```python\nmystrconv.register_converter('int', strconv.convert_int)\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbruth%2Fstrconv","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbruth%2Fstrconv","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbruth%2Fstrconv/lists"}