{"id":24297044,"url":"https://github.com/jayclassless/tabfilereader","last_synced_at":"2025-10-12T12:17:50.377Z","repository":{"id":69477045,"uuid":"307009224","full_name":"jayclassless/tabfilereader","owner":"jayclassless","description":null,"archived":false,"fork":false,"pushed_at":"2020-11-08T06:05:36.000Z","size":5909,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-01-16T19:53:12.285Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jayclassless.png","metadata":{"files":{"readme":"README.rst","changelog":"CHANGES.rst","contributing":null,"funding":null,"license":"LICENSE.rst","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-10-25T02:24:57.000Z","updated_at":"2020-11-08T06:03:28.000Z","dependencies_parsed_at":"2023-04-22T04:13:24.930Z","dependency_job_id":null,"html_url":"https://github.com/jayclassless/tabfilereader","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jayclassless%2Ftabfilereader","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jayclassless%2Ftabfilereader/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jayclassless%2Ftabfilereader/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jayclassless%2Ftabfilereader/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jayclassless","download_url":"https://codeload.github.com/jayclassless/tabfilereader/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":242179252,"owners_count":20084940,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-01-16T19:49:59.923Z","updated_at":"2025-10-12T12:17:45.323Z","avatar_url":"https://github.com/jayclassless.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"*******\nWelcome\n*******\n\n.. image:: https://img.shields.io/pypi/v/tabfilereader.svg\n   :target: https://pypi.python.org/pypi/tabfilereader\n.. image:: https://img.shields.io/pypi/l/tabfilereader.svg\n   :target: https://pypi.python.org/pypi/tabfilereader\n.. image:: https://github.com/jayclassless/tabfilereader/workflows/Test/badge.svg\n   :target: https://github.com/jayclassless/tabfilereader/actions\n.. image:: https://github.com/jayclassless/tabfilereader/workflows/Docs/badge.svg\n   :target: https://jayclassless.github.io/tabfilereader/\n\n\nOverview\n========\n``tabfilereader`` is a small library to make reading flat, tabular data from\nfiles a bit less tedious.\n\nAt its base, to use ``tabfilereader``, you simply define your Schema, then use\nit to open a Reader. You can then iterate through the Reader to retrieve\nrecords from the file.\n\n    \u003e\u003e\u003e import tabfilereader as tfr\n    \u003e\u003e\u003e class MySchema(tfr.Schema):\n    ...     column1 = tfr.Column('column_1')\n    ...     column2 = tfr.Column('column_2', data_type=tfr.IntegerType(), data_required=True)\n    \u003e\u003e\u003e reader = tfr.CsvReader.open('test/data/simple_header.csv', MySchema)\n    \u003e\u003e\u003e for record, errors in reader:\n    ...     print(record)\n    Record(column1='foo', column2=123)\n    Record(column1='bar', column2=None)\n\n\nSchemas\n=======\nSchema classes tell ``tabfilereader`` what columns to expect in the file, and\nwhat datatypes the values contained in them should be cast as. You create your\nschemas by defining a class that inherits from ``tabfilereader.Schema``. In\nthis class, you define properties that are instances of\n``tabfilereader.Column``, which specify where columns are in the file, and what\ntheir datatype is. An example::\n\n    \u003e\u003e\u003e import re\n    \u003e\u003e\u003e class ExampleSchema(tfr.Schema):\n    ...     first = tfr.Column('First Name')\n    ...     last = tfr.Column('Last Name', data_required=True)\n    ...     birthdate = tfr.Column(re.compile(r'^Birth.*'), data_type=tfr.DateType())\n    ...     weight = tfr.Column('Weight', data_type=tfr.FloatType(), required=False)\n\nColumns require at least one argument that tells ``tabfilereader`` how to find\nthe column in the file. For files where the first record contains column names,\nyou can specify either:\n\n* The exact name of the column as a string.\n* An ``re.Pattern`` that will match the column name.\n* A sequence of strings or ``re.Pattern`` objects that the column could\n  possibly be named as.\n\nFor files that do not contain a header record, you specify the column's\nlocation with an zero-based integer index.\n\nColumns also take a series of optional parameters:\n\n``required``\n    To indicate whether or not it is required that this column exists in the\n    file. Defaults to ``True``.\n\n``data_required``\n    To indicate whether or not the column must have a value for every record in\n    the file. Defaults to ``False``.\n\n``data_type``\n    With this parameter, you can provide a ``callable`` that will receive a\n    string value from the file and return a parsed and properly-typed value. If\n    the value is invalid, the callable should throw a ``ValueError``.\n    ``tabfilereader`` provides an array of pre-defined Types that you can use\n    here for the most common data types (numbers, dates, strings, etc).\n    See the API documentation for all the available pre-defined Types. This\n    parameter defaults to ``tabfilereader.StringType()`` if not specified.\n\nThere are also a handful of optional parameteres that can be declared on the\nSchema itself. The available options are:\n\n``ignore_unknown_columns``\n    To indicate what should be done if a Reader finds columns in the file that\n    are not declared in the Schema. Defaults to ``False``, which means the\n    Reader will throw an exception.\n\n``ignore_empty_records``\n    To indicate what should be done if a Reader encounters a record with no\n    columns whatsoever. Defaults to ``False``, which means the reader will\n    return a record that is full of errors. This option is particularly useful\n    for CSV files when people are a bit sloppy with their newlines at the end\n    of a file.\n\nTo set these Schema-level options, pass them as keyword arguments in the class\ndeclaration::\n\n    \u003e\u003e\u003e class SchemaWithOptions(tfr.Schema, ignore_unknown_columns=True):\n    ...     column1 = tfr.Column('column_1')\n\n\nReaders\n=======\nReaders use the Schemas to interpret the contents of the tabular files.\n``tabfilereader`` provides the following Readers to handle various types of\nfiles:\n\n``CsvReader``\n    Handles Comma Separated Value files (or similarly-constructed files; TSV,\n    etc).\n\n``ExcelReader``\n    Handles Excel spreadsheets; either XLS- or XLSX-formatted.\n\n``OdsReader``\n    Handles OpenDocumentFormat spreadsheets.\n\nReaders can be created by either calling the ``open()`` classmethod on the\nspecific Reader class you want to use, or by defining your own Reader class\nthat inherits from one provided by ``tabfilereader`` like so::\n\n    \u003e\u003e\u003e class MyReader(tfr.CsvReader):\n    ...     schema = MySchema\n    ...     delimiter = '|'\n\n    \u003e\u003e\u003e reader = MyReader('test/data/simple_header_pipe.csv')\n\nEach reader allows for a variety of optional parameters (like ``delimiter`` in\nthe example above). See the API documentation for a full listing of the options\nfor each.\n\nReaders are iterable. Each iteration returns a tuple of two values. The first\nvalue is a Record that contains the values from the file. The second value is\na collection of all the errors encountered when trying to parse the values in\nthe columns.\n\n    \u003e\u003e\u003e record, errors = next(reader)\n    \u003e\u003e\u003e record.column1\n    'foo'\n    \u003e\u003e\u003e record['column2']\n    123\n    \u003e\u003e\u003e bool(errors)\n    False\n\n    \u003e\u003e\u003e record, errors = next(reader)\n    \u003e\u003e\u003e record.column1\n    'bar'\n    \u003e\u003e\u003e record['column2'] is None\n    True\n    \u003e\u003e\u003e bool(errors)\n    True\n    \u003e\u003e\u003e errors['column2']\n    'A value is required'\n\n\nLicense\n=======\nThis project is released under the terms of the `MIT License`_.\n\n.. _MIT License: https://opensource.org/licenses/MIT\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjayclassless%2Ftabfilereader","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjayclassless%2Ftabfilereader","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjayclassless%2Ftabfilereader/lists"}