{"id":13541962,"url":"https://github.com/reubano/meza","last_synced_at":"2025-05-14T23:07:03.594Z","repository":{"id":36736927,"uuid":"41043522","full_name":"reubano/meza","owner":"reubano","description":"A Python toolkit for processing tabular data","archived":false,"fork":false,"pushed_at":"2025-02-27T13:29:11.000Z","size":4720,"stargazers_count":417,"open_issues_count":12,"forks_count":29,"subscribers_count":17,"default_branch":"main","last_synced_at":"2025-05-11T01:04:11.261Z","etag":null,"topics":["csv","data","excel","featured","functional-programming","library","pandas","tabular-data","xlsx","xml"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/reubano.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":"CONTRIBUTING.rst","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2015-08-19T15:42:52.000Z","updated_at":"2025-04-06T02:37:20.000Z","dependencies_parsed_at":"2024-12-20T04:03:16.528Z","dependency_job_id":"69863f4b-d393-433a-8a9a-ec4ef6808a7e","html_url":"https://github.com/reubano/meza","commit_stats":{"total_commits":578,"total_committers":9,"mean_commits":64.22222222222223,"dds":0.01384083044982698,"last_synced_commit":"40739525f3286912eba70de33602746fce370653"},"previous_names":[],"tags_count":94,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reubano%2Fmeza","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reubano%2Fmeza/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reubano%2Fmeza/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/reubano%2Fmeza/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/reubano","download_url":"https://codeload.github.com/reubano/meza/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254243362,"owners_count":22038046,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csv","data","excel","featured","functional-programming","library","pandas","tabular-data","xlsx","xml"],"created_at":"2024-08-01T10:00:59.424Z","updated_at":"2025-05-14T23:06:58.585Z","avatar_url":"https://github.com/reubano.png","language":"Python","readme":"meza: A Python toolkit for processing tabular data\n======================================================\n\n|GHA| |versions| |pypi|\n\nIndex\n-----\n\n`Introduction`_ | `Requirements`_ | `Motivation`_ | `Hello World`_ | `Usage`_ |\n`Interoperability`_ | `Installation`_ | `Project Structure`_ |\n`Design Principles`_ | `Scripts`_ | `Contributing`_ | `Credits`_ |\n`More Info`_ | `License`_\n\nIntroduction\n------------\n\n**meza** is a Python `library`_ for reading and processing tabular data.\nIt has a functional programming style API, excels at reading/writing large files,\nand can process 10+ file types.\n\nWith meza, you can\n\n- Read csv/xls/xlsx/mdb/dbf files, and more!\n- Type cast records (date, float, text...)\n- Process Uñicôdë text\n- Lazily stream files by default\n- and much more...\n\nRequirements\n------------\n\nmeza has been tested and is known to work on Python 3.7, 3.8, and 3.9; and PyPy3.7.\n\nOptional Dependencies\n^^^^^^^^^^^^^^^^^^^^^\n\n===============================  ==============  ==============================  =======================\nFunction                         Dependency      Installation                    File type / extension\n===============================  ==============  ==============================  =======================\n``meza.io.read_mdb``             `mdbtools`_     ``sudo port install mdbtools``   Microsoft Access / mdb\n``meza.io.read_html``            `lxml`_ [#]_    ``pip install lxml``             HTML / html\n``meza.convert.records2array``   `NumPy`_ [#]_   ``pip install numpy``            n/a\n``meza.convert.records2df``      `pandas`_       ``pip install pandas``           n/a\n===============================  ==============  ==============================  =======================\n\nNotes\n^^^^^\n\n.. [#] If ``lxml`` isn't present, ``read_html`` will default to the builtin Python html reader\n\n.. [#] ``records2array`` can be used without ``numpy`` by passing ``native=True`` in the function call. This will convert ``records`` into a list of native ``array.array`` objects.\n\nMotivation\n----------\n\nWhy I built meza\n^^^^^^^^^^^^^^^^\n\npandas is great, but installing it isn't exactly a `walk in the park`_, and it\ndoesn't play nice with `PyPy`_. I designed **meza** to be a lightweight, easy to install, less featureful alternative to\npandas. I also optimized **meza** for low memory usage, PyPy compatibility, and functional programming best practices.\n\nWhy you should use meza\n^^^^^^^^^^^^^^^^^^^^^^^\n\n``meza`` provides a number of benefits / differences from similar libraries such\nas ``pandas``. Namely:\n\n- a functional programming (instead of object oriented) API\n- `iterators by default`_ (reading/writing)\n- `PyPy compatibility`_\n- `geojson support`_ (reading/writing)\n- `seamless integration`_ with sqlachemy (and other libs that work with iterators of dicts)\n\nFor more detailed information, please check-out the `FAQ`_.\n\nHello World\n-----------\n\nA simple data processing example is shown below:\n\nFirst create a simple csv file (in bash)\n\n.. code-block:: bash\n\n    echo 'col1,col2,col3\\nhello,5/4/82,1\\none,1/1/15,2\\nhappy,7/1/92,3\\n' \u003e data.csv\n\nNow we can read the file, manipulate the data a bit, and write the manipulated\ndata back to a new file.\n\n.. code-block:: python\n\n    \u003e\u003e\u003e from meza import io, process as pr, convert as cv\n    \u003e\u003e\u003e from io import open\n\n    \u003e\u003e\u003e # Load the csv file\n    \u003e\u003e\u003e records = io.read_csv('data.csv')\n\n    \u003e\u003e\u003e # `records` are iterators over the rows\n    \u003e\u003e\u003e row = next(records)\n    \u003e\u003e\u003e row\n    {'col1': 'hello', 'col2': '5/4/82', 'col3': '1'}\n\n    \u003e\u003e\u003e # Let's replace the first row so as not to lose any data\n    \u003e\u003e\u003e records = pr.prepend(records, row)\n\n    # Guess column types. Note: `detect_types` returns a new `records`\n    # generator since it consumes rows during type detection\n    \u003e\u003e\u003e records, result = pr.detect_types(records)\n    \u003e\u003e\u003e {t['id']: t['type'] for t in result['types']}\n    {'col1': 'text', 'col2': 'date', 'col3': 'int'}\n\n    # Now type cast the records. Note: most `meza.process` functions return\n    # generators, so lets wrap the result in a list to view the data\n    \u003e\u003e\u003e casted = list(pr.type_cast(records, result['types']))\n    \u003e\u003e\u003e casted[0]\n    {'col1': 'hello', 'col2': datetime.date(1982, 5, 4), 'col3': 1}\n\n    # Cut out the first column of data and merge the rows to get the max value\n    # of the remaining columns. Note: since `merge` (by definition) will always\n    # contain just one row, it is returned as is (not wrapped in a generator)\n    \u003e\u003e\u003e cut_recs = pr.cut(casted, ['col1'], exclude=True)\n    \u003e\u003e\u003e merged = pr.merge(cut_recs, pred=bool, op=max)\n    \u003e\u003e\u003e merged\n    {'col2': datetime.date(2015, 1, 1), 'col3': 3}\n\n    # Now write merged data back to a new csv file.\n    \u003e\u003e\u003e io.write('out.csv', cv.records2csv(merged))\n\n    # View the result\n    \u003e\u003e\u003e with open('out.csv', 'utf-8') as f:\n    ...     f.read()\n    'col2,col3\\n2015-01-01,3\\n'\n\nUsage\n-----\n\nmeza is intended to be used directly as a Python library.\n\nUsage Index\n^^^^^^^^^^^\n\n- `Reading data`_\n- `Processing data`_\n\n  + `Numerical analysis (à la pandas)`_\n  + `Text processing (à la csvkit)`_\n  + `Geo processing (à la mapbox)`_\n\n- `Writing data`_\n- `Cookbook`_\n\nReading data\n^^^^^^^^^^^^\n\nmeza can read both filepaths and file-like objects. Additionally, all readers\nreturn equivalent `records` iterators, i.e., a generator of dictionaries with\nkeys corresponding to the column names.\n\n.. code-block:: python\n\n    \u003e\u003e\u003e from io import open, StringIO\n    \u003e\u003e\u003e from meza import io\n\n    \"\"\"Read a filepath\"\"\"\n    \u003e\u003e\u003e records = io.read_json('path/to/file.json')\n\n    \"\"\"Read a file like object and de-duplicate the header\"\"\"\n    \u003e\u003e\u003e f = StringIO('col,col\\nhello,world\\n')\n    \u003e\u003e\u003e records = io.read_csv(f, dedupe=True)\n\n    \"\"\"View the first row\"\"\"\n    \u003e\u003e\u003e next(records)\n    {'col': 'hello', 'col_2': 'world'}\n\n    \"\"\"Read the 1st sheet of an xls file object opened in text mode.\"\"\"\n    # Also, santize the header names by converting them to lowercase and\n    # replacing whitespace and invalid characters with `_`.\n    \u003e\u003e\u003e with open('path/to/file.xls', 'utf-8') as f:\n    ...     for row in io.read_xls(f, sanitize=True):\n    ...         # do something with the `row`\n    ...         pass\n\n    \"\"\"Read the 2nd sheet of an xlsx file object opened in binary mode\"\"\"\n    # Note: sheets are zero indexed\n    \u003e\u003e\u003e with open('path/to/file.xlsx') as f:\n    ...     records = io.read_xls(f, encoding='utf-8', sheet=1)\n    ...     first_row = next(records)\n    ...     # do something with the `first_row`\n\n    \"\"\"Read any recognized file\"\"\"\n    \u003e\u003e\u003e records = io.read('path/to/file.geojson')\n    \u003e\u003e\u003e f.seek(0)\n    \u003e\u003e\u003e records = io.read(f, ext='csv', dedupe=True)\n\nPlease see `readers`_ for a complete list of available readers and recognized\nfile types.\n\nProcessing data\n^^^^^^^^^^^^^^^\n\nNumerical analysis (à la pandas) [#]_\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nIn the following example, ``pandas`` equivalent methods are preceded by ``--\u003e``.\n\n.. code-block:: python\n\n    \u003e\u003e\u003e import itertools as it\n    \u003e\u003e\u003e import random\n\n    \u003e\u003e\u003e from io import StringIO\n    \u003e\u003e\u003e from meza import io, process as pr, convert as cv, stats\n\n    # Create some data in the same structure as what the various `read...`\n    # functions output\n    \u003e\u003e\u003e header = ['A', 'B', 'C', 'D']\n    \u003e\u003e\u003e data = [(random.random() for _ in range(4)) for x in range(7)]\n    \u003e\u003e\u003e df = [dict(zip(header, d)) for d in data]\n    \u003e\u003e\u003e df[0]\n    {'A': 0.53908..., 'B': 0.28919..., 'C': 0.03003..., 'D': 0.65363...}\n\n    \"\"\"Sort records by the value of column `B` --\u003e df.sort_values(by='B')\"\"\"\n    \u003e\u003e\u003e next(pr.sort(df, 'B'))\n    {'A': 0.53520..., 'B': 0.06763..., 'C': 0.02351..., 'D': 0.80529...}\n\n    \"\"\"Select column `A` --\u003e df['A']\"\"\"\n    \u003e\u003e\u003e next(pr.cut(df, ['A']))\n    {'A': 0.53908170489952006}\n\n    \"\"\"Select the first three rows of data --\u003e df[0:3]\"\"\"\n    \u003e\u003e\u003e len(list(it.islice(df, 3)))\n    3\n\n    \"\"\"Select all data whose value for column `A` is less than 0.5\n    --\u003e df[df.A \u003c 0.5]\n    \"\"\"\n    \u003e\u003e\u003e next(pr.tfilter(df, 'A', lambda x: x \u003c 0.5))\n    {'A': 0.21000..., 'B': 0.25727..., 'C': 0.39719..., 'D': 0.64157...}\n\n    # Note: since `aggregate` and `merge` (by definition) return just one row,\n    # they return them as is (not wrapped in a generator).\n    \"\"\"Calculate the mean of column `A` across all data --\u003e df.mean()['A']\"\"\"\n    \u003e\u003e\u003e pr.aggregate(df, 'A', stats.mean)['A']\n    0.5410437473067938\n\n    \"\"\"Calculate the sum of each column across all data --\u003e df.sum()\"\"\"\n    \u003e\u003e\u003e pr.merge(df, pred=bool, op=sum)\n    {'A': 3.78730..., 'C': 2.82875..., 'B': 3.14195..., 'D': 5.26330...}\n\nText processing (à la csvkit) [#]_\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nIn the following example, ``csvkit`` equivalent commands are preceded by ``--\u003e``.\n\nFirst create a few simple csv files (in bash)\n\n.. code-block:: bash\n\n    echo 'col_1,col_2,col_3\\n1,dill,male\\n2,bob,male\\n3,jane,female' \u003e file1.csv\n    echo 'col_1,col_2,col_3\\n4,tom,male\\n5,dick,male\\n6,jill,female' \u003e file2.csv\n\nNow we can read the files, manipulate the data, convert the manipulated data to\njson, and write the json back to a new file. Also, note that since all readers\nreturn equivalent `records` iterators, you can use them interchangeably (in\nplace of ``read_csv``) to open any supported file. E.g., ``read_xls``,\n``read_sqlite``, etc.\n\n.. code-block:: python\n\n    \u003e\u003e\u003e import itertools as it\n\n    \u003e\u003e\u003e from meza import io, process as pr, convert as cv\n\n    \"\"\"Combine the files into one iterator\n    --\u003e csvstack file1.csv file2.csv\n    \"\"\"\n    \u003e\u003e\u003e records = io.join('file1.csv', 'file2.csv')\n    \u003e\u003e\u003e next(records)\n    {'col_1': '1', 'col_2': 'dill', 'col_3': 'male'}\n    \u003e\u003e\u003e next(it.islice(records, 4, None))\n    {'col_1': '6', 'col_2': 'jill', 'col_3': 'female'}\n\n    # Now let's create a persistent records list\n    \u003e\u003e\u003e records = list(io.read_csv('file1.csv'))\n\n    \"\"\"Sort records by the value of column `col_2`\n    --\u003e csvsort -c col_2 file1.csv\n    \"\"\"\n    \u003e\u003e\u003e next(pr.sort(records, 'col_2'))\n    {'col_1': '2', 'col_2': 'bob', 'col_3': 'male'\n\n    \"\"\"Select column `col_2` --\u003e csvcut -c col_2 file1.csv\"\"\"\n    \u003e\u003e\u003e next(pr.cut(records, ['col_2']))\n    {'col_2': 'dill'}\n\n    \"\"\"Select all data whose value for column `col_2` contains `jan`\n    --\u003e csvgrep -c col_2 -m jan file1.csv\n    \"\"\"\n    \u003e\u003e\u003e next(pr.grep(records, [{'pattern': 'jan'}], ['col_2']))\n    {'col_1': '3', 'col_2': 'jane', 'col_3': 'female'}\n\n    \"\"\"Convert a csv file to json --\u003e csvjson -i 4 file1.csv\"\"\"\n    \u003e\u003e\u003e io.write('file.json', cv.records2json(records))\n\n    # View the result\n    \u003e\u003e\u003e with open('file.json', 'utf-8') as f:\n    ...     f.read()\n    '[{\"col_1\": \"1\", \"col_2\": \"dill\", \"col_3\": \"male\"}, {\"col_1\": \"2\",\n    \"col_2\": \"bob\", \"col_3\": \"male\"}, {\"col_1\": \"3\", \"col_2\": \"jane\",\n    \"col_3\": \"female\"}]'\n\nGeo processing (à la mapbox) [#]_\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nIn the following example, ``mapbox`` equivalent commands are preceded by ``--\u003e``.\n\nFirst create a geojson file (in bash)\n\n.. code-block:: bash\n\n    echo '{\"type\": \"FeatureCollection\",\"features\": [' \u003e file.geojson\n    echo '{\"type\": \"Feature\", \"id\": 11, \"geometry\": {\"type\": \"Point\", \"coordinates\": [10, 20]}},' \u003e\u003e file.geojson\n    echo '{\"type\": \"Feature\", \"id\": 12, \"geometry\": {\"type\": \"Point\", \"coordinates\": [5, 15]}}]}' \u003e\u003e file.geojson\n\nNow we can open the file, split the data by id, and finally convert the split data\nto a new geojson file-like object.\n\n.. code-block:: python\n\n    \u003e\u003e\u003e from meza import io, process as pr, convert as cv\n\n    # Load the geojson file and peek at the results\n    \u003e\u003e\u003e records, peek = pr.peek(io.read_geojson('file.geojson'))\n    \u003e\u003e\u003e peek[0]\n    {'lat': 20, 'type': 'Point', 'lon': 10, 'id': 11}\n\n    \"\"\"Split the records by feature ``id`` and select the first feature\n    --\u003e geojsplit -k id file.geojson\n    \"\"\"\n    \u003e\u003e\u003e splits = pr.split(records, 'id')\n    \u003e\u003e\u003e feature_records, name = next(splits)\n    \u003e\u003e\u003e name\n    11\n\n    \"\"\"Convert the feature records into a GeoJSON file-like object\"\"\"\n    \u003e\u003e\u003e geojson = cv.records2geojson(feature_records)\n    \u003e\u003e\u003e geojson.readline()\n    '{\"type\": \"FeatureCollection\", \"bbox\": [10, 20, 10, 20], \"features\": '\n    '[{\"type\": \"Feature\", \"id\": 11, \"geometry\": {\"type\": \"Point\", '\n    '\"coordinates\": [10, 20]}, \"properties\": {\"id\": 11}}], \"crs\": {\"type\": '\n    '\"name\", \"properties\": {\"name\": \"urn:ogc:def:crs:OGC:1.3:CRS84\"}}}'\n\n    # Note: you can also write back to a file as shown previously\n    # io.write('file.geojson', geojson)\n\nWriting data\n^^^^^^^^^^^^\n\nmeza can persist ``records`` to disk via the following functions:\n\n- ``meza.convert.records2csv``\n- ``meza.convert.records2json``\n- ``meza.convert.records2geojson``\n\nEach function returns a file-like object that you can write to disk via\n``meza.io.write('/path/to/file', result)``.\n\n.. code-block:: python\n\n    \u003e\u003e\u003e from meza import io, convert as cv\n    \u003e\u003e\u003e from io import StringIO, open\n\n    # First let's create a simple tsv file like object\n    \u003e\u003e\u003e f = StringIO('col1\\tcol2\\nhello\\tworld\\n')\n    \u003e\u003e\u003e f.seek(0)\n\n    # Next create a records list so we can reuse it\n    \u003e\u003e\u003e records = list(io.read_tsv(f))\n    \u003e\u003e\u003e records[0]\n    {'col1': 'hello', 'col2': 'world'}\n\n    # Now we're ready to write the records data to file\n\n    \"\"\"Create a csv file like object\"\"\"\n    \u003e\u003e\u003e cv.records2csv(records).readline()\n    'col1,col2\\n'\n\n    \"\"\"Create a json file like object\"\"\"\n    \u003e\u003e\u003e cv.records2json(records).readline()\n    '[{\"col1\": \"hello\", \"col2\": \"world\"}]'\n\n    \"\"\"Write back csv to a filepath\"\"\"\n    \u003e\u003e\u003e io.write('file.csv', cv.records2csv(records))\n    \u003e\u003e\u003e with open('file.csv', 'utf-8') as f_in:\n    ...     f_in.read()\n    'col1,col2\\nhello,world\\n'\n\n    \"\"\"Write back json to a filepath\"\"\"\n    \u003e\u003e\u003e io.write('file.json', cv.records2json(records))\n    \u003e\u003e\u003e with open('file.json', 'utf-8') as f_in:\n    ...     f_in.readline()\n    '[{\"col1\": \"hello\", \"col2\": \"world\"}]'\n\nCookbook\n^^^^^^^^\n\nPlease see the `cookbook`_ or ipython `notebook`_ for more examples.\n\nNotes\n^^^^^\n\n.. [#] http://pandas.pydata.org/pandas-docs/stable/10min.html#min\n.. [#] https://csvkit.readthedocs.org/en/0.9.1/cli.html#processing\n.. [#] https://github.com/mapbox?utf8=%E2%9C%93\u0026query=geojson\n\nInteroperability\n----------------\n\nmeza plays nicely with NumPy and friends out of the box\n\nsetup\n^^^^^\n\n.. code-block:: python\n\n    from meza import process as pr\n\n    # First create some records and types. Also, convert the records to a list\n    # so we can reuse them.\n    \u003e\u003e\u003e records = [{'a': 'one', 'b': 2}, {'a': 'five', 'b': 10, 'c': 20.1}]\n    \u003e\u003e\u003e records, result = pr.detect_types(records)\n    \u003e\u003e\u003e records, types = list(records), result['types']\n    \u003e\u003e\u003e types\n    [\n        {'type': 'text', 'id': 'a'},\n        {'type': 'int', 'id': 'b'},\n        {'type': 'float', 'id': 'c'}]\n\n\nfrom records to pandas.DataFrame to records\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n.. code-block:: python\n\n    \u003e\u003e\u003e import pandas as pd\n    \u003e\u003e\u003e from meza import convert as cv\n\n    \"\"\"Convert the records to a DataFrame\"\"\"\n    \u003e\u003e\u003e df = cv.records2df(records, types)\n    \u003e\u003e\u003e df\n            a   b   c\n    0   one   2   NaN\n    1  five  10  20.1\n    # Alternatively, you can do `pd.DataFrame(records)`\n\n    \"\"\"Convert the DataFrame back to records\"\"\"\n    \u003e\u003e\u003e next(cv.df2records(df))\n    {'a': 'one', 'b': 2, 'c': nan}\n\nfrom records to arrays to records\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n.. code-block:: python\n\n    \u003e\u003e\u003e import numpy as np\n\n    \u003e\u003e\u003e from array import array\n    \u003e\u003e\u003e from meza import convert as cv\n\n    \"\"\"Convert records to a structured array\"\"\"\n    \u003e\u003e\u003e recarray = cv.records2array(records, types)\n    \u003e\u003e\u003e recarray\n    rec.array([('one', 2, nan), ('five', 10, 20.100000381469727)],\n              dtype=[('a', 'O'), ('b', '\u003ci4'), ('c', '\u003cf4')])\n    \u003e\u003e\u003e recarray.b\n    array([ 2, 10], dtype=int32)\n\n    \"\"\"Convert records to a native array\"\"\"\n    \u003e\u003e\u003e narray = cv.records2array(records, types, native=True)\n    \u003e\u003e\u003e narray\n    [[array('u', 'a'), array('u', 'b'), array('u', 'c')],\n    [array('u', 'one'), array('u', 'five')],\n    array('i', [2, 10]),\n    array('f', [0.0, 20.100000381469727])]\n\n    \"\"\"Convert a 2-D NumPy array to a records generator\"\"\"\n    \u003e\u003e\u003e data = np.array([[1, 2, 3], [4, 5, 6]], np.int32)\n    \u003e\u003e\u003e data\n    array([[1, 2, 3],\n           [4, 5, 6]], dtype=int32)\n    \u003e\u003e\u003e next(cv.array2records(data))\n    {'column_1': 1, 'column_2': 2, 'column_3': 3}\n\n    \"\"\"Convert the structured array back to a records generator\"\"\"\n    \u003e\u003e\u003e next(cv.array2records(recarray))\n    {'a': 'one', 'b': 2, 'c': nan}\n\n    \"\"\"Convert the native array back to records generator\"\"\"\n    \u003e\u003e\u003e next(cv.array2records(narray, native=True))\n    {'a': 'one', 'b': 2, 'c': 0.0}\n\nInstallation\n------------\n\n(You are using a `virtualenv`_, right?)\n\nAt the command line, install meza using either ``pip`` (*recommended*)\n\n.. code-block:: bash\n\n    pip install meza\n\nor ``easy_install``\n\n.. code-block:: bash\n\n    easy_install meza\n\nPlease see the `installation doc`_ for more details.\n\nProject Structure\n-----------------\n\n.. code-block:: bash\n\n    ┌── CONTRIBUTING.rst\n    ├── LICENSE\n    ├── MANIFEST.in\n    ├── Makefile\n    ├── README.rst\n    ├── data\n    │   ├── converted/*\n    │   └── test/*\n    ├── dev-requirements.txt\n    ├── docs\n    │   ├── AUTHORS.rst\n    │   ├── CHANGES.rst\n    │   ├── COOKBOOK.rst\n    │   ├── FAQ.rst\n    │   ├── INSTALLATION.rst\n    │   └── TODO.rst\n    ├── examples\n    │   ├── usage.ipynb\n    │   └── usage.py\n    ├── helpers/*\n    ├── manage.py\n    ├── meza\n    │   ├── __init__.py\n    │   ├── convert.py\n    │   ├── dbf.py\n    │   ├── fntools.py\n    │   ├── io.py\n    │   ├── process.py\n    │   ├── stats.py\n    │   ├── typetools.py\n    │   └── unicsv.py\n    ├── optional-requirements.txt\n    ├── py2-requirements.txt\n    ├── requirements.txt\n    ├── setup.cfg\n    ├── setup.py\n    ├── tests\n    │   ├── __init__.py\n    │   ├── standard.rc\n    │   ├── test_fntools.py\n    │   ├── test_io.py\n    │   └── test_process.py\n    └── tox.ini\n\nDesign Principles\n-----------------\n\n- prefer functions over objects\n- provide enough functionality out of the box to easily implement the most common data analysis use cases\n- make conversion between ``records``, ``arrays``, and ``DataFrames`` dead simple\n- whenever possible, lazily read objects and stream the result [#]_\n\n.. [#] Notable exceptions are ``meza.process.group``, ``meza.process.sort``, ``meza.io.read_dbf``, ``meza.io.read_yaml``, and ``meza.io.read_html``. These functions read the entire contents into memory up front.\n\nScripts\n-------\n\nmeza comes with a built in task manager ``manage.py``\n\nSetup\n^^^^^\n\n.. code-block:: bash\n\n    pip install -r dev-requirements.txt\n\nExamples\n^^^^^^^^\n\n*Run python linter and nose tests*\n\n.. code-block:: bash\n\n    manage lint\n    manage test\n\nContributing\n------------\n\nPlease mimic the coding style/conventions used in this repo.\nIf you add new classes or functions, please add the appropriate doc blocks with\nexamples. Also, make sure the python linter and nose tests pass.\n\nPlease see the `contributing doc`_ for more details.\n\nCredits\n-------\n\nShoutouts to `csvkit`_, `messytables`_, and `pandas`_ for heavily inspiring meza.\n\nMore Info\n---------\n\n- `FAQ`_\n- `cookbook`_\n- ipython `notebook`_\n\nLicense\n-------\n\nmeza is distributed under the `MIT License`_.\n\n.. |GHA| image:: https://github.com/reubano/meza/actions/workflows/main.yml/badge.svg\n    :target: https://github.com/reubano/meza/actions?query=workflow%3A%22tests%22\n\n.. |versions| image:: https://img.shields.io/pypi/pyversions/meza.svg\n    :target: https://pypi.python.org/pypi/meza\n\n.. |pypi| image:: https://img.shields.io/pypi/v/meza.svg\n    :target: https://pypi.python.org/pypi/meza\n\n.. _mdbtools: https://github.com/mdbtools/mdbtools\n.. _lxml: http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser\n.. _library: #usage\n.. _NumPy: https://github.com/numpy/numpy\n.. _PyPy: https://github.com/pydata/pandas/issues/9532\n.. _walk in the park: http://pandas.pydata.org/pandas-docs/stable/install.html#installing-pandas-with-anaconda\n.. _csvkit: https://github.com/onyxfish/csvkit\n.. _messytables: https://github.com/okfn/messytables\n.. _pandas: https://github.com/pydata/pandas\n.. _MIT License: http://opensource.org/licenses/MIT\n.. _virtualenv: http://www.virtualenv.org/en/latest/index.html\n.. _contributing doc: https://github.com/reubano/meza/blob/master/CONTRIBUTING.rst\n.. _FAQ: https://github.com/reubano/meza/blob/master/docs/FAQ.rst\n.. _iterators by default: https://github.com/reubano/meza/blob/master/docs/FAQ.rst#memory\n.. _PyPy compatibility: https://github.com/reubano/meza/blob/master/docs/FAQ.rst#pypy\n.. _geojson support: https://github.com/reubano/meza/blob/master/docs/FAQ.rst#geojson\n.. _seamless integration: https://github.com/reubano/meza/blob/master/docs/FAQ.rst#convenience\n.. _notebook: http://nbviewer.jupyter.org/github/reubano/meza/blob/master/examples/usage.ipynb\n.. _readers: https://github.com/reubano/meza/blob/master/docs/FAQ.rst#what-readers-are-available\n.. _installation doc: https://github.com/reubano/meza/blob/master/docs/INSTALLATION.rst\n.. _cookbook: https://github.com/reubano/meza/blob/master/docs/COOKBOOK.rst\n","funding_links":[],"categories":["Python","Data Manipulation"],"sub_categories":["Pipelines"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Freubano%2Fmeza","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Freubano%2Fmeza","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Freubano%2Fmeza/lists"}