{"id":13856774,"url":"https://github.com/otsaloma/dataiter","last_synced_at":"2026-04-12T21:02:54.501Z","repository":{"id":35138308,"uuid":"211702437","full_name":"otsaloma/dataiter","owner":"otsaloma","description":"Simple, light-weight data frames for Python","archived":false,"fork":false,"pushed_at":"2025-04-18T20:03:48.000Z","size":3105,"stargazers_count":26,"open_issues_count":1,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-19T08:07:14.300Z","etag":null,"topics":["data-frame","json","numba","numpy","python"],"latest_commit_sha":null,"homepage":"https://dataiter.readthedocs.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/otsaloma.png","metadata":{"files":{"readme":"README.md","changelog":"NEWS.md","contributing":null,"funding":null,"license":"COPYING","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS.md","dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2019-09-29T17:49:00.000Z","updated_at":"2025-04-18T20:03:51.000Z","dependencies_parsed_at":"2024-04-10T20:33:26.228Z","dependency_job_id":"68ea526a-8ec6-4efe-8815-cda1432be704","html_url":"https://github.com/otsaloma/dataiter","commit_stats":{"total_commits":624,"total_committers":3,"mean_commits":208.0,"dds":0.06730769230769229,"last_synced_commit":"bb7c732ed81f59e0526b6d7a65bd4da59658fbe0"},"previous_names":[],"tags_count":61,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/otsaloma%2Fdataiter","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/otsaloma%2Fdataiter/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/otsaloma%2Fdataiter/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/otsaloma%2Fdataiter/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/otsaloma","download_url":"https://codeload.github.com/otsaloma/dataiter/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252784525,"owners_count":21803699,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-frame","json","numba","numpy","python"],"created_at":"2024-08-05T03:01:12.729Z","updated_at":"2026-04-12T21:02:54.478Z","avatar_url":"https://github.com/otsaloma.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"Simple, Light-Weight Data Frames for Python\n===========================================\n\n[![PyPI](https://img.shields.io/pypi/v/dataiter.svg)](https://pypi.org/project/dataiter)\n[![Downloads](https://pepy.tech/badge/dataiter/month)](https://pepy.tech/project/dataiter)\n\nDataiter's **`DataFrame`** is a class for tabular data similar to R's\n`data.frame`, implementing all common operations to manipulate data. It\nis under the hood a dictionary of NumPy arrays and thus capable of fast\nvectorized operations. You can consider it to be a light-weight\nalternative to Pandas with a simple and consistent API. Performance-wise\nDataiter relies on NumPy and Numba and is likely to be at best\ncomparable to Pandas.\n\n## Installation\n\n```bash\n# Latest stable version\npip install -U dataiter\n\n# Latest development version\npip install -U git+https://github.com/otsaloma/dataiter\n\n# Numba (optional)\npip install -U numba\n```\n\nRecommended NumPy version is currently \u003e= 2.4.0 due to various\nStringDType fixes that have landed in NumPy 2.2.1 and 2.4.0.\n\nDataiter optionally uses **Numba** to speed up certain operations. If\nyou have Numba installed, Dataiter will use it automatically. It's\ncurrently not a hard dependency, so you need to install it separately.\n\n## Quick Start\n\n```python\n\u003e\u003e\u003e import dataiter as di\n\u003e\u003e\u003e data = di.read_csv(\"data/listings.csv\")\n\u003e\u003e\u003e data.filter(hood=\"Manhattan\", guests=2).sort(price=1).head()\n.\n        id      hood zipcode guests    sqft price\n     int64    string  string  int64 float64 int64\n  ──────── ───────── ─────── ────── ─────── ─────\n0 42279170 Manhattan   10013      2     nan     0\n1 42384530 Manhattan   10036      2     nan     0\n2 18835820 Manhattan   10021      2     nan    10\n3 20171179 Manhattan   10027      2     nan    10\n4 14858544 Manhattan              2     nan    15\n5 31397084 Manhattan   10002      2     nan    19\n6 22289683 Manhattan   10031      2     nan    20\n7  7760204 Manhattan   10040      2     nan    22\n8 43292527 Manhattan   10033      2     nan    22\n9 43268040 Manhattan   10033      2     nan    23\n.\n```\n\n## Documentation\n\nhttps://dataiter.readthedocs.io/\n\nIf you're familiar with either dplyr (R) or Pandas (Python), the\ncomparison table in the documentation will give you a quick overview of\nthe differences and similarities in common operations.\n\nhttps://dataiter.readthedocs.io/en/stable/comparison.html\n\n## Development\n\nTo install a virtualenv for development, use\n\n    make venv\n\nor, for a specific Python version\n\n    make PYTHON=python3.X venv\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fotsaloma%2Fdataiter","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fotsaloma%2Fdataiter","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fotsaloma%2Fdataiter/lists"}