{"id":15546716,"url":"https://github.com/dataship/python-dataship","last_synced_at":"2025-04-23T18:09:41.655Z","repository":{"id":62566802,"uuid":"115739049","full_name":"dataship/python-dataship","owner":"dataship","description":"Lightweight tools for reading, writing and storing data, locally and over the internet for python","archived":false,"fork":false,"pushed_at":"2019-05-09T19:25:42.000Z","size":17,"stargazers_count":6,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-23T18:09:36.585Z","etag":null,"topics":["column-store","data-science","machine-learning","numpy","pandas"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dataship.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-12-29T16:50:56.000Z","updated_at":"2020-04-12T23:30:19.000Z","dependencies_parsed_at":"2022-11-03T16:30:23.244Z","dependency_job_id":null,"html_url":"https://github.com/dataship/python-dataship","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dataship%2Fpython-dataship","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dataship%2Fpython-dataship/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dataship%2Fpython-dataship/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dataship%2Fpython-dataship/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dataship","download_url":"https://codeload.github.com/dataship/python-dataship/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250487531,"owners_count":21438612,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["column-store","data-science","machine-learning","numpy","pandas"],"created_at":"2024-10-02T13:03:48.424Z","updated_at":"2025-04-23T18:09:41.635Z","avatar_url":"https://github.com/dataship.png","language":"Python","readme":"# dataship\n\nLightweight tools for reading, writing and storing data, locally and over the internet.\n\nAllows easy interaction with browser and node based data visualization and analysis tools.\nBuilt on numpy and works with pandas.\n\n# install\n`pip install dataship`\n\n# example\n\nWrite files locally like this,\n```python\nimport numpy as np\nfrom dataship import beam\n\nnames = ['eeny', 'meeny', 'miney', 'moe']\ncounts = np.array([1, 2, 3, 4], dtype=\"int8\")\n\ncolumns = {\n    \"name\" : names,\n    \"count\" : counts\n}\n\nbeam.write(\"./toeses\", columns)\n```\n\nRead that into pandas like this,\n```python\ncolumns = beam.read(\"./toeses\")\nframe = beam.to_dataframe(columns) # Dataframe\n```\n\nThe variable `frame` now contains a pandas Dataframe that looks like this:\n\nname | count\n-----|-------\neeny | 1\nmeeny | 2\nminey | 3\nmoe | 4\n\n\nand the directory `./toeses` contains these files:\n\n```shell\nindex.json # special file describing columns (json)\nname.json # data for name column (json)\ncount.i8 # data for count column (binary)\n```\n\nYou can also serialize an existing Pandas Dataframe like this,\n```python\ncolumns = beam.from_dataframe(frame)\nbeam.write(\"./toeses\", columns)\n```\n\nData files can be viewed from the command line with [arrayviewer](https://github.com/waylonflinn/arrayviewer)\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdataship%2Fpython-dataship","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdataship%2Fpython-dataship","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdataship%2Fpython-dataship/lists"}