Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dataship/python-dataship
Lightweight tools for reading, writing and storing data, locally and over the internet for python
https://github.com/dataship/python-dataship
column-store data-science machine-learning numpy pandas
Last synced: 3 months ago
JSON representation
Lightweight tools for reading, writing and storing data, locally and over the internet for python
- Host: GitHub
- URL: https://github.com/dataship/python-dataship
- Owner: dataship
- License: mit
- Created: 2017-12-29T16:50:56.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2019-05-09T19:25:42.000Z (over 5 years ago)
- Last Synced: 2024-10-09T13:13:31.495Z (4 months ago)
- Topics: column-store, data-science, machine-learning, numpy, pandas
- Language: Python
- Size: 16.6 KB
- Stars: 6
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# dataship
Lightweight tools for reading, writing and storing data, locally and over the internet.
Allows easy interaction with browser and node based data visualization and analysis tools.
Built on numpy and works with pandas.# install
`pip install dataship`# example
Write files locally like this,
```python
import numpy as np
from dataship import beamnames = ['eeny', 'meeny', 'miney', 'moe']
counts = np.array([1, 2, 3, 4], dtype="int8")columns = {
"name" : names,
"count" : counts
}beam.write("./toeses", columns)
```Read that into pandas like this,
```python
columns = beam.read("./toeses")
frame = beam.to_dataframe(columns) # Dataframe
```The variable `frame` now contains a pandas Dataframe that looks like this:
name | count
-----|-------
eeny | 1
meeny | 2
miney | 3
moe | 4and the directory `./toeses` contains these files:
```shell
index.json # special file describing columns (json)
name.json # data for name column (json)
count.i8 # data for count column (binary)
```You can also serialize an existing Pandas Dataframe like this,
```python
columns = beam.from_dataframe(frame)
beam.write("./toeses", columns)
```Data files can be viewed from the command line with [arrayviewer](https://github.com/waylonflinn/arrayviewer)