Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/dataship/python-dataship

Lightweight tools for reading, writing and storing data, locally and over the internet for python
https://github.com/dataship/python-dataship

column-store data-science machine-learning numpy pandas

Last synced: 28 days ago
JSON representation

Lightweight tools for reading, writing and storing data, locally and over the internet for python

Awesome Lists containing this project

README

        

# dataship

Lightweight tools for reading, writing and storing data, locally and over the internet.

Allows easy interaction with browser and node based data visualization and analysis tools.
Built on numpy and works with pandas.

# install
`pip install dataship`

# example

Write files locally like this,
```python
import numpy as np
from dataship import beam

names = ['eeny', 'meeny', 'miney', 'moe']
counts = np.array([1, 2, 3, 4], dtype="int8")

columns = {
"name" : names,
"count" : counts
}

beam.write("./toeses", columns)
```

Read that into pandas like this,
```python
columns = beam.read("./toeses")
frame = beam.to_dataframe(columns) # Dataframe
```

The variable `frame` now contains a pandas Dataframe that looks like this:

name | count
-----|-------
eeny | 1
meeny | 2
miney | 3
moe | 4

and the directory `./toeses` contains these files:

```shell
index.json # special file describing columns (json)
name.json # data for name column (json)
count.i8 # data for count column (binary)
```

You can also serialize an existing Pandas Dataframe like this,
```python
columns = beam.from_dataframe(frame)
beam.write("./toeses", columns)
```

Data files can be viewed from the command line with [arrayviewer](https://github.com/waylonflinn/arrayviewer)