https://github.com/poga/hyperspark

Decentralized data processing platform
https://github.com/poga/hyperspark

Last synced: about 1 month ago
JSON representation

Decentralized data processing platform

Host: GitHub
URL: https://github.com/poga/hyperspark
Owner: poga
Created: 2016-10-17T15:32:27.000Z (over 8 years ago)
Default Branch: master
Last Pushed: 2017-03-01T22:01:32.000Z (about 8 years ago)
Last Synced: 2025-03-18T14:05:57.365Z (about 2 months ago)
Language: JavaScript
Homepage:
Size: 43 KB
Stars: 6
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

awesome-dat - hyperspark - decentralized data processing platform for dat archives, inspired by `spark` (Outdated / Other Related Dat Project Modules)
awesome-dat - hyperspark - decentralized data processing platform for dat archives, inspired by `spark` (Outdated / Other Related Dat Project Modules)

README

        # Hyperspark

Hyperspark is a decentralized data processing tool for [Dat](http://dat-data.com). Inspired by [Spark](https://spark.apache.org/)

Basically, it's just a fancy wrapper around [Dat Archive](datproject.org)

**This is a work-in-progress. Any idea/suggestion is welcome**

### Goal

* Reuse intermediate data.

* Minimize bandwidth usage.

* Share computation power.

## How to use

#### Data owner

It's simple! Just share your data with dat: `dat .`

#### Data Scientist

Define your ideas with [transforms and actions](https://github.com/poga/dat-transform) without worrying about fetching and storing data.

#### Computation Provider

Run transformations defined by researchers. Cache and share intermediate data so everyone can re-use the knowledge without having their own computation cluster.

---

## Synopsis

define RDD on dat with [dat-transform](https://github.com/poga/dat-transform)

word-counting:

```js

const hs = require('hyperspark')

var rdd = hs()

// define transforms

var result = rdd

  .splitBy(/[\n\s]/)

  .filter(x => x !== '')

  .map(word => kv(word, 1))

// actual run(action)

result.reduceByKey((x, y) => x + y)

  .toArray(res => {

    console.log(res) // [{bar: 2, baz: 1, foo: 1}]

  })

```

## Related Modules

* RDD-style data transformation with js. [dat-transform](https://github.com/poga/dat-transform)

* Analyze data inside dat archive with RDD-style API. [dat-ipynb](https://github.com/poga/dat-ipynb-demo), using [nel](https://github.com/poga/nel)

* Convert iPython Notebook to Markdown. [ipynb2md](https://github.com/poga/ipynb2md)

* Attach file to markdown with dat. [markdown-attachment-p2p](https://github.com/poga/markdown-attachment-p2p)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/poga/hyperspark

Awesome Lists containing this project

README