An open API service indexing awesome lists of open source software.

https://github.com/pompierninja/parallelism-test

beam bam boom
https://github.com/pompierninja/parallelism-test

apache-beam dataflow

Last synced: about 1 year ago
JSON representation

beam bam boom

Awesome Lists containing this project

README

          

**test: Trying to improve the parallelism of 20 million rows of a .csv file using Dataflow**

[Dataset](https://grouplens.org/datasets/movielens/20m/)

[Why we should reshuffle?](https://stackoverflow.com/questions/54121642/apache-beam-dataflow-reshuffle)