https://github.com/goatcheesesaladwithpeanutoildressing/parallelism-test
beam bam boom
https://github.com/goatcheesesaladwithpeanutoildressing/parallelism-test
apache-beam dataflow
Last synced: over 1 year ago
JSON representation
beam bam boom
- Host: GitHub
- URL: https://github.com/goatcheesesaladwithpeanutoildressing/parallelism-test
- Owner: goatcheesesaladwithpeanutoildressing
- Created: 2019-09-13T15:04:30.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2019-09-13T15:09:16.000Z (almost 7 years ago)
- Last Synced: 2025-02-23T23:34:06.495Z (over 1 year ago)
- Topics: apache-beam, dataflow
- Language: Go
- Size: 3.91 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
**test: Trying to improve the parallelism of 20 million rows of a .csv file using Dataflow**
[Dataset](https://grouplens.org/datasets/movielens/20m/)
[Why we should reshuffle?](https://stackoverflow.com/questions/54121642/apache-beam-dataflow-reshuffle)