Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dgryski/dmrgo
Go library for writing standalone Map/Reduce jobs or for use with Hadoop's streaming protocol
https://github.com/dgryski/dmrgo
Last synced: 15 days ago
JSON representation
Go library for writing standalone Map/Reduce jobs or for use with Hadoop's streaming protocol
- Host: GitHub
- URL: https://github.com/dgryski/dmrgo
- Owner: dgryski
- Created: 2011-09-11T11:15:52.000Z (about 13 years ago)
- Default Branch: master
- Last Pushed: 2014-03-06T20:54:01.000Z (over 10 years ago)
- Last Synced: 2024-10-14T08:13:42.856Z (29 days ago)
- Language: Go
- Homepage:
- Size: 172 KB
- Stars: 104
- Watchers: 16
- Forks: 8
- Open Issues: 0
-
Metadata Files:
- Readme: README
Awesome Lists containing this project
README
dmrgo is a Go library for writing map/reduce jobs.
It can be used with Hadoop's streaming protocol, but also includes a standalone
map/reduce implementation (including partitioner) for 'small' jobs (~5G-10G).It is partially based on ideas from Yelp's MrJob package for Python, but since
the Go is statically typed I've tried to make the API match more closely with
Hadoop's Java API.The traditional "word count" example is in the examples directory.
This code is licensed under the GPLv3, or at your option any later version.
Further reading:
MrJob:
http://packages.python.org/mrjob/
https://github.com/Yelp/mrjobHadoop map/reduce tutorial:
http://hadoop.apache.org/common/docs/current/mapred_tutorial.htmlHadoop streaming protocol:
http://hadoop.apache.org/common/docs/current/streaming.html