Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/johnnymo87/word_count
Word count scripts for hadoop practice
https://github.com/johnnymo87/word_count
Last synced: 1 day ago
JSON representation
Word count scripts for hadoop practice
- Host: GitHub
- URL: https://github.com/johnnymo87/word_count
- Owner: johnnymo87
- Created: 2013-10-20T23:15:17.000Z (about 11 years ago)
- Default Branch: master
- Last Pushed: 2013-10-21T00:32:56.000Z (about 11 years ago)
- Last Synced: 2024-10-18T20:17:09.087Z (20 days ago)
- Language: Python
- Size: 102 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Hadoop setup guides:
* http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
* http://www.drdobbs.com/database/pydoop-writing-hadoop-programs-in-python/240156473?pgno=1Word count map reduce scripts:
* http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/
* https://github.com/glennklockwood/hpchadoop/tree/master/wordcount.pyEquivalent series of pipes in terminal:
```
cat /home/hduser/data/ulysses | /home/hduser/scripts/mapper.py | sort -k1,1
| /home/hduser/scripts/reducer.py | sort -nrk2 | head -10
```