https://github.com/sepppenner/mapreducewithhadoop
MapReduce implementation for a word count with Apache Hadoop.
https://github.com/sepppenner/mapreducewithhadoop
Last synced: 7 days ago
JSON representation
MapReduce implementation for a word count with Apache Hadoop.
- Host: GitHub
- URL: https://github.com/sepppenner/mapreducewithhadoop
- Owner: SeppPenner
- License: mit
- Created: 2017-04-22T11:54:24.000Z (about 9 years ago)
- Default Branch: master
- Last Pushed: 2020-06-04T15:18:04.000Z (about 6 years ago)
- Last Synced: 2025-02-24T03:30:52.857Z (over 1 year ago)
- Language: Java
- Homepage:
- Size: 17.6 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: Changelog.md
- License: License.txt
Awesome Lists containing this project
README
MapReduce
=========
MapReduce implementation for a word count with Apache Hadoop.
[](https://ci.appveyor.com/project/SeppPenner/mapreducewithhadoop)
[](https://github.com/SeppPenner/MapReduceWithHadoop/issues)
[](https://github.com/SeppPenner/MapReduceWithHadoop/network)
[](https://github.com/SeppPenner/MapReduceWithHadoop/stargazers)
[](https://raw.githubusercontent.com/SeppPenner/MapReduceWithHadoop/master/License.txt)
## First step:
Download/ create the file you want to search. In our case as example I used: https://www.dropbox.com/s/6yg2xtg10uri3qx/movies.list?dl=0
## Basic usage:
```bash
export JAVA_HOME=/usr/java/default
export PATH=$JAVA_HOME/bin:$PATH
export HADOOP_CLASSPATH=$JAVA_HOME/lib/tools.jar
/usr/bin/hadoop com.sun.tools.javac.Main WordCount.java
jar cf mwc.jar WordCount*.class
hadoop fs -mkdir /user/YourFolder
hadoop fs -mkdir InputWordCount
hadoop fs -copyFromLocal /home/YourFolder/WordCount/movies.list InputWordCount/movies.list
hdfs dfs -ls /user/YourFolder/InputWordCount/
hadoop jar mwc.jar WordCount /user/YourFolder/InputWordCount/ /user/YourFolder/output
hadoop fs -get /user/YourFolder/output /home/YourFolder/WordCount/
hdfs dfs -ls /user/YourFolder/output
hdfs dfs -cat /user/YourFolder/output/part-r-00000
```
Change history
--------------
See the [Changelog](https://github.com/SeppPenner/MapReduceWithHadoop/blob/master/Changelog.md).