https://github.com/maastaar/mapreduceimplementation
A simple MapReduce implementation in C based on Google's paper "MapReduce: Simplified Data Processing on Large Clusters"
https://github.com/maastaar/mapreduceimplementation
bigdata mapreduce
Last synced: 2 months ago
JSON representation
A simple MapReduce implementation in C based on Google's paper "MapReduce: Simplified Data Processing on Large Clusters"
- Host: GitHub
- URL: https://github.com/maastaar/mapreduceimplementation
- Owner: MaaSTaaR
- Created: 2016-10-01T17:12:19.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2016-10-01T17:13:05.000Z (over 9 years ago)
- Last Synced: 2024-12-27T08:12:06.410Z (about 1 year ago)
- Topics: bigdata, mapreduce
- Language: C
- Size: 66.4 KB
- Stars: 2
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Simple MapReduce Implementation
===============================
A simple MapReduce implementation in C based on Google's paper "[MapReduce: Simplified Data Processing on Large Clusters](http://jayurbain.com/msoe/cs4230/Readings/MapReduce%20-%20Simplified%20Data%20Processing%20on%20Large%20Clusters.pdf)" under the supervision of Prof. [Hussain Almohri](http://www.halmohri.com).
In this implementation Map & Reduce functions are simple TCP/IP server that receive a line from the worker (map or reduce) process it and send it back to the worker. For now search and identity functions are implemented. The path of input data can be found in the file "workers/map/map.c".
By using "make" command the binaries will be generated in "bin" directory. They will be:
* client.o: A test program that request from MapReduce to process a request.
* map_function_search.o: Map function implementation to search on the files.
* reduce_function_identity.o: Reduce function implementation. The identity function.
* map_worker.o: Map worker. Must be run on Map nodes.
* reduce_worker.o: Reduce worker. Must be run on Reduce nodes.
* reader.o: The reader which is used by reduce workers to read map output.
* master.o: The master of the nodes.
The cluster workers are defined on the file "master/workers_list.c".
License: GNU GPL.