https://github.com/malisha4065/hadoopproject
Map reducing task with apache hadoop.
https://github.com/malisha4065/hadoopproject
apache apache-hadoop hadoop-yarn map-reduce
Last synced: 11 months ago
JSON representation
Map reducing task with apache hadoop.
- Host: GitHub
- URL: https://github.com/malisha4065/hadoopproject
- Owner: Malisha4065
- Created: 2025-05-25T18:35:42.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-06-05T05:04:23.000Z (about 1 year ago)
- Last Synced: 2025-06-05T07:15:46.156Z (about 1 year ago)
- Topics: apache, apache-hadoop, hadoop-yarn, map-reduce
- Language: Java
- Homepage:
- Size: 17.6 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Analysing Top IP Addresses and Error codes using Apache Hadoop
Using NASA web server logs of July 95 [Dataset link](https://ita.ee.lbl.gov/html/contrib/NASA-HTTP.html) top IP addresses and error codes were extracted using Apache Hadoop. Map reducing logic was implemented in Java.
## Hadoop Cluster with Hadoop Yarn for resource management was custom configured using original apache/hadoop:3.4.1 docker image
Feel free to use the configuration. Note that the bash/batch scripts in this repository are written for the map-reducing task mentioned above.
### Linux
```bash
./initial.sh
```
### Windows
## make sure to convert compile.sh and run-jobs.sh inside the scripts folder to LF format before running the script
```bash
.\initial.bat
```