Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/yyassif/docker-spark-hadoop-for-recommendation
Using All Big Data Technologies In Order To Apply ALS Algorithm To Recommend Amazon Prodcuts
https://github.com/yyassif/docker-spark-hadoop-for-recommendation
Last synced: 28 days ago
JSON representation
Using All Big Data Technologies In Order To Apply ALS Algorithm To Recommend Amazon Prodcuts
- Host: GitHub
- URL: https://github.com/yyassif/docker-spark-hadoop-for-recommendation
- Owner: yyassif
- Created: 2023-12-21T22:51:02.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-08-01T20:34:25.000Z (5 months ago)
- Last Synced: 2024-10-15T03:52:47.612Z (2 months ago)
- Language: Scala
- Size: 25.4 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# How to use HDFS/Spark
## Installation
### Base Images
```sh
docker pull bde2020/hadoop-namenode:1.1.0-hadoop2.8-java8
docker pull bde2020/hadoop-datanode:1.1.0-hadoop2.8-java8
docker pull bde2020/spark-base:2.1.0-hadoop2.8-hive-java8
docker pull bde2020/spark-master:2.1.0-hadoop2.8-hive-java8
docker pull bde2020/spark-worker:2.1.0-hadoop2.8-hive-java8
docker pull bde2020/spark-notebook:2.1.0-hadoop2.8-hive
docker pull bde2020/hdfs-filebrowser:3.11
```### Docker Compose File
To start an HDFS/Spark Workbench, run:
```sh
docker-compose up -d
```## Interfaces
- Namenode: http://localhost:50070
- Datanode: http://localhost:50075
- Spark-master: http://localhost:8080
- Spark-notebook: http://localhost:9001
- Hue (HDFS Filebrowser): http://localhost:8088/home# Recommendation Spark Application
Jar file is packaged under the jarfile directory.
- Compiled with scala 2.11.11
- For Spark 2.1.0## How to run (using Makefile)
To Run the make the make command I've come up with this order which seem very mandatory to properly have the job done.
### Start the Preprocessing
```
make prepare-raw-dataset
```### Start the Data Ingestion into HDFS
```
make ingest-hdfs
```### Create the Fat-JAR File
```
make jar
```### Start the Prediction
```
make prediction
```### Save the Results into a result directory
```
make prediction-result
```### Clean the Output directory in HDFS
```
make clean-output
```### Clean the Input directory in HDFS
```
make clean-input
```