An open API service indexing awesome lists of open source software.

https://github.com/hamid-rezaei/could-computing

This repository contains three cloud computing projects that showcase the power of distributed computing and machine learning algorithms using platforms like Apache Hadoop, Apache Spark, and modern frameworks.
https://github.com/hamid-rezaei/could-computing

Last synced: 8 months ago
JSON representation

This repository contains three cloud computing projects that showcase the power of distributed computing and machine learning algorithms using platforms like Apache Hadoop, Apache Spark, and modern frameworks.

Awesome Lists containing this project

README

          

# Cloud Computing Projects

This repository contains three cloud computing projects that showcase the power of distributed computing and machine learning algorithms using platforms like Apache Hadoop, Apache Spark, and modern frameworks.

## Projects:
1. [Music Recommender](https://github.com/Hamid-Rezaei/Music-Recommender)
2. [Search Movie](https://github.com/Hamid-Rezaei/Search-Movie)
3. [Hadoop and Spark](https://github.com/Hamid-Rezaei/Hadoop-Spark-Project)

### 1. Music Recommender
The Music Recommender project is designed to provide personalized music recommendations using collaborative filtering algorithms. It uses cloud computing resources to handle large datasets and train machine learning models efficiently.

- **Key Features**:
- Collaborative filtering algorithm for music recommendations.
- Scalable architecture using cloud resources.
- Efficient handling of user data for personalized suggestions.

- **Technologies Used**:
- Python
- Apache Spark
- AWS EC2 for cloud resources
- Jupyter Notebooks for development and visualization

### 2. Search Movie
This project provides a cloud-based movie search engine. It leverages a distributed architecture to handle large datasets of movie information, allowing users to search for movies by title, genre, and year.

- **Key Features**:
- Fast and efficient search functionality.
- Cloud-based deployment for high scalability.
- Support for various search filters (title, genre, etc.).

- **Technologies Used**:
- Python
- Elasticsearch for search indexing
- Flask for the web interface
- Docker for containerization
- Kubernetes for orchestration

### 3. Hadoop and Spark Project
This project demonstrates the use of Apache Hadoop and Apache Spark for processing large-scale data. It covers tasks like data ingestion, transformation, and analysis using MapReduce and Spark’s distributed computing model.

- **Key Features**:
- Distributed data processing using Hadoop's MapReduce.
- In-memory data analysis using Apache Spark.
- Integration with HDFS (Hadoop Distributed File System) for scalable storage.

- **Technologies Used**:
- Apache Hadoop
- Apache Spark
- HDFS
- Python/Scala for scripting