Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists by longNguyen010203

A curated list of projects in awesome lists by longNguyen010203 .

https://github.com/longnguyen010203/zillow-home-value-prediction

πŸŒˆπŸ“ŠπŸ“ˆ The Zillow Home Value Prediction project employs linear regression models on Kaggle datasets to forecast house prices. πŸ“‰πŸ’°Using Apache Spark (PySpark) within a Docker setup enables efficient data preprocessing, exploration, analysis, visualization, and model building with distributed computing for parallel computation.

analysis apache-spark distributed-computing docker docker-compose feature-engineering jupyter-notebook jupyterlab linear-regression machine-learning models parallel-computing prediction-model preprocessing pyspark visualization

Last synced: 24 Sep 2024

https://github.com/longnguyen010203/spark-kafka-self-learning

πŸ“šπŸŒŠπŸŽ“ A third-year student is self-studying Spark and Kafka as part of their πŸ‘· data engineering journey, with the goal of securing an πŸ“¬ internship or fresher job in 2024.

apache-kafka apache-spark cluster docker docker-compose zookeeper

Last synced: 24 Sep 2024

https://github.com/longnguyen010203/spark-processing-aws

πŸ‘·πŸŒ‡ Set up and build a big data processing pipeline with Apache Spark, πŸ“¦ AWS services (S3, EMR, EC2, IAM, VPC, Redshift) Terraform to setup the infrastructure and Integration Airflow to automate workflowsπŸ₯Š

apache-airflow apache-spark aws aws-ec2 aws-s3 aws-services cloud-computing data-pipeline emr-cluster iam pyspark redshift spark-cluster spark-master spark-worker terraform

Last synced: 24 Sep 2024