Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists by longNguyen010203
A curated list of projects in awesome lists by longNguyen010203 .
https://github.com/longnguyen010203/zillow-home-value-prediction
πππ The Zillow Home Value Prediction project employs linear regression models on Kaggle datasets to forecast house prices. ππ°Using Apache Spark (PySpark) within a Docker setup enables efficient data preprocessing, exploration, analysis, visualization, and model building with distributed computing for parallel computation.
analysis apache-spark distributed-computing docker docker-compose feature-engineering jupyter-notebook jupyterlab linear-regression machine-learning models parallel-computing prediction-model preprocessing pyspark visualization
Last synced: 24 Sep 2024
https://github.com/longnguyen010203/spark-kafka-self-learning
πππ A third-year student is self-studying Spark and Kafka as part of their π· data engineering journey, with the goal of securing an π¬ internship or fresher job in 2024.
apache-kafka apache-spark cluster docker docker-compose zookeeper
Last synced: 24 Sep 2024
https://github.com/longnguyen010203/spark-processing-aws
π·π Set up and build a big data processing pipeline with Apache Spark, π¦ AWS services (S3, EMR, EC2, IAM, VPC, Redshift) Terraform to setup the infrastructure and Integration Airflow to automate workflowsπ₯
apache-airflow apache-spark aws aws-ec2 aws-s3 aws-services cloud-computing data-pipeline emr-cluster iam pyspark redshift spark-cluster spark-master spark-worker terraform
Last synced: 24 Sep 2024