https://github.com/rickymiura/slack-posts-eda

Performing EDA on a large dataset of Slack posts using Apache Spark and AWS to efficiently uncover trends and insights at scale.
https://github.com/rickymiura/slack-posts-eda

big-data distributed-computing spark

Last synced: about 1 month ago
JSON representation

Performing EDA on a large dataset of Slack posts using Apache Spark and AWS to efficiently uncover trends and insights at scale.

Host: GitHub
URL: https://github.com/rickymiura/slack-posts-eda
Owner: RickyMiura
Created: 2025-01-26T08:18:13.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-01-26T08:25:54.000Z (over 1 year ago)
Last Synced: 2025-03-23T00:47:31.804Z (about 1 year ago)
Topics: big-data, distributed-computing, spark
Language: Jupyter Notebook
Homepage:
Size: 447 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Overview

This project focuses on performing exploratory data analysis (EDA) on a large dataset of Slack posts using **Apache Spark** and **AWS**. The primary objective was to handle and analyze the big dataset efficiently, uncovering trends, patterns, and insights from Slack messages at scale.

## Key Highlights:
- **Big Data Processing**: Leveraged Spark for distributed data processing, enabling efficient handling of large Slack datasets.
- **Cloud Integration**: Utilized AWS services for data storage, processing, and scaling the analysis infrastructure.
- **Insights and Trends**: Explored key metrics such as message frequency, user activity patterns, and common topics across Slack posts.

This project demonstrates the power of combining **big data tools** like Spark with **cloud computing** to analyze large datasets and generate actionable insights.

# Contributors
1. Ricky Miura
2. Gopi Maguluri
3. Nihal Karim
4. Pooja Baralu Umesh

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/rickymiura/slack-posts-eda

Awesome Lists containing this project

README