{"id":30749227,"url":"https://github.com/manula-fernando/tweetpulse-pro","last_synced_at":"2026-04-13T03:03:20.198Z","repository":{"id":310086421,"uuid":"1038692526","full_name":"Manula-Fernando/TweetPulse-Pro","owner":"Manula-Fernando","description":"TweetPulse Pro: Real-time Twitter sentiment analytics using Kafka, Spark (PySpark), MongoDB, Flask API, and a Django dashboard. Docker-ready.","archived":false,"fork":false,"pushed_at":"2025-08-15T17:34:05.000Z","size":0,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-08-15T17:41:55.362Z","etag":null,"topics":["analytics","django","docker","flask","kafka","mongodb","pyspark","real-time","sentiment","spark","twitter"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Manula-Fernando.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-08-15T16:53:19.000Z","updated_at":"2025-08-15T17:26:01.000Z","dependencies_parsed_at":"2025-08-15T17:42:04.027Z","dependency_job_id":null,"html_url":"https://github.com/Manula-Fernando/TweetPulse-Pro","commit_stats":null,"previous_names":["manula-fernando/tweetpulse-pro"],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/Manula-Fernando/TweetPulse-Pro","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Manula-Fernando%2FTweetPulse-Pro","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Manula-Fernando%2FTweetPulse-Pro/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Manula-Fernando%2FTweetPulse-Pro/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Manula-Fernando%2FTweetPulse-Pro/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Manula-Fernando","download_url":"https://codeload.github.com/Manula-Fernando/TweetPulse-Pro/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Manula-Fernando%2FTweetPulse-Pro/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273561342,"owners_count":25127396,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-04T02:00:08.968Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["analytics","django","docker","flask","kafka","mongodb","pyspark","real-time","sentiment","spark","twitter"],"created_at":"2025-09-04T06:03:01.560Z","updated_at":"2026-04-13T03:03:20.103Z","avatar_url":"https://github.com/Manula-Fernando.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n\n\n# 🚀 TweetPulse Pro - Advanced Real-Time Twitter Sentiment Analytics\n\n[![Python](https://img.shields.io/badge/Python-3.10+-brightgreen.svg)](https://python.org)\n[![Docker](https://img.shields.io/badge/Docker-Ready-blue.svg)](https://docker.com)\n[![Performance](https://img.shields.io/badge/Performance-Optimized-00ff41.svg)]()\n[![AI](https://img.shields.io/badge/AI-PySpark%20ML-00cc33.svg)]()\n[![Release](https://img.shields.io/github/v/release/Manula-Fernando/TweetPulse-Pro)](https://github.com/Manula-Fernando/TweetPulse-Pro/releases)\n[![API Image](https://img.shields.io/badge/GHCR-tweetpulse--api-2ea043?logo=github)](https://ghcr.io/manula-fernando/tweetpulse-api)\n[![Producer Image](https://img.shields.io/badge/GHCR-tweetpulse--producer-2ea043?logo=github)](https://ghcr.io/manula-fernando/tweetpulse-producer)\n[![Consumer Image](https://img.shields.io/badge/GHCR-tweetpulse--consumer-2ea043?logo=github)](https://ghcr.io/manula-fernando/tweetpulse-consumer)\n[![Dashboard Image](https://img.shields.io/badge/GHCR-tweetpulse--dashboard-2ea043?logo=github)](https://ghcr.io/manula-fernando/tweetpulse-dashboard)\n\n## 🎯 Overview\n\n**TweetPulse Pro** is a cutting-edge, high-performance real-time Twitter sentiment analysis platform that processes tweets with lightning speed and accuracy. Built with a modern black \u0026 green aesthetic and optimized for enterprise use.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"imgs/Flow_DIagram.png\" alt=\"Project Architecture\" width=\"800\"/\u003e\n  \n\u003c/p\u003e\n\n## Overview\n\n\n**TweetPulse Pro** is a modern, production-ready analytics platform for real-time sentiment analysis of Twitter data. It leverages industry-standard technologies—**Apache Kafka**, **Apache Spark**, **MongoDB**, **Django**, **Flask REST API**, and **Docker**—to deliver scalable, reliable, and extensible analytics and visualization.\n\n**Author:** Manula Fernando  \n**Last Updated:** August 15, 2025\n\n---\n\n## Key Features\n\n- **Real-Time Data Pipeline**: Kafka ingests tweets, Spark Streaming processes and classifies sentiment, MongoDB stores results.\n- **RESTful Analytics API**: Flask-based API exposes analytics endpoints for dashboards and external integrations.\n- **Modern Dashboard**: Django + Bootstrap 5 dashboard with advanced, interactive visualizations (Chart.js, matplotlib, seaborn).\n- **Modular, Configurable Code**: All scripts use YAML config, logging, and CLI overrides for easy customization and deployment.\n- **Full Docker Orchestration**: One-command startup with Docker Compose for all services (Kafka, Zookeeper, MongoDB, Producer, Consumer, API, Dashboard).\n- **Production-Ready Practices**: Error handling, logging, environment variables, and clear separation of concerns.\n\n---\n\n\n\n## Repository Structure\n\n```text\nReal-Time-Twitter-Sentiment-Analysis/\n├── tweetpulse-dashboard/       # Django dashboard (Bootstrap, Chart.js, user features)\n│   ├── manage.py\n│   ├── dashboard/              # Django app code\n│   ├── templates/              # HTML templates\n│   └── logistic_regression_model.pkl/  # Model for dashboard\n├── tweetpulse-pipeline/        # Kafka producer \u0026 Spark consumer (YAML-configurable)\n│   ├── kafka_producer.py\n│   ├── kafka_spark_consumer.py\n│   ├── producer_config.yaml\n│   ├── consumer_config.yaml\n│   ├── analytics_api.py        # Flask REST API for analytics\n│   ├── Dockerfile.producer\n│   ├── Dockerfile.consumer\n│   ├── Dockerfile.api\n│   └── docker-compose.analytics.yml\n├── tweetpulse-ml-model/        # Jupyter notebooks, datasets, trained models\n│   ├── Big_Data.ipynb\n│   ├── twitter_training.csv\n│   ├── twitter_validation.csv\n│   └── logistic_regression_model.pkl/\n├── imgs/                       # Architecture and dashboard images\n│   ├── Flow_DIagram.png\n│   ├── Dashboard_1.png, Dashboard_2.png, Dashboard_3.png, Dashboard_4.png\n│   ├── Login_Page.png, Register_Page.png\n│   ├── MongoDB_Connection.png, Docker_Container.png\n│   └── Confusion_matrix.png, Text_Classifer.png\n├── requirements.txt            # Python dependencies\n├── zk-single-kafka-single.yml  # Kafka/Zookeeper Docker Compose\n└── README.md                   # Project documentation\n```\n\n---\n\n\n\n## Quick Start (Recommended: Dockerized Workflow)\n\n\u003e **Recommended:** Use Docker Compose for a reproducible, production-like environment. All dependencies and services are containerized.\n\n### 1. Prerequisites\n\n- [Docker Desktop](https://www.docker.com/products/docker-desktop/) (Windows/Mac/Linux)\n- [Git](https://git-scm.com/)\n\n\n### 2. Clone the Repository\n\n```bash\ngit clone \u003cyour-repo-url\u003e\ncd TweetPulse-Pro\n```\n\n\n### 3. Build and Start the Full Analytics Stack\n\n```powershell\ndocker compose -f tweetpulse-pipeline/docker-compose.analytics.yml up --build\n```\n\n\nThis will launch:\n- Zookeeper \u0026 Kafka (real-time ingestion)\n- MongoDB (storage)\n- Producer (tweets to Kafka)\n- Consumer (Spark streaming, sentiment analysis)\n- REST API (analytics endpoints)\n- Django Dashboard (visualization)\n\n\n### 4. Access the Platform\n\n- **Dashboard:** [http://localhost:8000](http://localhost:8000)\n- **REST API:** [http://localhost:5000](http://localhost:5000)\n- **MongoDB Compass:** Connect to `mongodb://localhost:27017`\n\n---\n\n\n## Full Setup \u0026 Manual Steps (For Advanced Users)\n\n### 1. Python Environment (Windows)\n\n- Install Python 3.10+ and create a virtual environment:\n  ```powershell\n  python -m venv .venv\n  .\\.venv\\Scripts\\Activate.ps1\n  pip install -r requirements.txt\n  ```\n\n\n### 2. Kafka \u0026 Zookeeper\n\n- Start with Docker Compose:\n  ```powershell\n  docker compose -f zk-single-kafka-single.yml up -d\n  ```\n\n\n### 3. MongoDB\n\n- Start MongoDB (Docker or local install). Use MongoDB Compass for GUI.\n\n\n### 4. Producer \u0026 Consumer\n\n- Edit `tweetpulse-pipeline/producer_config.yaml` and `consumer_config.yaml` as needed.\n- Run producer:\n  ```powershell\n  python tweetpulse-pipeline/kafka_producer.py --config tweetpulse-pipeline/producer_config.yaml\n  ```\n- Run consumer:\n  ```powershell\n  $env:JAVA_HOME = \"C:\\\\Program Files\\\\Java\\\\jdk-17\"  # adjust if needed\n  $env:PATH = \"$env:JAVA_HOME\\bin;$env:PATH\"\n  python tweetpulse-pipeline/kafka_spark_consumer.py --config tweetpulse-pipeline/consumer_config.yaml\n  ```\n\n\n### 5. Analytics API\n\n- Run Flask API:\n  ```powershell\n  python tweetpulse-pipeline/analytics_api.py\n  ```\n\n\n### 6. Django Dashboard\n\n- Collect static files:\n  ```powershell\n  python tweetpulse-dashboard/manage.py collectstatic --noinput\n  ```\n- Run server:\n  ```powershell\n  python tweetpulse-dashboard/manage.py runserver\n  ```\n\n### Notes for Windows\n\n- Ensure Docker Desktop is running and WSL2 backend is enabled.\n- If running services outside Docker, install Java 17 (required by Spark) and set JAVA_HOME.\n- If Kafka inside Docker and apps on host, use `localhost:9092`. If apps inside Docker, they use `kafka:9092` via compose.\n\n---\n\n---\n\n## Advanced Usage \u0026 Manual Workflow\n\n\u003e For development, debugging, or custom deployments, you can run individual services/scripts manually. See each folder's README or script docstrings for details.\n\n---\n\n\n\n## Best Practices \u0026 Industry Standards\n\n- **Containerization:** All services are Dockerized for reproducibility and scalability.\n- **Configuration Management:** Use YAML config files and environment variables for all scripts/services.\n- **Logging \u0026 Monitoring:** All components use structured logging; integrate with ELK/Prometheus for production.\n- **Modular Codebase:** Producer, consumer, and API are fully modular and independently deployable.\n- **Security:** Never commit secrets; use `.env` files and Docker secrets for credentials.\n- **Testing:** Unit/integration tests recommended for all modules (see `/tests` if present).\n- **Documentation:** Keep this README and all configs up to date; use docstrings and comments in code.\n- **Naming Consistency:** Use the project name \"TweetPulse Pro\" in all documentation, scripts, and UI for clarity and branding.\n- **Author:** Manula Fernando (2025)\n\n---\n\n\n## Data \u0026 Model\n\n- **Dataset:** [Kaggle Twitter Entity Sentiment Analysis](https://www.kaggle.com/datasets/jp797498e/twitter-entity-sentiment-analysis)\n- **ML Model:** Trained with PySpark; see `tweetpulse-ml-model/` for notebooks and details.\n\n---\n\n\n## Screenshots\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"imgs/Dashboard_1.png\" alt=\"Dashboard Home\" width=\"800\"/\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"imgs/Dashboard_2.png\" alt=\"Sentiment Distribution \u0026 Trends\" width=\"800\"/\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"imgs/Docker_Container.png\" alt=\"Docker Containers\" width=\"800\"/\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"imgs/MongoDB_Connection.png\" alt=\"MongoDB Connection\" width=\"800\"/\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"imgs/Confusion_matrix.png\" alt=\"Model Confusion Matrix\" width=\"600\"/\u003e\n\u003c/p\u003e\n\n\n## Author\n\n- **Manula Fernando**\n\nFor previous contributors and academic context, see project history.\n\n---\n\n\n\n## Support \u0026 Contribution\n\n- Open issues or pull requests for improvements, bugfixes, or questions.\n- For custom deployments, advanced analytics, or consulting, contact the author via GitHub.\n\n---\n\n\n**Happy coding! Explore, extend, and build on TweetPulse Pro for your own analytics needs.**\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmanula-fernando%2Ftweetpulse-pro","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmanula-fernando%2Ftweetpulse-pro","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmanula-fernando%2Ftweetpulse-pro/lists"}