Projects in Awesome Lists by subhashpolisetti
A curated list of projects in awesome lists by subhashpolisetti .
https://github.com/subhashpolisetti/autogluon_ml_end-to-end_implementations
This repository features practical AutoGluon implementations across tasks like Kaggle competition solutions, tabular classification/regression, multimodal analysis, feature engineering, and multi-label classification. Each Colab notebook runs real-world datasets (text, image, time series) in Google Colab with complete outputs for reproducibility.
Last synced: 12 Apr 2025
https://github.com/subhashpolisetti/autogluon_ml_end-to-end_implementations_part-2
Last synced: 12 Apr 2025
https://github.com/subhashpolisetti/mqtt-reservoir-data-pipeline
California Reservoir Water Monitoring System – A data aggregation and monitoring solution using MQTT to collect and summarize daily water mark levels across California's reservoirs. This system transforms CSV data into JSON, generates daily reports, and provides insights into water availability.
Last synced: 28 Feb 2025
https://github.com/subhashpolisetti/hoppingwindow-crudeoil
This project analyzes West Texas Intermediate (WTI) Crude Oil Prices using a hopping window strategy to calculate weekly mean and maximum prices, offering detailed insights into price fluctuations.
Last synced: 26 Mar 2025
https://github.com/subhashpolisetti/dimensionality_reduction
This repository demonstrates various dimensionality reduction techniques on image and tabular datasets. It explores and visualizes methods like PCA, t-SNE, UMAP, ISOMAP, and Autoencoders, comparing their performance and effectiveness with interactive visualizations.
databricks dimensionality-reduction pandas tabular-data
Last synced: 26 Mar 2025
https://github.com/subhashpolisetti/timeseriesdataaggregation
A Python project demonstrating time-series data aggregation using a hopping window function to calculate mean temperature readings from simulated sensor data. Ideal for streaming applications requiring low-latency insights over overlapping time intervals in real-time
Last synced: 26 Mar 2025
https://github.com/subhashpolisetti/apache-beam-and-eda-projects
Apache Beam and EDA Projects: Showcasing real-time data processing with Apache Beam, interactive visualizations with D3.js, and automated EDA with Sweetviz and PyCaret. Includes Jupyter notebooks and outputs for learning and exploration.
apachebeam d3js matplotlib pandas pycaret seaborn-plots sweetviz
Last synced: 26 Mar 2025
https://github.com/subhashpolisetti/eda-timeseries-tabular
This repository contains two machine learning projects: Air Quality Prediction, which predicts CO(GT) levels using environmental and pollutant data with AutoML, and NYC Taxi Fare Prediction, predicts taxi fares based on trip data using automated machine learning. Both projects showcase data analysis,preprocessing, and predictive modeling techniques
automl eda tabular-data time-series
Last synced: 26 Mar 2025
https://github.com/subhashpolisetti/crispdm-semma-kdd-workflows
This repository demonstrates three data mining methodologies applied to various real-world datasets: CRISP-DM (Weather Analysis), KDD (Social Media Ads Analysis), and SEMMA (Spotify Recommendation System). Each project includes data exploration, preprocessing, modeling, and evaluation steps, along with comprehensive documentation, supporting files,
clustering-algorithm crisp-dm kdd latex-template machine-learning numpy pandas random-forest research-paper semma
Last synced: 26 Mar 2025
https://github.com/subhashpolisetti/python-mqtt-message-stream
This project demonstrates asynchronous message streaming using MQTT for Internet of Things (IoT) applications. We’ll set up a Publisher and a Subscriber to handle 1,000,000 messages via the Mosquitto MQTT broker.
mqtt mqtt-broker mqtt-client python
Last synced: 26 Mar 2025
https://github.com/subhashpolisetti/realtime-websocket-client
This project demonstrates a WebSocket-based real-time communication system with a client-server setup and a basic user interface. The client sends 10,000 messages to the server, which responds to each message, showcasing efficient, long-lasting, full-duplex communication.
Last synced: 26 Mar 2025
https://github.com/subhashpolisetti/flask_real_chat_application
This is a real-time chat application built with Flask, Flask-SocketIO, and SQLite. Users can join chat rooms, send messages, and view message history in real-time. The app features a simple and responsive UI that provides a smooth chat experience
css flask flasksocketio html python sqlite
Last synced: 26 Mar 2025
https://github.com/subhashpolisetti/cmpe255-assignment-1
Data Mining Assignment
Last synced: 26 Mar 2025
https://github.com/subhashpolisetti/cat-dog-classification-cnn
CNN-Based Dogs vs. Cats Image Classifier A Convolutional Neural Network (CNN) model for classifying images of cats and dogs, built using TensorFlow and Keras. This project includes data preprocessing, data augmentation, and transfer learning techniques to achieve high classification accuracy.
Last synced: 02 Apr 2025
https://github.com/subhashpolisetti/llm-threat-model
This project addresses security challenges in implementing a Large Language Model (LLM) system for the Food and Agriculture Organization (FAO). It focuses on securing their document retrieval and analysis platform, which processes sensitive food security and nutrition data.
Last synced: 02 Apr 2025
https://github.com/subhashpolisetti/rag-llm-vecdb-survey
This repository explores how Large Language Models (LLMs) and Vector Databases (VecDBs) work together, based on insights from the referenced survey paper. It focuses on showing how their combination can solve practical problems, like improving how AI retrieves and uses information (retrieval-augmented generation) and building more efficient systems
Last synced: 02 Apr 2025
https://github.com/subhashpolisetti/decision-tree-ensemble-algorithms
A Python implementation of ensemble learning algorithms from scratch, including Gradient Boosting Machine (GBM), Random Forest, AdaBoost, and Decision Trees. This repository also showcases XGBoost, CatBoost, LightGBM for classification, regression, and ranking tasks, with visualizations and performance comparisons.
Last synced: 02 Apr 2025
https://github.com/subhashpolisetti/clustering-techniques-and-embeddings
This repository includes Colab notebooks demonstrating various clustering algorithms, from scratch-based methods to advanced deep learning models and embeddings. Each notebook features explanations, visualizations, and quality evaluation metrics for clustering performance.
anomaly-detection clustering-algorithm hierarchical-clustering kmeans-clustering multimodal time-series
Last synced: 02 Apr 2025
https://github.com/subhashpolisetti/network-traffic-visualizer
A Python-based tool for analyzing and visualizing network traffic from .pcap files using Wireshark's tshark. Extracts key metrics like protocols, IP addresses, and packet sizes, with insightful visualizations to explore network activity
Last synced: 26 Feb 2025