An open API service indexing awesome lists of open source software.

Projects in Awesome Lists by subhashpolisetti

A curated list of projects in awesome lists by subhashpolisetti .

https://github.com/subhashpolisetti/autogluon_ml_end-to-end_implementations

This repository features practical AutoGluon implementations across tasks like Kaggle competition solutions, tabular classification/regression, multimodal analysis, feature engineering, and multi-label classification. Each Colab notebook runs real-world datasets (text, image, time series) in Google Colab with complete outputs for reproducibility.

Last synced: 12 Apr 2025

https://github.com/subhashpolisetti/mqtt-reservoir-data-pipeline

California Reservoir Water Monitoring System – A data aggregation and monitoring solution using MQTT to collect and summarize daily water mark levels across California's reservoirs. This system transforms CSV data into JSON, generates daily reports, and provides insights into water availability.

csv json mqtt python

Last synced: 28 Feb 2025

https://github.com/subhashpolisetti/hoppingwindow-crudeoil

This project analyzes West Texas Intermediate (WTI) Crude Oil Prices using a hopping window strategy to calculate weekly mean and maximum prices, offering detailed insights into price fluctuations.

hopping-window pandas python

Last synced: 26 Mar 2025

https://github.com/subhashpolisetti/dimensionality_reduction

This repository demonstrates various dimensionality reduction techniques on image and tabular datasets. It explores and visualizes methods like PCA, t-SNE, UMAP, ISOMAP, and Autoencoders, comparing their performance and effectiveness with interactive visualizations.

databricks dimensionality-reduction pandas tabular-data

Last synced: 26 Mar 2025

https://github.com/subhashpolisetti/timeseriesdataaggregation

A Python project demonstrating time-series data aggregation using a hopping window function to calculate mean temperature readings from simulated sensor data. Ideal for streaming applications requiring low-latency insights over overlapping time intervals in real-time

hopping-window python

Last synced: 26 Mar 2025

https://github.com/subhashpolisetti/apache-beam-and-eda-projects

Apache Beam and EDA Projects: Showcasing real-time data processing with Apache Beam, interactive visualizations with D3.js, and automated EDA with Sweetviz and PyCaret. Includes Jupyter notebooks and outputs for learning and exploration.

apachebeam d3js matplotlib pandas pycaret seaborn-plots sweetviz

Last synced: 26 Mar 2025

https://github.com/subhashpolisetti/eda-timeseries-tabular

This repository contains two machine learning projects: Air Quality Prediction, which predicts CO(GT) levels using environmental and pollutant data with AutoML, and NYC Taxi Fare Prediction, predicts taxi fares based on trip data using automated machine learning. Both projects showcase data analysis,preprocessing, and predictive modeling techniques

automl eda tabular-data time-series

Last synced: 26 Mar 2025

https://github.com/subhashpolisetti/crispdm-semma-kdd-workflows

This repository demonstrates three data mining methodologies applied to various real-world datasets: CRISP-DM (Weather Analysis), KDD (Social Media Ads Analysis), and SEMMA (Spotify Recommendation System). Each project includes data exploration, preprocessing, modeling, and evaluation steps, along with comprehensive documentation, supporting files,

clustering-algorithm crisp-dm kdd latex-template machine-learning numpy pandas random-forest research-paper semma

Last synced: 26 Mar 2025

https://github.com/subhashpolisetti/python-mqtt-message-stream

This project demonstrates asynchronous message streaming using MQTT for Internet of Things (IoT) applications. We’ll set up a Publisher and a Subscriber to handle 1,000,000 messages via the Mosquitto MQTT broker.

mqtt mqtt-broker mqtt-client python

Last synced: 26 Mar 2025

https://github.com/subhashpolisetti/realtime-websocket-client

This project demonstrates a WebSocket-based real-time communication system with a client-server setup and a basic user interface. The client sends 10,000 messages to the server, which responds to each message, showcasing efficient, long-lasting, full-duplex communication.

python tkinter-gui websocket

Last synced: 26 Mar 2025

https://github.com/subhashpolisetti/flask_real_chat_application

This is a real-time chat application built with Flask, Flask-SocketIO, and SQLite. Users can join chat rooms, send messages, and view message history in real-time. The app features a simple and responsive UI that provides a smooth chat experience

css flask flasksocketio html python sqlite

Last synced: 26 Mar 2025

https://github.com/subhashpolisetti/cmpe255-assignment-1

Data Mining Assignment

Last synced: 26 Mar 2025

https://github.com/subhashpolisetti/cat-dog-classification-cnn

CNN-Based Dogs vs. Cats Image Classifier A Convolutional Neural Network (CNN) model for classifying images of cats and dogs, built using TensorFlow and Keras. This project includes data preprocessing, data augmentation, and transfer learning techniques to achieve high classification accuracy.

Last synced: 02 Apr 2025

https://github.com/subhashpolisetti/llm-threat-model

This project addresses security challenges in implementing a Large Language Model (LLM) system for the Food and Agriculture Organization (FAO). It focuses on securing their document retrieval and analysis platform, which processes sensitive food security and nutrition data.

architecture-components

Last synced: 02 Apr 2025

https://github.com/subhashpolisetti/rag-llm-vecdb-survey

This repository explores how Large Language Models (LLMs) and Vector Databases (VecDBs) work together, based on insights from the referenced survey paper. It focuses on showing how their combination can solve practical problems, like improving how AI retrieves and uses information (retrieval-augmented generation) and building more efficient systems

Last synced: 02 Apr 2025

https://github.com/subhashpolisetti/decision-tree-ensemble-algorithms

A Python implementation of ensemble learning algorithms from scratch, including Gradient Boosting Machine (GBM), Random Forest, AdaBoost, and Decision Trees. This repository also showcases XGBoost, CatBoost, LightGBM for classification, regression, and ranking tasks, with visualizations and performance comparisons.

Last synced: 02 Apr 2025

https://github.com/subhashpolisetti/clustering-techniques-and-embeddings

This repository includes Colab notebooks demonstrating various clustering algorithms, from scratch-based methods to advanced deep learning models and embeddings. Each notebook features explanations, visualizations, and quality evaluation metrics for clustering performance.

anomaly-detection clustering-algorithm hierarchical-clustering kmeans-clustering multimodal time-series

Last synced: 02 Apr 2025

https://github.com/subhashpolisetti/network-traffic-visualizer

A Python-based tool for analyzing and visualizing network traffic from .pcap files using Wireshark's tshark. Extracts key metrics like protocols, IP addresses, and packet sizes, with insightful visualizations to explore network activity

Last synced: 26 Feb 2025