An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/joel-beck/airbnb-oslo

Price Prediction Models for Airbnb Apartments in Oslo | Winter Term 2021/22

prediction python pytorch scikit-learn

Last synced: 04 May 2026

https://github.com/bhawnamehbubani/airline-passenger-referral-program-development-with-classification-techniques

Prediction of airline passenger referrals using Logistic Regression, GridSearchCV, and TF-IDF vectorization with Python, Pandas, Scikit-learn, and Excel.

excel gridsearchcv logistic-regression pandas python3 scikit-learn tf-idf-vectorization

Last synced: 04 May 2026

https://github.com/keven-rdr/rio-airbnb-predictor

Estudo de IA, utilizando modelos de previsão como o regressor para determinar valor de imóvel

airbnb ia kaggle php price regression-models scikit-learn

Last synced: 04 May 2026

https://github.com/suguru-n/temp_easyai

学部生向け機械学習体験プログラム

google-colab jupyter-notebook linearregression python scikit-learn

Last synced: 04 May 2026

https://github.com/dakii24/credit-card-fraud-detection

This repository contains a machine learning project focused on detecting fraudulent credit card transactions. The project includes data preprocessing, model training, and evaluation to identify and prevent fraudulent activities.

capstone-project class-imbalance classification-algorithm credit-card credit-card-fraud data-science decision-trees fraud machine-learning open-data python scikit-learn svm svm-classifier

Last synced: 04 May 2026

https://github.com/madhu26sree/diabetes-prediction

This project leverages the Support Vector Machine (SVM) algorithm to predict whether a person is likely to have diabetes or not, using the Diabetes dataset. It covers data preprocessing, model building, evaluation using Python.

machine-learning python scikit-learn

Last synced: 04 May 2026

https://github.com/drod75/nyc-arrests-analysis

This is a simple Data Science Project made to analyze and display data and trends found within the NYC Arrests Year to Date Dataset.

data-analysis data-visualization folium jupyter-notebook matplotlib-pyplot nyc-opendata nypd python scikit-learn seaborn

Last synced: 04 May 2026

https://github.com/msikorski93/protein-tertiary-structure

Performing a regression task for estimating residue size based on given physicochemical properties of protein tertiary structures (CASP 5-9).

bioinformatics gradient-boosting multilayer-perceptron-network protein-structure-prediction regression-algorithms scikit-learn tensorflow

Last synced: 04 May 2026

https://github.com/chathumiamarasinghe/nn-training-model

A comprehensive project for training neural networks to solve real-world problems. This repository includes customizable code for building, training, and evaluating neural network architectures using popular deep learning frameworks.

jupyter-notebook matplotlib numpy phyton scikit-learn

Last synced: 04 May 2026

https://github.com/aqueeqazam/machine-learning-using-scikit

This repository contains all of the algorithms used to train the machine learning models using the Scikit library.

numpy scikit-learn

Last synced: 04 May 2026

https://github.com/siddhantborse/atmosviz

Atmos Viz is a Python-based project designed to analyze, visualize, and predict global temperature trends across various cities and countries using time-series analysis and advanced data science techniques. Leveraging historical climate data, this project integrates machine learning models, geospatial mapping, and interactive visualizations to unco

geopandas geospatial-analysis gis matplotlib numpy pandas plotly python scikit-learn seaborn shapefiles time timeseries-analysis timeseries-data

Last synced: 05 May 2026

https://github.com/sxv357/xtern-artificial-intelligence-work-based-assessment

This application takes in data regarding undergraduate college students in the state of Indiana such as their year, what major they're pursuing, which university they attend, and makes a prediction about their food order.

jupyter-notebook matplotlib pandas pickle scikit-learn seaborn

Last synced: 05 May 2026

https://github.com/pierrealexandre78/deathpredict

Predict Hospital mortality rate using Machine Learning for patients admitted in ICU (Intensive Care Unit)

healthcare hospital machine-learning predictions python random-forest-classifier scikit-learn xgboost-classifier

Last synced: 05 May 2026

https://github.com/thekartikeyamishra/resumeevaluatorapp

The Automated Resume Evaluator is a Python-based application that helps evaluate resumes against job descriptions. It calculates an Applicant Tracking System (ATS) score, which is the percentage of keywords from the job description found in the resume.

flask machine-learning matplotlib nlp nltk pypdf python scikit-learn spacy textblob

Last synced: 05 May 2026

https://github.com/himanshkr03/comparative_performance_on_fashionmnist

This repository explores various machine learning and deep learning models for classifying images from the Fashion MNIST dataset. It includes data exploration, model training, evaluation, and visualization techniques to gain insights into the classification task.

deep-learning fashion-mnist fine hybrid-model image-classification keras machine-learning scikit-learn tensorflow xgboost-algorithm

Last synced: 05 May 2026

https://github.com/simpl1fy/spam-classifier-project

A web application to classify spam texts or emails.

multinomial-naive-bayes nltk python render scikit-learn text-classification

Last synced: 05 May 2026

https://github.com/s-matke/eco-forecast

Machine learning model used for predicting European country with most green surplus energy generated

data-science green-energy machine-learning scikit-learn supervised-learning

Last synced: 05 May 2026

https://github.com/hallowshaw/text-emotion-classification-using-lstm-and-tokenization

This repository provides a machine learning and deep learning pipeline for text emotion detection. It includes a pretrained LSTM model, tokenizer, and preprocessing steps to classify emotions such as joy, sadness, and anger from text input. Easily deployable with provided resources and scripts.

emotion-classification emotion-detection feature-engineering lstm nltk nltk-python scikit-learn scikitlearn-machine-learning sentiment-analysis sequential-models text-classification text-classification-multi-label tokenization tokenizer

Last synced: 05 May 2026

https://github.com/marconicivitavecchia/stazione-monitoraggio-ambientale

Codice in MicroPython per ESP32 per il corso tenuto dalla nostra scuola rivolto ai docenti sulla creazione di una stazione di monitoraggio ambientale che copre gli argomenti di Python, IoT ed Intelligenza Artificiale.

ai esp32 micropython micropython-esp32 python school-project scikit-learn

Last synced: 05 May 2026

https://github.com/zafir100100/cancer-stage-prediction

This code predicts cancer data using various regression models, calculates their average R-squared scores, and prints the best model.

cross-validation data-analysis data-preprocessing decision-trees gradient-boosting linear-regression machine-learning-algorithms numpy pandas random-forest regression scikit-learn

Last synced: 05 May 2026

https://github.com/hitthecodelabs/petalanalyticsstreamlit

Web application developed with Streamlit that predicts the Iris flower type based on its physical features

matplotlib model numpy pickle python scikit-learn sklearn streamlit

Last synced: 05 May 2026

https://github.com/monish-nallagondalla/universal-bank

Credit Card Ownership Prediction A machine learning project that predicts credit card ownership using features like age and income, balancing class distributions for improved accuracy.

classification-models credit-card-prediction data-analysis data-classification decision-tree-classifier imbalanced-datasets machine-learning model-evaluation python scikit-learn

Last synced: 05 May 2026

https://github.com/smaddanki/pattern-pursuit-challenge

A personal challenge to build a production-ready trading signal system for S&P 500 stocks using deep learning. This project progresses from basic ML models to a complete trading infrastructure, focusing on 5-day forward return prediction and signal generation.

deep-learning machine-learning pytorch quantative-trading quantitative-finance quantitative-research scikit-learn

Last synced: 05 May 2026

https://github.com/markdouthwaite/lingo-demo

A demo project showing how to effectively deploy Scikit-Learn Linear Models in Go into Google Cloud Run.

go golang google-cloud-platform python scikit-learn

Last synced: 05 May 2026

https://github.com/akash-47-tank/personalized-e-commerce-review-summarizer

Personalized E-commerce Product Review Summarizer: A Streamlit app that summarizes product reviews (e.g., from a CSV) using T5-small and tailors summaries to user preferences (price, durability, etc.) with NLP and lightweight ML.

data-analysis e-commerce machine-learning nlp personalization portfolio python scikit-learn sentiment-analysis streamlit t5 transformers web-app

Last synced: 05 May 2026

https://github.com/teja-1403/coursera-machine-learning-with-python-honors

This project involves building a classifier to predict rainfall for the next day based on weather data from the Australian Government's Bureau of Meteorology. Various machine learning techniques such as Linear Regression, KNN, Decision Trees, Logistic Regression, and SVM were implemented and evaluated.

classification hierarchical-clustering machine-learning regression scikit-learn scipy

Last synced: 05 May 2026

https://github.com/zuhairzia/customer-segmentation

📖 About Customer Segmentation using KMeans clustering to analyze demographics, income, and spending. Helps businesses with targeted marketing and customer insights.

joblib matplotlib numpy pandas scikit-learn seaborn

Last synced: 05 May 2026

https://github.com/rohra-mehak/sciencesync

System for Personalized Google Scholar Alerts Processing and Data Management, and provision of ML based clustering analysis

agglomerative-clustering clustering crossref-api customtkinter google-api google-scholar graph-api machine-learning numpy pandas python3 scientific-article-analysis scikit-learn sqlite3

Last synced: 05 May 2026

https://github.com/nandinimarepalli/ai_ml_internship_projects

Projects completed during my AI/ML and Data Expert internship, including EDA, machine learning models, and dashboard development using Python, pandas, scikit-learn, and visualization libraries.

matplotlib numpy pandas python scikit-learn seaborn

Last synced: 05 May 2026

https://github.com/rohansardar/iris_flower

A basic ML project on the iris flower classification

data-science iris-classification iris-dataset ml python scikit-learn

Last synced: 05 May 2026

https://github.com/gbourniq/cnn-multiclass-classification-gear

Using Machine Learning and Deep Learning to predict the category of outdoor equipment

image-classification keras-tensorflow multiclass-classification python scikit-learn tensorboard-visualizations

Last synced: 05 May 2026

https://github.com/aysenurcftc/breast_cancer_streamlit

Breast Cancer Wisconsin Dataset Classifier with Scikit-learn and Streamlit

breast-cancer classification gridsearch scikit-learn streamlit

Last synced: 05 May 2026

https://github.com/aryar-06/linear-regression

A Python project demonstrating basic linear regression with gradient descent and matrix operations, alongside scikit-learn comparison.

data-analysis data-preprocessing educational-project gradient-descent linear-regression machine-learning python regression-algorithms scikit-learn

Last synced: 05 May 2026

https://github.com/supernovasatsangi23/modifying-biomarker-gene-identification-for-effective-cancer-categorization

A project that focuses on implementing a hybrid approach that modifies the identification of biomarker genes for better categorization of cancer. The methodology is a fusion of MRMR filter method for feature selection, steady state genetic algorithm and a MLP classifier.

dataset deep-learning deep-neural-networks feature-selection genetic-algorithm machine-learning machine-learning-algorithms mlp-classifier mrmr neural-network numpy pandas-dataframe python python3 scikit-learn scikit-learn-python tkinter-gui tkinter-python

Last synced: 05 May 2026

https://github.com/antoniskl/un-general-debate-corpus-classification

The aim of this project is to classify UNGDC speeches with regards to climate change. As a secondary objective, a correlation is being examined between these speeches, the forestation and the happiness index of the countries.

classification data-science jupyter-notebook machine-learning nlp python regression scikit-learn text-classification text-preprocessing

Last synced: 05 May 2026

https://github.com/kefrankk/ml-fraud-detection

I built a predictive model to detect fraud in financial transactions.

pandas python scikit-learn

Last synced: 05 May 2026

https://github.com/nimbostratos/titanic-survival-prediction

Machine learning project predicting Titanic survival using AdaBoost with feature engineering and hyperparameter optimization

data-analysis data-science data-science-projects kaggle machine-learning machine-learning-models python scikit-learn

Last synced: 05 May 2026

https://github.com/kunalpisolkar24/dsbda_lab

Collection of practical codes for Savitribai Phule Pune University's Data Science and Big Data Analytics Laboratory (310256).

data-analytics data-preprocessing data-science data-wrangling descriptive-statistics linear-regression logistic-regression mapreduce scala scikit-learn sppu-computer-engineering tf-idf

Last synced: 05 May 2026

https://github.com/vanilladucky/housing-prediction

This is a data analytics and machine learning project that I undertook using a housing dataset on Kaggle in order to put my machine learning knowledge to practice and some practical application.

data-science machine-learning python scikit-learn

Last synced: 05 May 2026

https://github.com/divinenaman/color-extraction-api

Extract colours from images using K-means, along with FastAPI pipeline.

fastapi k-means-clustering scikit-learn

Last synced: 05 May 2026

https://github.com/sevilaymuni/project-no.6-tree-based-models

Random Forest Assisted Suggestions for Salifort Motors Employee Retention: Plan, Analyze, Construct and Execute

data-science decision-trees evaluation-metrics gridsearchcv logistic-regression machine-learning matplotlib python random-forest-classifier scikit-learn seaborn-plots

Last synced: 05 May 2026

https://github.com/zenitsu272/fault-detection-ml

Machine Learning based Fault Detection in machines using sensor data

artificial-intelligence decsion-tree machine-learning pandas pandas-dataframe pandas-python scikit-learn

Last synced: 05 May 2026

https://github.com/pjj11005/ml_with_pytorch_study

[머신 러닝 교과서: 파이토치 편] -> 학습한 코드 저장소

deep-learning graph-neural-networks machine-learning neural-networks pytorch scikit-learn transformer

Last synced: 06 May 2026

https://github.com/grandechowhiskey/fcc-machine_learning-boilerplates

A collection of projects completed as part of the FreeCodeCamp "Machine Learning with Python" certification. These projects focus on implementing machine learning models, data preprocessing, and predictive analysis using libraries like scikit-learn and TensorFlow.

ai ml python3 scikit-learn tensorflow

Last synced: 06 May 2026

https://github.com/fahrettinsolak/ai-based-salary-scale-calculation-project

This project demonstrates a Polynomial Regression model using a dataset related to experience and salary. The model is built using Python with the pandas, matplotlib, and sklearn libraries. The dataset includes information on years of experience and corresponding salary.

artificial-intelligence deep-learning jupyter-notebook machine-learning matplotlib pandas pyhton scikit-learn

Last synced: 05 May 2026

https://github.com/sadmansakib93/mental-resilience-analysis-using-machine-learning

Utilized supervised and unsupervised ML techniques to analyze mental health and resilience levels of medical students [Project completed on December, 2019]

artificial-intelligence classification clustering correlation linear-regression machine-learning machine-learning-algorithms mental-health python regression resilience scikit-learn statistical-analysis

Last synced: 06 May 2026

https://github.com/rishisolanke/twitter-sentiment-analysis-using-machine-learning-

A research project that classifies tweets as positive, negative, or neutral using ML algorithms (Logistic Regression, Naïve Bayes, SVM) with NLP preprocessing.

data-science data-visualization logistic-regression machine-learning ml-models naive-bayes natural-language-processing nlp scikit-learn sentiment-analysis svm text-classification twitter-data

Last synced: 06 May 2026

https://github.com/keneandita/iris-intel

Iris Flower Classifier is a simple web app built with Streamlit that predicts the species of an Iris flower based on user-input flower features. It uses pre-trained machine learning models including Logistic Regression, K-Nearest Neighbors, SVM, and Decision Tree to make real-time predictions.

iris-classification jupyter-notebook machine-learning python scikit-learn streamlit

Last synced: 06 May 2026

https://github.com/eshansugeesh/fico-score-loan-default-modeling-project

Credit risk assessment using FICO score segmentation, loan default modeling, discretization techniques, and log-likelihood evaluation for predictive analytics in financial services.

bucketing classification credit-risk customer-segmentation data-science discretization fico-score financial-analytics loan-analysis loan-default log-likelihood machine-learning numpy pandas predictive-modeling risk-modeling scikit-learn segmentation statistical-modelling

Last synced: 06 May 2026

https://github.com/nicolas-giacomelli/modelo_regressao_linear_vendas

Modelo de regressão linear para previsão de vendas Desafio do curso de IA da RocketSeat

matplotlib pandas python3 scikit-learn

Last synced: 06 May 2026

https://github.com/radoslawregula/binary-classification-metrics

A model implementing a solution to the binary classification problem along with several accuracy metrics.

binary-classification classification jupyter-notebook machine-learning matplotlib pandas python scikit-learn stochastic-gradient-descent

Last synced: 06 May 2026

https://github.com/billgewrgoulas/recommendation-systems

Algorithms for joke rating prediction using the joke data-set from Kaggle.

algorithm clustering collaborative-filtering machine-learning numpy pandas recommender-system scikit-learn scypi

Last synced: 06 May 2026

https://github.com/kaoutarmi/predition_price-old-cars

Ce projet de prédiction du prix des voitures utilise l’apprentissage automatique pour estimer la valeur des véhicules en fonction de leurs caractéristiques.

car-price-prediction data-preprocessing data-science decision-tree feature-engineering machine-learning regression scikit-learn

Last synced: 06 May 2026

https://github.com/lazarust/jupyternotebooks

Storage spot for all my Jupyter Notebooks. Check some of them out!!

jupyter-notebook jupyter-notebooks keras scikit-learn sklearn

Last synced: 06 May 2026

https://github.com/erick957/saleprice-prediction-dataset-analysis-and-cleaning-advance-regression

🏠 Predict house prices using advanced regression techniques with this comprehensive analysis and cleaning project, from data loading to model deployment.

data-analysis data-science eda google-colab machine-learning numpy pandas python scikit-learn scikit-learn-python

Last synced: 06 May 2026

https://github.com/andrewsy1004/logistic-regression-spam-classifier

This project implements a spam email classifier using Logistic Regression.

numpy pandas scikit-learn

Last synced: 06 May 2026

https://github.com/5hraddha/optimize-oil-well-locations

In the quest for harnessing valuable energy resources, the OilyGiant mining company wants to expand its operations by discovering new oil well locations. To achieve this, a data-driven approach is adopted, leveraging geological exploration data from three distinct regions and employing techniques in data analysis and modeling.

linear-regression numpy pandas scikit-learn supervised-learning

Last synced: 06 May 2026

https://github.com/sabin74/boston_house_prediction

This project aims to predict the median value of owner-occupied homes in Boston suburbs using various machine learning regression models. Multiple regression techniques were applied, including Linear Regression, Decision Tree, Random Forest, Gradient Boosting and dimensionality reduction with PCA. Hyperparameter tuning was performed.

boston-housing-price-prediction hyperparameter-tuning kaggle-dataset pca-analysis python3 regression-models scikit-learn

Last synced: 06 May 2026

https://github.com/adesartika33/proyek-analisis-data-dataset-iris

Proyek ini bertujuan untuk menganalisis dataset Iris, salah satu dataset klasik dalam bidang Machine Learning dan Data Science. Dataset ini terdiri dari 150 sampel bunga Iris dari tiga spesies (Setosa, Versicolor, dan Virginica)

classification data-science data-visualization eda exploratory-data-analysis iris-dataset machine-learning python random-forest scikit-learn

Last synced: 06 May 2026

https://github.com/felipesbonatti/case-credit-risk-prediction

Projeto de classificação de risco de crédito construído com Python, Scikit-learn e Pandas. Demonstra um fluxo de trabalho de Machine Learning de ponta a ponta: pré-processamento de dados, feature engineering, treinamento de múltiplos algoritmos e avaliação de performance com métricas como AUC-ROC.

credit-risk machine-learning predictive-modeling python scikit-learn

Last synced: 06 May 2026

https://github.com/pimakarov/textkd-p4-fewshot-distilbert

📊 Compare few-shot text classification with DistilBERT and TF-IDF + SVM using IMDB data, analyzing performance across various sample sizes.

bert distilbert few-shot-learning nlp python pytorch scikit-learn text-classification transfer-learning trasformer

Last synced: 06 May 2026

https://github.com/dwade-eng/customer-lead-conversion-analysis

This project explores a real-world lead conversion dataset, using a structured machine learning pipeline to classify leads into likely or unlikely converters. It includes complete steps from data wrangling and visualization to feature engineering and model evaluation.

html matplotlib pandas python3 scikit-learn seaborn

Last synced: 06 May 2026

https://github.com/ejw-data/ml-playground

Testing the limitations, inabilities, and strengths of models with synthetic data

machine-learning python scikit-learn

Last synced: 06 May 2026

https://github.com/tharunkumar666/employee_attrition_prediction

Predict employee attrition using Logistic Regression. Use Python with Pandas and Scikit-learn to analyze factors like salary, satisfaction, and promotion history. Model classifies if an employee will stay or leave, helping HR take proactive retention measures.

pandas python regression-models scikit-learn

Last synced: 06 May 2026

https://github.com/douglas-data-analyst/predictive-analysis

Modelo preditivo para previsão de vendas usando scikit-learn e machine learning

data-science machine-learning predictive-analytics python sales-forecasting scikit-learn time-series

Last synced: 06 May 2026

https://github.com/pradeep-r04/spam-email-classification

Spam Email Classification Using NLP and Machine Learning involves building a system to identify and categorize emails as either spam or non-spam (ham). This process typically uses Natural Language Processing (NLP) techniques to analyze and preprocess text data and machine learning algorithms to train a model for classification.

artificial-intelligence machine-learning naive-bayes-classifier nlp pkl python scikit-learn streamlit

Last synced: 06 May 2026

https://github.com/cycle-sync-ai/student-score-analysis

A data-driven student performance analysis project using UCI dataset (396 students, 33 features). Implements machine learning models (K-means, PCA, Decision Tree, Random Forest, Linear Regression) to analyze academic patterns and predict student scores based on lifestyle, health, and study habits.

clustering clustering-algorithm decision-trees feature-engineering learning-management-system linear-regression machine-learning machine-learning-algorithms matplotlib numpy pandas pca pickle prediction prediction-algorithm scikit-learn score seaborn student

Last synced: 06 May 2026

https://github.com/williyam-m/company-registration-trends

Utilized Linear Regression from scikit-learn to predict future company registration trends.

flask matplotlib numpy pandas-python scikit-learn

Last synced: 06 May 2026

https://github.com/lintangwisesa/pdb_mti_ui_lab1_k6

Tugas Lab 1 Pengelolaan Data Besar MTI UI 2023

machine-learning python3 scikit-learn

Last synced: 06 May 2026

https://github.com/barbarpotato/applied-data-science-with-python-specialization

This skills-based specialization is intended for learners who have a basic python or programming background, and want to apply statistical, machine learning, information visualization, text analysis, and social network.

data-science matplotlib pandas scikit-learn

Last synced: 06 May 2026

https://github.com/bhavyac16/flairifyme

FlairifyMe is a Reddit Flair Detector for r/india subreddit, that takes a post's URL as user input and predicts the flair for the post using a model generated by Logistic Regression.

flair-prediction flask hacktoberfest linear-svm logistic-regression naive-bayes-classifier nltk praw-reddit reddit-flair-detector scikit-learn scraped-data subreddit text-classification

Last synced: 06 May 2026

https://github.com/kartheekdama/salary-prediction

This salary prediction model leverages machine learning techniques, including Random Forest, Decision Tree, and Linear Regression, to estimate salaries based on individual attributes such as age, gender, education level, job title, and years of experience. The Random Forest model outperforms the others, achieving the highest R-squared score.

decision-tree exploratory-data-analysis feature-importance linear-regression machine-learning random-forest scikit-learn

Last synced: 06 May 2026

https://github.com/sahilmate/ebm-breast-cancer-classifier

This repository implements an Explainable Boosting Machine (EBM) model for breast cancer classification using scikit-learn and interpret. The project includes data preprocessing, model training, accuracy evaluation, and feature importance visualization.

breast-cancer-classification data-visualization explainable-boosting-machine feature-importance interpret machine-learning scikit-learn

Last synced: 06 May 2026

https://github.com/josepablodmg/python--linear-regression-advertising

A linear regression analysis to predict sales based on advertising spending across TV, radio, and newspaper channels. The project includes exploratory data analysis, model training, coefficient visualization, and residual analysis.

advertising data-analysis exploratory-data-analysis linear-regression machine-learning python regression scikit-learn visualization

Last synced: 06 May 2026

https://github.com/rafay-imraan/recommendation-system

A machine learning model that outputs personalized similar movie recommendations for people based on the ones they have rated positively.

machine-learning pandas python scikit-learn

Last synced: 06 May 2026

https://github.com/ccastleberry/hands_on_machine_learning

Notebooks and files created while working through the book Hands on Machine Learning

data-science jupyter-notebook scikit-learn tensorflow

Last synced: 06 May 2026

https://github.com/avtorgenii/ml-playground

A repository for exploring and experimenting with datasets, building machine learning models, and testing various techniques in data preprocessing, feature engineering, and model evaluation.

matplotlib ml pandas scikit-learn

Last synced: 06 May 2026

https://github.com/galaxy092/samsung-innovation-campus-big-data-capstone-project

Samsung Innovation Campus Big Data Capstone Project - Weather Prediction

hadoop jupyter-notebook pandas pyspark scikit-learn sparksql

Last synced: 06 May 2026

https://github.com/blacknahil/spam-detection

A simple web application for detecting spam messages using a machine learning model. The application is built using Flask and provides an interactive interface for users to input a message and get a prediction whether it is spam or ham along with the probability.

flask html-css-javascript pandas scikit-learn

Last synced: 06 May 2026

https://github.com/samudraneel05/stanford-open-policing

The Stanford Open Policing Project (SOPP) aims to bring transparency to police interactions by collecting and analyzing data on traffic stops across the United States. It accumulates a vast dataset on traffic stops, encompassing details such as demographics, location, and outcomes.

clustering heirarchical-clustering k-means-clustering machine-learning matplotlib pandas python scikit-learn

Last synced: 06 May 2026

https://github.com/jbizzlefoshizzle/ibm_capstone_project

Used K-means clustering and mapping libraries to determine best cities in San Diego to open a Mexican restaurant

beautifulsoup4 folium-maps geopy pandas-python scikit-learn

Last synced: 06 May 2026

https://github.com/kianaabrisham/svm-from-scratch

Linear SVM from scratch with hinge loss + decision boundaries

classification from-scratch fundamentals hinge-loss numpy optimization scikit-learn svm

Last synced: 07 May 2026