An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/stefan-schroedl/molecule_classification

"Auto-sklearn for chemistry" - train and run machine-learned classifiers for molecular classification tasks.

automatic automl cheminformatics chemistry chemoinformatics hyperparameter-optimization machine-learning molecule python rdkit scikit-learn

Last synced: 25 Sep 2025

https://github.com/chris-santiago/steps

A SciKit-Learn style feature selector using best subsets and stepwise regression.

best-subset-selection data-science python scikit-learn stepwise-selection

Last synced: 28 Jun 2025

https://github.com/shubhamkumar-op/crop_recommendation

A rule-based system use rules to suggest crops based on weather patterns, and other relevant factors. This type of system is relatively simple to implement and may be suitable for smaller farms with less complex data needs.

logistic-regression pandas random-forest random-forest-classifier scikit-learn

Last synced: 17 Jun 2025

https://github.com/fabianacampanari/iris-dataanalysis-seaborn-

🌸 The provided code snippet is a Python script that uses matplotlib to plot the numerical and exact derivatives of a function f4 over a range of values. The script generates a sequence of values x from -5 to 5, calculates the derivatives using two different methods, and then plots the results for comparison.

iris-dataset jupyter-notebook machine-learning matplotlib numpy pandas pyplot python-lambda python3 scikit-learn seaborn seaborn-plots

Last synced: 12 Sep 2025

https://github.com/ngupta23/more

This is a helper package for pandas, visualizations and scikit-learn

data-science helpers pandas python scikit-learn visualization visualizations

Last synced: 24 Oct 2025

https://github.com/mattlianje/climactic

ML ... Highlight creator for e-sports videos using audio analysis.

audio-analysis highlights ml scikit-learn

Last synced: 09 Apr 2025

https://github.com/tanaybhadula/phishing-website-detection

A web application to predicted whether a URL/Website is phishing or not by extracting its lexical features.

classification descision-tree flask machine-learning pandas phishing-detection python random-forest scikit-learn stacking-classifier svm-classifier xgboost

Last synced: 21 Jul 2025

https://github.com/yaricom/english-article-correction

The experiment with applying NLP to correction of definite/indefinite articles in English text corpus

data-analysis glove-vectors nlp nlp-machine-learning numpy pandas scikit-learn umbc-webbase-corpus

Last synced: 05 Apr 2025

https://github.com/ascender1729/iris-flower-classification-2024

An exploratory data analysis and machine learning project using the Iris dataset to classify flower species with a K-Nearest Neighbors classifier. It includes data visualization, feature scaling, model training, and evaluation with 100% accuracy on the test set.

classification data-analysis iris-dataset k-nearest-neighbors machine-learning matplotlib pandas python scikit-learn seaborn

Last synced: 20 Jul 2025

https://github.com/zongxr/8th-national-ai-training-competition

第八届全国职工职业技能大赛人工智能训练师赛项

gemma-2b machine-learning matplotlib numpy pandas python scikit-learn seaborn transformers yolov8

Last synced: 09 May 2025

https://github.com/crflynn/sklearn-instrumentation

Generalized scikit-learn machine learning model instrumentation library

instrumentation machine-learning scikit-learn

Last synced: 15 Apr 2025

https://github.com/altescy/nlpstack

📚 NLPSTACK: A Python Library for Natural Language Processing

nlp python pytorch scikit-learn

Last synced: 14 Apr 2025

https://github.com/shreyamalogi/credit-card-fraud-detection-system

A Credit Card Fraud Detection System using Adaboost and Majority Voting, designed to identify fraudulent credit card transactions by combining the strength of multiple classifiers.

adaboost ensemble-learning majority-voting python scikit-learn

Last synced: 11 Apr 2025

https://github.com/cpeoples/powerpredict

🔮 AI-powered Powerball & Mega Millions lottery number prediction using deep learning (Transformer + LSTM), Markov chains, and statistical analysis. Built with TensorFlow/Keras 3.

artificial-intelligence data-science deep-learning keras lottery-prediction lstm machine-learning markov-chain megamillions neural-network powerball python scikit-learn statistical-analysis tensorflow texas-lottery transformer

Last synced: 21 May 2026

https://github.com/kkhushhalr2405/moviereview

ML model to classify reviews given by users as positive or negative review

machinelearning-python natural-language-processing python-3 scikit-learn spacy-nlp

Last synced: 19 Oct 2025

https://github.com/evilpegasus/apstats

A statistical analysis of WA state teacher salaries and SAT scores

data data-science excel matplotlib numpy pandas python scikit-learn statistics

Last synced: 14 Apr 2026

https://github.com/vhnegrisoli/materiais-pos-graduacao

Repositório com scripts e notebooks utilizando Python 3 e bancos de dados relacionais e não-relacionais (Oracle, MongoDB, Redis, Neo4J) como estudo para a Pós-Graduação em Data Science & Big Data pela Pontifícia Universidade Católica de Minas Gerais (PUC-MG)

business-intelligence data-science dataviz jupyter-notebook matplotlib mongodb pandas powerbi python scikit-learn

Last synced: 07 May 2025

https://github.com/senior-sigan/ml_workshop_2017.06.10

Small workshop on solving dota2 competition

kaggle-competition machine-learning scikit-learn

Last synced: 14 Oct 2025

https://github.com/janszewczyk/hands-control-system

Hands Control System (HCS) - an application/system that uses AI/computer vision to control a computer mouse by means of movement and hand gestures.

ai cv2 maschine-learning mediapipe python38 scikit-learn

Last synced: 04 May 2026

https://github.com/lucasmsa/ia-ufpb

🧠 Projetos da disciplina de Inteligência Artificial

pandas python scikit-learn

Last synced: 30 Apr 2026

https://github.com/mjahmadee/machinelearning2023

Welcome to the official GitHub repository for the "Machine Learning" course 2023! In this course, we explore the fascinating world of machine learning, diving deep into the algorithms, techniques, and tools that enable computers to learn from data and make intelligent decisions.

machine-learning python scikit-learn

Last synced: 01 May 2025

https://github.com/adityashrm21/adult-income-prediction

A end-to-end data analysis pipeline including model deployment

data-science eda flask heroku logistic-regression r scikit-learn tidyverse

Last synced: 29 Apr 2026

https://github.com/deepmancer/tweet-disaster-detection

fine-tuned BERT and scikit-learn models for real-time classification of disaster-related tweets, using TensorFlow, Keras, and Transformers. .

bert bert-fine-tuning classification fine-tuning huggingface-transformers keras keras-tensorflow natural-language-processing nlp scikit-learn tensorflow tensorflow2 tokenizer transformers

Last synced: 01 Apr 2025

https://github.com/ndgigliotti/cluster-tuner

A GridSearchCV-like hyperparameter tuner for clustering algorithms.

clustering gridsearchcv hyperparameter-tuning parameter-search scikit-learn scikit-learn-compatible unsupervised

Last synced: 03 Feb 2026

https://github.com/praveen1664/solved-python-machine-learning-book-book-1st-edition

Solved problem of famous book in machine learning, deep learning for learners

logistic-regression machine machine-learning model-evaluation nbviewer scikit-learn

Last synced: 20 May 2026

https://github.com/punitkumar4871/fake_news_prediction

A simple Jupyter Notebook project 📓 to classify news articles as 🧐 Fake or ✅ Real using machine learning.

matplotlib pandas python scikit-learn

Last synced: 15 Feb 2026

https://github.com/pxlairobotics/anr_ml_docker

This repository contains the necessary elements (code and artifacts) to build a Machine Learning container suitable to execute all the ML exercises of the AnR course.

fastai-v07 flask juypter keras machine-learning ml numpy pandas scikit-learn scipy tensorflow

Last synced: 30 Mar 2025

https://github.com/ashwin-rajeev/face-recognition

A simple face recognition system using scikit-learn and OpenCV

face-detection face-recognition machine-learning python scikit-learn

Last synced: 20 May 2026

https://github.com/samarthgarge/30-days-of-datascience

📊 30 Days of Data Science is a daily challenge to guide you through Data Science essentials. From basics to advanced, this repo offers clear examples, practical exercises, and resources to help you master Data Science, one day at a time. Whether you're new or refining your skills, this challenge has something for you. Join the journey now! 🚀

data-science decision-trees exploratory-data-analysis hyperparameter-tuning joblib k-means-clustering linear-regression logistic-regression machine-learning natural-language-processing numpy pandas pipeline pipelines python3 scikit-learn seaborn time-series-forecasting

Last synced: 10 Apr 2025

https://github.com/xuefeng-xu/fedps

Federated data Preprocessing via aggregated Statistics

data-preprocessing federated-learning python scikit-learn statistics

Last synced: 26 Jun 2025

https://github.com/francescopaolol/sentimentanalysis

About sentiment analysis on IMDB Dataset of 50K Movie Reviews

jupyter-notebook kaggle machine-learning ml pandas scikit-learn sentiment-analysis

Last synced: 18 Apr 2026

https://github.com/chinaskidev/ml-prediccion-lluvia-brazil

MLOps, usando Docker,Airflow,tensorflow,streamlit

python3 scikit-learn streamlit tensorflow

Last synced: 22 Apr 2025

https://github.com/alexeyev/hse-spb-bigdata-python-fall2016

Материалы к курсу по программированию и инструментам анализа данных, прочитанному в петербургском филиале НИУ ВШЭ осенью 2016 года

course-materials data-analysis numpy pandas python scikit-learn sklearn

Last synced: 07 Apr 2026

https://github.com/karenwky/predictive_modeling_hong_kong_horse_racing

predict the winning horse with supervised machine learning models (lucky to have 100% accuracy on small test data)

joblib knn-classifier lightgbm-models pandas predictive-modeling scikit-learn

Last synced: 10 Apr 2025

https://github.com/learninghouseservice/learninghouse

The LearningHouse Service provides machine learning algorithms based on the scikit-learn python library as a RESTful API. Its purpose is to offer smart home enthusiasts an easy way to teach their homes.

angular fastapi rest-api scikit-learn smarthome smarthome-api smarthome-app

Last synced: 01 May 2025

https://github.com/rvandewater/recipies

🥧 Easily define reproducible preprocessing steps for ML on Polars and Pandas dataframes.

data-science machine-learning pandas polars python scikit-learn tidymodels

Last synced: 18 Aug 2025

https://github.com/satyamtripathi8/handwritten_digit_classifier

Explore this repository for a CNN-based handwritten digit classification project. Utilizes TensorFlow/Keras to train and evaluate models, providing a practical example of deep learning in image recognition.

cnn-for-visual-recognition cnn-keras mnist pillow scikit-learn tensorflow

Last synced: 27 Feb 2025

https://github.com/pratham-verma/web_application_firewall

This project presents a powerful Web Application Firewall (WAF) designed to protects web applications from malicious activities. By leveraging machine learning algorithms, the WAF efficiently filters and detects potentially harmful requests before they reach the website, ensuring robust security.

burpsuite http-handling juypter-notebook logestic-regression machine-learning pandas proxy-server python scikit-learn supervised-learning web-security zaproxy

Last synced: 15 Apr 2025

https://github.com/reekrajroy/selflearning_chatbot

Self learning chatbot using python.

python scikit-learn

Last synced: 10 Apr 2026

https://github.com/nazchanel/fake-news-detection-webapp

A Flask webapp that detects fake news with a given text input using the power of Natural Language Processing. Deployment on Heroku failed due to the program's large memory consumption.

data-science dataset keras keras-tensorflow machine-learning natural-language-processing nlp nlp-machine-learning python scikit-learn tensorflow

Last synced: 06 Mar 2026

https://github.com/jrbourbeau/madpy-ml-sklearn-2018

MadPy supervised machine learning talk

machine-learning python scikit-learn

Last synced: 04 Sep 2025

https://github.com/ascender1729/quantumwaste

A Quantum-Inspired Molecular Recycling Simulator utilizing quantum algorithms and machine learning to optimize polymer recycling.

flask machine-learning polymer-recycling quantum-algorithms quantum-computing reactjs scikit-learn sustainable-technology threejs

Last synced: 24 Jan 2026

https://github.com/llamzonamazon/pc-data-dash

An interactive dashboard for PlanCatalyst’s redesigned website forecasting country-level development.

azure-blob-storage azure-container-instances azure-container-registry azure-functions azure-logic-apps numpy pandas power-bi python restful-api scikit-learn

Last synced: 14 Feb 2026

https://github.com/smola/fastcountvectorizer

FastCountVectorizer is a faster alternative to scikit-learn CountVectorizer.

natural-language-processing python scikit-learn

Last synced: 09 Mar 2026

https://github.com/rudolfwilliam/torch-kde

A differentiable implementation of kernel density estimation in PyTorch.

kernel-density-estimation python pytorch scikit-learn

Last synced: 31 Oct 2025

https://github.com/rjlovespy/house-price-predictor

A Tkinter GUI whose predictions are based on an ML model that is trained by Random Forest Regressor

cx-freeze gui-development jupyter-notebook machine-learning-models numpy pandas py-to-exe python random-forest-regression scikit-learn tkinter-gui

Last synced: 19 Apr 2025

https://github.com/monasri001/ai-based-job-recommendation-system

An AI-powered job recommendation system leveraging machine learning and MongoDB to match users with suitable job opportunities based on skills, experience, and locations.

ai automation-recommendation-engine data-science job-recommendation-system machine-learning mongodb naive-bayes python scikit-learn

Last synced: 30 Oct 2025

https://github.com/bkamapantula/discover-workshop

Code search utility to assist developer workflows via code discovery. Currently uses tf-idf estimator.

developer-tools pycon python scikit-learn tf-idf

Last synced: 18 Oct 2025

https://github.com/iterative/sagemaker-pipeline

An example project, showcasing a DVC pipeline using SageMaker SDK for data preparation and model training

dvc dvc-pipeline example sagemaker scikit-learn xgboost

Last synced: 18 Jun 2025

https://github.com/pr38/numbadecisiontrees

novel 'numba' based recreation of scikit-learn's decision tree algorithm

decision-tree decision-trees machine-learning numba python scikit-learn

Last synced: 02 Sep 2025

https://github.com/binste/pipecutter

pipecutter provides a few tools for luigi such that it works better with data science libraries and environments such as pandas, scikit-learn, and Jupyter notebooks.

jupyter-notebook luigi luigi-targets luigi-tasks pandas scikit-learn

Last synced: 14 Jan 2026

https://github.com/tma15/bunruija

A text classification toolkit

neural-networks pytorch scikit-learn text-classification

Last synced: 26 Oct 2025

https://github.com/gsganden/model_inspector

A uniform interface to a curated set of methods for inspecting machine learning models

data-science machine-learning scikit-learn visualization

Last synced: 26 Oct 2025

https://github.com/verlias/melodymatch-codefest2024

MelodyMatch combines genre filtering and audio analysis techniques to offer personalized music recommendations, implemented with Python's scikit-learn, Flask, and React.

flask python reactjs scikit-learn

Last synced: 02 Jul 2025

https://github.com/alexandregazagnes/scikit-transformers

Very usefull package to enable and provide custom transformers such as LogColumnTransformer, BoolColumnTransformers and others fancy transformers.

data data-science log python scikit-learn transformer

Last synced: 26 Oct 2025

https://github.com/reverendbayes/bmw-mhd-log-anomaly-detector

BMW MHD log anomaly detector — a damn useful tool for tuners and enthusiasts. Detects log anomalies using Isolation Forests trained on real driving data with binary features for AFR spikes, throttle faults, and RPM anomalies.

afr-analysis anomaly-detection automotive bmw car-logs data-driven-diagnostics diagnostics ecu-logging engine-tuning isolation-forest machine-learning mhd performance-monitoring python scikit-learn signal-processing throttle-analysis unsupervised-learning vehicle-data

Last synced: 10 Mar 2026

https://github.com/ptyadana/ml-music-recommender

Machine Learning Project for recommendations of music genre based on age and gender

graphviz graphviz-dot joblib jupyter-notebook machine-learning machinelearning-python pandas python3 scikit-learn

Last synced: 12 Apr 2025

https://github.com/kr1shnasomani/bloodprint

Blood group detection from fingerprint using TensorFlow and PyTorch.

computer-vision deep-learning neural-network numpy pypi pytorch scikit-learn tensorflow

Last synced: 12 Apr 2025

https://github.com/obirikan/cellphone-price-prediction

This project uses Linear Regression to predict cellphone prices based on various features such as resolution, RAM, battery capacity, and more. The best model is selected through multiple training iterations and saved for future use.

machine-learning matplotlib pandas scikit-learn

Last synced: 16 May 2025

https://github.com/pavanvaranasi02/virtual-air-writing-and-mouse

Introducing an innovative virtual air mouse system for seamless computer interaction. Users control devices with hand gestures, while advanced features include air writing, text-to-speech, and easy text saving. Our project enhances usability, efficiency, and the digital experience.

mediapipe opencv-python pyautogui python3 scikit-learn tensorflow

Last synced: 12 Apr 2025

https://github.com/omanshu209/ann-classifier-hub

Artificial Neural Network(ANN)🧠 Classification models developed using the PyTorch and Scikit-Learn libraries of python🐍!

ann artificial-intelligence artificial-neural-networks deep-learning jupyter-notebook machine-learning ml mlp-classifier neural-network nn pandas python3 pytorch pytorch-ann scikit-learn sklearn

Last synced: 27 Jul 2025

https://github.com/sunnyadn/comprisk

Python toolkit for competing risks: forest (RSF) today; Fine-Gray + Aalen-Johansen + Gray's test + cause-specific Cox in v0.4. Scales to n=10⁶ in ~1 min, 10–22× faster than randomForestSRC on real EHR data, sklearn-compatible.

biostatistics competing-risks machine-learning numba python random-forest random-survival-forest scikit-learn survival-analysis

Last synced: 23 May 2026

https://github.com/kshula/kwachaforexapp

Kwacha Forex App built with streamlit powered by scikit learn machine learning

numpy pandas scikit-learn streamlit

Last synced: 01 Mar 2026

https://github.com/lenoben/review-detention-model

A simple python ML model trained with scikit-learn on review datasets to predict if whatever input that is given to it is positive or negative.

machine-learning nlp python scikit-learn sentiment-analysis

Last synced: 06 Sep 2025

https://github.com/katharineshapcott/rank-similarity

Rank Similarity is a set of nonlinear classification and transform tools for large datasets.

classification machine-learning nonlinear scikit-learn transformer

Last synced: 24 Oct 2025

https://github.com/zackakil/iris-dataset-3d-marbles

What if every row/flower from the Iris dataset was represented as a marble in a physics simulation? Using Scikit-Learn with Blender to render the iris dataset in 3d and create a physical simulation of a marble machine to classify the dataset.

3d blender data-visualization flower iris-dataset machine-learning physics-simulation scikit-learn

Last synced: 02 May 2026

https://github.com/g0bel1n/tinyautoml

TinyAutoML is a comprehensive Pipeline Classifier Project thought as a Scikit-learn plugin

automl-pipeline machine-learning scikit-learn

Last synced: 21 Jul 2025

https://github.com/bethanyjep/manipulating-and-cleaning-talk

Gold Ambassador and MSFT Reactor TelAviv talk on Data Cleaning and Manipulation to Student Ambassadors

matplotlib numpy pandas python scikit-learn seaborn

Last synced: 12 Sep 2025

https://github.com/srinivasrm/mutual-funds-analysis-and-prediction

In this project I have performed analysis and prediction on 1,3,and 5 year returns on 1064 mutual funds in India. I have scraped data from a website which is the most visited website for mutual fund investments.I have tested regression models linear model,SGD Regressor , Random Forest Regressor,Decision Tree Regressor,Ridge,MLP Regressor and linear model (Lasso).After which I have selected the best perorming model and performed Hyper parameter tuning and then deployed an interactive application which can generate the visualization and send an email with the visualization to the users email address.

beautifulsoup data-analysis data-base data-cleaning data-science deployment etl finanace frontend funds machine-learning mutual mutual-funds pgsql python scikit-learn sql streamlit web webapplication

Last synced: 27 Oct 2025

https://github.com/nguyenanht/john-toolbox

This is my own toolbox to explore data science

data-science machine-learning pipeline python pytorch scikit-learn

Last synced: 10 Apr 2025

https://github.com/luca-parisi/m_arcsinh

m-arcsinh: A Reliable and Efficient Function for Supervised Machine Learning (scikit-learn, TensorFlow, and Keras) and Feature Extraction (scikit-learn)

activation activation-function activation-functions activations arcsinh classification dimensionality-reduction feature-extraction keras keras-tensorflow machine-learning machinelearning mlp mlp-classifier neural-network python scikit-learn svm svm-classifier tensorflow

Last synced: 28 Jun 2025

https://github.com/farahibrar/programming-in-python

Explore a comprehensive collection of Python programming for diverse data analysis and data science projects. This repository covers data exploration, visualization, statistical analysis, machine learning, NLP, and model deployment. Perfect for enthusiasts looking to delve into practical examples and advanced techniques.

beautifulsoup dataanalysis docker flask folium jupyter-notebook machine-learning matplotlib nltk numpy pandas python pytorch scikit-learn scikitlearn scipy seaborn spacy statsmodels tensorflow

Last synced: 16 Jul 2025

https://github.com/jameschapman19/scikit-prox

A package for fitting regularized models from scikit-learn via proximal gradient descent

proximal-gradient-descent regularization scikit-learn scikit-learn-api

Last synced: 10 Apr 2025

https://github.com/queirozfcom/python-ds-util

Collection of useful helper methods for interactive data science work in python. Usually on jupyter notebooks, using the basic python scientific stack.

data-science matplotlib pandas python scikit-learn utils

Last synced: 19 Oct 2025

https://github.com/nicofilips/cs50ai-harvard

Harvard University Online Course | CS50-AI | Artificial Intelligence with Python | Project Solution

artificial-intelligence harvardcs50 natural-language-processing neuronal-network nltk python scikit-learn tensorflow

Last synced: 15 Apr 2025

https://github.com/brkcvlk/mlfcrafter

ML Pipeline Automation Tool - Chain together data processing, model training, and deployment with minimal code. Build production-ready ML workflows in minutes, not hours.

ai automation automl beginner-friendly data-processing data-science framework library machine-learning ml-framework mlops model-training pipeline production-ready python python3 scikit-learn toolkit workflow xgboost

Last synced: 14 Jan 2026