An open API service indexing awesome lists of open source software.

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/akhundmuzzammil/energyconsumptionprediction

This repository contains code and resources for training a linear regression model to predict energy consumption based on various building parameters.

data-analysis energy-consumption linear-regression machine-learning python scikit-learn streamlit visualization

Last synced: 18 Apr 2026

https://github.com/simrandalal/semantic-book-recommender

A semantic content-based book recommender using sentence-transformer embeddings, cosine similarity, and a Streamlit interface.

dotenv huggingface-transformers nlp-machine-learning pandas python scikit-learn similarity-search streamlit

Last synced: 05 Apr 2026

https://github.com/taqsblaze/hush

Hush: A lightweight, context-aware text toxicity classifier. Leveraging NLP and Random Forest ensemble learning to detect and mitigate harmful language in real-time. Built for efficiency, safety, and cleaner digital communication.

content-moderation machine-learning nlp random-forest safety-tools scikit-learn text-classification toxicity-detection

Last synced: 05 Apr 2026

https://github.com/malick08012/heart-disease-prediction

A machine learning project that predicts the risk of heart disease based on patient health data. Includes data cleaning, EDA, visualization, model training, evaluation and feature importance analysis

artificial-intelligence heartdisease-prediction logistic-regression machine-learning python scikit-learn

Last synced: 18 Apr 2026

https://github.com/naren1704/ml-approach-for-employee-performance-prediction

A Flask UI that predicts the performance of employee based on the XGBoost trained model.

css flask html python scikit-learn xgboost

Last synced: 05 Apr 2026

https://github.com/nowon1/insurance-claim-prediction_version

This project aims to predict the insurance claim amounts based on various customer attributes using machine learning techniques. The project involves data preprocessing, exploratory data analysis, feature engineering, and model training and evaluation.

data-preprocessing data-science data-visualization exploratory-data-analysis feature-engineering insurance jupyter-notebook machine-learning numpy pandas predictive-modeling python random-forest regression-analysis scikit-learn

Last synced: 05 Apr 2026

https://github.com/kaladabrio2020/machine-learning-with-pytorch-and-scikit-learn

Progress on the book machine learning with pytorch and scikit-learn

deep-learning implementation machine-learning python3 pytorch scikit-learn

Last synced: 20 Apr 2026

https://github.com/dahsie/spam_classification

Ce fut mon prémier projet NLP où j'ai réalisé la détection de spam en utilisant les algorithmes d'embedding pour encorder mes textes. J'ai utilisé Random Forest et Milti-Layres Perceptrons pour la phase de classification. Ce qui a pemit l'obtension des précisions respective de 97% et 98%. J'ai aussi appris à documenter mes codes via sphinx

doc2vec fasttext-embeddings gensim glove-embeddings python scikit-learn sphinx-doc word2vec-algorithm

Last synced: 20 Apr 2026

https://github.com/tr-3n/-ai-powered-resume-analyzer-multi-source-job-matcher

AI-Powered Resume Analyzer & Multi-Source Job Matcher, is a web application built using Python and Streamlit that helps job seekers find the best job opportunities based on their resume. The app extracts text from uploaded resumes, matches it with job listings from multiple sources, and displays the most relevant jobs.

ai api html-css job job-recommendation job-search jobmatching natural-language-processing pandas pypdf2 python resume-analyzer scikit-learn streamlit web-development

Last synced: 20 Apr 2026

https://github.com/chdl17/lead-score-case-study

Lead scoring is the process of assigning a numerical value or score to each lead, based on factors such as demographics and behavior, to determine their potential value as customers.

machine-learning-algorithms matplotlib-pyplot python scikit-learn

Last synced: 20 Apr 2026

https://github.com/adityapradhan202/binge-trend

Media and entertainment recommendation website with AI powered recommendation system.

datascience-machinelearning natural-language-processing python scikit-learn spacy-nlp

Last synced: 21 Apr 2026

https://github.com/sayan-mondal2022/mlops-assignment

A project for validating the Machine learning models

machine-learning scikit-learn streamlit

Last synced: 22 Apr 2026

https://github.com/jawwad-fida/data-science-salary-estimator

A tool that estimates data science salaries (MAE ~ $ 11K) to help data scientists negotiate their income when they get a job.

data-science machine-learning project scikit-learn

Last synced: 25 Apr 2026

https://github.com/toscdom/spam_detection

This repository contains a project focused on analyzing and classifying emails to detect SPAM. It includes: Training a machine learning classifier for SPAM detection. Identifying key topics in SPAM emails using NLP techniques. Calculating semantic distances to evaluate topic similarity. Tools used include Python libraries like nlp frameworks

classifier nlp nltk scikit-learn semantic-analysis spam-detection

Last synced: 27 Apr 2026

https://github.com/tillscode/personal-finance-ml-analysis

Machine learning analysis of personal financial data with predictive modeling and interactive dashboard

dashboard data-analysis finance machine-learning python scikit-learn

Last synced: 28 Apr 2026

https://github.com/serdaraydem1r/10dayaichallenge101

In the 10-day camp, we experienced the basics of machine learning by coding

artificial-intelligence machine-learning-algorithms model-evaluation-and-selection scikit-learn

Last synced: 28 Apr 2026

https://github.com/dwade-eng/amazon-product-recommender-prototype-

This project is a content-based product recommendation engine inspired by Amazon's "Customers who viewed this item also viewed" feature. It uses a dataset of product metadata and user interactions to suggest similar items based on product titles, brands, and categories using TF-IDF vectorization and cosine similarity.

html numpy pandas python3 scikit-learn

Last synced: 28 Apr 2026

https://github.com/emmanuelletocs/steam-game-recommender

A powerful recommendation system for Steam games, combining Content-Based and Collaborative Filtering techniques. Built with Python, Scikit-learn, and Streamlit to deliver accurate, real-time game recommendations. Perfect for gamers and data scientists interested in building intelligent recommendation engines.

als-algorithm data-analysis gaming-industry knn machine-learning mds mysql ncf neural-network pyspark recommendation-engine recommendation-system scikit-learn spark

Last synced: 28 Apr 2026

https://github.com/akash-47-tank/predictive-customer-churn-analyzer

A professional-grade customer churn prediction system that not only predicts customer churn but also provides clear explanations for the predictions. Built with Python, XGBoost, and SHAP.

machine-learning pandas python scikit-learn shap streamlit xgboost

Last synced: 28 Apr 2026

https://github.com/catcoder27/ai-portfolio

Reusable ML scaffold: notebooks, model cards, reports

data-science kaggle machine-learning pandas scikit-learn

Last synced: 28 Apr 2026

https://github.com/incalculable-driverslicence975/data-projects-portfolio

📊 Showcase data projects that highlight analytics, machine learning, and MLOps with reproducible code and clear business insights.

ai computer-vision dashboard data-science-projects data-visualization deep-learning etl excel finance hadoop hiveq keras machine-learning nlp pandas portfolio-project scikit-learn tableau-dashboards

Last synced: 28 Apr 2026

https://github.com/rakibhhridoy/customersegmentation-clustering

Customer segmentation heavily use in business purpose. It is needed skill for business intelligence and applied machine learning engineer. This represent quite basic way the customer segmentation is done. In python the task is quite easy to do.

agglomerative-clustering clustering-algorithm customer ecommerce kmeans-clustering machine-learning scikit-learn scikitlearn-machine-learning segmentation unsupervised-learning unsupervised-machine-learning

Last synced: 28 Apr 2026

https://github.com/arizdn234/spotify-api-with-colab

Crawling, Analyzing, Clustering music data from Spotify API

machile-learning scikit-learn spotify-api spotipy-library

Last synced: 28 Apr 2026

https://github.com/alessine/predicting_pirate_attack_success

Using machine learning to predict the success or failure of pirate attacks; elaborated during the Data Science Bootcamp at Propulsion Academy

bokeh fine-tuning interactive-visualizations machine-learning modelling overfitting plotly prediction scikit-learn

Last synced: 28 Apr 2026

https://github.com/skypse/santander-coders-data_science-course

Curso de Data Science, proposto pelo Satander, utilizando Python!

jupyter-notebook numpy pandas-python python scikit-learn

Last synced: 29 Apr 2026

https://github.com/jarif87/text-key-extractor

A Django web app that uses TF-IDF to extract keywords from text, featuring a modern, responsive UI with animated gradients and glassmorphism.

django-application keywords-extraction pandas python scikit-learn

Last synced: 29 Apr 2026

https://github.com/m-muecke/text-normalizer

Text normalizer integration for sklearn.pipeline.Pipeline class

nlp nltk python scikit-learn

Last synced: 29 Apr 2026

https://github.com/gustaminas/ai_primer---flatland

A project from the AI_primer course at Vilnius university.

cnn-keras data-augmentation data-mixup dropout-keras scikit-learn shape-classification

Last synced: 29 Apr 2026

https://github.com/saikumar787/car_price_prediction_using_linear-regression

A machine learning project to predict the selling price of used cars using regression techniques. Includes data preprocessing, model training, evaluation, and testing on new data.

car-price-prediction-with-machine-learning data-analysis joblib jupiter-notebook linear-regression-models model-deployment python scikit-learn standardscaler

Last synced: 29 Apr 2026

https://github.com/matheusvazdata/retail-sales-forecast-linreg-sklearn

Minimal project for retail sales forecasting using linear regression (scikit-learn).

forecasting linear-regression machine-learning matplotlib numpy pandas scikit-learn

Last synced: 29 Apr 2026

https://github.com/shahzadmustafa15/credit-card-fraud-detection

Credit card fraud detection using Random Forest with Stratified K-Fold cross-validation and F1-score evaluation.

classification confusion cross-validation f1-score fraud-detection imbalanced-data kaggle machine-learning python random-forest scikit-learn

Last synced: 29 Apr 2026

https://github.com/hexbyte-lab/resumatch

AI-powered resume-to-job matching tool with NLP analysis | Python + Flask + Machine Learning

cosine-similarity flask job-search machine-learning nltk portfolio-project python resume scikit-learn tfidf

Last synced: 29 Apr 2026

https://github.com/jarif87/dna-based-identification-of-e.coli

Django web app predicting E. coli in DNA sequences using a machine learning model, with a responsive interface and client-side validation. Files generated by project.py.

classification django-application dna-sequences html-css-javascript mlp-classifier python3 scikit-learn

Last synced: 29 Apr 2026

https://github.com/rishi-sutar/healwise-ai-your-way-to-wellness

Healwise-AI is a health diagnostic tool that uses a Support Vector Classifier (SVC) model to predict diseases based on user-reported symptoms. After predicting, it offers detailed health advice, including descriptions, diets, medications, and workouts related to the diagnosis.

machine-learning scikit-learn support-vector-machine

Last synced: 30 Apr 2026