Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/jofaval/iris-flowers

Multilabel Classification of the famous Iris Flowers Dataset from Ronald Aylmer Fisher in 1936

classification data-analysis data-science data-visualization google-colab iris-flowers kaggle machine-learning python scikit-learn xgboost

Last synced: 21 Oct 2024

https://github.com/rakibhhridoy/appliedmachinelearninghousing-regression

Let's take the Housing dataset which contains information about different houses in Boston. This data was originally a part of UCI Machine Learning Repository and has been removed now. We can also access this data from the scikit-learn library. The objective is to predict the value of prices of the house using the given features.

deep-learning housing-market housing-prices machine-learning numpy pandas python real-estate regression scikit-learn

Last synced: 06 Nov 2024

https://github.com/jofaval/titanic-disaster

Data Analysis of the famous Titanic Disaster in 1912 with Machine Learning

classification data-analysis data-science data-visualization google-colab kaggle machine-learning python scikit-learn

Last synced: 21 Oct 2024

https://github.com/khanovico/python-stock-analyzer

This is a Webapp implemented by python and several data science frameworks, enabling online stock trend analyzing.

amcharts-js-charts data-analysis data-visualization flask javascript pandas python scikit-learn

Last synced: 03 Nov 2024

https://github.com/pierrekieffer/datapreprocessing

Custom data preprocessing library made for machine learning

data-preparation data-preprocessing machine-learning preprocessing scikit-learn

Last synced: 26 Oct 2024

https://github.com/markdouthwaite/py-lingo

Utilities for helping you deploy Scikit-Learn models in Go (with lingo!)

hdf5 linear-models scikit-learn

Last synced: 11 Oct 2024

https://github.com/zsailer/skspline

A Scikit-learn interface on Scipy's spline.

scikit-learn scipy

Last synced: 05 Nov 2024

https://github.com/mkdirer/depression-data-analysis

This project analyzes a Kaggle depression dataset using data preprocessing, clustering, classification, and outlier detection techniques. Python libraries like pandas, numpy, matplotlib, seaborn, and scikit-learn are used to extract insights.

classification clustering matplotlib numpy pandas scikit-learn seaborn vizualization

Last synced: 10 Oct 2024

https://github.com/bilgenurbekar/turkishcyberbullying

Contains fine-tuned BERT models and results in the text classification category using Turkish social media data

bert-fine-tuning huggingface-transformers matplotlib numpy pandas python pytorch scikit-learn transformers

Last synced: 10 Oct 2024

https://github.com/scikit-learn/pairwise-distances-reductions-asv-suite

A dedicated asv suite for scikit-learn private PairwiseDistancesReductions

asv benchmarks cython scikit-learn

Last synced: 29 Oct 2024

https://github.com/sergeimakarovv/energy-data-analytics-ml

Analyzing global data on sustainable energy, predicting CO2 emissions per capita

machine-learning pandas plotly python scikit-learn streamlit

Last synced: 10 Oct 2024

https://github.com/alexsolov28/ml_course

Курс "Технология машинного обучения", бакалавриат, 6 семестр

colab-notebooks jupyter-notebook matplotlib numpy pandas python scikit-learn seaborn

Last synced: 31 Oct 2024

https://github.com/oneapi-src/customer-churn-prediction

AI Starter Kit for customer churn prediction using Intel® Extension for Scikit-learn*

machine-learning scikit-learn

Last synced: 05 Nov 2024

https://github.com/hokagem/damagedlogginganalyzer

A project about an analization of a statistic of damaged logging (wood) in Germany using Python.

analysis csv csv-parser k-fold-cross-validation numpy pandas pandas-dataframe pandas-python polynomial-regression scikit-learn statistics wood

Last synced: 20 Oct 2024

https://github.com/heyitsjoealongi/fantasy-football-qbwr-model

Fantasy Football: Quarterback / Wide Receiver - Gaussian Process Regression (GPR) Machine Learning Model

machine-learning matplotlib model numpy python scikit-learn

Last synced: 26 Oct 2024

https://github.com/chaitanya1436/student_performance_analysis

A project focused on analyzing college student performance using data on department, assessment scores, and performance labels. Implemented in Google Colab, the analysis includes data preprocessing, feature scaling, and exploratory data analysis to uncover insights and prepare the data for further analysis or modeling.

ata-preprocessing data-preparation exploratory-data-analysis feature-scaling google-colab numpy pandas scikit-learn

Last synced: 03 Nov 2024

https://github.com/NoName115/Bachelor-thesis

Bachelor thesis - Determination of Gun Type and Position in Image Scene

bachelor-thesis classification computer-vision fit gun keras machine-learning scikit-image scikit-learn vut

Last synced: 23 Oct 2024

https://github.com/ccastleberry/hands_on_machine_learning

Notebooks and files created while working through the book Hands on Machine Learning

data-science jupyter-notebook scikit-learn tensorflow

Last synced: 28 Oct 2024

https://github.com/stella4444/linear-regression

learning about linear regression (currently a work in progress) ~ working with data

linear-regression machine-learning numpy scikit-learn

Last synced: 03 Nov 2024

https://github.com/rickcontreras/modelos1

Modelo de clasificación para predecir el desempeño de estudiantes en las Pruebas Saber Pro en Colombia. Incluye análisis exploratorio de datos, preprocesamiento y modelos de machine learning.

classification colombia data-analysis data-science education educational-assessment exploratory-data-analysis jupyter-notebook machine-learning python saber-pro scikit-learn student-performance

Last synced: 10 Oct 2024

https://github.com/adi3042/thyroid-disease-detection

🔍🌟 Discover Thyroid Disease Detection! Dive into our advanced system designed to identify and predict thyroid disorders using cutting-edge machine learning techniques. Leverage our comprehensive models and data analysis tools to make informed decisions about thyroid health. 🩺🔬🚀 ThyroidHealthTech

classification css detection-model functools html ipykernel javascript jupyter-notebook machine-learning matplotlib numpy pandas python3 scikit-learn setuptools thyroid-dataset thyroid-disease thyroid-disease-detection venv

Last synced: 13 Oct 2024

https://github.com/alaazameldev/text-based-search-engine

Implementation of a search engine using TF-IDF and Word Embedding-based vectorization techniques for efficient document retrieval

chromadb fastapi gensim-word2vec nltk numpy precision-recall python scikit-learn tf-idf-vectorizer

Last synced: 10 Oct 2024

https://github.com/tamk-kol/project_orbital_data_analysis

The goal of this project is to develop an automatic method to detect orbital maneuvers using machine learning.

matplotlib numpy pandas scikit-learn

Last synced: 31 Oct 2024

https://github.com/sckonung/crab-age-regression

ML model for regression with a crab age dataset Competition in Kaggle

keras machine-learning pandas python scikit-learn tensorflow

Last synced: 03 Nov 2024

https://github.com/ismaelvr1999/air-quality-clustering

This project focuses on analyzing air quality data and categorizing it into clusters using the K-Means algorithm.

jupyter-notebook machine-learning matplotlib pandas python scikit-learn

Last synced: 10 Oct 2024

https://github.com/allanreda/automated-k-means-clustering-engine

An interactive K-Means clustering tool built with Flask and Scikit-Learn, supporting Excel file uploads, cluster analysis, and data export, deployed on Google Cloud Run via Docker with CI/CD integration.

cicd css data-visualization deployment docker flask google-cloud-run html javascript k-means-clustering machine-learning matplotlib numpy pandas python scikit-learn

Last synced: 31 Oct 2024

https://github.com/bilalm04/email-spam-classifier

A machine learning project that classifies emails as spam or not spam using Logistic Regression, with a deployable Flask API for real-time classification.

api flask jupyter-notebook machine-learning matplotlib nlp numpy pandas python scikit-learn

Last synced: 10 Oct 2024

https://github.com/sarowarahmed/predicting-kolkata-house-price

🏠 Predicting Kolkata House Price: A web app powered by a Machine Learning model, built with Numpy, Pandas, Scikit-learn, and Streamlit, to predict house prices in Kolkata. Deployed on Streamlit Cloud for easy access and real-time predictions.

app kolkata linear-regression machine-learning numpy pandas scikit-learn streamlit

Last synced: 03 Nov 2024

https://github.com/idaraabasiudoh/svm_cell_classification

This repository contains code for classifying cell samples using Support Vector Machine (SVM) with Scikit-learn.

machine-learning python3 scikit-learn svm-classifier

Last synced: 02 Nov 2024

https://github.com/sergeimakarovv/ml-powerlifting

Predicting a weight lifted by athletes using Machine Learning

machine-learning pandas python scikit-learn

Last synced: 03 Nov 2024

https://github.com/oroszgy/cookiecutter-ml-flask

Cookiecutter template for training and serving machine learning models with scikit-learn, spacy, Flask and Docker

docker flask flask-application machine-learning nlp rest-api scikit-learn spacy

Last synced: 19 Oct 2024

https://github.com/mohd-faizy/preprocess_ml

This repository hosts Python code that utilizes the Scikit-learn preprocessing API for data preprocessing. The code presents a comprehensive range of tools that handle missing data, scale data, encode categorical variables, and perform other functions.

data-science feature-engineering feature-engineering-algorithm feature-extraction feature-selection machine-learning outlier-detection preprocessing-data preprocessor scikit-learn

Last synced: 11 Oct 2024

https://github.com/r-gg/ml-37

Amazon Reviews ~ Sentiment analysis evaluation: fine-tuned BERT vs LSTM. (+ Extensive Data Mining & Visualization)

bert deep-learning ipynb-jupyter-notebook lstm machine-learning python scikit-learn uni-project

Last synced: 11 Oct 2024

https://github.com/the-developer-306/house-price-predictor

House Price Predictor: Harnessing machine learning algorithms to forecast housing prices in Boston, empowering buyers and sellers with accurate predictions based on key factors like location, crime rate, rooms, accessibility, and more.

csv ipynb-jupyter-notebook joblib matplotlib numpy pandas python scikit-learn

Last synced: 11 Oct 2024

https://github.com/chris-santiago/tsfeast

A collection of Scikit-Learn compatible time series transformers and tools.

data-science feature-engineering python scikit-learn time-series timeseries-features transformers

Last synced: 27 Oct 2024

https://github.com/abdullahashfaq-ds/sms-spam-detection

A machine learning application designed to classify SMS messages as spam or non-spam, offering real-time analysis to identify potentially harmful content.

css3 docker flask html5 javascript matplotlib nltk numpy pandas python scikit-learn seaborn tailwindcss xgboost

Last synced: 29 Oct 2024

https://github.com/henriqueotogami/imersao-dados-3-alura

Terceira edição da Imersão Dados da Alura (03 a 07/05/21). O projeto dessa edição foi inspirado em um desafio do Laboratory Innovation Science at Harvard disponibilizado no Kaggle.

alura bioinformatics data-science drug-discovery google-collab harvard-university imersaodados jupyter-notebook kaggle-challenge laboratory-innovation-science matplotlib pandas python3 scikit-learn seaborn

Last synced: 05 Nov 2024

https://github.com/aneeshmurali-n/ann-diabetes-prediction

Predicting diabetes progression using an Artificial Neural Network (ANN). This project leverages the scikit-learn diabetes dataset for training and evaluation. Includes data preprocessing, model building, and performance visualization.

ann data-preprocessing data-visualization deep-learning diabetes-prediction exploratory-data-analysis keras machine-learning matplotlib neural-network numpy pandas regression scikit-learn seaborn tensorflow visualization

Last synced: 31 Oct 2024

https://github.com/brenofariasdasilva/dagster-education-model

Dagster Education Model using Dagster 1.3.11 and Python 3.7.17.

dagster makefile matplotlib pandas pyenv python3 scikit-learn seaborn shellscript

Last synced: 16 Oct 2024

https://github.com/oneapi-src/predictive-asset-health-analytics

AI Starter Kit for Predictive Asset Maintenance using Intel® optimized version of XGBoost

machine-learning scikit-learn

Last synced: 05 Nov 2024

https://github.com/messierandromeda/sentiment-analysis

Sentiment analysis with the IMDB movie review dataset.

imdb-dataset python scikit-learn sentiment-analysis

Last synced: 10 Oct 2024

https://github.com/marconicivitavecchia/stazione-monitoraggio-ambientale

Codice in MicroPython per ESP32 per il corso tenuto dalla nostra scuola rivolto ai docenti sulla creazione di una stazione di monitoraggio ambientale che copre gli argomenti di Python, IoT ed Intelligenza Artificiale.

ai esp32 micropython micropython-esp32 python school-project scikit-learn

Last synced: 05 Nov 2024

https://github.com/darenr/gradientboostingmachines

Notebooks exploring strengths and weaknesses of GBM based classifiers

jupyter-notebook lightgbm pandas scikit-learn xgboost

Last synced: 23 Oct 2024

https://github.com/rakibhhridoy/visualmachinelearning-yellowbrick

Yellowbrick wraps the scikit-learn and matplotlib to create publication-ready figures and interactive data explorations. It is a diagnostic visualization platform for machine learning that allows us to steer the model selection process by helping to evaluate the performance, stability, and predictive value of our models and further assist in diagnosing the problems in our workflow.

classification hyperparameter-tuning machine-learning model-evaluation model-view-presenter model-visualization python random-forest random-forest-classifier scikit-learn visualization xgboost xgboost-algorithm yellowbrick

Last synced: 06 Nov 2024

https://github.com/idaraabasiudoh/telco-churn-logistic-regression

A predictive model using logistic regression to identify customers likely to churn from a telecommunications company.

logistic-regression machine-learning python3 scikit-learn

Last synced: 03 Nov 2024

https://github.com/bkamapantula/discover

Code search utility to assist developer workflows via code discovery. Currently uses TF-IDF estimator.

developer-tools python scikit-learn tf-idf

Last synced: 16 Oct 2024

https://github.com/soumyapro/parkinson-disease-prediction

This project predicts Parkinson's disease using machine learning models.

logistic-regression numpy pandas scikit-learn svc xgboost

Last synced: 03 Nov 2024

https://github.com/engineertolulope/us_states_living_ranking_analysis

Python script for analyzing and ranking U.S. states based on factors like cost of living, tax burden, diversity, crime rates, and climate. Uses weighted criteria to identify the best states to live in according to these metrics. Ideal for decision-making on relocation.

data-analysis data-science linear-regression machine-learning python scikit-learn

Last synced: 03 Nov 2024

https://github.com/idaraabasiudoh/credit_card_fraud_detection

This repository contains a machine learning project focused on detecting credit card fraud using Decision Tree and Support Vector Machine (SVM) classifiers.

data-analysis jupyter-notebook machine-learning python3 scikit-learn snapml

Last synced: 03 Nov 2024

https://github.com/sanjiv856/machine_learning_scikit-learn

Repository for machine learning in Python using Scikit-learn.

pipelines python scikit-learn sklearn titanic-kaggle titanic-survival-prediction

Last synced: 03 Nov 2024

https://github.com/abhipatel35/svm-hyperparameter-optimization-for-breast-cancer

Utilizing SVM for breast cancer classification, this project compares model performance before and after hyperparameter tuning using GridSearchCV. Evaluation metrics like classification report showcase the effectiveness of the optimized model.

breast-cancer cancer-diagnosis classification data-analysis data-science gridsearchcv healthcare hyperparameter-tuning jupyter-notebook machine-learning medical-imaging pycharm python scikit-learn support-vector-machine svm

Last synced: 31 Oct 2024

https://github.com/katjaweb/king-county-house-price-prediction

This project aims to predict house prices based on various features such as square footage, number of rooms or location.

machine-learning python regression scikit-learn

Last synced: 03 Nov 2024

https://github.com/manjit-baishya-datascience/spam-email-detection

This project demonstrates how to build a spam detection system using Natural Language Processing (NLP) and machine learning techniques.

imblearn nlp nlp-machine-learning nltk scikit-learn spam-detection

Last synced: 03 Nov 2024

https://github.com/jeus0522/7-explore-different-classifier-ml-app

A project exploring various classification algorithms, showcasing their implementation, comparison, and evaluation using Python and scikit-learn.

k-nearest-neighbours knn random-forest scikit-learn streamlit support-vector-machine svm

Last synced: 03 Nov 2024

https://github.com/kavyachouhan/fake-news-detection-dravidian-language

This repository contains the code and resources for a machine learning project focused on detecting fake news in the Malayalam language, developed as part of the IITM-PAN BS AI-ML Challenge.

jupyter-notebook machine-learning numy pandas python scikit-learn

Last synced: 03 Nov 2024

https://github.com/manome/python-supervised-learning

This project provides sample code for performing supervised learning.

conformal-prediction scikit-learn supervised-learning

Last synced: 03 Nov 2024

https://github.com/pejpero/machine_learning

This repository contains two comprehensive machine learning projects using scikit-learn, demonstrating ensemble learning with a Voting Classifier and the comparison of linear and polynomial regression models on different datasets.

ensemble-learning linear-regression logistic-regression machine-learning polynomial-regression random-forest scikit-learn svm

Last synced: 03 Nov 2024

https://github.com/amiriiw/text_classification

Welcome to the Text Classification Project! This project is designed to train a model for classifying texts based on their emotional content and then using it to categorize new texts into corresponding emotional categories.

keras numpy pandas pickle scikit-learn tensorflow text-classification

Last synced: 03 Nov 2024

https://github.com/raulmaulidhino-dev/ml_modelling_regression

There are many factors that influence the grades/scores of students. One of the factors is study hours. In this mini analysis project, there are 3 models that will learn and predict the relation between study hours of students and their scores in an exam/test. This project will result the best ML model to solve the problem.

data data-analysis-python data-science eda machine-learning scikit-learn

Last synced: 03 Nov 2024

https://github.com/gangula-karthik/altitude-analytics

Developed a powerful model that predicts airline review sentiments—promoter, passive, or detractor—to help airlines sharpen their marketing strategies and boost customer loyalty 🚀 ✨

airlines data-science machine-learning python scikit-learn sentiment-analysis supervised-learning

Last synced: 05 Nov 2024

https://github.com/iamriteshkoushik/skrun

18hrs Scikit Learn Course Speedrun Repo

freecodecamp machine-learning scikit-learn

Last synced: 05 Nov 2024

https://github.com/pranav-tank/heart-disease-prediction-model

I have created this project as my Python term assignment. In this project I have trained a ML model to predict the heart disease using Scikit-learn library in python.

google-colaboratory jupyter-notebook machine-learning prediction-model python scikit-learn

Last synced: 03 Nov 2024

https://github.com/aysh2603/twitter-sentiment-analysis

The Twitter Sentiment Analysis project employs Natural Language Processing (NLP) techniques to classify tweets into positive or negative sentiments. By analyzing the tone of tweets, this project provides insights into public sentiment on various topics.

hyperparameter-tuning nlp-machine-learning numpy pandas python3 scikit-learn

Last synced: 03 Nov 2024

https://github.com/sudothearkknight/15-machinelearningprojects

A curation of 15 Machine Learning projects in various fields that are helping me gain a better understanding of the different machine learning tools, techniques, algorithms and methodalogies.

classification-algorithm machine-learning machine-learning-algorithms natural-language-processing pycharm-ide python3 regression-models scikit-learn scikitlearn-machine-learning spam-detection

Last synced: 31 Oct 2024

https://github.com/ayushtiwari134/machine_learning_models

A repo where i upload all the models which i train during my journey of learning Machine Learning from scratch

linear-regression logistic-regression machinelearning matplotlib numpy pandas python random-forest scikit-learn

Last synced: 06 Nov 2024

https://github.com/mahdibehoftadeh/polynomial-regression-co2-emissions

A simple machine learning polynomial regression using a large dataset to learn and predict CO2 emission of a car by its built features like engine size and cylinders

machine-learning matplotlib numpy nural-network pandas polynomial-regression python scikit-learn

Last synced: 03 Nov 2024

https://github.com/sarowarahmed/advertising-sales-app

📈 Advertising Sales Predictor: A web app powered by a Machine Learning model, built with Numpy, Pandas, Scikit-learn, and Streamlit, to forecast sales based on TV, Newspaper, and Online Advertising. Deployed on Streamlit Cloud for real-time, easy-to-use predictions.

advertising app machine-learning multiple-linear-regression numpy pandas sales scikit-learn streamlit

Last synced: 03 Nov 2024

https://github.com/aysh2603/credit-card-fraud-detection

The Credit Card Fraud Detection project aims to identify fraudulent transactions from a dataset of credit card transactions. The project addresses the challenge of class imbalance and employs advanced machine learning techniques to build an effective fraud detection model.

ensemble-learning hyperparameter-tuning numpy pandas python3 scikit-learn streamlit

Last synced: 03 Nov 2024

https://github.com/cego669/dirtycategoriesencoding

Repository containing two classes (StringAgglomerativeEncoder and StringDistanceEncoder) useful for grouping or visualizing the distance between dirty categorical variables. They are compatible with the scikit-learn API.

category clustering dimensionality-reduction dirty hierarchical-clustering machine-learning scikit-learn singular-value-decomposition svd

Last synced: 03 Nov 2024

https://github.com/umasivakumar14/real_estate_ml_model

Predicts the price of a home in Bengaluru, Karnataka based on location, urbanization, total square feet, bedrooms, bathrooms, and balconies.

flask gridsearchcv http-requests machine-learning machine-learning-algorithms pandas python scikit-learn

Last synced: 03 Nov 2024

https://github.com/christianconchari/bike-sharing-demand

Este repositorio contiene el trabajo práctico final de la materia Aprendizaje de Máquina II de la Especialización en Inteligencia Artificial (CEIA) de la Facultad de Ingeniería de la Universidad de Buenos Aires (FIUBA).

airflow docker fastapi machine-learning mlflow python scikit-learn

Last synced: 03 Nov 2024

https://github.com/joel-beck/claims-prediction

Car Insurance Claims Prediction

python regression scikit-learn

Last synced: 05 Nov 2024

https://github.com/joel-beck/airbnb-oslo

Price Prediction Models for Airbnb Apartments in Oslo | Winter Term 2021/22

prediction python pytorch scikit-learn

Last synced: 05 Nov 2024

https://github.com/miguellopezvirues/text_sentiment_classification_gamestop

A notebook on NLP sentiment analysis for text classification of game reviews between "positive", "neutral" and "negative".

machine-learning nlp pandas python scikit-learn sentiment-analysis

Last synced: 06 Nov 2024

https://github.com/gabrielmazzotta/nlp-clustering--movie-similarity-from-plot-summaries

A Python-based movie recommendation system leveraging NLP and clustering techniques. This project includes data processing, vectorization of plot summaries, and the implementation of recommendation algorithms to suggest similar movies based on user input.

clustering cosine-similarity hierarchical-clustering kmeans lemmatization nlp recommendation-engine scikit-learn similarity-score spacy tokenization

Last synced: 03 Nov 2024

https://github.com/laoluadewoye/skloverlay

This repository is the official location of the SKLOverlay Project. Here, it will hold everything used for the package on Py Pi, including source files.

classification classification-algorithm data-science data-wrangling evaluation-metrics excel graphics graphs machine-learning machine-learning-algorithms matplotlib modeling pandas preprocessing scikit-learn

Last synced: 03 Nov 2024

https://github.com/chengetanaim/high-school-alcoholism-and-academic-performance

Student Alcoholism and Academic Performance Data Analysis

jupyter-notebook scikit-learn

Last synced: 05 Nov 2024

https://github.com/ccastleberry/sk-autobots

Custom data transformers using the scikit-learn API.

scikit-learn sklearn sklearn-api

Last synced: 03 Nov 2024

https://github.com/chengetanaim/customerpersonalityanalysis

Customer Personality Analysis involves a thorough examination of a company's optimal customer profiles. This analysis facilitates a deeper understanding of customers, enabling businesses to tailor products to meet the distinct needs, behaviors, and concerns of various customer types

kmeans-clustering pandas scikit-learn

Last synced: 05 Nov 2024