Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/shliakhovai/house-price-prediction

This repository contains a complete machine learning pipeline for predicting housing prices. It includes data preprocessing, feature engineering, and model training and evaluation components, designed to provide a robust solution for regression tasks.

data-science machine-learning matplotlib numpy pandas prediction python regression scikit-learn seaborn

Last synced: 03 Nov 2024

https://github.com/oneapi-src/credit-card-fraud-detection

AI Starter Kit for Credit Card Fraud Detection model using Intel® Extension for Scikit-learn*

machine-learning scikit-learn

Last synced: 05 Nov 2024

https://github.com/kingabzpro/mlops-with-jenkins

From data ingestion to deploying the model using Jenkins.

classification fastapi jenkins mlops scikit-learn

Last synced: 13 Oct 2024

https://github.com/rajaprerak/ml_dl_webapp

Machine learning and Deep learning project

flask heroku keras knearest-neighbors python scikit-learn tensorflow

Last synced: 06 Nov 2024

https://github.com/labrijisaad/chefclub-data-internship

Repository showcasing my Data Engineer / Scientist internship at Chefclub, contributing to data infrastructure enhancement and fostering data-driven insights.

airflow chefclub data-engineering data-science gcp scikit-learn

Last synced: 06 Nov 2024

https://github.com/labrijisaad/monthly-daily-energy-forecasting-docker-api

This repository houses an Energy Forecasting API that uses Machine Learning to predict daily and monthly energy consumption from historical data. It's designed as a practical demonstration of a Machine Learning Engineering workflow, from initial analysis to a deployable API packaged with Docker.

api docker jupyter-notebooks machine-learning makefile python random-forest scikit-learn xgboost

Last synced: 06 Nov 2024

https://github.com/rs2416/Detecting_Social_Anxiety

This repository contains the full dataset and code needed to recreate the classification models and reproduce the results within this paper: https://formative.jmir.org/2021/10/e32656/

jupyter-notebook machine-learning python scikit-learn social-anxiety

Last synced: 03 Aug 2024

https://github.com/vaibhavs10/learn-ml

Modified notebooks (single) from kaggle.com/learn with added nuances

decision-trees machine-learning pandas random-forest scikit-learn

Last synced: 25 Oct 2024

https://github.com/paulj1989/bulgarian-constitutional-court-decisions

Developing NLP models for text and sentence classification using legal texts from the Bulgarian constitutional court.

keras neural-network nlp scikit-learn tensorflow tesseract

Last synced: 06 Nov 2024

https://github.com/ivanyu/kaggle-digit-recognizer

Kaggle's "Digit Recognizer" competition

kaggle keras machine-learning scikit-learn

Last synced: 15 Oct 2024

https://github.com/vatshayan/hospital-discharge-analysis

Analysis of Hospitalization Discharge Rates in Lake County, Illinois of various attributes like Anxiety, Alcohol, mood, Diabetes, Asthma, etc

data-analysis data-visualization jupyter-notebook machine machine-learning machine-learning-algorithms scikit-learn

Last synced: 11 Oct 2024

https://github.com/rakibhhridoy/machinelearning-featureselection

Before training a model or feed a model, first priority is on data,not in model. The more data is preprocessed and engineered the more model will learn. Feature selectio one of the methods processing data before feeding the model. Various feature selection techniques is shown here.

extratreesclassifier feature-selection gridsearchcv lasso-regression logistic-regression machine-learning numpy pandas pca rfe rfecv scikit-learn selectkbest

Last synced: 06 Nov 2024

https://github.com/rakibhhridoy/supportvectormachinein-medical

Support vector machine in medical disease detection. Both linear and non-linear data can be fitted in svm through its kernel specialization In medical we focus on precision or recall rather than accuracy.

diabetes-prediction machine-learning medical precision-medicine recall-precision scikit-learn support-vector-machines svm

Last synced: 06 Nov 2024

https://github.com/rakibhhridoy/easywaydiveinto-datascience

Data Science is not as easy as it seems at first. The most problem faced by new learner are lack of resource knowledge as well as confusion in using the various resources. I hope this repository will benefit confusion learner.

algorithms algorithms-implemented bayesian-statistics data-science deep-learning deep-neural-networks linear-algebra machine-learning matplotlib multivariate-calculus numpy optimization pandas python scikit-learn scipy seaborn statistics statsmodels tensorflow

Last synced: 06 Nov 2024

https://github.com/ksatrajit0/heart-disease-prediction-ml

Predicts the risk of heart attack in a patient using their medical record

heart-disease-prediction machine-learning matplotlib numpy pandas scikit-learn seaborn

Last synced: 31 Oct 2024

https://github.com/bacross/datamunger

python package for handling nan's and outliers

data data-frame datamunger knn nan outliers python scikit-learn

Last synced: 19 Oct 2024

https://github.com/bistcuite/plainml

Painless Machine Learning Library for python based on scikit-learn

machine-learning ml plainml python scikit-learn

Last synced: 19 Oct 2024

https://github.com/karimosman89/legal-document-nlp

Create a tool that uses NLP to extract key information from legal documents, contracts, or agreements.Use NLP techniques for named entity recognition and text classification.Streamline the review process for legal teams by automating information extraction.

nltk python scikit-learn spacy

Last synced: 07 Nov 2024

https://github.com/2003harsh/house-price-prediction-using-machine-learning

This project features a web app that predicts house prices using a linear regression model. Users can input details like location, square footage, bathrooms, and bedrooms through an HTML form. I've added a CI/CD pipeline with GitHub Actions, unit testing with pytest, and automated Docker containerization to improve deployment and robustness.

ci-cd data-analysis docker-image flask linear-regression machine-learning matplotlib mlops-workflow requests scikit-learn

Last synced: 10 Oct 2024

https://github.com/spamfromaditya/drugs-consumption-prediction-model-eda-bagging-classifier

Drug consumption prediction models are like crystal balls for public health. By analyzing vast amounts of data, these models can identify individuals or communities at higher risk of drug use. They consider factors like demographics, social media activity, prescription history, and even economic indicators.

bagging-classifier machine-learning matplotlib numpy python scikit-learn

Last synced: 08 Nov 2024

https://github.com/jordandeklerk/pygridge

A scikit-learn compatible Python package for data-driven group regularized ridge regression

python regression regularized-regression scikit-learn

Last synced: 31 Oct 2024

https://github.com/asosnovsky/analyzing-blood-vessel-aneurysm

A few simple scripts to identify aneurysm in a blood-vessel (research projects)

machine-learning meanshift medical-image-processing scikit-learn

Last synced: 13 Oct 2024

https://github.com/mindful-ai-assistants/credit-card-prediction

💳 This repository focuses on building a predictive model to assess the likelihood of credit card defaults. The project includes data analysis, feature engineering, and machine learning to provide accurate default predictions.

jupyter logistic-regression machine-learning python3 scikit-learn

Last synced: 21 Oct 2024

https://github.com/gappeah/income-prediction-ml

This is a machine learning project aimed at predicting whether an individual's annual income exceeds $50,000 based on their demographic and personal information.

data data-science machine-learning ml numpy pandas python random-forest scikit-learn

Last synced: 10 Oct 2024

https://github.com/rakshit-vasava/predictive-analytics-for-insurance-purchase

Predicting customer insurance purchases using stacking models and SMOTE for the Homesite Quote Conversion Problem on Kaggle.

k-nearest-neighbours kaggle-competition multilayer-perceptron python random-forest scikit-learn smote support-vector-machines

Last synced: 31 Oct 2024

https://github.com/lechemi/machine-learning-vademecum

Un notebook contenente nozioni di base ed esempi pratici in python sul machine learning.

machine-learning python scikit-learn

Last synced: 31 Oct 2024

https://github.com/zen204/airbnb_availability

A machine learning model that predicts Airbnb listing availability, utilizing feature engineering and supervised learning techniques to improve guest experience and optimize host management.

binary-classification data-analysis data-preprocessing data-visualization feature-engineering machine-learning matplotlib model-evaluation nlp pandas predictive-modeling python scikit-learn seaborn supervised-learning

Last synced: 03 Nov 2024

https://github.com/prithivsakthiur/data-board

Data Boards - Visualization of various plots ( Analysis )

data-analysis gradio huggingface keras mathplotlib pandas plots pyplot scikit-learn seaborn spaces

Last synced: 03 Nov 2024

https://github.com/md-emon-hasan/ml-project-car-price-prediction

🚗 End-to-end ML project for predicting car prices based on various features. Includes data preprocessing, model training, and a Flask web for predictions.

car-price-prediction car-price-predictor data-science feature-engineering ml predictive-modeling scikit-learn

Last synced: 10 Oct 2024

https://github.com/an-exodus/dubai-real-estate-price-prediction-ml

This repository contains a comparative analysis of machine learning algorithms to predict real estate prices in Dubai. Using data from Bayut, we evaluate Decision Tree, Linear Regression, Random Forest, and Gradient Boosting models based on their predictive accuracy.

decision-tree gradient-boosting linear-regression machine-learning random-forest scikit-learn

Last synced: 03 Nov 2024

https://github.com/aarryasutar/logistic_regression_on_age_prediction

This code evaluates the performance of a logistic regression model on age prediction using various features to predict a binary target variable, calculating metrics to determine the performance. It evaluates the comparison, identifies favorable features, and visualizes the ROC-AUC curve to determine the best model performance.

accuracy-score confusion-matrix f1-score feature-selection logistic-regression model-training numpy pandas precision recall rmse roc-auc-curve scikit-learn visualization

Last synced: 03 Nov 2024

https://github.com/nirmalyabag20/breast-cancer-prediction-using-machine-learning

This project leverages machine learning to classify breast cancer as malignant or benign based on tumor characteristics. By applying and evaluating multiple algorithms, the model achieves high accuracy, demonstrating the practical application of data-driven solutions in medical diagnostics.

logistic-regression matplotlib numpy pandas python scikit-learn seaborn

Last synced: 31 Oct 2024

https://github.com/bestmahdi2/uni__dataminningstackoverflowproject

A university project related to data mining lesson on StackOverflow website data with Python language

cart csv data-mining logistic-regression matplotlib mlp naive-bayes nltk numpy pandas python scikit-learn scipy seaborn stackoverflow svc textblob tqdm xgboost

Last synced: 14 Oct 2024

https://github.com/corentinth/ml-gender_classification

[Machine Learning] The Hello Wolrd of Machine Learning using sklearn

body-metrics gender-classification machine-learning scikit-learn

Last synced: 02 Nov 2024

https://github.com/bhuvaneshwarguttula/student-performance-indicator

To understand and predict how the student's performance (test scores) is affected by the other variables (Gender, Ethnicity, Parental level of education, Lunch, Test preparation course).

exploratory-data-analysis machine-learning pandas python scikit-learn student-performance-analysis

Last synced: 10 Oct 2024

https://github.com/troublem1/mle

MultiLabel-Transformer(MLE) is an extended version of a LabelEncoder, such that, it encodes multiple categorical columns to numeric in any workflow or pipeline

packages python3 scikit-learn sklearn

Last synced: 14 Oct 2024

https://github.com/soumya6tiwari/customer-segmentation-using-rfm-analysis

This project focuses on customer segmentation using RFM (Recency, Frequency, Monetary) analysis and K-Means clustering. It enables businesses to identify high-value customers, optimize marketing strategies, and improve customer retention through data-driven insights.

backend clustering flask frontend kmeans-clustering matplotlib numpy pandas python rfm-analysis scikit-learn unsupervised-learning

Last synced: 03 Nov 2024

https://github.com/chrislemke/scikit-tabtrans

TabTransformer ready for Scikit learn 🧑‍🔬

deep-learning machine-learning python scikit-learn transformer

Last synced: 11 Oct 2024

https://github.com/farahibrar/programming-in-python

Explore a comprehensive collection of Python programming for diverse data analysis and data science projects. This repository covers data exploration, visualization, statistical analysis, machine learning, NLP, and model deployment. Perfect for enthusiasts looking to delve into practical examples and advanced techniques.

beautifulsoup dataanalysis docker flask folium jupyter-notebook machine-learning matplotlib nltk numpy pandas python pytorch scikit-learn scikitlearn scipy seaborn spacy statsmodels tensorflow

Last synced: 15 Oct 2024

https://github.com/sd338/fractureai

This tool helps people upload X-rays to find broken bones. It uses a machine to mark where the breaks are and gives users marked pictures to download. A smart computer also helps people understand their broken bones and gives them advice.

css cv2 flask gorq html javascript matplotlib npm numpy pandas pydantic python react scikit-learn torch torchvision ultralytics

Last synced: 31 Oct 2024

https://github.com/aahnik/gdsc-ml-ds-bootcamp-2023

This repo contains files given by my seniors as well as assignments and final project done by me during the bootcamp.

data-science machine-learning ml numpy pandas python3 scikit-learn

Last synced: 11 Oct 2024

https://github.com/swimshahriar/heart-attack-prediction

Heart attack prediction from 13 features.

jupyter-notebook pandas python3 scikit-learn

Last synced: 02 Nov 2024

https://github.com/priboy313/pandasflow

A set of custom python modules for friendly workflow on pandas

catboost data-analysis data-science pandas phik python scikit-learn shap

Last synced: 03 Nov 2024

https://github.com/pankajarm/tabular_ml_toolkit

A helper library to jumpstart your machine learning project based on tabular or structured data.

data-science feature-engineering hyperparameter-tuning machine-learning parallelism python scikit-learn structured-data tabular xgboost

Last synced: 03 Nov 2024

https://github.com/vishal-038/attendance_by_face_recogination

This project is a face recognition-based attendance system that uses Python, OpenCV, Scikit-learn, Streamlit, and various other libraries like Pandas, Numpy, Datetime, and OS for different functionalities. It enables adding faces to the database, taking attendance based on face recognition, and showing live attendance through a web interface built

opencv python scikit-learn

Last synced: 09 Oct 2024

https://github.com/timzatko/fifa-19-dataset-machine-learning

Player's value prediction and game position classification on FIFA 19 dataset.

data-analysis fifa19 machine-learning scikit-learn

Last synced: 11 Oct 2024

https://github.com/kieranlitschel/kerassearchcv

Built for the implementation of Keras in Tensorflow. Behaves similarly to GridSearchCV and RandomizedSearchCV in Sci-Kit learn, but allows for progress to be saved between folds and for fitting and scoring folds in parallel.

classification grid-search keras keras-tensorflow multithreading randomized-search scikit-learn

Last synced: 11 Oct 2024

https://github.com/samarpan-rai/serveitlearn

It creates an extremely thin layer around FastAPI library which allows you to create an end point super fast.

fastapi inference ml pypi scikit-learn

Last synced: 30 Oct 2024

https://github.com/mohammadreza-mohammadi94/data-analysis-and-machine-learning-projects

A comprehensive collection of data analysis and machine learning projects, showcasing techniques and models for various data challenges. Dive in to explore code examples, analyses, and machine learning workflows.

data-analysis data-science dataframes exploratory-data-analysis pandas python scikit-learn visualization

Last synced: 07 Nov 2024

https://github.com/kohlerhector/trex-tree-reward-exploration

Using Tree estimators of the MDP models to then count leaves grouping similar transitions and do count-based exploration.

decision-trees drl exploration rl scikit-learn stable-baselines3

Last synced: 08 Nov 2024

https://github.com/george-gca/ai_papers_search_tool

Automatic paper clustering and search tool by fastext from Facebook Research

fasttext fasttext-embeddings fasttext-python nlp python scikit-learn

Last synced: 11 Oct 2024

https://github.com/ayushshahh/fespn

A neural network made to predict final exam scores of students

mlp mlp-regressor multilayer-perceptron neural-network prediction-model scikit-learn

Last synced: 18 Oct 2024

https://github.com/grampers-dev/co2oracle

The CO2 Oracle project uses machine learning and AI to analyze and predict CO2 emissions for environmental management. Using a Kaggle dataset, it demonstrates predictive analytics to understand and forecast emissions. Written in Python, it employs libraries like Pandas, NumPy, and Scikit-Learn.

artificial-intelligence machine-learning numpy pandas python scikit-learn

Last synced: 10 Oct 2024

https://github.com/pymc-learn/pymc-learn-sphinx-theme

Sphinx theme for Pymc-learn documentation

pymc3 pymc4 scikit-learn sphinx sphinx-theme

Last synced: 19 Oct 2024

https://github.com/kefrankk/ml-fraud-detection

I built a predictive model to detect fraud in financial transactions.

pandas python scikit-learn

Last synced: 31 Oct 2024

https://github.com/yuji1702/ai--powered-triage-system

This project implements a machine learning-based triage system for emergency rooms, which classifies patients based on their symptoms and vitals using a Random Forest Classifier. The system features real-time patient data integration, a user-friendly GUI built with Tkinter, and secure patient data encryption using Fernet from the cryptography lib

cryptography data-imputation data-preprocessing data-security encryption gui healthcare machine-learning matplotlib medical-data python random-forest realt-time scikit-learn seaborn tkinter triage-system

Last synced: 31 Oct 2024

https://github.com/alainlebret/python-et-ia-1

Ressources personnelles du cours "Python & IA" en 2e année GPSE à l'ENSICAEN

artificial-intelligence image-processing machine-learning matplotlib numpy python scikit-image scikit-learn

Last synced: 31 Oct 2024

https://github.com/rtmigo/skifts_py

Search for the most relevant documents containing words from a query. Uses Scikit-learn and Numpy

cosine-similarity information-retrieval numpy python scikit-learn text-mining tf-idf

Last synced: 14 Oct 2024

https://github.com/anty-filidor/cyberbullying-detector

NLP bullying detector for tweets with ML model training pipeline deployed as web-app with CICD

deployment-system flask-api machine-learning nlp python scikit-learn

Last synced: 14 Oct 2024

https://github.com/alainlebret/python-et-ia-2

Ressources personnelles du cours "Python & IA" en 2e année GPSE à l'ENSICAEN

artificial-intelligence image-processing machine-learning matplotlib numpy python scikit-image scikit-learn

Last synced: 31 Oct 2024

https://github.com/gaurangdave/house_price_predictions

Machine Learning Application to predict House Prices

hands-on learning-by-doing machine-learning numpy pandas python scikit-learn

Last synced: 31 Oct 2024

https://github.com/bkamapantula/discover

Code search utility to assist developer workflows via code discovery. Currently uses TF-IDF estimator.

developer-tools python scikit-learn tf-idf

Last synced: 16 Oct 2024

https://github.com/ashishsingh789/bcg_virtual_internship

This repository showcases my BCG X virtual internship project on customer churn analysis for PowerCo, covering business understanding, EDA, feature engineering, and modeling using Python and machine learning.

data-manipulation data-science dataanalysis datavisualization eda machine-learning matplotlib numpy pandas python random-forest scikit-learn seaborn

Last synced: 31 Oct 2024

https://github.com/jainish-prajapati/solar-flare-prediction

This repository contains code and data for predicting solar flare energy ranges using machine learning, based on NASA's RHESSI mission data. It includes preprocessing of FITS files into a unified CSV dataset and implements models like Gradient Boosting, Random Forest, and Decision Tree classifiers, achieving accuracies up to 87%.

data-visualization machine-learning numpy pandas python scikit-learn solar-flare-prediction

Last synced: 31 Oct 2024

https://github.com/kheriberto/linear_regression_ecommerce

Simple project showcasing crafting a linear regression model with SciKit Learn

data-analysis jupyter-notebook linear-regression pandas python scikit-learn seaborn

Last synced: 31 Oct 2024

https://github.com/henrytseng/example_docker_scikit-learn

A quick example of using Scikit-Learn from a Docker container

docker scikit-learn

Last synced: 14 Oct 2024

https://github.com/karthikarajagopal44/data-analysis-using-python-libraries-

The COVID-19 pandemic has significantly impacted India, necessitating a detailed analysis of the virus’s spread within the country. In this project, we explore an India-specific COVID-19 dataset, leveraging Python libraries such as Pandas, NumPy, Matplotlib, and Seaborn.

data-cleaning data-visualization matplotlib numpy pandas python python3 scikit-learn seaborn

Last synced: 31 Oct 2024

https://github.com/ksasi/boston_housing

Predicting Boston Housing Prices - Udacity

machine-learning numpy pandas python scikit-learn

Last synced: 15 Oct 2024