Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/drkbluescience/wids2024_challenge2_metastaticdiagnosisregression

This notebook presents an exploratory data analysis (EDA) and regression modeling approach for the WiDS Datathon 2024 Challenge #2.

catboost data data-visualization ensemble-learning exploratory-data-analysis gradient-boosting imputation-methods lgbm machine-learning scikit-learn women-in-data-science

Last synced: 31 Oct 2024

https://github.com/belzebu013/prever_nivel_colesterol

Projeto de IA com algoritmo de Regressão Linear múltipla para prever o nível de colesterol de um individuo.

ia jupiter-notebook pandas python regressao-linear-multipla scikit-learn

Last synced: 31 Oct 2024

https://github.com/kheriberto/logistic_regression_project

A project that analyses dummie data from an advertising company using logistic regression

data-analysis logistic-regression pandas python scikit-learn seaborn

Last synced: 31 Oct 2024

https://github.com/akansharajput280799/data-driven-insights-into-job-satisfaction-and-compensation-trends

This project analyzes 2020 employee data to identify factors influencing job satisfaction, performance, and salary differences, offering insights for improving engagement and workplace strategies.

cluster-analysis colab-notebook data-cleaning descriptive-statistics factor-analysis hypothesis-testing jupyter-notebook matplotlib python scikit-learn seaborn t-test visualization

Last synced: 31 Oct 2024

https://github.com/fahrettinsolak/ai-based-salary-scale-calculation-project

This project demonstrates a Polynomial Regression model using a dataset related to experience and salary. The model is built using Python with the pandas, matplotlib, and sklearn libraries. The dataset includes information on years of experience and corresponding salary.

artificial-intelligence deep-learning jupyter-notebook machine-learning matplotlib pandas pyhton scikit-learn

Last synced: 31 Oct 2024

https://github.com/karthikarajagopal44/data-analysis-using-python-libraries-

The COVID-19 pandemic has significantly impacted India, necessitating a detailed analysis of the virus’s spread within the country. In this project, we explore an India-specific COVID-19 dataset, leveraging Python libraries such as Pandas, NumPy, Matplotlib, and Seaborn.

data-cleaning data-visualization matplotlib numpy pandas python python3 scikit-learn seaborn

Last synced: 31 Oct 2024

https://github.com/kheriberto/linear_regression_ecommerce

Simple project showcasing crafting a linear regression model with SciKit Learn

data-analysis jupyter-notebook linear-regression pandas python scikit-learn seaborn

Last synced: 31 Oct 2024

https://github.com/jainish-prajapati/solar-flare-prediction

This repository contains code and data for predicting solar flare energy ranges using machine learning, based on NASA's RHESSI mission data. It includes preprocessing of FITS files into a unified CSV dataset and implements models like Gradient Boosting, Random Forest, and Decision Tree classifiers, achieving accuracies up to 87%.

data-visualization machine-learning numpy pandas python scikit-learn solar-flare-prediction

Last synced: 31 Oct 2024

https://github.com/ashishsingh789/bcg_virtual_internship

This repository showcases my BCG X virtual internship project on customer churn analysis for PowerCo, covering business understanding, EDA, feature engineering, and modeling using Python and machine learning.

data-manipulation data-science dataanalysis datavisualization eda machine-learning matplotlib numpy pandas python random-forest scikit-learn seaborn

Last synced: 31 Oct 2024

https://github.com/gaurangdave/house_price_predictions

Machine Learning Application to predict House Prices

hands-on learning-by-doing machine-learning numpy pandas python scikit-learn

Last synced: 31 Oct 2024

https://github.com/alainlebret/python-et-ia-2

Ressources personnelles du cours "Python & IA" en 2e année GPSE à l'ENSICAEN

artificial-intelligence image-processing machine-learning matplotlib numpy python scikit-image scikit-learn

Last synced: 31 Oct 2024

https://github.com/alainlebret/python-et-ia-1

Ressources personnelles du cours "Python & IA" en 2e année GPSE à l'ENSICAEN

artificial-intelligence image-processing machine-learning matplotlib numpy python scikit-image scikit-learn

Last synced: 31 Oct 2024

https://github.com/yuji1702/ai--powered-triage-system

This project implements a machine learning-based triage system for emergency rooms, which classifies patients based on their symptoms and vitals using a Random Forest Classifier. The system features real-time patient data integration, a user-friendly GUI built with Tkinter, and secure patient data encryption using Fernet from the cryptography lib

cryptography data-imputation data-preprocessing data-security encryption gui healthcare machine-learning matplotlib medical-data python random-forest realt-time scikit-learn seaborn tkinter triage-system

Last synced: 31 Oct 2024

https://github.com/kefrankk/ml-fraud-detection

I built a predictive model to detect fraud in financial transactions.

pandas python scikit-learn

Last synced: 31 Oct 2024

https://github.com/jianninapinto/bandersnatch

This project implements a machine learning model using Random Forest, XGBoost, and Support Vector Machines algorithms with oversampling and undersampling techniques to handle imbalanced classes for classification tasks in the context of predicting the rarity of monsters.

altair imbalanced-classification imblearn machine-learning mongodb oversampling pycharm-ide pymongo python random-forest-classifier scikit-learn smote support-vector-machines undersampling xgboost

Last synced: 26 Sep 2024

https://github.com/pymc-learn/pymc-learn-sphinx-theme

Sphinx theme for Pymc-learn documentation

pymc3 pymc4 scikit-learn sphinx sphinx-theme

Last synced: 19 Oct 2024

https://github.com/m-muecke/text-normalizer

Text normalizer integration for sklearn.pipeline.Pipeline class

nlp nltk python scikit-learn

Last synced: 28 Oct 2024

https://github.com/enyaude/california_house_price_prediction

Developed a California house price prediction model utilizing linear regression and Random Forest, and applied machine learning techniques such as Ridge, and Lasso for optimization in Python.

jupyter-notebook linear-regression python random-forest scikit-learn streamlit

Last synced: 31 Oct 2024

https://github.com/s0fft/airline-passenger-satisfaction

Airline-Customer-Model — Machine Learning Project on: Scikit-learn / Pandas / Matplotlib / Seaborn

jupyter-notebook mashine-learning matplotlib pandas python3 scikit-learn seaborn

Last synced: 31 Oct 2024

https://github.com/nits2612/data-science-projects

Portfolio of data science projects completed by me during PGP AI/ML, self learning, and hobby purposes.

data data-science dataanalysis deep deep-learning keras machine-learning matplotlib numpy opencv pandas python scikit-learn seaborn surprise-python tensorflow transfer-learning

Last synced: 31 Oct 2024

https://github.com/tapas-gope/telecommunication-customer-churn

This project involves predicting customer churn in a telecommunications company using machine learning techniques, exploring various features' impact, optimizing models, and identifying key factors influencing churn.

feature-engineering matplotlib-pyplot model-evaluation-and-validation numpy pandas python scikit-learn

Last synced: 31 Oct 2024

https://github.com/alexsolov28/vkrb

Выпускная квалификационная работа бакалавра

matplotlib numpy pandas python scikit-learn seaborn streamlit

Last synced: 31 Oct 2024

https://github.com/tamk-kol/project_orbital_data_analysis

The goal of this project is to develop an automatic method to detect orbital maneuvers using machine learning.

matplotlib numpy pandas scikit-learn

Last synced: 31 Oct 2024

https://github.com/oroszgy/cookiecutter-ml-flask

Cookiecutter template for training and serving machine learning models with scikit-learn, spacy, Flask and Docker

docker flask flask-application machine-learning nlp rest-api scikit-learn spacy

Last synced: 19 Oct 2024

https://github.com/chris-santiago/tsfeast

A collection of Scikit-Learn compatible time series transformers and tools.

data-science feature-engineering python scikit-learn time-series timeseries-features transformers

Last synced: 27 Oct 2024

https://github.com/abhipatel35/diabetes_ml_classification

Predict diabetes using machine learning models. Experiment with logistic regression, decision trees, and random forests to achieve accurate predictions based on health indicators. Complete lifecycle of ML project included.

classification data-analysis data-science data-visualization descision-tree diabetes-prediction jupiter-notebook logistic-regression machine-learning model-evaluation open-source pandas pycharm-ide python random-forest scikit-learn

Last synced: 31 Oct 2024

https://github.com/darenr/gradientboostingmachines

Notebooks exploring strengths and weaknesses of GBM based classifiers

jupyter-notebook lightgbm pandas scikit-learn xgboost

Last synced: 23 Oct 2024

https://github.com/shahaba83/airplane-ticket-cancellation

In this project, we try to predict the possibility of canceling the plane ticket by the buyer

datatime numpy pandas python scikit-learn seaborn

Last synced: 31 Oct 2024

https://github.com/callesjuan/ninjalprm

Protótipo de ferramenta de agrupamento de dispositivos Android por geolocalização (Server)

python scikit-learn xmpp

Last synced: 24 Oct 2024

https://github.com/shubhamsoni98/classification-with-random-forest-1

To classify sales into categories (Low, Moderate, High) using Random Forests to inform strategic decisions and optimize marketing strategies.

algorithms anaconda data data-science datacleaning eda jupyter-notebook machine-learning pyhton random-forest scikit-learn visualization

Last synced: 31 Oct 2024

https://github.com/alexsolov28/ml_course

Курс "Технология машинного обучения", бакалавриат, 6 семестр

colab-notebooks jupyter-notebook matplotlib numpy pandas python scikit-learn seaborn

Last synced: 31 Oct 2024

https://github.com/priyanshulathi/air-quality-index-prediction

Machine learning based air quality index prediction using environmental and pollutant data to classify and forecast pollution levels.

machine-learning matplotlib numpy pandas python scikit-learn seaborn

Last synced: 31 Oct 2024

https://github.com/pratishtha-abrol/sentimentanalysis

Logistic Regression: A sentiment analysis case study

logistic-regression nltk-python scikit-learn sentiment-analysis

Last synced: 24 Oct 2024

https://github.com/jofaval/sonar

Binary Classification of Sonar Signals of Rocks and Metal cylinders in 1987

data-analysis data-science data-visualization machine-learning python scikit-learn sonar uci

Last synced: 21 Oct 2024

https://github.com/s-matke/eco-forecast

Machine learning model used for predicting European country with most green surplus energy generated

data-science green-energy machine-learning scikit-learn supervised-learning

Last synced: 21 Oct 2024

https://github.com/sudothearkknight/15-machinelearningprojects

A curation of 15 Machine Learning projects in various fields that are helping me gain a better understanding of the different machine learning tools, techniques, algorithms and methodalogies.

classification-algorithm machine-learning machine-learning-algorithms natural-language-processing pycharm-ide python3 regression-models scikit-learn scikitlearn-machine-learning spam-detection

Last synced: 31 Oct 2024

https://github.com/abhipatel35/svm-hyperparameter-optimization-for-breast-cancer

Utilizing SVM for breast cancer classification, this project compares model performance before and after hyperparameter tuning using GridSearchCV. Evaluation metrics like classification report showcase the effectiveness of the optimized model.

breast-cancer cancer-diagnosis classification data-analysis data-science gridsearchcv healthcare hyperparameter-tuning jupyter-notebook machine-learning medical-imaging pycharm python scikit-learn support-vector-machine svm

Last synced: 31 Oct 2024

https://github.com/ccastleberry/hands_on_machine_learning

Notebooks and files created while working through the book Hands on Machine Learning

data-science jupyter-notebook scikit-learn tensorflow

Last synced: 28 Oct 2024

https://github.com/baponkar/scikit-logisticregression-application

A simple and detail application analysis of sci kit learn LogisticRegression model .

classification-algorithm logistic-regression machine-learning python3 scikit-learn

Last synced: 07 Nov 2024

https://github.com/samkazan/fraud-detection-ml

Machine learning models for enhanced fraud detection in e-commerce transactions, exploring feature engineering, distance prediction, and clustering analysis.

clustering data-science data-visualization dataanalytics dbscan eda hierarchical-clustering kmeans-clustering knn-imputer matplotlib mlxtend python scikit-learn seaborn xgboost

Last synced: 05 Nov 2024

https://github.com/avik-pal/kaggle-titanic

Predicting whether a given set of people survive on the Titanic

machine-learning numpy pandas scikit-learn scikitlearn-machine-learning

Last synced: 12 Oct 2024

https://github.com/tanaybhadula/ml-preprocessing-cli

A CLI tool with python to preprocess datasets for performing supervised learning to save time for users. Input data can be preprocessed using simple commands and preprocessed dataset can be downloaded later

cli data-cleaning data-preprocessing machine-learning pandas python scikit-learn

Last synced: 11 Oct 2024

https://github.com/samkazan/structural_discovery_of_macromolecules_data_analysis

This research project uses machine learning techniques and neural network to uncover key factors that contribute to successful protein structure discovery using Python and R

classification clustering ipython-notebook jupyter-notebook keras-neural-networks keras-tensorflow machine-learning neural-network numpy python r rmarkdown scikit-learn scipy tensorflow

Last synced: 11 Oct 2024

https://github.com/jonad/finding_donors

Predicting income with UCI Census Income Dataset using supervised machine learning algorithms

numpy pandas scikit-learn scikitlearn-machine-learning

Last synced: 07 Nov 2024

https://github.com/sonu275981/flight-fare-prediction

End to end implementation of Machine Learning Airline Flight Fare Prediction using python

flight-price-prediction gradient-boosting machine-learning predicts-flight-fares python3 regression-models scikit-learn

Last synced: 05 Nov 2024

https://github.com/gsmafra/sklearn-dummies

Scikit-learn label binarizer with support for missing values

data-science machine-learning pandas python scikit-learn

Last synced: 27 Oct 2024

https://github.com/messierandromeda/sentiment-analysis

Sentiment analysis with the IMDB movie review dataset.

imdb-dataset python scikit-learn sentiment-analysis

Last synced: 10 Oct 2024

https://github.com/henriqueotogami/imersao-dados-3-alura

Terceira edição da Imersão Dados da Alura (03 a 07/05/21). O projeto dessa edição foi inspirado em um desafio do Laboratory Innovation Science at Harvard disponibilizado no Kaggle.

alura bioinformatics data-science drug-discovery google-collab harvard-university imersaodados jupyter-notebook kaggle-challenge laboratory-innovation-science matplotlib pandas python3 scikit-learn seaborn

Last synced: 05 Nov 2024

https://github.com/the-developer-306/house-price-predictor

House Price Predictor: Harnessing machine learning algorithms to forecast housing prices in Boston, empowering buyers and sellers with accurate predictions based on key factors like location, crime rate, rooms, accessibility, and more.

csv ipynb-jupyter-notebook joblib matplotlib numpy pandas python scikit-learn

Last synced: 11 Oct 2024

https://github.com/r-gg/ml-37

Amazon Reviews ~ Sentiment analysis evaluation: fine-tuned BERT vs LSTM. (+ Extensive Data Mining & Visualization)

bert deep-learning ipynb-jupyter-notebook lstm machine-learning python scikit-learn uni-project

Last synced: 11 Oct 2024

https://github.com/mohd-faizy/preprocess_ml

This repository hosts Python code that utilizes the Scikit-learn preprocessing API for data preprocessing. The code presents a comprehensive range of tools that handle missing data, scale data, encode categorical variables, and perform other functions.

data-science feature-engineering feature-engineering-algorithm feature-extraction feature-selection machine-learning outlier-detection preprocessing-data preprocessor scikit-learn

Last synced: 11 Oct 2024

https://github.com/sergeimakarovv/ml-powerlifting

Predicting a weight lifted by athletes using Machine Learning

machine-learning pandas python scikit-learn

Last synced: 03 Nov 2024

https://github.com/sarowarahmed/predicting-kolkata-house-price

🏠 Predicting Kolkata House Price: A web app powered by a Machine Learning model, built with Numpy, Pandas, Scikit-learn, and Streamlit, to predict house prices in Kolkata. Deployed on Streamlit Cloud for easy access and real-time predictions.

app kolkata linear-regression machine-learning numpy pandas scikit-learn streamlit

Last synced: 03 Nov 2024

https://github.com/bilalm04/email-spam-classifier

A machine learning project that classifies emails as spam or not spam using Logistic Regression, with a deployable Flask API for real-time classification.

api flask jupyter-notebook machine-learning matplotlib nlp numpy pandas python scikit-learn

Last synced: 10 Oct 2024

https://github.com/ismaelvr1999/air-quality-clustering

This project focuses on analyzing air quality data and categorizing it into clusters using the K-Means algorithm.

jupyter-notebook machine-learning matplotlib pandas python scikit-learn

Last synced: 10 Oct 2024

https://github.com/sckonung/crab-age-regression

ML model for regression with a crab age dataset Competition in Kaggle

keras machine-learning pandas python scikit-learn tensorflow

Last synced: 03 Nov 2024

https://github.com/alaazameldev/text-based-search-engine

Implementation of a search engine using TF-IDF and Word Embedding-based vectorization techniques for efficient document retrieval

chromadb fastapi gensim-word2vec nltk numpy precision-recall python scikit-learn tf-idf-vectorizer

Last synced: 10 Oct 2024

https://github.com/rickcontreras/modelos1

Modelo de clasificación para predecir el desempeño de estudiantes en las Pruebas Saber Pro en Colombia. Incluye análisis exploratorio de datos, preprocesamiento y modelos de machine learning.

classification colombia data-analysis data-science education educational-assessment exploratory-data-analysis jupyter-notebook machine-learning python saber-pro scikit-learn student-performance

Last synced: 10 Oct 2024

https://github.com/stella4444/linear-regression

learning about linear regression (currently a work in progress) ~ working with data

linear-regression machine-learning numpy scikit-learn

Last synced: 03 Nov 2024

https://github.com/chaitanya1436/student_performance_analysis

A project focused on analyzing college student performance using data on department, assessment scores, and performance labels. Implemented in Google Colab, the analysis includes data preprocessing, feature scaling, and exploratory data analysis to uncover insights and prepare the data for further analysis or modeling.

ata-preprocessing data-preparation exploratory-data-analysis feature-scaling google-colab numpy pandas scikit-learn

Last synced: 03 Nov 2024

https://github.com/sergeimakarovv/energy-data-analytics-ml

Analyzing global data on sustainable energy, predicting CO2 emissions per capita

machine-learning pandas plotly python scikit-learn streamlit

Last synced: 10 Oct 2024

https://github.com/bilgenurbekar/turkishcyberbullying

Contains fine-tuned BERT models and results in the text classification category using Turkish social media data

bert-fine-tuning huggingface-transformers matplotlib numpy pandas python pytorch scikit-learn transformers

Last synced: 10 Oct 2024

https://github.com/mkdirer/depression-data-analysis

This project analyzes a Kaggle depression dataset using data preprocessing, clustering, classification, and outlier detection techniques. Python libraries like pandas, numpy, matplotlib, seaborn, and scikit-learn are used to extract insights.

classification clustering matplotlib numpy pandas scikit-learn seaborn vizualization

Last synced: 10 Oct 2024

https://github.com/khanovico/python-stock-analyzer

This is a Webapp implemented by python and several data science frameworks, enabling online stock trend analyzing.

amcharts-js-charts data-analysis data-visualization flask javascript pandas python scikit-learn

Last synced: 03 Nov 2024

https://github.com/alejoduarte23/si_bayesianmixturemodel

Implementation of a two-stage fast Bayesian system identification for separated Modes. This repository expands the usage of this technique by adding a mixture model fit to obtain modal parameters from the posterior distribution.

matplotlib numpy scikit-learn scipy

Last synced: 10 Oct 2024

https://github.com/szymon-budziak/real_estate_house_prices_prediction

Predicting real estate house prices using various machine learning algorithms, including data exploration, preprocessing, model training, and evaluation.

data-analysis data-preprocessing data-science eda jupyter-notebook machine-learning matplotlib numpy optuna pandas predictive-modeling price-prediction python random-forest regression scikit-learn seaborn

Last synced: 10 Oct 2024

https://github.com/davgiles/ut-austin-data-science-program

This repository contains my projects from the Data Science & Business Analytics Post-Graduate Program through UT Austin.

eda matplotlib numpy pandas python scikit-learn scipy seaborn visualization xgboost

Last synced: 03 Nov 2024

https://github.com/rishavp15/aivshuman_text

In this project make user to decide that the text which is entered in text box is a human generated or a computer generated text.

django pandas python scikit-learn

Last synced: 10 Oct 2024

https://github.com/chrispsang/customerchurnanalysis

Predicting customer churn using a RandomForestClassifier with detailed EDA, model evaluation, and visualization. Includes a Tableau dashboard for interactive insights.

customerchurn data-analysis data-visualization datapreprocessing machine-learning python scikit-learn tableau

Last synced: 10 Oct 2024

https://github.com/adriantomin/bulldozer-price-prediction

Predicting the Sale Price of Bulldozers Using Machine Learning 🚜💰 This project uses machine learning to predict bulldozer sale prices based on historical data from the Kaggle Bluebook for Bulldozers competition. The goal is to minimize the RMSLE between actual and predicted prices.

data-science jupyter-notebook machine-learning matplotlib-pyplot numpy pandas python scikit-learn

Last synced: 10 Oct 2024

https://github.com/achronus/data-exploration

A repository dedicated to interesting data exploration projects I've completed

data-analysis exploratory-data-analysis machine-learning matplotlib pandas python scikit-learn seaborn

Last synced: 13 Oct 2024

https://github.com/enayar478/nomad_machine_learning_dash_app

An interactive Machine Learning app built with Dash and Plotly, developed as part of the Data Analytics Bootcamp at Le Wagon Bordeaux. It allows users to visualize data, make real-time predictions, and explore various model insights.

analytics cachetools dash dashboard-application data-analysis data-science deployment gunicorn interactive-visualization machine-learning pandas plotly plotly-dash prediction-model python python3 render scikit-learn web-application

Last synced: 12 Oct 2024