Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/magnuss0/movie-rec-system

The project extracts movie data using TheMovieDB API, processes it using TF-IDF and cosine similarity for generating recommendations, and stores the data in a DuckDB database. The system is encapsulated within a FastAPI web application and can be deployed using Docker. It provides movie recommendations in JSON format.

cosine-similarity docker duckdb movies-recommendation moviesdb-api ploomber poetry-python scikit-learn streamlit tf-idf

Last synced: 25 Nov 2024

https://github.com/troublem1/mle

MultiLabel-Transformer(MLE) is an extended version of a LabelEncoder, such that, it encodes multiple categorical columns to numeric in any workflow or pipeline

packages python3 scikit-learn sklearn

Last synced: 01 Dec 2024

https://github.com/nirmalyabag20/diabetes-prediction-using-machine-learning

This project focuses on predicting diabetes using machine learning algorithms based on health metrics like glucose levels, blood pressure, and BMI. By comparing different models, the goal is to identify the most accurate approach for early diabetes detection, showcasing the potential of machine learning in healthcare.

decision-tree-classifier jupyter-notebook kneighborsclassifier logistic-regression matplotlib numpy pandas python random-forest-classifier scikit-learn seaborn svc

Last synced: 19 Dec 2024

https://github.com/francescopaolol/sentimentanalysis

About sentiment analysis on IMDB Dataset of 50K Movie Reviews

jupyter-notebook kaggle machine-learning ml pandas scikit-learn sentiment-analysis

Last synced: 22 Dec 2024

https://github.com/francescopaolol/titaniccompetition

It's my first kaggle competition about predict survival on the Titanic and get familiar with ML basics

jupyter-notebook kaggle-competition machine-learning ml pandas scikit-learn

Last synced: 22 Dec 2024

https://github.com/francescopaolol/favoritatimeseriesforecasting

See: https://www.kaggle.com/competitions/store-sales-time-series-forecasting

jupyter-notebook kaggle-competition machine-learning pandas scikit-learn

Last synced: 22 Dec 2024

https://github.com/sarthak-1408/rain-fall-prediction

This repository represents the End to End Machine Learning Project (Rain Fall Prediction in Australia).

heroku heroku-deployment machine-learning numpy pandas rain-fall rain-fall-prediction scikit-learn xgboost-algorithm

Last synced: 16 Nov 2024

https://github.com/francescopaolol/logisticregression

About predicting survival on the Titanic and get familiar with ML basics

jupyter-notebook kaggle logistic-regression machine-learning ml pandas scikit-learn

Last synced: 22 Dec 2024

https://github.com/chanioxaris/german-credit-data

Experimental classification algorithms on german credit data implemented using scikit-learn library

classification classifier cross-validation dataset information-entropy information-gain naive-bayes prediction random-forest scikit-learn support-vector-machines

Last synced: 02 Nov 2024

https://github.com/shaadclt/data-preprocessing-pipeline

This project contains a data preprocessing pipeline implemented in Python using the pandas and numpy libraries. The pipeline handles missing values, outliers, and normalizes numeric features in a dataset.

numpy pandas scikit-learn

Last synced: 07 Dec 2024

https://github.com/jdavydovportfolio/careerpredictor

A project leveraging AI and Machine Learning (Logistic Regression) to predict graduate job placements. Includes data preprocessing, exploratory analysis, and predictive modeling.

artificial-intelligence exploratory-data-analysis jupyter-notebook linear-regression logistic-regression machine-learning machine-learning-algorithms machine-learning-models matplotlib ml numpy pandas pandas-dataframe predictive-modeling programming python scikit-learn

Last synced: 07 Dec 2024

https://github.com/m-rishab/credbet

A loan prediction web app which tells You that you are eligible for loan or not!

decision-tree-classifier matplotlib numpy pandas python scikit-learn

Last synced: 21 Nov 2024

https://github.com/matsunagalab/tutorial_analyzingmddata

Google colab notebooks for typical MD trajectory analysis routines with Python

mdtraj molecular-dynamics scikit-learn tutorial

Last synced: 19 Nov 2024

https://github.com/nemeslaszlo/sale-price-of-bulldozers

The goal of predicting the sale price of bulldozers. How well can we predict the future sale price of a bulldozer, given its characteristics previous examples of how much similar bulldozers have been sold for? (Archive kaggle competition)

matplotlib numpy pandas random-forest-regressor regression scikit-learn seaborn

Last synced: 01 Dec 2024

https://github.com/nemeslaszlo/heart-disease

Heart disease classification project with different models (LogisticRegression, KNeighboursClassifier, RandomForestClassifier) and detailed reports.

classification knearest-neighbor-classifier logistic-regression mathplotlib numpy pandas randomforest-classification scikit-learn seaborn

Last synced: 01 Dec 2024

https://github.com/francescopaolol/decisiontree

About classify iris plants into three species in this classic dataset

decision-tree-classifier jupyter-notebook kaggle machine-learning ml pandas scikit-learn

Last synced: 22 Dec 2024

https://github.com/pankajarm/tabular_ml_toolkit

A helper library to jumpstart your machine learning project based on tabular or structured data.

data-science feature-engineering hyperparameter-tuning machine-learning parallelism python scikit-learn structured-data tabular xgboost

Last synced: 21 Dec 2024

https://github.com/ax-va/numpy-pandas-matplotlib-scikit-learn-vanderplas-2023

These examples provide an introduction to Data Science and classic Machine Learning using NumPy, Pandas, Matplotlib, and scikit-learn. They are taken, with some changes, from the book "Python Data Science Handbook: Essential Tools for Working with Data", Second Edition, written by Jake VanderPlas and published by O'Reilly Media in 2023.

ax-va classic-machine-learning data-science machine-learning matplotlib numpy pandas python scikit-learn

Last synced: 18 Nov 2024

https://github.com/priboy313/pandasflow

A set of custom python modules for friendly workflow on pandas

catboost data-analysis data-science pandas phik python scikit-learn shap

Last synced: 21 Dec 2024

https://github.com/md-emon-hasan/6-classification-iris-ml-apps

A ML project on the classification of the Iris dataset, demonstrating data preprocessing, model training, and evaluation using Python and scikit-learn.

classification data-science iris-classification iris-dataset iris-flower-classification predictive-modeling scikit-learn

Last synced: 13 Nov 2024

https://github.com/ewertondrigues02/previsao-de-vendas

Previsão de vendas de uma empresa fictícia onde foi feita análise com ferramentas como Jupyter Notebook, Google Colab, Python e bibliotecas de Machine Learn como: regressão linear, arvore de decisão, scikit-learn

analise-de-dados analise-exploratoria arvore-de-decisao ciencia-de-dados colab excel google-colab jupyter jupyter-notebook machine-learning previsao previsao-de-vendas python3 regressao-linear scikit-learn

Last synced: 18 Nov 2024

https://github.com/rayyan9477/machine-learning-driven-backorder-prediction-system

Experience a state-of-the-art Django web application designed to predict product backorders with exceptional accuracy. This platform leverages advanced machine learning techniques, incorporating pre-trained Random Forest Classifier, Decision Tree, and LGBM models.

matplotlib notebook numpy pandas python scikit-learn

Last synced: 11 Nov 2024

https://github.com/j-i-l/tfb-prediction

Transcription factor binding prediction

bioinformatics machine-learning pandas python scikit-learn

Last synced: 20 Dec 2024

https://github.com/an-exodus/dubai-real-estate-price-prediction-ml

This repository contains a comparative analysis of machine learning algorithms to predict real estate prices in Dubai. Using data from Bayut, we evaluate Decision Tree, Linear Regression, Random Forest, and Gradient Boosting models based on their predictive accuracy.

decision-tree gradient-boosting linear-regression machine-learning random-forest scikit-learn

Last synced: 21 Dec 2024

https://github.com/md-emon-hasan/ai-from-university

🎓 Collection of academic resources, projects, and exercises related to artificial intelligence concepts learned in university coursework.

ai artificial-intelligence linear-regression logestic-regression mahcine-learning ml scikit-learn

Last synced: 13 Nov 2024

https://github.com/rayyan9477/data-driven-house-price-prediction-and-property-recommendation-app

The app leverages algorithms to accurately predict house prices and recommend similar properties based on a saved dataset through content-based filtering. It is tailored for homebuyers seeking their dream house and real estate investors looking for profitable opportunities, providing powerful insights and data-driven decision-making support.

data-science eda html machine-learning numpy pandas python scikit-learn

Last synced: 11 Nov 2024

https://github.com/aditya172926/text_summarization

Project to generate summaries and perform Named Entity Recognition from multiple types of text bodies.

glove machine-learning nlp python scikit-learn spacy

Last synced: 24 Nov 2024

https://github.com/guoshijiang/scikit-learn

带你一起学习scikit-learn

nlp-machine-learning scikit-learn

Last synced: 24 Nov 2024

https://github.com/upul/chocolate-quality-analysis

This repository contains a Jupiter notebook which describes how to use basic machine learning tools such Scikit-Learning, Pandas, and Numpy for buiding models.

machine-learning numpy pandas predictive-analytics scikit-learn

Last synced: 18 Nov 2024

https://github.com/asosnovsky/analyzing-blood-vessel-aneurysm

A few simple scripts to identify aneurysm in a blood-vessel (research projects)

machine-learning meanshift medical-image-processing scikit-learn

Last synced: 23 Nov 2024

https://github.com/spamfromaditya/drugs-consumption-prediction-model-eda-bagging-classifier

Drug consumption prediction models are like crystal balls for public health. By analyzing vast amounts of data, these models can identify individuals or communities at higher risk of drug use. They consider factors like demographics, social media activity, prescription history, and even economic indicators.

bagging-classifier machine-learning matplotlib numpy python scikit-learn

Last synced: 31 Dec 2024

https://github.com/aarryasutar/logistic_regression_on_age_prediction

This code evaluates the performance of a logistic regression model on age prediction using various features to predict a binary target variable, calculating metrics to determine the performance. It evaluates the comparison, identifies favorable features, and visualizes the ROC-AUC curve to determine the best model performance.

accuracy-score confusion-matrix f1-score feature-selection logistic-regression model-training numpy pandas precision recall rmse roc-auc-curve scikit-learn visualization

Last synced: 21 Dec 2024

https://github.com/khaymanii/multiple-disease-prediction-system

This system predicts if a patient has heart, parkinsons and diabetes disease

matplotlib numpy pandas python scikit-learn

Last synced: 20 Nov 2024

https://github.com/edikedik/eboruta

Flexible and transparent Python Boruta implementation

ensemble-models feature-selection machine-learning python scikit-learn

Last synced: 13 Oct 2024

https://github.com/shliakhovai/house-price-prediction

This repository contains a complete machine learning pipeline for predicting housing prices. It includes data preprocessing, feature engineering, and model training and evaluation components, designed to provide a robust solution for regression tasks.

data-science machine-learning matplotlib numpy pandas prediction python regression scikit-learn seaborn

Last synced: 21 Dec 2024

https://github.com/farrajota/kaggle_titanic

My solutions to the "Titanic: Machine Learning from Disaster" kaggle competition

docker docker-compose kaggle kaggle-competition kaggle-titanic notebook pyspark python scikit-learn

Last synced: 17 Nov 2024

https://github.com/dinhanhx/determination

Scripts to set global random for some machine learning framework

determination deterministic keras pytorch randomness scikit-learn tensorflow2

Last synced: 30 Nov 2024

https://github.com/lfenzo/ml-solar-sao-paulo

Implementation of scientific project regarding the use of Machine Learning in Solar Radiation Prediction

forecasting machine-learning python scikit-learn

Last synced: 17 Nov 2024

https://github.com/monzerdev/fake-news-detection

Project implementing machine learning models to detect fake news articles. Utilizes Deep Neural Networks, Support Vector Machines (SVM), and Ensemble methods (Random Forest). Developed using Python with scikit-learn, PyTorch, and nltk.

dnn fakenewsdetection machinelearning nlp nltk python pytorch random-forest scikit-learn svm

Last synced: 21 Dec 2024

https://github.com/oneapi-src/credit-card-fraud-detection

AI Starter Kit for Credit Card Fraud Detection model using Intel® Extension for Scikit-learn*

machine-learning scikit-learn

Last synced: 05 Nov 2024

https://github.com/kingabzpro/github-actions-for-machine-learning-beginners

A project on automating ML workflow using scikit-learn pipelines, CML, and GitHub actions.

cml github-actions machine-learning mlops scikit-learn

Last synced: 17 Nov 2024

https://github.com/ivanyu/kaggle-digit-recognizer

Kaggle's "Digit Recognizer" competition

kaggle keras machine-learning scikit-learn

Last synced: 06 Dec 2024

https://github.com/nekruzash/regression-correlation

This is from CS2023 - AI/DS/ML class, trained a model based on different categories of data and predicted using a linear regression for the best feature that has the greatest effect on the housing prices.

jupyter-notebook python scikit-learn

Last synced: 15 Nov 2024

https://github.com/kingabzpro/ml-workflow-orchestration-with-prefect

An introductory project to streamline the machine learning pipeline using Prefect and Discord Notifications, from data ingestion to model saving

discord mlops prefect scikit-learn

Last synced: 17 Nov 2024

https://github.com/sralter/happy_customers

Predicting whether a customer is happy based on the results from a survey.

eda ensemble-classifier hyperopt lazypredict ml scikit-learn

Last synced: 17 Nov 2024

https://github.com/jesly-joji/spam-ham-classifier

Used Naive Bayes Algorithm, NLP Text Preprocessing Techniques

naive-bayes-classifier nlp scikit-learn streamlit text-preprocessing

Last synced: 20 Dec 2024

https://github.com/sralter/classifire

Wildfire Prediction Model: Samuel Alter's BrainStation 2023 Data Science Capstone Project

qgis scikit-learn tensorflow

Last synced: 17 Nov 2024

https://github.com/vishal-038/attendance_by_face_recogination

This project is a face recognition-based attendance system that uses Python, OpenCV, Scikit-learn, Streamlit, and various other libraries like Pandas, Numpy, Datetime, and OS for different functionalities. It enables adding faces to the database, taking attendance based on face recognition, and showing live attendance through a web interface built

opencv python scikit-learn

Last synced: 17 Dec 2024

https://github.com/alisonmitchell/boston-housing

Investigation of the Boston housing dataset to evaluate, train and test a regression model to predict house prices.

data-science machine-learning matplotlib numpy pandas python scikit-learn scipy seaborn

Last synced: 15 Nov 2024

https://github.com/ahmedshahriar/telco-customer-churn-prediction-streamlit-app

This streamlit app predicts the churn rate using Gradient Boosting models (XGBoost, Catboost, LightGBM) on IBM Customer Churn Dataset

binary-classification binary-classifiers data-science jupyter-notebook machine-learning pandas python scikit-learn sklearn stacking-ensemble streamlit streamlit-webapp

Last synced: 16 Nov 2024

https://github.com/vatshayan/hospital-discharge-analysis

Analysis of Hospitalization Discharge Rates in Lake County, Illinois of various attributes like Anxiety, Alcohol, mood, Diabetes, Asthma, etc

data-analysis data-visualization jupyter-notebook machine machine-learning machine-learning-algorithms scikit-learn

Last synced: 15 Nov 2024

https://github.com/nirmalyabag20/crop-yield-prediction-using-machine-learning

This project uses machine learning to predict crop yields based on factors like region, crop type, rainfall, temperature, and pesticide use. By analyzing a dataset of over 28,000 records, the models provide accurate yield forecasts, helping optimize farming decisions and resource management, ultimately contributing to sustainable agriculture.

jupyter-notebook matplotlib numpy pandas python scikit-learn seaborn

Last synced: 19 Dec 2024

https://github.com/kookmin-sw/capstone-2023-29

자리있어? - 경기도 광역버스 좌석예측 시스템

fastapi lstm postgresql python3 pytorch react scikit-learn sqlalchemy

Last synced: 13 Nov 2024

https://github.com/george-gca/ai_papers_search_tool

Automatic paper clustering and search tool by fastext from Facebook Research

fasttext fasttext-embeddings fasttext-python nlp python scikit-learn

Last synced: 14 Nov 2024

https://github.com/mrapp-ke/examplewisef1maximizer

A scikit-learn meta-estimator for multi-label classification that aims to maximize the example-wise F1 measure

machine-learning multilabel-classification scikit-learn

Last synced: 24 Dec 2024

https://github.com/mpolinowski/isometric-mapping

Non-linear dimensionality reduction through Isometric Mapping

isomap matplotlib-pyplot python scikit-learn

Last synced: 30 Nov 2024

https://github.com/rajikaimal/emma

:santa: Intelligent mention bot for GitHub organizations

bot emma machine-learning python scikit-learn

Last synced: 14 Dec 2024

https://github.com/saro0307/pre-doctor-ai-model

Pre-Doctor is an AI-driven health advisor using sci-kit-learn, offering quick medical advice based on user-input symptoms, making healthcare accessible and user-friendly. Utilizing Flask and pyttsx3, it seamlessly integrates machine learning for informed well-being.

artificial-intelligence css flask generative-ai generative-model html machine-learning python reinforcement-learning scikit-learn

Last synced: 13 Nov 2024

https://github.com/bhimrazy/iris-species-prediction-using-decision-tree-algorithm-grip

Iris Species Intelligence: Classifying Iris Species with Confidence using Decision Trees | The Sparks Foundation: GRIP

decision-tree-classifier fastapi gripjan23 machine-learning python scikit-learn sparkfoundation

Last synced: 16 Nov 2024

https://github.com/somjit101/nlp-casestudy-quora-question-similarity

An application of NLP and classical ML algorithms to an interesting real-world use case of predicting similarity between two questions on Quora. This allows the platform to combine similar questions into one and combine their answers to avoid duplication and unnecessary confusion.

cross-validation feature-engineering feature-extraction gradient-boosting kaggle logistic-regression machine-learning model-calibration natural-language-processing nlp quora-question-pairs scikit-learn svm text-mining xgboost

Last synced: 16 Nov 2024

https://github.com/siam29/exploring-explainable-ai-demystifying-dt-rf-knn-xgbc

Implemented XAI techniques to enhance transparency in fraud detection models. I employed techniques such as SHAP, LIME on DT, RF, XGBC, and KNN to offer lucid explanations for transactions that were flagged.

machine-learning matplotlib pandas scikit-learn xai

Last synced: 06 Dec 2024

https://github.com/corentinth/ml-gender_classification

[Machine Learning] The Hello Wolrd of Machine Learning using sklearn

body-metrics gender-classification machine-learning scikit-learn

Last synced: 20 Dec 2024

https://github.com/kingabzpro/mlops-with-jenkins

From data ingestion to deploying the model using Jenkins.

classification fastapi jenkins mlops scikit-learn

Last synced: 13 Oct 2024

https://github.com/fohlen/stats-experiment

A tiny stats experiment with GENESIS data

matplotlib python3 scikit-learn

Last synced: 22 Nov 2024