Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

scikit-learn

scikit-learn is a widely-used Python module for classic machine learning. It is built on top of SciPy.

https://github.com/grachale/predict_titanik

Predicting the survival of Titanic passengers (binary classification) with usage of decision tree and KNN from scikit-learn.

classification decision-tree-classifier knn-classifier matplotlib pandas python scikit-learn titanic-survival-prediction

Last synced: 13 Jan 2025

https://github.com/ghufranbarcha/codsoft-machine-learning-internship

This repository contain all Machine Learning & NLP task during my internship at Codsoft.

jupyter-notebook machinelearning nlp nltk python scikit-learn

Last synced: 02 Dec 2024

https://github.com/mehmoodulhaq570/machine-learning-models

A repository consisting of machine learning models for predicting the future instance. More specifically this repository is a Machine Learning course for those who are interested in learning the basics of machine learning algorithms.

decision-trees gradient-descent gradient-descent-algorithm knn-algorithm linear-regression linear-regression-models logistic-regression-algorithm machine-learning-algorithms machine-learning-models ml naive-bayes-algorithm one-hot-encoding pca python random-forest-classifier scikit-learn svm-model

Last synced: 22 Dec 2024

https://github.com/grachale/predict_pass_exam

Creating AdaBoost classifier with decision trees for predicting whether a student will pass or fail an exam (classification) based on the number of study hours and their scores in the previous exam.

adaboost cross-validation decision-tree jupyter-notebook matplotlib python scikit-learn seaborn

Last synced: 13 Jan 2025

https://github.com/gmontamat/quora-question-pairs

Code for the Kaggle competition "Quora Question Pairs"

kaggle-competition quora-question-pairs scikit-learn spell-checker xgboost

Last synced: 30 Oct 2024

https://github.com/aurelienmorgan/online_retail_growth

Laying some of the "Know Your Customers" / "Know Your Markets" fundations to strategizing on how to grow a business (from insight dug from the business data).

customer-journey customer-segmentation dashboard elasticsearch elk growth-marketing jupyter-notebook kibana lifetime-value prediction python ranking scikit-learn scoring

Last synced: 15 Dec 2024

https://github.com/aurelienmorgan/french_text_sentiment

Sentiment Analysis in texts written in French language using Tensorflow/Keras (and using XGBoost for hyperparameters optimization)

beautifulsoup dask fasttext french gru hyperparameters-optimization jupyter-notebook keras multiprocessing nlp python rnn scikit-learn sentiment-analysis tensorflow transfer-learning web-scraping xgboost

Last synced: 15 Dec 2024

https://github.com/uhstray-io/pyrizon

Data Collection, Analysis, Mapping, Pipelining & Transformation, & API using Python

api data-engineering etl numpy pandas plotly python pytorch raw-data scikit-learn seaborne sql sqlite tensorflow

Last synced: 08 Dec 2024

https://github.com/grampers-dev/co2oracle

The CO2 Oracle project uses machine learning and AI to analyze and predict CO2 emissions for environmental management. Using a Kaggle dataset, it demonstrates predictive analytics to understand and forecast emissions. Written in Python, it employs libraries like Pandas, NumPy, and Scikit-Learn.

artificial-intelligence machine-learning numpy pandas python scikit-learn

Last synced: 10 Oct 2024

https://github.com/2003harsh/house-price-prediction-using-machine-learning

This project features a web app that predicts house prices using a linear regression model. Users can input details like location, square footage, bathrooms, and bedrooms through an HTML form. I've added a CI/CD pipeline with GitHub Actions, unit testing with pytest, and automated Docker containerization to improve deployment and robustness.

ci-cd data-analysis docker-image flask linear-regression machine-learning matplotlib mlops-workflow requests scikit-learn

Last synced: 10 Oct 2024

https://github.com/grachale/predict_life_expect

Predicting life expectancy (regression) with usage of custom random forest, linear regression and decision tree regressor from scikit-learn.

decision-tree-regression jupyter-notebook linear-regression pandas python random-forest regression scikit-learn

Last synced: 13 Jan 2025

https://github.com/ultrasage-danz/scikit-learn-ml

Machine Learning with scikit-learn by Data School

ai data data-school machine-learning macos ml scikit-learn ultrasage-dan

Last synced: 02 Dec 2024

https://github.com/aakanksha1406/fake-news-classifier

to identify when an article might be fake news

keras lstm lstm-neural-networks nltk python scikit-learn tensorflow

Last synced: 10 Oct 2024

https://github.com/bacross/datamunger

python package for handling nan's and outliers

data data-frame datamunger knn nan outliers python scikit-learn

Last synced: 08 Dec 2024

https://github.com/vidhi1290/text-classification-model-with-attention-mechanism-nlp

This Python project utilizes PyTorch to perform text classification with an attention mechanism. Pre-trained GloVe embeddings are processed for word representation, and a custom attention model is trained on consumer complaint data to categorize complaints into product categories.🎯

attention-mechanism deeplearning machine-learning nlp nltk numpy pandas python pytorch scikit-learn text-classification tqdm

Last synced: 08 Dec 2024

https://github.com/adzialocha/notebook

Jupyter notebooks for random experiments with audio processing, data analysis and machine learning

jupyter-notebook keras learning librosa music21 scikit-learn

Last synced: 22 Dec 2024

https://github.com/vidhi1290/zomato-data-analysis

Zomato Data Analysis - Explore the world of Zomato restaurant data through Python and data analysis. Uncover trends and insights using Pandas for data manipulation and Matplotlib for visualization. Join us in this journey to reveal the hidden stories within the data!

data-analysis data-analysis-python data-science data-visualization dataprocessing machine-learning machine-learning-algorithms matplotlib numpy pandas python scikit-learn zomato-data-analysis

Last synced: 08 Dec 2024

https://github.com/aryank1511/wattwise

WattWise is an innovative energy-saving app that uses an Arduino-powered device to monitor and predict household electricity usage and bills in real-time.

arduino docker flask machine-learning mqtt nextjs scikit-learn

Last synced: 10 Oct 2024

https://github.com/bhuvaneshwarguttula/student-performance-indicator

To understand and predict how the student's performance (test scores) is affected by the other variables (Gender, Ethnicity, Parental level of education, Lunch, Test preparation course).

exploratory-data-analysis machine-learning pandas python scikit-learn student-performance-analysis

Last synced: 10 Oct 2024

https://github.com/soumya6tiwari/customer-segmentation-using-rfm-analysis

This project focuses on customer segmentation using RFM (Recency, Frequency, Monetary) analysis and K-Means clustering. It enables businesses to identify high-value customers, optimize marketing strategies, and improve customer retention through data-driven insights.

backend clustering flask frontend kmeans-clustering matplotlib numpy pandas python rfm-analysis scikit-learn unsupervised-learning

Last synced: 21 Dec 2024

https://github.com/elcorto/gp_playground

Explore selected topics related to Gaussian processes

gaussian-processes gpy gpytorch kernel-ridge-regression machine-learning scikit-learn tinygp

Last synced: 28 Nov 2024

https://github.com/sandeepbalachandran/predictor

A collection of prediction algorithms for different purposes

collection jupyter-notebook machine-learning notebook predictor regression-models scikit-learn

Last synced: 03 Dec 2024

https://github.com/prithivsakthiur/data-board

Data Boards - Visualization of various plots ( Analysis )

data-analysis gradio huggingface keras mathplotlib pandas plots pyplot scikit-learn seaborn spaces

Last synced: 21 Dec 2024

https://github.com/vectominist/mednlp

Mandarin Medical Dialogue Analysis with Pytorch.

dialog huggingface mandarin medical pytorch scikit-learn transformers

Last synced: 02 Dec 2024

https://github.com/zen204/airbnb_availability

A machine learning model that predicts Airbnb listing availability, utilizing feature engineering and supervised learning techniques to improve guest experience and optimize host management.

binary-classification data-analysis data-preprocessing data-visualization feature-engineering machine-learning matplotlib model-evaluation nlp pandas predictive-modeling python scikit-learn seaborn supervised-learning

Last synced: 03 Nov 2024

https://github.com/gappeah/income-prediction-ml

This is a machine learning project aimed at predicting whether an individual's annual income exceeds $50,000 based on their demographic and personal information.

data data-science machine-learning ml numpy pandas python random-forest scikit-learn

Last synced: 10 Oct 2024

https://github.com/mg380/ibm-applied-data-science-capstone

This Capstone is the 10th (final) course in IBM Data Science Professional Certificate specialization, and it actually summarises in the form of project all materials that have been learned during this specialization

capstone data data-analysis data-science datascience ibm machine-learning plotly python scikit-learn sql

Last synced: 10 Oct 2024

https://github.com/juselara1/bregclus

Python implementation of Bregman Hard Clustering and Bregman Soft Clustering as a scikit-learn module.

bregman-divergence clustering numpy scikit-learn unsupervised-learning

Last synced: 31 Dec 2024

https://github.com/prajwalsinha/unveiling-climate-change-dynamics-through-earth-surface-temperature-analysis

Climate change analysis through global surface temperature data. Includes data preprocessing, statistical analysis, visualizations, and forecasting. Python-based project using Pandas, Matplotlib, and Scikit-learn.

data dataanalysis dynamic-mapping pyplot python scikit-learn seaborn

Last synced: 21 Dec 2024

https://github.com/tsu2000/audit_risk

Machine learning web app in Streamlit about classifying fraudulent companies using various classification models.

machine-learning plotly python random-forest scikit-learn streamlit-webapp

Last synced: 23 Jan 2025

https://github.com/uea-geral/rna-perceptron-exercise

🤖Disciplina de RNA: treinamento de um neurônio Perceptron.

jupyter-notebook neural-network numpy perceptron python scikit-learn

Last synced: 26 Nov 2024

https://github.com/offchan42/thai-thesis-classification

Classify each document inside the corpus using Python machine learning module: scikit-learn

nlp python python2 scikit-learn segment thai thai-language thai-thesis-classification

Last synced: 29 Oct 2024

https://github.com/md-emon-hasan/ml-project-car-price-prediction

🚗 End-to-end ML project for predicting car prices based on various features. Includes data preprocessing, model training, and a Flask web for predictions.

car-price-prediction car-price-predictor data-science feature-engineering ml predictive-modeling scikit-learn

Last synced: 10 Oct 2024

https://github.com/jersongb22/datascience_ibm_stockpredictionlstm_project

In the IBM Advanced Data Science specialization, an interactive real-time web application was developed using LSTM networks in TensorFlow to predict stock market trends for global companies.

apache-spark data-science deep-learning lstm-neural-networks machine machine-learning plotly python scikit-learn streamlit tensorflow

Last synced: 25 Nov 2024

https://github.com/palak-463/tablataalrecognitionsystem

Software built using Python which makes use of CNN and FNN to detect the Taals of the Tabla, an Indian classical music instrument. 🎛️

cnn deep-learning flask fnn librosa numpy os pickle python scikit-learn

Last synced: 09 Jan 2025

https://github.com/sethios-notebook/__ia_learnig__

Formation Python spéciale Machine Learning francais. Apprendre Python en 30 fichiers qui contiennent une formation sur Numpy, Pandas, Matplotlib, Scipy, Sklearn, Seaborn, H5py, et bien d'autres techniques. Python est le langage d'excellence pour le machine learning, le deep learning, et la data science.

iac machine-learning matplotlib numpy python scikit-learn

Last synced: 21 Dec 2024

https://github.com/rickiepark/ml-ko

머신러닝, 딥러닝 한글 번역 저장소

deep-learning keras machine-learning python scikit-learn tensorflow

Last synced: 21 Dec 2024

https://github.com/rickiepark/ml-with-python-cookbook-2nd

<실무로 통하는 ML 문제 해결 with 파이썬>

deep-learning machie-learning pytorch scikit-learn

Last synced: 21 Dec 2024

https://github.com/troublem1/mle

MultiLabel-Transformer(MLE) is an extended version of a LabelEncoder, such that, it encodes multiple categorical columns to numeric in any workflow or pipeline

packages python3 scikit-learn sklearn

Last synced: 01 Dec 2024

https://github.com/ishanoshada/matplot3dex

A Matplotlib 3D Extension package for enhanced data visualization

data data-science matplotlib python-packages scikit-learn

Last synced: 01 Dec 2024

https://github.com/magnuss0/movie-rec-system

The project extracts movie data using TheMovieDB API, processes it using TF-IDF and cosine similarity for generating recommendations, and stores the data in a DuckDB database. The system is encapsulated within a FastAPI web application and can be deployed using Docker. It provides movie recommendations in JSON format.

cosine-similarity docker duckdb movies-recommendation moviesdb-api ploomber poetry-python scikit-learn streamlit tf-idf

Last synced: 25 Nov 2024

https://github.com/h-fuzzy-logic/python-finding-nsf-award-themes

Using NLP to find themes and concepts in NSF Awards

nltk pandas python scikit-learn

Last synced: 15 Dec 2024

https://github.com/dustinmichels/bayesian-values-guesser

Uses some user input, data from the World Values Survey <www.worldvaluessurvey.org>, and Bayes Rule to guess a number of beliefs the user might have. STATUS: In progress.

bayes-rule bayesian-values-guesser naive-bayes-classifier pandas python scikit-learn values-survey

Last synced: 14 Dec 2024

https://github.com/aahnik/gdsc-ml-ds-bootcamp-2023

This repo contains files given by my seniors as well as assignments and final project done by me during the bootcamp.

data-science machine-learning ml numpy pandas python3 scikit-learn

Last synced: 11 Oct 2024

https://github.com/spamfromaditya/drugs-consumption-prediction-model-eda-bagging-classifier

Drug consumption prediction models are like crystal balls for public health. By analyzing vast amounts of data, these models can identify individuals or communities at higher risk of drug use. They consider factors like demographics, social media activity, prescription history, and even economic indicators.

bagging-classifier machine-learning matplotlib numpy python scikit-learn

Last synced: 31 Dec 2024

https://github.com/canayter/unsupervised-machine-learning

Utilizing Python and unsupervised learning to predict if cryptocurrencies are affected by 24-hour or 7-day price changes.

k-means-clustering python scikit-learn unsupervised-machine-learning

Last synced: 09 Jan 2025

https://github.com/ayushshahh/fespn

A neural network made to predict final exam scores of students

mlp mlp-regressor multilayer-perceptron neural-network prediction-model scikit-learn

Last synced: 07 Dec 2024

https://github.com/francescopaolol/favoritatimeseriesforecasting

See: https://www.kaggle.com/competitions/store-sales-time-series-forecasting

jupyter-notebook kaggle-competition machine-learning pandas scikit-learn

Last synced: 22 Dec 2024

https://github.com/swimshahriar/heart-attack-prediction

Heart attack prediction from 13 features.

jupyter-notebook pandas python3 scikit-learn

Last synced: 20 Dec 2024

https://github.com/f-aguzzi/chemfusekit

Chemometrics library for data fusion, model training and prediction of data from multiple sensor sources.

chemometrics datafusion knn lda pca plsda scikit-learn svm

Last synced: 01 Jan 2025

https://github.com/francescopaolol/titaniccompetition

It's my first kaggle competition about predict survival on the Titanic and get familiar with ML basics

jupyter-notebook kaggle-competition machine-learning ml pandas scikit-learn

Last synced: 22 Dec 2024

https://github.com/shaadclt/data-preprocessing-pipeline

This project contains a data preprocessing pipeline implemented in Python using the pandas and numpy libraries. The pipeline handles missing values, outliers, and normalizes numeric features in a dataset.

numpy pandas scikit-learn

Last synced: 07 Dec 2024

https://github.com/nemeslaszlo/heart-disease

Heart disease classification project with different models (LogisticRegression, KNeighboursClassifier, RandomForestClassifier) and detailed reports.

classification knearest-neighbor-classifier logistic-regression mathplotlib numpy pandas randomforest-classification scikit-learn seaborn

Last synced: 01 Dec 2024

https://github.com/francescopaolol/decisiontree

About classify iris plants into three species in this classic dataset

decision-tree-classifier jupyter-notebook kaggle machine-learning ml pandas scikit-learn

Last synced: 22 Dec 2024

https://github.com/khaymanii/loan_status_predictor

This is a loan status predictor model built using Support Vector Machine Algorithm and Gradio UI library

matplotlib numpy pandas python scikit-learn

Last synced: 21 Jan 2025

https://github.com/khaymanii/movie-recommendation-model

This is a model built using Python and Cosine Similarity algorithm

matplotlib numpy pandas python scikit-learn

Last synced: 21 Jan 2025

https://github.com/rusiruchapana/blood-group-prediction

Creating a machine learning project to predict blood groups from fingerprint patterns

asp-net-web-api keras matplotlib numpy opencv-python pandas pillow scikit-learn tensorflow

Last synced: 21 Jan 2025

https://github.com/tharindanimnajith/deep-learning-spam-detection

Deep Learning classifiers to detect spam SMS messages - LSTM Model, DenseNet CNN Models - NLP, Python, Jupyter Notebook, Tensorflow, Keras, Numpy, Pandas, Matplotlib, Scikit-Learn

deep-learning densenet keras lstm nlp python3 scikit-learn tensorflow

Last synced: 25 Nov 2024

https://github.com/jdavydovportfolio/careerpredictor

A project leveraging AI and Machine Learning (Logistic Regression) to predict graduate job placements. Includes data preprocessing, exploratory analysis, and predictive modeling.

artificial-intelligence exploratory-data-analysis jupyter-notebook linear-regression logistic-regression machine-learning machine-learning-algorithms machine-learning-models matplotlib ml numpy pandas pandas-dataframe predictive-modeling programming python scikit-learn

Last synced: 07 Dec 2024

https://github.com/nemeslaszlo/sale-price-of-bulldozers

The goal of predicting the sale price of bulldozers. How well can we predict the future sale price of a bulldozer, given its characteristics previous examples of how much similar bulldozers have been sold for? (Archive kaggle competition)

matplotlib numpy pandas random-forest-regressor regression scikit-learn seaborn

Last synced: 01 Dec 2024

https://github.com/francescopaolol/logisticregression

About predicting survival on the Titanic and get familiar with ML basics

jupyter-notebook kaggle logistic-regression machine-learning ml pandas scikit-learn

Last synced: 22 Dec 2024

https://github.com/fohlen/stats-experiment

A tiny stats experiment with GENESIS data

matplotlib python3 scikit-learn

Last synced: 23 Jan 2025

https://github.com/nafisalawalidris/logistic-regression-model-for-breast-cancer-recurrence-prediction

Predicting Breast Cancer Recurrence - A logistic regression model using patient attributes to classify recurrence risk. Dataset analysis and model evaluation. Contributions welcome.

breast-cancer classification-model data-analysis data-science healthcare logistic-regression machine-learning python recurrence-prediction scikit-learn

Last synced: 23 Jan 2025

https://github.com/francescopaolol/sentimentanalysis

About sentiment analysis on IMDB Dataset of 50K Movie Reviews

jupyter-notebook kaggle machine-learning ml pandas scikit-learn sentiment-analysis

Last synced: 22 Dec 2024

https://github.com/alessiochen/setiment-analysis-ai-project

Application of Sentimental Analysis for Artificial Intelligence class at UNIFI

ai andrew dataset movie-reviews scikit-learn sentiment-analysis

Last synced: 05 Jan 2025

https://github.com/nirmalyabag20/diabetes-prediction-using-machine-learning

This project focuses on predicting diabetes using machine learning algorithms based on health metrics like glucose levels, blood pressure, and BMI. By comparing different models, the goal is to identify the most accurate approach for early diabetes detection, showcasing the potential of machine learning in healthcare.

decision-tree-classifier jupyter-notebook kneighborsclassifier logistic-regression matplotlib numpy pandas python random-forest-classifier scikit-learn seaborn svc

Last synced: 19 Dec 2024

https://github.com/aditya172926/text_summarization

Project to generate summaries and perform Named Entity Recognition from multiple types of text bodies.

glove machine-learning nlp python scikit-learn spacy

Last synced: 24 Nov 2024

https://github.com/chanioxaris/german-credit-data

Experimental classification algorithms on german credit data implemented using scikit-learn library

classification classifier cross-validation dataset information-entropy information-gain naive-bayes prediction random-forest scikit-learn support-vector-machines

Last synced: 02 Nov 2024

https://github.com/ccharlesss/financeml

machine learning web application using Python's FastAPI and scikit-learn to predict S&P 500 stock price trends and cluster stocks based on average annual returns and volatility. Utilised the MVC design pattern to structure the application effectively. Implemented a decision tree classifier with 84% accuracy.

cicd docker fastapi finance javascript jenkins machine-learning restful-api scikit-learn webapplication

Last synced: 13 Jan 2025

https://github.com/guoshijiang/scikit-learn

带你一起学习scikit-learn

nlp-machine-learning scikit-learn

Last synced: 24 Nov 2024

https://github.com/ax-va/numpy-pandas-matplotlib-scikit-learn-vanderplas-2023

These examples provide an introduction to Data Science and classic Machine Learning using NumPy, Pandas, Matplotlib, and scikit-learn. They are taken, with some changes, from the book "Python Data Science Handbook: Essential Tools for Working with Data", Second Edition, written by Jake VanderPlas and published by O'Reilly Media in 2023.

ax-va classic-machine-learning data-science machine-learning matplotlib numpy pandas python scikit-learn

Last synced: 18 Nov 2024