An open API service indexing awesome lists of open source software.

NumPy

NumPy is an open source library for the Python programming language, adding support for large, multidimensional arrays, and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

https://github.com/rlxchap2/crypto-miner

🔨Crypto Miner is a Python project designed to encrypt and decrypt files, especially images, using the powerful cryptography library

crypto cryptography csv numpy pillow python

Last synced: 08 May 2026

https://github.com/adityakumarda/kmeans-web-analytics

Built with Python, Pandas, and Scikit-learn, this machine learning project uses K-Means to cluster website users by behavior. It reveals patterns in engagement and bounce, helping drive data-informed decisions.

cluster-analysis elbow-curves elbow-method elbow-plot jupyter-notebook kmeans-clustering machine-learning matplotlib numpy pandas python python3 relationship scikit-learn seaborn sklearn

Last synced: 10 Apr 2026

https://github.com/dineshdhamodharan24/amazon-reviews-sentiment-analysis

This is a sentiment analysis project that classifies Amazon product reviews as positive or negative using machine learning techniques.

matplotlib numpy pandas python scikit-learn

Last synced: 10 Apr 2026

https://github.com/broodhoney/heart-disease-prediction

This is a machine learning project which has a trained model that classifies whether a patient has a heart-disease or not.

kaggle-dataset matplotlib numpy pandas python scikit-learn scikitlearn-machine-learning uci

Last synced: 10 Apr 2026

https://github.com/smirnovlad/data-science-notebooks

A collection of various data analysis approaches

data-science deep-learning kaggle machine-learning numpy pandas pytorch

Last synced: 10 Apr 2026

https://github.com/ameykasbe/credit-card-fraud-detection-on-imbalanced-dataset

Examined data preprocessing techniques and performance of six different predictive models in Python to credit card fraud detection problem on an imbalanced dataset. Algorithms implemented - Logistic Regression, K Nearest Neighbours, Support Vector Classification, Naïve Bayes Classifier, Decision Tree Classifier, and Random Forest Classifier.

classification machine-learning matplotlib numpy pandas python scikit-learn seaborn

Last synced: 10 Apr 2026

https://github.com/anshpg/linearluminary

Greetings! I've developed a straightforward linear regression model from scratch to predict house prices in Bangalore. But before delving into coding, let me walk you through the algorithm's conceptualization. I considered various factors such as location, ocean proximity, plot size, finished state, and flat type.

algo linea mathematics matplotlib numpy pandas pyth

Last synced: 13 May 2026

https://github.com/jkosla/neural_network_from_scratch_numpy

Neural Network From Scratch in Python | Build a simple neural network from scratch using pure Python and NumPy. Learn about forward propagation, backpropagation, and training with gradient descent. Accompanies my Medium article.

ai aritificial-intelligence medium nerual-networks numpy python3 tutorial

Last synced: 10 Apr 2026

https://github.com/munawar-code/car_price_predictor

This project is a ML-based car price prediction system. The model is built using Jupyter Notebook for training and evaluation, while a simple one-page website was developed using Pycharm to provide interface for users to input car details and get price predictions.

datapreprocessing datavisualization exploratory-data-analysis feature-engineering flask-application html-css-javascript linear-regression machine-learning-algorithms matplotlib numpy pandas python scikitlearn-machine-learning

Last synced: 13 Apr 2026

https://github.com/ryan-bendelson/2024-summer-research

This is Python code that I worked with during my summer 2024 research project involving quantum physics.

density-matrices kronecker-product linear-algebra miniconda3 numpy numpy-arrays partial-trace python quantum-information

Last synced: 16 Apr 2026

https://github.com/aneeshmurali-n/project-ml-data-preprocessing

The main objective of this project is to design and implement a robust data preprocessing system that addresses common challenges such as missing values, outliers, inconsistent formatting, and noise. By performing effective data preprocessing, the project aims to enhance the quality, reliability, and usefulness of the data for machine learning.

data-analysis data-cleaning data-encoding data-exploration feature-scaling label-encoding matplotlib minmaxscaler numpy one-hot-encoding outlier-detection pandas standardscaler

Last synced: 02 May 2026

https://github.com/khaymanii/spam_mail_detection_model

This model was built using Python and Logistics Regression algorithm

matplotlib numpy pandas python sckiit-learn

Last synced: 10 Apr 2026

https://github.com/anubhavkumar31/simple-heart_disease_prediction-using-logisticregression

Its a simple yet good model which predicts if a person have heart disease or not. This is a binary classification model i.e its output is either 0(dont have heart disease) or 1 (have heart disease).

logistic-regression machine-learning numpy python sklearn sklearn-linear-model sklearn-metrics

Last synced: 10 Apr 2026

https://github.com/semihbugrasezer/rockvsmine

Rock vs Mine Prediction with Python | Machine Learning Project

numpy pandas python

Last synced: 05 May 2026

https://github.com/babagata/racunalna_fizika

Math and physics solved with python

matplotlib numpy random scipy sympy

Last synced: 10 Apr 2026

https://github.com/badranalyst/titanic-survival-prediction-full-data-science-project-classification

This project predicts Titanic survivors using classification models. It includes data cleaning, pre-processing, exploratory data analysis (EDA), categorical feature conversion, model building, and evaluation. Python libraries like Pandas, NumPy, Matplotlib, and Seaborn are used to analyze and predict survival outcomes.

classification data-analysis data-science eda exploratory-data-analysis machine-learning matplo matplotlib-pyplot ml model numpy pandas predictive-modeling python seaborn

Last synced: 06 May 2026

https://github.com/elon-fask/nlp_num1

Natural Language Processing with Disaster Tweets

ai machine-learning nlp nlp-machine-learning numpy pandas python text-processing

Last synced: 10 Apr 2026

https://github.com/ahmedabdalkreem/connected_component_labeling

Technique used to detect small object in the image like shapes and number can used this technique in OCR.

computer-vision connected-components matplotlib numpy object-detection python rgb2gray threshold

Last synced: 11 Apr 2026

https://github.com/ahmedabdalkreem/hotel-reservation

Our task is to classify a Hotel Reservation as either booking canceled (class1) or no canceled(class0) and use more one model to arrive the best model.

bagging decisiontreeclassifier ensemble extra-trees-classifier logistic-regression matplotlib numpy pandas python3 random-forest sklearn-library svc-model

Last synced: 11 Apr 2026

https://github.com/alejoduarte23/reading_data_from_dewesoft

The following repository retrieves sensor data (acceleration and strains) from both local and cloud databases. It processes the data using classes from another repository called Modal Engine for spectral analysis, modal analysis, and signal processing.

dewesoft matplotlib modal-analysis numpy orm scipy signal-processing sql sqlalchemy

Last synced: 07 Jan 2026

https://github.com/asghar-rizvi/youtube-statistics-project

This project analyzes a dataset of global YouTube statistics to uncover insights about YouTube channels, their ranks, and other attributes. The dataset used for this analysis was obtained from Kaggle.

data-analysis data-analysis-python data-science data-science-projects matplotlib numpy pandas pycharm-ide python seaborn

Last synced: 13 Jun 2026

https://github.com/tsungtsetu122/datamining-cifar10-classification

Data mining project on CIFAR-10 extracted features, applying preprocessing, classification models, and evaluation techniques to improve classification performance.

matplotlib numpy pandas python scikit-learn

Last synced: 10 Apr 2026

https://github.com/hussain-7/emotion_detection-master

Human Emotion Analysis using facial expressions in real-time from webcam feed. Based on the dataset from Kaggle's Facial Emotion Recognition Challenge.

keras-tensorflow matplotlib numpy opencv-python tensorflow

Last synced: 08 May 2026

https://github.com/sc0v0ne/ai-discipline-work

AI Discipline Work - Movie recommendation

jupyter-notebook machine-learning numpy pandas python python3

Last synced: 15 Apr 2025

https://github.com/soumyapro/wine-quality-prediction

This project is about the prediction of wine quality using machine learning algorithms

boxplot matplotlib numpy pandas random-forest smote

Last synced: 10 Apr 2026

https://github.com/mnitin-reddy/collaborative-filtering-based-recommendation-system

This project is a Book Recommendation System that uses two main approaches: Popularity-Based and Collaborative Filtering. It recommends top books based on their rating frequency and average ratings, and also provides personalized book suggestions by analyzing user interactions.

collaborative-filtering numpy pandas popularity-based-recommendation python recommendation-system scikit-learn

Last synced: 11 Apr 2026

https://github.com/farhad-here/data-visualization-analysis-dva

This is my data analysis project. Users can use this project to clean and preprocessing the date or data visualization. Individuals can impute or ecnode ther dataset.

altair bokeh data-analysis data-analysis-python io matplotlib numpy pandas plotly python sklearn streamlit

Last synced: 11 Apr 2026

https://github.com/shivam5509/power-bi-project

Expert in creating interactive dashboards and reports using Power BI, utilizing 10+ visual tools like cards, slicers, and charts. Skilled in cleaning and transforming large datasets with Power Query Editor. Proficient in advanced DAX functions (SUMX, FILTER, CALCULATE) to derive insights and drive data-driven decisions.

advanced-excel computer-science data-analysis data-mining data-visualization engineering mysql numpy pandas powerbi pyhton3 sql sql-server

Last synced: 11 Apr 2026

https://github.com/utkarsh251106/cricket-shot-analyzer

Real-time cricket shot analyzer using Python, OpenCV, and MediaPipe. Processes videos frame-by-frame, overlays pose and biomechanical metrics, and outputs an annotated video with JSON evaluation. Also the output video might have "??" in it which is there cause OpenCV can't display degree's symbol.

artificial-intelligence computer-vision deep-learning machine-learning mediapipe numpy python real-time

Last synced: 05 May 2026

https://github.com/armahdavi/qff-evalation_code-data-processing-statistics-plotting

Data pipelines and processing codes, statistical modellings, descriptive statistics, and plot visualizations for QFF evaluation phase of for Mahdavi et al. (2021) (Environmental Pollution) Project Miestone: 2018 - 2021 Full-length article: https://www.sciencedirect.com/science/article/abs/pii/S0269749120370779

data-science data-visualization histogram matplotlib matplotlib-pyplot numpy pandas python

Last synced: 11 Apr 2026

https://github.com/rkarahul/face-detection-using-opencv-

•Build a face detection project using OpenCV and haar cascades, which are the better choice for real-time detection.

haar-cascade-classifier machinelearning numpy pandas-library python3 tkinter

Last synced: 08 May 2026

https://github.com/zuhairzia/titanic-survival-project

This is a Titanic Survival Prediction Model developed using Python, Pandas, Scikit-learn, and Jupyter Notebook. The model predicts whether a passenger survived the Titanic disaster based on features such as age, gender, and passenger class.

csv-dataset flask jupyter-notebook matplotlib numpy pandas pandas-library python scikit-learn seaborn streamlit

Last synced: 11 Apr 2026

https://github.com/djdhairya/crop-recommendation

Crop Recommendation System is a powerful tool for enhancing agricultural decision-making. By leveraging data-driven insights, it empowers farmers to maximize yield and ensure sustainable practices.

adaboostclassifier bagging-classifier csv decision-trees gaussian html knn-classification logistic-regression machine-learning machine-learning-algorithms matplotlib model numpy pandas random-forest random-forest-classifier scikit-learn seaborn svc

Last synced: 11 Apr 2026

https://github.com/ahmed-maher77/diabetes-prediction-app-using-machine-learning

Diabetes Prediction: Using machine learning to classify individuals as diabetic or non-diabetic based on health data, enabling early intervention and improved healthcare outcomes.

ai css data-science gradientboostinclassifier javascript logisticregression machine-learning matplotlib numpy pandas python randomforestclassifier seaborn streamlit supportvectormachine webdevelopment

Last synced: 11 Apr 2026

https://github.com/dmarks84/coursework_project_apache-airflow-kafka-on-toll-booth-data

Project for IBM Data Engineering & Python course on ETL & Big Data -- Read in live toll booth data, wrangles and transformed, and wrote into a SQL database

apache-airflow apache-kafka automation dags data-modeling databases eda elt etl mysql numpy pandas pipelines python sql

Last synced: 11 Apr 2026

https://github.com/alexixrugis/perceptronvisualization

Visualization of training and operation of a perceptron written from scratch in numpy

ai machine-learning numpy python

Last synced: 11 Feb 2026

https://github.com/varkenvarken/blempy

small, safe utilities to efficiently transfer Blender property-collection attributes (e.g. vertex coordinates) to/from NumPy arrays and perform vectorized operations with minimal Python overhead.

blender numpy

Last synced: 13 Jan 2026

https://github.com/dhanish03/credit_card_fraud_detection

Developed and implemented an advanced CCFDS using ML algorithms and pattern recognition techniques. Integrated real-time monitoring and adaptive learning capabilities into the system to dynamically adjust fraud detection parameters, ensuring effectiveness in identifying emerging fraud patterns.

kaggle-dataset numpy pandas-dataframe python3 sklearn

Last synced: 16 Apr 2026

https://github.com/iamsaniasingh/heart_disease_prediction

This is my very first machine learning project, where I used a supervised learning algorithm—logistic regression—to predict heart disease. The model was trained and tested entirely on a pre-existing dataset, with no user input involved. The goal was to understand how ML models work and how they can be applied in healthcare predictions.

logistic-regression machine-learning machine-learning-algorithms numpy pandas python sklearn

Last synced: 11 Apr 2026

https://github.com/lucasgleria/seamese-network-algorithm

Este projeto implementa um sistema de busca por similaridade de imagens usando redes siamesas e Triplet Loss em PyTorch. Ele gera embeddings de imagens (MNIST com EfficientNet-B0) para encontrar visuais semelhantes. O foco está na análise visual e no aprendizado de representações no espaço vetorial.

google-colab matplotlib numpy pandas python pytorch timm

Last synced: 11 Apr 2026

https://github.com/abrarshahok/electric-vehicle-charging-station-energy-consumption-prediction

With the rapid adoption of electric vehicles, optimizing energy usage at charging stations has become crucial for improving operational efficiency and ensuring customer satisfaction. This tool leverages predictive modeling to forecast energy consumption for charging sessions based on various input features.

matplotlib numpy pandas plotly python3 scikit-learn xgboost

Last synced: 09 Jun 2026

https://github.com/riju18/from-data-production-to-client-handover

The common tedious problem is to build a data app to demonstrate the data analysis & analytics along with Machine Learning to a client. It was an attempt to do it on small scale in the most powerful & simplest way.

machine-learning matplotlib numpy pandas plotly python seaborn streamlit

Last synced: 30 Apr 2026

https://github.com/lmizner/grokking_data_science

Coding practice for basic data science interview questions in Python

data-science numpy pandas python scikit-learn

Last synced: 11 Apr 2026

https://github.com/ksharma67/anomaly-detection-on-temperature-device-failure

A typical anomaly detection task and performing KMeans, PCA, Gaussian distribution, and Isolation Forest.

eda ellipticenvelope feature-engineering gaussian-distribution isolation-forest kmeans-clustering numpy pca python sklearn

Last synced: 11 Apr 2026

https://github.com/gregoritsch3/ml_eda_clustering_aidassessment

An EDA and Machine Learning Clustering exercise on the Country Aid Assessment dataset demonstrating the use of PCA, KMeans and DBSCAN clustering, Elbow Methods, etc. The clustering algorithm successfully demarcates countries that are in most dire need of aid based on their GDPP and Child Mortality rate.

anova dbscan kmeans machine-learning matplotlib numpy pandas pca scikit-learn seaborn statistics

Last synced: 16 Apr 2026

https://github.com/charles-l/rayboi

a raytracer written in futhark/python

futhark numpy pathtracing python3 raytracing

Last synced: 19 Apr 2026

https://github.com/vishnu-vamshii/heart-disease-prediction-using-ml

This project presents an end-to-end data analysis and machine learning pipeline for predicting heart disease using a publicly available dataset. The project includes data exploration, visualization, and implementation of various machine learning models to predict the likelihood of heart disease based on a set of clinical attributes.

machine-learning matplotlib numpy pandas python seaborn sklearn

Last synced: 11 Apr 2026

https://github.com/talapanenivarshithchowdary/asteroid-detection-ml

This project uses Machine Learning to detect and classify asteroids based on trajectory and size, aiding in Near-Earth Object detection and planetary defense.

classification data-science decision-trees jupyter-notebook knn logistic-regression machine-lea matplotlib numpy pandas pillow prediction python3 random-forest scikit-learn

Last synced: 11 Apr 2026

https://github.com/audy21/datacamp

Learning portfolio documenting my progress, while taking Data Analyst & Data Science certifications from DataCamp.

data-analysis data-science machine-learning matplotlib numpy pandas python scikit-learn seaborn

Last synced: 11 Apr 2026

https://github.com/amanyadav-07/customer-churn-prediction

Machine Learning project to predict customer churn using Logistic Regression, Random Forest, and XGBoost. Includes data preprocessing, feature engineering, SMOTE balancing, model training, evaluation, and business insights.

accuracy-metrics data-analysis data-visualization logistic-regression machine-learning matplotlib numpy pandas python3 random-forest-classifier seaborn sklearn xgboost-classifier

Last synced: 11 Apr 2026

https://github.com/arthurdsant/dataanalysis-agricultural_raw_material

This Python project performs analysis and visualization of agricultural raw material price data using a Kaggle dataset. Based on Jupiter Notebook and Python.

jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 26 Jan 2026

https://github.com/swarnabhaghosh/house-price-prediction-model

Built an end-to-end regression pipeline to predict house prices using Linear Regression with automated preprocessing (PowerTransform, StandardScaling) via Scikit-learn's Pipeline and ColumnTransformer.

column-transformer linear-regression matplotlib-pyplot numpy pandas pipeline python scikit-learn seaborn

Last synced: 11 Apr 2026

https://github.com/aksoni07/movie-recommendation

A hybrid movie recommendation system designed to deliver personalized and accurate suggestions by combining user preferences, item attributes, and collaborative patterns, ensuring a seamless and engaging experience.

clustering content-based-filtering data-analysis embeddings jupyter-notebook numpy ollaborative-filtering pandas personalization python recommendation-systems scikit-learn user-item-interactions

Last synced: 11 Apr 2026

https://github.com/chintanboghara/rocket-simulation

A comprehensive web-based orbital mechanics simulator with advanced mission planning, real-time tracking, and educational features.

docker flask html javascript numpy plotly python

Last synced: 11 Apr 2026

https://github.com/winterwind/ecg_signal_classification

Two-part project that involves detecting the R-peaks in an ECG signal to extract the individual ECG beats and making a machine learning model to classify them

csv csv-files data-science decision-trees ecg ecg-classification ecg-signal jupyter jupyter-notebook knearest-neighbors knn machine-learning matplotlib matplotlib-pyplot numpy pandas pyplot python random-forest scipy

Last synced: 11 Apr 2026

https://github.com/eduardoprofe666/mn-api

🐍📦 Paquete de Python con implementaciones de métodos numéricos

mn-api numerical-methods numpy pandas python scipy simpy tabulate

Last synced: 04 Jan 2026

https://github.com/kahngjoonkoh/randomshapegenerator

A program that will generate images with random shapes and background colours. Can be customized and generated in bulk.

generative-art numpy opencv python threading tkinter

Last synced: 11 Apr 2026

https://github.com/allanreda/telco-customer-churn-predictor-app

A web-based machine learning application that predicts customer churn using a logistic regression model. Built with Scikit-Learn for model training, Gradio for the user interface, and deployed on Google Cloud App Engine. The app allows users to input customer data and receive predictions on churn risk to support business decision-making.

app-engine data-visualization deployment google-cloud gradio hyperparameter-tuning logistic-regression machine-learning numpy pandas scikit-learn

Last synced: 16 Apr 2026

https://github.com/bunu23/image-classification

This repository contains a notebook implementing a Convolutional Neural Network for multi-class image classification using transfer learning with a pre-trained ResNet-50 model. Covers dataset handling, model architecture customization, training, evaluation, fine-tuning, and external image prediction.

keras matplotlib numpy pil python tensorflow

Last synced: 11 Apr 2026

https://github.com/erikbrinkman/hilbert-bytes

A python library for converting between d-dimensional points and indices on a hilbert curve

hilbert-curve numba numpy python

Last synced: 08 May 2025

https://github.com/andersoncrs/prediccion_precio_vehiculos_statsmodels

Este proyecto utiliza un modelo de regresión lineal para predecir el precio de vehículos basándose en sus características principales. El análisis incluye la definición del problema, exploración y limpieza de datos, conversión de variables categóricas a numéricas, evaluación de correlaciones y entrenamiento del modelo.

analisis-de-datos analisis-exploratorio-de-datos matplotlib numpy seaborn statsmodels visualizacion-de-datos

Last synced: 26 Apr 2026

https://github.com/shwetapardhi/assignment-03-q1--hypothesis-testing

Q1.A F&B manager wants to determine whether there is any significant difference in the diameter of the cutlet between two units. A randomly selected sample of cutlets was collected from both units and measured? Analyze the data and draw inferences at 5% significance level. Please state the assumptions and tests that you carried out to check validit

hypothesis-testing numpy p-value pandas python scipy significance-testing stats t-test

Last synced: 11 Apr 2026

https://github.com/shreyasmehta05/sortsync

A custom sorting algorithm combining parallel merge and count sort, with detailed performance comparisons against standard sorting techniques.

c matplotlib numpy python3

Last synced: 04 Feb 2026

https://github.com/mani-prakash-n-r/stock_market_prediction_system

This project uses LSTM networks to predict stock prices based on historical data, providing insights for informed investment decisions..LSTM, NumPy, Scikit-Learn, Matplotlib, yfinance and TensorFlow

lstm matplotlib numpy python sckiit-learn tensorflow yfinance

Last synced: 11 Apr 2026

https://github.com/saliola/nonnegative_integer_matrices

code to generate and count nonnegative integer matrices with prescribe row and column sums (aka contingency tables)

cython cython-examples numpy numpy-examples python3

Last synced: 18 Apr 2026

https://github.com/djdurga/google_play_store_apps_analysis

This data analysis project focuses on exploring and understanding the Google Play Store Apps dataset.

numpy pandas python

Last synced: 11 Apr 2026

https://github.com/cfbastarz/jupyternotebooks

A collection of several Jypyter notebooks.

dask matplotlib numpy python xarray xesmf

Last synced: 18 Jan 2026

https://github.com/chanmeng666/advanced-neural-network-applications

Practical implementations of perceptron and linear neuron models for classification and regression, with mathematical analysis and visualizations in Jupyter notebooks.

classification data-analysis data-science educational gradient-descent jupyter-notebook linear-neuron machine-learning matplotlib neural-network neural-networks numpy perceptron python regression

Last synced: 03 May 2026

https://github.com/apfirebolt/numpy-and-pandas-examples

Some examples and sample datasets to learn numpy, pandas and other data science libraries in Python

data-analysis jupyter-notebook numpy pandas python

Last synced: 17 Apr 2026

https://github.com/mramshaw/intro-to-ml

Intro to Machine Learning - Pattern Recognition for Fun and Profit

machine-learning matplotlib ml numpy pandas pip pip3 python scikit-learn scipy seaborn seaborn-plots sklearn statsmodels tensorflow weka

Last synced: 11 Apr 2026

https://github.com/ikbalcaus/HandSketch

Drawing on Canvas with Hand Gestures + AI for Letter Recognition

mediapipe numpy ocr-recognition opencv python pytorch tkinter

Last synced: 31 Mar 2025

https://github.com/harmanveer-2546/wafer-fault-detection

The goal is to eliminate manual work in identifying faulty wafers. Opening and handling suspected wafers disrupts the entire process. False negatives result in wasted time, manpower, and costs.

clustering data-transformation feature-selection machine-learning matplotlib numpy pandas python random-forest roc-auc-curve roc-auc-score seaborn sklearn svc xgboost

Last synced: 11 Apr 2026

https://github.com/lohiyah/real-estate-price-forecast

A Python-based app predicting real estate prices using machine learning. Built with Pandas, NumPy, Scikit-learn, Matplotlib, and Seaborn for data processing and visualization, and Flask for the web interface.

flask matplotlib numpy pandas python3 scikit-learn seaborn

Last synced: 11 Apr 2026