An open API service indexing awesome lists of open source software.

NumPy

NumPy is an open source library for the Python programming language, adding support for large, multidimensional arrays, and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

https://github.com/gregoritsch3/ml_eda_classification_diabetes

An EDA and Machine Learning Classification exercise on the Diabetes dataset demonstrating the use of SQLAlchemy data import from an SQL database (PostgreSQL), Pre-processing Pipelines, ANOVA, 9 ScikitLearn ML models, Hyperparamter Tuning for the best performing one, and feature importance.

anova machine-learning matplotlib numpy pandas pipelines scikit-learn seaborn sql sqlalchemy statistics

Last synced: 14 Apr 2026

https://github.com/matkorussovich/student-performance-analysis

Este repositorio contiene un análisis del desempeño académico de estudiantes, realizado como parte del módulo "Introducción al Data Science" en el Máster en Data Science de la Universidad Europea de Madrid.

jupyter-notebook matplotlib-pyplot numpy pandas python

Last synced: 14 Apr 2026

https://github.com/sralter/potential_talents

Using NLP techniques (word and sentence embedding tools like SBERT and Learning-to-Rank systems like RankNet and LambdaRank) to rank candidates.

lambdarank learning-to-rank lightgbm matplotlib nlp numpy pandas python pytorch ranknet

Last synced: 09 Apr 2026

https://github.com/karthik9273/revolutionizing-gold-rate-forecasting-for-small-businesses-with-machine-learning

The "Gold Price Prediction" project focuses on predicting the prices of gold using machine learning techniques. By leveraging popular Python libraries such as NumPy, Pandas, Scikit-learn (sklearn), Matplotlib, Seaborn, Random Forest Regressor, and others, this project provides a comprehensive solution for accurate price estimation.

data-science google-colab-notebook jupyter-notebook machine-learning matplotlib numpy pandas-dataframe python seaborn sklearn

Last synced: 06 May 2026

https://github.com/al-ghaly/stock-market-simulation

Simulate and visualize stock market behavior

matplotlib numpy python python-visualization

Last synced: 16 May 2026

https://github.com/mjul/scipy-lab

Scientific computation with Python

matplotlib numpy python scipy

Last synced: 04 May 2026

https://github.com/deepcloudlabs/dcl702-2021-jul-12

DCL-702: Data Analytics using Python

data-analytics numpy pandas python3

Last synced: 05 May 2026

https://github.com/neelays/xor-xnor_neural_network

NumPy neural network to approximate XOR/XNOR

numpy

Last synced: 15 May 2026

https://github.com/sharif-minhaz/rag-system

Ingest and vectorize content upon publication, store embedding, then retrieve and augment user queries with context to generate high-quality responses.

faiss flask mysql2 nodejs numpy rag react transformers

Last synced: 14 Apr 2026

https://github.com/muhammadshavaiz/hand-sketch-recognition--inceptionv3

The Hand Drawn Sketch Classification project uses PyTorch to classify hand-drawn sketches. It evaluates model Inception_v3, with Inception_v3 achieving the highest accuracy of 57%. The repository features scripts for dataset management, model training, and evaluation.

inception-v3 matplotlib numpy pandas python pytorch

Last synced: 14 Apr 2026

https://github.com/juzershakir/predicting_boston_housing_prices

Builded a model to predict the value of a given house in the Boston real estate market using various statistical analysis tools. Identified the best price that a client can sell their house utilizing machine learning.

bias-variance boston-housing-price-prediction data-exploration decision-tree-regression gridsearchcv k-fold machine-learning matplotlib mlfnd model-evaluation model-validation numpy pandas python3 r2-score sklearn supervised-learning udacity-nanodegree

Last synced: 22 Oct 2025

https://github.com/mlicamele/neural-network

Project focused on exploring the computations behind neural networks by building one from scratch with only numpy and testing it with the MNIST dataset.

gradient-descent matrix-computations neural-networks numpy python

Last synced: 12 Apr 2026

https://github.com/chirindaopensource/search_benford_law_compatibility

End-to-End Python scalable forensic accounting toolkit implementing Benford's Law analysis for FTSE financial data. Delivers automated anomaly detection with Chi-Squared/MAD testing, comprehensive validation pipelines, and risk-based prioritization of investigative resources. Replicates Ausloos et al.'s (2025) methodology with full reproducibility.

academic-research anomaly-detection benfords-law chi-squared-test data-validation econometrics financial-analysis financial-data forensic-accounting fraud-detection ftse goodness-of-fit jupyter-notebook numpy pandas python reproducible-research risk-management scipy statistical-testing

Last synced: 12 Apr 2026

https://github.com/kostadinlambov/bitcoin-and-stock-market-correlation

This study uses a quantitative research design to analyze the relationship between Bitcoin prices and the stock market over the past five years with the S&P 500 Index serving as a proxy for the stock market.

bitcoin data-science jupyter-notebook matplotlib matplotlib-pyplot numpy pandas python scipy-stats seaborn sp500-data-analysis

Last synced: 09 Apr 2026

https://github.com/harmanveer-2546/reducing-data-entries

Way to delete data entries from csv/excel file using. For excel file, use excel instead of csv in the code.

csv data data-entry delete-data excel numpy pandas python

Last synced: 05 May 2026

https://github.com/atul-maurya-30/galaxy

Galaxy Classification is a machine learning project focused on classifying galaxies into two subclasses: 'STARFORMING' and 'STARBURST'. This project demonstrates data preprocessing, model training, and evaluation using advanced machine learning techniques and Python libraries.

flask machine-learning matplotlib numpy pandas python regression-classification seaborn sklearn

Last synced: 09 Mar 2026

https://github.com/manuel-lang/numpymongo

A python package to export NumPy data to MongoDB

mongodb numpy wrapper

Last synced: 23 Feb 2026

https://github.com/ggrbill/phd-plot-scripts

My personal plot scripts used to generate graphs for my PhD Thesis

hacktoberfest matplotlib numpy python

Last synced: 18 Apr 2026

https://github.com/gusenov/max-empty-rect-py

:black_square_button: Реализация на Питоне алгоритма поиска на изображении пустого прямоугольника максимальной площади.

algorithm empty-spot graphics numpy python python-image-library python-library rectangle-detection

Last synced: 06 Feb 2026

https://github.com/sonaligill/olympics-analysis

The outcome of this project is an interactive streamlit web application that visualizes the analysis of Olympic data while rendering different aspects of Olympic history, compare country performances, and gain insights into athlete demographics.

numpy plotly python scikit-learn scipy streamlit

Last synced: 28 Jan 2026

https://github.com/harmanveer2546/heart

Predicting the presence of heart disease based on several health-related factors and Performing - i.) Data Cleaning ii.) Data Pre-Processing iii.) EDA iv.) Compare 5 different classification algorithms (Logistic Regression, Decision Tree, Random Forest, KNN and SVC)

data-preprocessing decision-tree eda knn logistic-regression machine-learning numpy pandas random-forest roc-auc-curve svc

Last synced: 03 May 2026

https://github.com/alejandro945/insurance-risk

This project aims to predict the risk of insurance claims using a dataset from Kaggle. The dataset consists of 26 columns and 205 rows, providing various features related to insurance risk. By analyzing this data, we seek to build predictive models that can help insurers assess the risk of claims.

data-analytics ipython-notebook numpy pandas python

Last synced: 06 Feb 2026

https://github.com/subh888999/car-prices--analysis-projects

This repository houses projects focused on data collection, assessment, cleaning, visualization, and analysis. It includes workflows and methodologies for handling data, from initial gathering and evaluation to processing, visualizing insights, and performing in-depth analysis

jupyter-notebook matplotlib numpy panda seaborn statistics

Last synced: 03 May 2026

https://github.com/matheusafonseca/c213-trabalho-1

Repository dedicated to storing and managing the first assignment for C213 - embedded systems.

matplotlib numpy pid-controller python scypi streamlit

Last synced: 29 Jan 2026

https://github.com/jaypanchal9/fraud-detection-case-study

A comprehensive case study applying machine learning techniques to detect fraudulent transactions effectively.

machine-learning matplotlib numpy pandas python3 scikit-learn seaborn xgboost

Last synced: 15 Apr 2026

https://github.com/martincastroalvarez/search-keras-gensim-elasticsearch

Search Engine using Word Embeddings, GloVe, Neural Networks, BART and Elasticsearch

elasticsearch gensim gensim-word2vec keras nlp numpy python scipy spacy word2vec

Last synced: 15 Apr 2026

https://github.com/ishinzoo/songrecommendation

This project is a machine learning-based system that recommends songs based on the user's detected emotions. The application uses facial expression recognition to determine the user's current emotional state and suggests songs that align with that emotion. This system can be particularly useful for personalized music streaming services, helping use

machine-learning mediapipe numpy opencv os python tenserflow

Last synced: 25 Feb 2026

https://github.com/keyurparalkar/breast-cancer-detection

Predict whether the cancer is benign or malignant

gradient-descent logistic-regression machine-learning numpy

Last synced: 26 Apr 2026

https://github.com/amruta33/credit_card_analysis

The loan providing companies find it hard to give loans to the people due to their insufficient or non-existent credit history. Because of that, some consumers use it as their advantage by becoming a defaulter.

numpy pandas python3

Last synced: 15 Apr 2026

https://github.com/hansalemaos/npzigloc

Zig for Numpy

numpy python zig

Last synced: 31 Jan 2026

https://github.com/dipeshgoyal013/salary-data-analysis

Salary Analysis according department and agency.

analysis matplotlib numpy pandas salary sklearn-library

Last synced: 15 Apr 2026

https://github.com/samiyaalizaidi/nn-ml-homeworks

Homework solutions for CPE-4903: Neural Networks & Machine Learning at Kennesaw State University.

machine-learning machine-learning-workflow neural-networks numpy scikit-learn

Last synced: 15 Apr 2026

https://github.com/dhruvv1402/document-scanner-python-opencv

Transform smartphone photos into scanned documents in seconds! This Python-based document scanner automatically detects edges, corrects perspective, and enhances document images to produce clean, scanner-like results.

canny-edge-detection contour-detection imutils numpy opencv python scikitlearn-machine-learning warpperspective

Last synced: 15 Apr 2026

https://github.com/anshpg/exploration-in-image-processing-digit-image-generation

This project, developed by Anshuman Pattnaik, explores image processing techniques using Python libraries such as pandas, numpy, matplotlib, and cv2 (OpenCV). The primary objective of the project was to delve into image processing with a focus on creating a unique dataset and algorithm for image generation.

cv2 image-generation image-processing ipynb-jupyter-notebook matplotlib-pyplot numpy opencv pandas

Last synced: 01 Feb 2026

https://github.com/audeering/audmath

General math functions

math numpy

Last synced: 07 Feb 2026

https://github.com/mohamedelashri/lvec

Python package for seamless handling of Lorentz vectors

awkward hep hep-ex numpy physics root root-cern uproot

Last synced: 25 Feb 2026

https://github.com/sarowarahmed/advertising-sales-app

📈 Advertising Sales Predictor: A web app powered by a Machine Learning model, built with Numpy, Pandas, Scikit-learn, and Streamlit, to forecast sales based on TV, Newspaper, and Online Advertising. Deployed on Streamlit Cloud for real-time, easy-to-use predictions.

advertising app machine-learning multiple-linear-regression numpy pandas sales scikit-learn streamlit

Last synced: 07 Feb 2026

https://github.com/chandkund/customer-segmentation

Customer segmentation divides customers into distinct groups based on characteristics and behaviors. This project uses K-Means clustering, an unsupervised machine learning algorithm, to segment customers and provide insights for targeted marketing strategies

kmeans-clustering matplotlib numpy pandas python seaborn

Last synced: 15 Apr 2026

https://github.com/coueghlani/nlp

Proyecto de Procesamiento de Lenguaje Natural y Análisis de Datos

mineria-de-datos nlp nlp-machine-learning nltk numpy procesadores-de-lenguajes sklearn spacy

Last synced: 08 Feb 2026

https://github.com/smahala02/materials-science-introduction

Introduction to Materials Science concepts using Python for array manipulation and visualization with NumPy and Matplotlib.

data-visualization materials-science matplotlib numpy python scientific-computing

Last synced: 09 Feb 2026

https://github.com/rampal-punia/data-science-toolkit

Your Go-To Resource for Essential Data Science Related Commands, Concepts, Quick Overviews and Useful Functions.

artificial-intelligence data-science keras machine-learning matplotlib nlp nlp-machine-learning numpy pandas pythorch sql tensorflow

Last synced: 09 Feb 2026

https://github.com/carterbox/libimage

Provides large (2k) test images as NumPy arrays.

images numpy python

Last synced: 15 Apr 2026

https://github.com/harmanveer-2546/statistics-for-machine-learning

Statistical tools help you clean and organize your data. You can identify outliers, manage missing values, and ensure your data is in a format that the ML algorithms can understand.

inline matplotlib matplotlib-styles numpy pandas probability python seaborn statistics

Last synced: 18 Apr 2026

https://github.com/khinthandarkyaw98/python_for_engineers

This particular Python notebook is designed to provide Engineers with an opportunity to practice scientific computations.

engineering numpy python scientific-computing youtube

Last synced: 16 Apr 2026

https://github.com/ywatanabe1989/scitex-io

Universal scientific data I/O with plugin registry — save/load 30+ formats with one API. Part of SciTeX.

cli csv data-io hdf5 mcp numpy openscience pandas plugin-registry python research scientific-computing scitex

Last synced: 07 Jun 2026

https://github.com/dpgitaccount/project---hospital-readmission-analysis

The goal of this project is to build a predictive model to estimate the likelihood of a hospital readmission based on patient data. By identifying factors that contribute to readmissions, hospitals can optimize care and reduce costs associated with repeated visits.

boxplot confusion-matrix datamodeling exploratory-data-analysis heatmap histplot numpy pandas plotly python random-forest seaborn smote-sampling visualization

Last synced: 16 Apr 2026

https://github.com/lexxai/goit_python_ds_hw_09

Модуль 9. Підбір гіперпараметрів НМ. Глибоке навчання. Tensorflow. Keras.

adam-optimizer data-science google-colab keras keras-tensorflow numpy python

Last synced: 16 Apr 2026

https://github.com/vgvr0/analisis-de-datos-con-streamlit-numpy-pandas-y-matplotlib

Sistema completo de análisis y visualización de datos cinematográficos que proporciona insights detallados sobre películas, incluyendo análisis financiero, puntuaciones, tendencias temporales y un sistema de recomendación. Desarrollado con Python y Streamlit, ofrece una interfaz interactiva y amigable para explorar datos de películas.

matplotlib numpy pandas plotly plotly-dash recommendation-system streamlit

Last synced: 16 Apr 2026

https://github.com/codersales/machine-learning-classification

machine learning jupyter notebooks | data-science | priority | relevant | significant | green-light | 1 | may-2023-filtered | may-2023-filtered-2 | may-2023-filtered-3 | filtered-4 | frequent

authorized classification current decision-tree ensemble-techniques jupyter machine-learning more-than-100-commits more-than-300-commits numpy pandas python3 ranked repository-5 seaborn sklearn stacking sub-critical supervised-learning workstation

Last synced: 07 Mar 2026

https://github.com/kavayk29/speech-recognition-using-tdnn-and-data-augmentation

Developed a speech recognition system using TDNN, preprocessing audio, extracting MFCC features, and training the model. Fine-tuning with augmented data (19,000 rows) improved accuracy from 9% to 80% training and 40% validation. Data augmentation proved crucial for enhancing model performance and generalization. Still working to increase the acc.

deep-learning keras-tensorflow numpy os pandas tdnn tensorflow

Last synced: 14 Feb 2026

https://github.com/killervardhan8/gesturedecode

The Sign Language Interpretation project focuses on recognizing and interpreting hand gestures to facilitate communication for individuals who use sign language. This project leverages computer vision and machine learning techniques to accurately identify and translate hand signs into text

csv mediapipe numpy python tensorflow

Last synced: 28 Feb 2026

https://github.com/nagar2nd/ml-regressionmodel---cardekho-price-prediction

This repository features a machine learning model for predicting used car prices using data from CarDekho.com. The project leverages exploratory data analysis and regression techniques to empower sellers and buyers with actionable insights in the Indian used car market.

analytics cleaning-data data linear-regression machine-learning matplotlib numpy pandas python seaborn

Last synced: 16 Apr 2026

https://github.com/type0-1/salary-truth-predictor

A supervised machine learning regression model. Includes problem statement, approach to solution, code, images, dataset, and Jupyter Notebook for interactive analysis.

linear-regression machine-learning matplotlib-pyplot ml numpy pandas polynomial-regression projects scikitlearn-machine-learning support-vector-regression

Last synced: 16 Apr 2026

https://github.com/jessicahora/studies-on-linear-algebra

Repositorio com Estudos sobre Algebra Linear.

linalg linear-algebra matplotlib-pyplot matrix numpy python scipy

Last synced: 01 Mar 2026

https://github.com/sasanka14/water_quality_predictions

Water Quality Prediction - College Project 🌊💧 Predicts water potability (safe/unsafe) using ML models like XGBoost & Random Forest. Features data preprocessing, feature importance, model evaluation, and visualizations. Built with Python, Pandas, Scikit-learn & Seaborn for analysis. 🚀

anaconda jupyter-notebook matplotlib numpy pandas python scikit-learn seaborn xgboost

Last synced: 16 Apr 2026

https://github.com/anujdutt9/reinforcement_learning

Reinforcement Learning using Numpy and PyTorch.

numpy python3 pytorch reinforcement-learning

Last synced: 16 Apr 2026

https://github.com/pramodyasahan/health-insurance-cost-prediction

This project focuses on predicting health insurance costs using a polynomial regression model. By employing machine learning techniques in Python, the project aims to accurately estimate insurance costs based on various personal attributes. The model takes into account several features including age, sex, BMI, number of children, smoking status etc

machine-learning matplotlib numpy pandas python3 scikit-learn

Last synced: 16 Apr 2026

https://github.com/magnusrodseth/disaster-tweets

📚 Assignments in the course IT3212 - Data Driven Software at NTNU. Our task is to classify whether a tweet is related to a disaster or not.

adaboost jupyter-notebook logistic-regression numpy pandas python python3 random-forest support-vector-machines xgboost

Last synced: 16 Apr 2026

https://github.com/mugambi645/exploring-ebay-car-sales-data

Exploring ebay car sales dataset

car-sales data-analysis numpy pandas

Last synced: 16 Apr 2026

https://github.com/leftcoastnerdgirl/supervised_learning

This project demonstrates supervised machine learning using scikit-learn.

classification-reports confusion-matrix jupyter-notebook numpy pandas-python pathlib scikit-learn sklearn

Last synced: 16 Apr 2026

https://github.com/rahulchouhan1/spotify-most-popular-songs-data-analysis

🎵 Spotify Songs Analysis using Pandas

matplotlib numpy pandas

Last synced: 16 Apr 2026

https://github.com/marcow2812/zuse-projekt

Python-basierte Software zur Projektion von 3D-Objekten auf einen Würfel

augmented-reality numpy opencv-contrib python

Last synced: 16 Apr 2026

https://github.com/supershivam5/python_projects

💻 Python programming with Numpy, Pandas, Matplotlib.🌟 Love exploring new technologies. Check out my projects!

matplotlib-pyplot numpy pandas scikit-learn seaborn

Last synced: 17 Apr 2026

https://github.com/yanxue06/housing-price-predictor

Python-based California housing price predictor

jupyter numpy pandas python seaborn

Last synced: 06 Mar 2026

https://github.com/ricobuilds/ml-roadmap

Opinionated roadmap to machine learning in 2023

conda huggingface machine-learning matplotlib numpy pandas python pytorch

Last synced: 06 Mar 2026

https://github.com/radinshahdaei/ce40215-nc

Theoretical and practical assignments for "Numerical Computation".

jupyter-notebook numpy python sympy

Last synced: 17 Apr 2026

https://github.com/neerajcodes888/a-novel-used-car-price-prediction-model-based-on-lindenoise

Welcome to the LinDenoise Repository! LinDenoise offers a smart solution for cleaning noisy data in regression tasks. Integrated seamlessly within the widely-used scikit-learn framework, it effortlessly enhances data quality while improving predictive accuracy

car-price-prediction deep-learning ipynb-notebook machine-learning numpy pandas python3 visualization

Last synced: 06 Mar 2026