An open API service indexing awesome lists of open source software.

NumPy

NumPy is an open source library for the Python programming language, adding support for large, multidimensional arrays, and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

https://github.com/goessl/adjugate

A package for calculating submatricies, minors, adjugate- and cofactor matrices.

adjugate cofactor determinant inverse linear-algebra matrix minor numpy python submatrix

Last synced: 07 Jan 2026

https://github.com/hyperplasma/olympic-visualization-analysis

Multidimensional analysis and visualization of Olympic medals, economy, and happiness index.

data-analysis data-visualization matplotlib numpy pandas python wordcloud

Last synced: 04 May 2026

https://github.com/mehwishferoz/data-analysis-with-python-zero-to-pandas

This repository contains the Python code and projects I created while learning the Data Analysis with Python: Zero to Pandas course. The course covers essential topics such as data cleaning, analysis, and visualization using powerful Python libraries like Pandas, NumPy, Matplotlib, and Seaborn.

exploratory-data-analysis jovian matplotlib numpy pandas seaborn

Last synced: 10 May 2026

https://github.com/prateekrajsrivastav/financial-transition-classification

This project focuses on classifying financial transactions using machine learning techniques. By leveraging labeled data, the model aims to categorize transactions into predefined categories (e.g., "Food," "Transport," "Shopping," etc.).

matplotlib numpy pandas-python scikitlearn-machine-learning seaborn

Last synced: 07 Sep 2025

https://github.com/secary/maths7027

Mathematical Foundations of Data Science

latex mathematics numpy pandas

Last synced: 04 May 2026

https://github.com/als8446/tripleten-data-science-projects

Projects Overview Projects made in the Data Scientist course from TripleTen LatAm

data data-analysis hypothesis-tests machine matplotlib numpy pandas python scipy sklearn

Last synced: 10 Apr 2026

https://github.com/paulo-santos-ds/guia_de_precos_de_veiculos_com_machine_learn

Sistema de predição de preços de carros usados desenvolvido para a Empresa Rusty Bargain

catboost lgbm numpy pandas pyplot python seaborn sklearn time

Last synced: 13 Apr 2026

https://github.com/lexiortiz/ibm-data-engineering-fundamentals

Notes, exercises, and projects from the IBM Data Engineering Fundamentals path via Verizon Skill Forward.

data-engineering numpy pandas postegresql python sql

Last synced: 13 Apr 2026

https://github.com/samestrin/image-manipulation-api-digitalocean

A Python DigitalOcean App Platform based REST image manipulation API using Flask, NumPy, and OpenCV that runs in a Docker container.

api docker flask image image-processing numpy python python3 rest rest-api

Last synced: 13 Apr 2026

https://github.com/abdulsamie10/pythonbasics

This repository contains few tasks, which I developed just to get a strong grip on Python Programming Langauge.

ai labtasks lambda numpy python pythonlab

Last synced: 20 Apr 2026

https://github.com/alyssahusna44/poai-image-classification

An implementation of an AI model that classifies animal subspecies. The project involved data preparation, model training using ResNet50, DenseNet121, and MobileNetV3, and evaluation using metrics: accuracy and mAP. 🗂️

artificial-intelligence image-classification machine-learning model-training-and-evaluation neural-networks numpy pandas python tensorflow

Last synced: 09 Apr 2026

https://github.com/wwwmisla/gerador-mapas-calor

Sistema de visão computacional para gerar mapas de calor com base na movimentação em espaços públicos, auxiliando no planejamento urbano e uso eficiente do espaço.

color computer-vision demo google-colab gradio heatmap marchine-learning matplotlib model numpy opencv people-detection python smart-city ufrn visao-computacional vision-computer yolo yolov8

Last synced: 04 May 2026

https://github.com/ivancaez/analisis_dades_microbit

Data analysis of Micro:bit with maplotlib, numpy and pandas

csv jupyter-notebook matplotlib microbit numpy pandas python

Last synced: 13 Apr 2026

https://github.com/ronitjariwala/-prodigy_ds_01

Prodigy InfoTech Data Science Internship Task-1

matplotlib numpy pandas python seaborn

Last synced: 04 May 2026

https://github.com/snghrsw/kikagaku-ml-learning

Pythonで単回帰分析と重回帰分析、ディープラーニングで回帰と分類

liner-regestion multiple-regression numpy pandas python scikit-learn

Last synced: 11 Apr 2026

https://github.com/chandkund/predicting-diabetes-onset

The "Predicting Diabetes Onset" project aims to build a machine learning model that predicts whether an individual has diabetes based on various health-related features. The dataset used for this project includes attributes related to medical history and physical measurements.

deep-learning numpy pandas python seaborn sklearn-library sklearn-metrics visualization

Last synced: 13 Apr 2026

https://github.com/hilarionengarejr/movie-recommender-app

Sentiment analysis on user reviews for movie recommendations using Content Based Filtering.

docker flask nltk numpy pandas python3 scikit-learn selenium

Last synced: 10 Apr 2026

https://github.com/nischalkshaj/image-identification

This is a repository for AI image training for beginners.

express mongodb node numpy pillow python3 pytorch reactjs

Last synced: 09 Apr 2026

https://github.com/athari22/investigating-netflix-movies-and-guest-stars-in-the-office

Apply basic Python skills in Introduction to Python and Intermediate Python by processing and visualizing film and television data.

data-analysis data-science data-visualization loop loops matplotlib matplotlib-pyplot netflix numpy office pandas python

Last synced: 11 Apr 2026

https://github.com/swarup-ranjan-mondal/chat-analyzer

A streamlit app to analyze whatsapp personal and group chats.

matplotlib numpy pandas python seaborn streamlit

Last synced: 04 May 2026

https://github.com/dmarks84/coursework_project_text-mining-topic-modeling

Project for University of Michigan Applied Data Science Specialization -- Developed functions to score similarity between text passages.

data-modeling data-reporting data-visualization databases eda nlp numpy pandas python statistics text-mining

Last synced: 12 Apr 2026

https://github.com/pinedah/sleep-data-analysis-exercise

Análisis de un dataset médico sobre el sueño, explorando duración, calidad y factores relacionados. Incluye limpieza de datos, EDA y visualizaciones con Python (pandas, numpy, matplotlib, seaborn, scipy).

data-analysis data-science escom numpy pandas python school-project scipy

Last synced: 13 Apr 2026

https://github.com/kasraskari/tumor-predict

Streamlit app for predicting tumor malignancy using logistic regression.

logistic-regression machine-learning numpy pandas python scikit-learn streamlit tumor-detection

Last synced: 09 Apr 2026

https://github.com/evan-dg31/data-science

Exploratory Data Analysis (EDA), Predictive Modeling (Supervised and Unsupervised), Regression, Classification, Clustering

classification clustering data-analysis data-science data-visualization machine-learning matplotlib numpy pandas python regression-analysis seaborn

Last synced: 13 Apr 2026

https://github.com/joiceo/python

Projetos e exercícios em Python

eda machine-learning numpy pandas python seaborn sklearn

Last synced: 13 Apr 2026

https://github.com/darshanpakhale250-gif/customer-churn-prediction-ml

A machine learning project to predict customer churn using regression and classification models including logistic regression, decision tree, and random forest. Performed EDA, visualizations, and model evaluation. The dataset is taken from Kaggle and implemented in Google Colab.

colab-notebook customer-churn-analysis data-science decision-trees kaggle logistic-regression machine ml numpy pandas python random-forest

Last synced: 13 Apr 2026

https://github.com/kizman-23/supervised_models

Classical prediction of future data using models trained by labeled data

numpy pandas scikit-learn supervised-machine-learning

Last synced: 13 Apr 2026

https://github.com/akshatkmistry/parkinsons_disease_predictor-voice_measures

This project implements a machine learning system to detect Parkinson's disease using voice measurements. The application uses a Random Forest classifier trained on voice feature data to predict the likelihood of Parkinson's disease with high accuracy (94%).

machine-learning matplotlib numpy pandas random-forest-classifier seaborn sklearn streamlit

Last synced: 09 Apr 2026

https://github.com/venkat-0706/accenture-hackathon

Developing an e-commerce recommendation system involves utilizing technologies such as Python for programming, Pandas for data manipulation, SQL for database management, FastAPI for building APIs, PostgreSQL for data storage, and Docker for containerization.

alembic api docker fastapi machinelearningalgorithms matplotlib numpy postgresql pydantic python3 scipy seaborn sqlmodel

Last synced: 13 Apr 2026

https://github.com/asier-ortiz/python-for-data-science-and-machine-learning-bootcamp

Python for Data Science and Machine Learning Bootcamp: NumPy, Pandas, Seaborn, Matplotlib, Plotly, Scikit-Learn, TensorFlow, and more

matplotlib numpy pandas plotty python scikit-learn seaborn tensorflow

Last synced: 13 Apr 2026

https://github.com/mairagalvao/final_grades

An analysis of the final grades of students using Python

matplotlib numpy pandas python3

Last synced: 09 May 2026

https://github.com/jakubfr4czek/apartment-prices-analysis

This repository contains an analysis of apartments prices in Krakow. It was made for "Rachunek prawdopodobieństwa i statystyka" (probability and statistics) course i took at AGH University of Science.

agh agh-wi analysis csv jupyter jupyter-notebook krakow numpy pandas price-analysis price-prediction probability probability-statistics python statistics statistics-learning

Last synced: 04 May 2026

https://github.com/archishmansengupta/dnn

Digit Neural Network is a digit recognition network based on MNIST data set using numpy, pandas and matplotlib

matplotlib mnist neural-network numpy pandas python

Last synced: 13 Apr 2026

https://github.com/harmanveer-2546/house-price-prediction-

We all have experienced a time when we have to look up for a new house to buy. But then the journey begins with a lot of frauds, negotiating deals, researching the local areas and so on. So to deal with this kind of issues Today, I prepared a MACHINE LEARNING Based model, trained on the House Price Prediction Dataset.

catboost-classifier linear-regression machine-learning matplotlib numpy pandas python random-forest seaborn svm

Last synced: 09 Apr 2026

https://github.com/harmanveer-2546/eda-on-indian-railways

Indian Railways is a statutory body under the ownership of the Ministry of Railways of the Government of India that operates India's national railway system. As of 2023, it manages the fourth largest national railway system by size with a track length of 132,310 km, running track length of 106,493 km and route length of 68,584 km.

clean-data eda exploratory-data-analysis geometry geopandas indian-railways json linestring matplotlib numpy os pandas plotly python railway seaborn shapely train visualization

Last synced: 09 Apr 2026

https://github.com/harmanveer-2546/tweets-cleaning-with-python

Twitter is one of the most used data sources for data analysis. The reason is that it’s open and free to collect unless you subscribe to the paid version one. Besides, it’s pretty simple to collect data from it.

filtering function-calling nlptk numpy os pandas punctuation python re removing-links tokenization tweet-cleaning twitter-data-analysis twitter-sentiment-analysis

Last synced: 10 Apr 2026

https://github.com/azaz9026/eda

Exploratory Data Analysis (EDA) refers to the method of studying and exploring record sets to apprehend their predominant traits, discover patterns, locate outliers, and identify relationships between variables. EDA is normally carried out as a preliminary step before undertaking extra formal statistical analyses or modeling.

data-cleaning data-visualization encoding machine-learning matplotlib numpy pandas plotly python3 seaborn sklearn-library

Last synced: 15 Apr 2026

https://github.com/lorenzorottigni/ml-iris-svm

Machine Learning python bootcamp: Support Vector Machines on iris flower dataset

ipynb machine-learning numpy pandas python scikit-learn seaborn support-vector-machines

Last synced: 10 Apr 2026

https://github.com/bagusperdanay7/fcc-da-mean-variance-standard-deviation-calculator

One of Data Analysis with Python (freecodecamp) task, created a Mean Variance Standard Deviation Calculator.

data-analysis freecodecamp-project numpy python

Last synced: 06 May 2026

https://github.com/csengupta1101/rock-paper-scissor-game

Rock Paper Scissor game built with Python 3. Jupyter notebook used as IDE. Code File , Read Me attached herewith.

game if-else-statements numpy python python3 random

Last synced: 11 May 2026

https://github.com/abhinav330/911-emergency-calls-analysis

This Python Notebook analyzes emergency call data from the '911.csv' dataset. It uses various data visualization techniques to explore and gain insights into the emergency call data, including the types of calls, reasons for calls, and call patterns over time.

data-analysis data-science data-visualization eda exploratory-data-analysis exploratory-data-visualizations numpy pandas python

Last synced: 09 Jun 2026

https://github.com/manishkumarpatel07/heartattack_risk_prediction

"Heart Attack Risk Prediction" uses machine learning to estimate the likelihood of a heart attack based on user-provided data like physical attributes, symptoms, and medical history. This system enables remote screening, identifying high-risk individuals, and easing medical system burdens by providing early, data-driven health risk assessments.

boruta knn-algorithm matplotlib numpy pandas python scikit-learn

Last synced: 09 Apr 2026

https://github.com/abideen-olawuwo/nyc-taxi

Prediction the duration of New York Taxi trip

linear-regression matplotlib numpy pandas python seaborn sklearn

Last synced: 11 Apr 2026

https://github.com/nikhilsree5/netflixcasestudy

Global Content Strategy: Leveraging Data Insights to Optimize Netflix's International Growth

eda numpy pandas python visualization

Last synced: 13 Apr 2026

https://github.com/alex1iv/asr_ru_numbers

Automatic Speech Recognition (ASR) system for Russian digits

audio-processing librosa numpy speech-recognition tensorflow

Last synced: 13 Apr 2026

https://github.com/hansalemaos/hexarray2decimal

Converts a numpy string array with hex values to int

convert hex int numpy python

Last synced: 05 May 2026

https://github.com/shlok-nahar/mnist-cnn-classifier

This repository trains and evaluates three CNN models on MNIST, providing performance comparisons and 5 unique visualizations.

confusion-matrix graph heatmap-visualization json machine matplotlib mnist numpy precision-recall python receiver-operating-characteristic seaborn sklearn tensorflow

Last synced: 13 Apr 2026

https://github.com/abdullah2020/hamoye_stageb

This is my Hamoye Stage B project. The project focuses on Predicting Energy Efficiency of Buildings. It implemented different Machine Learning algorithm technique that are not limited to Linear Regression, LASSO, Ridge etc.

eda lasso-regression linear-regression numpy pandas predictive-modeling regression-models ridge-regression rmse rsquare-values

Last synced: 09 May 2026

https://github.com/rahatmoktadir03/customer-churn-prediction

A machine learning project for predicting customer churn, enabling businesses to identify at-risk customers and develop retention strategies.

business-analytics customer-churn-prediction data-science google-colab machine-learning numpy pandas python sklearn strreamlit xception-model

Last synced: 11 Apr 2026

https://github.com/yuu-eguci/cognitive-services-trial

Try to play with Cognitive Services!! [Cognitive Services] [OpenCV] [Numpy]

cognitive-services dotenv numpy opencv-python pipenv python python3

Last synced: 05 Jul 2025

https://github.com/naveen88112/clustering_customer_invoice_data

Customer Invoice Data Clustering This project uses clustering methods on customer invoice data for segmentation analysis. It preprocesses data, normalizes features, and uses K-Means and DBSCAN to cluster customers according to spending habits and shared locations.

clustering data-preprocessing data-visualization numpy pandas python silhouette-score standardization

Last synced: 13 Apr 2026

https://github.com/mayankmittal29/algovision-statistical_methods_in_ai

Implementation of various machine learning algorithms from scratch, including Linear Regression, K-Nearest Neighbors, Decision Trees, and K-Means clustering. Also done EDA on data, Implemented LSH, IVF, SLIC algorithms also with evaluation metrics

decision-tree-classifier eda gradient-descent image-segmentation ivf knn-classification linear-regression lsh-implementation matplotlib-pyplot numpy pandas python3 seaborn sgd-optimizer sklearn slic-superpixel-algorithm

Last synced: 11 Apr 2026

https://github.com/pardhuu66/college-id-validator

FastAPI-based offline College ID Validator with Docker support

base64 dnn docker easyocr fastapi mobilenetv2 numpy onnx onnxruntime opencv pillow pydantic python tensorflow uvicorn

Last synced: 11 Apr 2026

https://github.com/hari7261/playwithdata-python

This is one of the repository where I have put lot of data science and machine learning related questions on their solutions I hope you will find something better than some other platforms. Thank you Happy exploring

data-analysis data-science data-science-learning machienlearning matplotlib matplotlib-python ml numpy numpy-arrays numpy-library pandas pandas-dataframe pandas-library python python-script sklearn

Last synced: 13 Apr 2026

https://github.com/pabs-code/img-cartoonizer-using-opencv

A streamline app using 3 ways to cartoonized an image using OpenCV and Python.

bilateral-filtering color-quantization edge-detection edge-enhancement laplacian-edge-detection numpy opencv python

Last synced: 04 May 2026

https://github.com/itshyphen/mass-mailing-script

A simple mass mailing script that sends personalized email to multiple emails importing from csv

numpy pandas python smtplib

Last synced: 13 Apr 2026

https://github.com/subhas-pramanik-09/car-detection-and-counting

This project counts the number of cars passing through a designated line in a video file using OpenCV and background subtraction techniques.

machine-learning numpy object-detection opencv

Last synced: 13 Apr 2026

https://github.com/tnleite/loan-approval-prediction

Este repositório apresenta um modelo preditivo de aprovação de empréstimos, focado em minimizar o risco de inadimplência. Utilizando EDA e algoritmos de machine learning (Random Forest, XGBoost), ajustamos o threshold para maximizar o recall de inadimplentes, contribuindo para uma gestão de riscos eficiente.

classification-algorithm data-science exploratory-data-analysis machine-learning-algorithms machine-learning-models matplotlib numpy scikit-learn scipy seaborn xgboost-classifier

Last synced: 13 Apr 2026

https://github.com/aksoni07/movie-recommendation

A hybrid movie recommendation system designed to deliver personalized and accurate suggestions by combining user preferences, item attributes, and collaborative patterns, ensuring a seamless and engaging experience.

clustering content-based-filtering data-analysis embeddings jupyter-notebook numpy ollaborative-filtering pandas personalization python recommendation-systems scikit-learn user-item-interactions

Last synced: 11 Apr 2026

https://github.com/v41bh4vr4jput/python---numpy

A collection of hands-on examples, exercises, and projects to master NumPy — the fundamental package for numerical computing in Python. This repository is perfect for beginners and advanced learners looking to explore array manipulation, mathematical operations, and high-performance data analysis.

numpy python3

Last synced: 20 Apr 2026

https://github.com/chandkund/wine-quality-prediction

This project predicts wine quality based on physicochemical properties using machine learning models. By leveraging Random Forest Classifier, Logistic Regression, and SVM, the goal is to classify wines into quality categories and uncover the key factors that influence wine quality.

logistic-regression matplotlib numpy pandas-python random-forest-classifier svm-classifier

Last synced: 01 May 2026

https://github.com/dmkk01/mlp-python

Implementation of a multilayer perceptron using Pytorch and Numpy libraries

mlp numpy pytorch

Last synced: 05 May 2026

https://github.com/n1k1f0rm/ml-predicts

Place where you can find in transparent way how ML algos works

machine-learning ml numpy python

Last synced: 21 Apr 2026

https://github.com/dbriane208/python-for-data-science

Machine Learning and Data Science repository. Love crafting Machine Learning models.

data-analysis data-science data-visualization machine-learning numpy pandas python seaborn

Last synced: 13 Apr 2026

https://github.com/srevenant/data-science-alpine

A docker container for data science, using alpine linux and python3

alpine data numpy pandas python3 science scipy xgboost

Last synced: 05 May 2026

https://github.com/lin826/nanogpt-demo

Training and finetuning local GPTs.

gpt nanogpt numpy pytorch tqdm transformers

Last synced: 05 May 2026

https://github.com/iv4n-ga6l/GenderDetection

Gender detection using gender classification model

genderclassification genderdetection numpy pil python resnet18 torch torchvision

Last synced: 28 Apr 2025

https://github.com/hajaarh/malaria_hematie_cnn

Algorithme de réseaux de neurones convolutionnels

numpy pandas python resnet-50 sklearn-library tensorflow vgg16

Last synced: 13 Apr 2026

https://github.com/bhaskarkulshrestha/rock-vs-mine-prediction

A machine learning project to predict whether an object detected underwater is a rock or a mine using Logistic Regression.

colab-notebook logistic-regression machine-learning numpy pandas python sckit-learn supervised-learning

Last synced: 05 May 2026

https://github.com/carol-neto/sprint-4-statistical-data-analysis

In this project I had the opportunity to test my knowledge by analyzing a phone plan and creating graphs to compare the plans and determine which ones generate the most revenue.

matplotlib-pyplot numpy pandas pytho scipy-stats seaborn statistical-analysis

Last synced: 09 May 2026

https://github.com/dmarks84/coursework_project_nlp-with-nltk

Project for University of Michigan Applied Data Science Specialization -- Utilized NLTK library to process natural language, and then built several spelling recommenders for a list of misspelled words.

data-modeling databases dataframes eda nlp numpy pandas python reporting statistics text-mining visualization

Last synced: 13 Apr 2026

https://github.com/burhanali2211/marksanalyzer

This is a simple CLI-based tool that helps analyze student marks by calculating key statistics such as the highest and lowest scores.

analysis analyzer arrays marks numpy

Last synced: 05 May 2026

https://github.com/shwetapardhi/assignment-05-multiple-linear-regression-2

Prepare a prediction model for profit of 50_startups data. Do transformations for getting better predictions of profit and make a table containing R^2 value for each prepared model. R&D Spend -- Research and devolop spend in the past few years Administration -- spend on administration in the past few years Marketing Spend -- spend on Marketing in t

collinearity-diagnostics cooks-distance correlation-analysis eda heteroscedasticity homoscedasticity leverage-score multi-linear-regression numpy ols-regression p-value pair-plot python r-square-values regress-exog residual-analysis smf statsmodels vif

Last synced: 09 May 2026

https://github.com/iamwatchdogs/retail-insights-from-superstore-data

An simple Data Analysis Project for finding insight within the Super store data set

conda-environment data-analytics jupyter-notebook matplotlib numpy pandas python

Last synced: 05 May 2026

https://github.com/prashhhant213/cardioflex-treadmill-analysis-using-descriptive-statistics-probability

Description Analysis and Visualization on CardioFlex Treadmill data to provide insights and recommendations to improve their userbase.

colab-notebook numpy pandas probability python stats

Last synced: 11 Apr 2026

https://github.com/ot-code/coca-cola-stock-prediction

This repo compares four predictive models—Linear Regression, ARIMA, XGBoost, and LSTM—to forecast Coca‑Cola FEMSA stock closing prices using Python and five years of historical data.

arima csv linear-regression lstm-neural-networks mae matplotlib mse numpy pandas python r2 scikit-learn seaborn tensorflow-keras xgboost

Last synced: 13 Apr 2026

https://github.com/csengupta1101/netflix-rating

The project revolves around Netflix shows and movies around the world. The problem statement that is being tried to address here is that what kind of show to come up with in future times and how well that will fit with the audience

jupyter-notebook matplotlib movies netflix numpy pandas plotly python python3 rating tvseries

Last synced: 13 Apr 2026

https://github.com/abhisek-13/whatsapp-chat-analyzer

The WhatsApp Chat Analyzer is a data analysis project that provides insights into WhatsApp chats. It analyzes chat data to show metrics like the number of lines, most used letter, chatting duration, media files shared, most used emojis, and group member activity. The results are displayed on a user-friendly dashboard built with Streamlit.

data-analysis data-mining data-visualization eda machine-learning machine-learning-algorithms matplotlib numpy pandas python seaborn sklearn

Last synced: 13 Apr 2026