An open API service indexing awesome lists of open source software.

NumPy

NumPy is an open source library for the Python programming language, adding support for large, multidimensional arrays, and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

https://github.com/smirnovlad/data-science-notebooks

A collection of various data analysis approaches

data-science deep-learning kaggle machine-learning numpy pandas pytorch

Last synced: 10 Apr 2026

https://github.com/ameykasbe/credit-card-fraud-detection-on-imbalanced-dataset

Examined data preprocessing techniques and performance of six different predictive models in Python to credit card fraud detection problem on an imbalanced dataset. Algorithms implemented - Logistic Regression, K Nearest Neighbours, Support Vector Classification, Naïve Bayes Classifier, Decision Tree Classifier, and Random Forest Classifier.

classification machine-learning matplotlib numpy pandas python scikit-learn seaborn

Last synced: 10 Apr 2026

https://github.com/jol79/python_exercises

Solving interesting python exercises on different topics

matplotlib-pyplot numpy pandas python3 pythonexercises scikit-learn seaborn

Last synced: 10 Apr 2026

https://github.com/jkosla/neural_network_from_scratch_numpy

Neural Network From Scratch in Python | Build a simple neural network from scratch using pure Python and NumPy. Learn about forward propagation, backpropagation, and training with gradient descent. Accompanies my Medium article.

ai aritificial-intelligence medium nerual-networks numpy python3 tutorial

Last synced: 10 Apr 2026

https://github.com/ryan-bendelson/2024-summer-research

This is Python code that I worked with during my summer 2024 research project involving quantum physics.

density-matrices kronecker-product linear-algebra miniconda3 numpy numpy-arrays partial-trace python quantum-information

Last synced: 16 Apr 2026

https://github.com/sahil210695/gradient-descent

A simplified explanation of gradient descent for linear regression in python using numpy

gradient-descent gradient-descent-algorithm linear-regression matplotlib mini-batch-gradient-descent numpy python stochastic-gradient-descent

Last synced: 03 May 2026

https://github.com/khaymanii/spam_mail_detection_model

This model was built using Python and Logistics Regression algorithm

matplotlib numpy pandas python sckiit-learn

Last synced: 10 Apr 2026

https://github.com/jaweria-b/eda-basketball

The Streamlit app analyzes NBA player stats with user-selected filters, offering data download and intercorrelation heatmap.

matplotlib numpy python streamlit

Last synced: 10 Apr 2026

https://github.com/niteshchawla/logistics-nn-regression

The case study is about India's Largest Marketplace for Intra-City Logistics. This dataset has the required data to train a regression model that will do the delivery time estimation, based on all those features.

adam-optimizer data-visualization encoding exploratory-data-analysis feature-engineering hidden-layers hyperparameter-tuning keras-tensorflow kerastuner metrics neural-network numpy pandas regression relu scaling sequential-models

Last synced: 10 Apr 2026

https://github.com/rama1997/lane-line-detection

Uses computer vision to detects lane lines on the road from images/videos using the POV of a driving vehicle

numpy opencv opencv-python python

Last synced: 10 Apr 2026

https://github.com/azaz9026/data_cleaning

Welcome to the Data Cleaning repository! This collection is dedicated to showcasing techniques and methods for cleaning and preparing datasets for analysis.

data-analysis data-engineering data-structures data-visualization eda feature-engineering machine-learning numpy outliers pandas python seaborn

Last synced: 13 Apr 2026

https://github.com/alejoduarte23/reading_data_from_dewesoft

The following repository retrieves sensor data (acceleration and strains) from both local and cloud databases. It processes the data using classes from another repository called Modal Engine for spectral analysis, modal analysis, and signal processing.

dewesoft matplotlib modal-analysis numpy orm scipy signal-processing sql sqlalchemy

Last synced: 07 Jan 2026

https://github.com/fedesgh/parkinson_volatility_spread_on_cedears

Creating a function that returns a graph with the difference between Parkinson's volatility and regular volatility given a certain bounds

numpy pandas pickle seaborn

Last synced: 10 Apr 2026

https://github.com/navindafernando/feature-extraction

Heart Risk Level Predicting Regression Model & Web using Feature Engineering and Data Preprocessing :baby_chick:

categorical-encoding feature-engineering flask handling-outlier html5 joblib label-encoding machine-learning numpy pandas polynomial-features quantile-transformer scaling

Last synced: 10 Apr 2026

https://github.com/hussain-7/emotion_detection-master

Human Emotion Analysis using facial expressions in real-time from webcam feed. Based on the dataset from Kaggle's Facial Emotion Recognition Challenge.

keras-tensorflow matplotlib numpy opencv-python tensorflow

Last synced: 08 May 2026

https://github.com/mohamed15058/text_classification-digital-egypt-pioneers-initiative-project-

Text_Classification(Digital-Egypt-Pioneers-Initiative-Project )

depi mlops nlp nltk numpy panadas python3 twnsorflow

Last synced: 10 Apr 2026

https://github.com/omarsaad21/rfm-clustering-

A full Data science and deployment project focusing on Data analysis and ML ( create a customer segmentation model to recommend the best merchants for each user as targetted offers)

business-solutions data-science eda numpy pandas plotly python sickit-learn streamlit

Last synced: 11 Apr 2026

https://github.com/soumyapro/wine-quality-prediction

This project is about the prediction of wine quality using machine learning algorithms

boxplot matplotlib numpy pandas random-forest smote

Last synced: 10 Apr 2026

https://github.com/paraskevi-kivroglou/rl-pong-agent

A project by Paraskevi Kivroglou as part of exploring deep reinforcement learning applications.

atari atari-games gym-environment gymnasium numpy python3 pytorch q-learning reinforcement-learning reinforcement-learning-agent

Last synced: 11 Apr 2026

https://github.com/yash-rewalia/airbnb_eda_pandas

The goal of the project is to gather information and analyze the detailed information of the different entries in order to provide insights about the host and price of the property in a particular area as per your preference , type of rooms and number of reviews accordingly.

data data-cleaning data-insights data-preprocessing data-visualization matplotlib numpy pandas python seaborn

Last synced: 11 Apr 2026

https://github.com/farhad-here/data-visualization-analysis-dva

This is my data analysis project. Users can use this project to clean and preprocessing the date or data visualization. Individuals can impute or ecnode ther dataset.

altair bokeh data-analysis data-analysis-python io matplotlib numpy pandas plotly python sklearn streamlit

Last synced: 11 Apr 2026

https://github.com/hansalemaos/hexarray2decimal

Converts a numpy string array with hex values to int

convert hex int numpy python

Last synced: 05 May 2026

https://github.com/pardhuu66/college-id-validator

FastAPI-based offline College ID Validator with Docker support

base64 dnn docker easyocr fastapi mobilenetv2 numpy onnx onnxruntime opencv pillow pydantic python tensorflow uvicorn

Last synced: 11 Apr 2026

https://github.com/zuhairzia/titanic-survival-project

This is a Titanic Survival Prediction Model developed using Python, Pandas, Scikit-learn, and Jupyter Notebook. The model predicts whether a passenger survived the Titanic disaster based on features such as age, gender, and passenger class.

csv-dataset flask jupyter-notebook matplotlib numpy pandas pandas-library python scikit-learn seaborn streamlit

Last synced: 11 Apr 2026

https://github.com/djdhairya/crop-recommendation

Crop Recommendation System is a powerful tool for enhancing agricultural decision-making. By leveraging data-driven insights, it empowers farmers to maximize yield and ensure sustainable practices.

adaboostclassifier bagging-classifier csv decision-trees gaussian html knn-classification logistic-regression machine-learning machine-learning-algorithms matplotlib model numpy pandas random-forest random-forest-classifier scikit-learn seaborn svc

Last synced: 11 Apr 2026

https://github.com/jigyasag18/fake-news-prediction-app

The Fake News Prediction App Repository offers a machine learning project that focuses on identifying the authenticity of news articles as fake or real. It uses a dataset of 20,000 articles and employs methods such as TF-IDF vectorization and the Lemmatization algorithm, achieving ~95% classification accuracy with random forest classifier model

data datapreprocessing logistic-regression machine-learning machine-learning-algorithms numpy pandas prediction stemming streamlit streamlit-webapp vectorization

Last synced: 11 Apr 2026

https://github.com/armahdavi/analytics-data-pipelines-statistics-plotting---dust-extraction-hvac-filters---phase-1

PhD Technical Paper 1 - Phase 1 - Mahdavi & Siegel (2020) (Aerosol Science & Technology; AS&T) - Sharing all the data pipelines, processing codes, descriptive statistics, statistical modellings, and plotting/visualizations - Project Miestone: 2017 - 2020 - Full-length article is available

matplotlib numpy pandas pandas-dataframe pyplot python scipy-stats sklearn

Last synced: 13 Apr 2026

https://github.com/kirtipratihar/python_libraries_for_ds

This repository serves as a comprehensive guide to Python programming for Data Science. It covers essential topics like data manipulation, data visualization, machine learning, and statistical analysis using popular libraries such as Pandas, NumPy, Matplotlib, Seaborn, and Scikit-Learn.

artificial-intelligence machine-learning numpy pandas python scikit-learn tensorflow

Last synced: 11 Apr 2026

https://github.com/mehradi-github/ref-jupyter-2510

using python in machine learning

matplotlib numpy pandas python sklearn statistics

Last synced: 11 Apr 2026

https://github.com/timothyjan/intro-machine-learning-classifiers

We will use the scikit-learn library, which is a higher-level machine learning library that will work with NumPy data, and Pandas, a library that makes it easier to manipulate data. We will explore a variety of classification algorithms, and compare their performance on a “real-world” dataset, which will introduce its own set of challenges.

numpy pandas python scikit-learn

Last synced: 11 Apr 2026

https://github.com/vidushibhadana/eda-on-nyc-taxi-data

About Conducting an Exploratory Data Analysis (EDA) on New York City taxi data and visualizing it through countplots, distribution plots (displot), and histograms using Python and it's libraries.

data data-visualization jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 11 Apr 2026

https://github.com/varkenvarken/blempy

small, safe utilities to efficiently transfer Blender property-collection attributes (e.g. vertex coordinates) to/from NumPy arrays and perform vectorized operations with minimal Python overhead.

blender numpy

Last synced: 13 Jan 2026

https://github.com/volf52/deep-neural-net

A simple deep neural net class written to work with Numpy and Cupy

binarized-neural-networks binary-neural-networks bnn cupy deep-learning deep-neural-networks mnist numpy python python3

Last synced: 05 May 2026

https://github.com/dhanish03/credit_card_fraud_detection

Developed and implemented an advanced CCFDS using ML algorithms and pattern recognition techniques. Integrated real-time monitoring and adaptive learning capabilities into the system to dynamically adjust fraud detection parameters, ensuring effectiveness in identifying emerging fraud patterns.

kaggle-dataset numpy pandas-dataframe python3 sklearn

Last synced: 16 Apr 2026

https://github.com/mrktsm/spam-email-recognizer

Long Short-Term Memory (LSTM) network trained to classify emails as spam or non-spam. It processes email content to make accurate predictions and can be integrated into projects for efficient spam detection and email management.

data-preprocessing keras lstm-neural-network model-architecture nltk numpy pandas performance-evaluation scikit-learn spam-classification-model tenserflow training-the-model

Last synced: 09 Apr 2026

https://github.com/ngangawairimu/data-validation-using-python

Agricultural dataset validated using python code for usage. Building a data pipeline that will ingest and clean data with the press of a button.

jupyter-notebook numpy pandas pytest python

Last synced: 13 Apr 2026

https://github.com/eljandoubi/genre_classification

Create an ML pipeline for Genre Classification using MLflow.

hydra machine-learning mlflow numpy pandas pandas-profiling pytest scikit-learn scipy wandb

Last synced: 11 Apr 2026

https://github.com/nishi1612/knight-tour-problem

IT485 Logic of Inference Project on Knight's Tour. A hamiltonian path problem to determine path of knight to travel entire chessboard with every cell visited only once.

backtracking-algorithm bootstrap flask html knights-tour localhost numpy pygame python tkinter warnsdorff

Last synced: 11 Apr 2026

https://github.com/arthurdsant/dataanalysis-agricultural_raw_material

This Python project performs analysis and visualization of agricultural raw material price data using a Kaggle dataset. Based on Jupiter Notebook and Python.

jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 26 Jan 2026

https://github.com/mgitrov/lifespan-x-population-status

A machine learning project aiming to predict animals' lifespan and population status.

bs4 data-science machine-learning matplotlib numpy pandas python regular-expressions requests seaborn sklearn

Last synced: 11 Apr 2026

https://github.com/chintanboghara/rocket-simulation

A comprehensive web-based orbital mechanics simulator with advanced mission planning, real-time tracking, and educational features.

docker flask html javascript numpy plotly python

Last synced: 11 Apr 2026

https://github.com/nachtfeuer/covid19

Python script(s) for visualizing corona data

csv json matplotlib numpy pandas python requests tkinter

Last synced: 05 May 2026

https://github.com/kahngjoonkoh/randomshapegenerator

A program that will generate images with random shapes and background colours. Can be customized and generated in bulk.

generative-art numpy opencv python threading tkinter

Last synced: 11 Apr 2026

https://github.com/dastogirrudro/machine-learning-and-deep-learning

This is my thesis project which i have done in varsity.Here i used machine learning and deep learning i used LSTM as deep learning.This can identify aggresive spam message. Here i used pandas scikit-learn and many more framework i used python as a programming language.I used many algorithm for highering the accuracy of my project.

deep-learning lstm machine-learning numpy pandas python scikit-learn

Last synced: 11 Apr 2026

https://github.com/erikbrinkman/hilbert-bytes

A python library for converting between d-dimensional points and indices on a hilbert curve

hilbert-curve numba numpy python

Last synced: 08 May 2025

https://github.com/andersoncrs/prediccion_precio_vehiculos_statsmodels

Este proyecto utiliza un modelo de regresión lineal para predecir el precio de vehículos basándose en sus características principales. El análisis incluye la definición del problema, exploración y limpieza de datos, conversión de variables categóricas a numéricas, evaluación de correlaciones y entrenamiento del modelo.

analisis-de-datos analisis-exploratorio-de-datos matplotlib numpy seaborn statsmodels visualizacion-de-datos

Last synced: 26 Apr 2026

https://github.com/shwetapardhi/assignment-03-q1--hypothesis-testing

Q1.A F&B manager wants to determine whether there is any significant difference in the diameter of the cutlet between two units. A randomly selected sample of cutlets was collected from both units and measured? Analyze the data and draw inferences at 5% significance level. Please state the assumptions and tests that you carried out to check validit

hypothesis-testing numpy p-value pandas python scipy significance-testing stats t-test

Last synced: 11 Apr 2026

https://github.com/lijesh010/ml_project_data_preprocessing

The main objective of this project is to design and implement a robust data preprocessing system that addresses common challenges such as missing values, outliers, inconsistent formatting, and noise. By performing effective data preprocessing, the project aims to enhance the quality, reliability, and usefulness of the data for machine learning.

data-cleaning data-exploration data-preprocessing machine-learning numpy pandas-python python scikit-learn

Last synced: 11 Apr 2026

https://github.com/isabelacaldeira/chutelibre

Crashing into code. Here is a physics problem about free fall solved with python!

freefall jupyter-notebook matplotlib numpy physics physics-simulation python3

Last synced: 11 Apr 2026

https://github.com/chanmeng666/advanced-neural-network-applications

Practical implementations of perceptron and linear neuron models for classification and regression, with mathematical analysis and visualizations in Jupyter notebooks.

classification data-analysis data-science educational gradient-descent jupyter-notebook linear-neuron machine-learning matplotlib neural-network neural-networks numpy perceptron python regression

Last synced: 03 May 2026

https://github.com/apfirebolt/numpy-and-pandas-examples

Some examples and sample datasets to learn numpy, pandas and other data science libraries in Python

data-analysis jupyter-notebook numpy pandas python

Last synced: 17 Apr 2026

https://github.com/mramshaw/intro-to-ml

Intro to Machine Learning - Pattern Recognition for Fun and Profit

machine-learning matplotlib ml numpy pandas pip pip3 python scikit-learn scipy seaborn seaborn-plots sklearn statsmodels tensorflow weka

Last synced: 11 Apr 2026

https://github.com/dmarks84/ind_project_obesity-multi-class-classification--kaggle

Independent Project - Kaggle Competition -- I worked on the obesity classification data set as part of a Kaggle Competition of the same name, scoring (for accuracy) above 0.9

classification correlation-analysis cross-validation data-modeling data-visualization dataframes eda gridsearchcv matplotlib multiclass-classification numpy pandas python seaborn sklearn statistics supervised-ml

Last synced: 11 Apr 2026

https://github.com/harmanveer-2546/wafer-fault-detection

The goal is to eliminate manual work in identifying faulty wafers. Opening and handling suspected wafers disrupts the entire process. False negatives result in wasted time, manpower, and costs.

clustering data-transformation feature-selection machine-learning matplotlib numpy pandas python random-forest roc-auc-curve roc-auc-score seaborn sklearn svc xgboost

Last synced: 11 Apr 2026

https://github.com/lohiyah/real-estate-price-forecast

A Python-based app predicting real estate prices using machine learning. Built with Pandas, NumPy, Scikit-learn, Matplotlib, and Seaborn for data processing and visualization, and Flask for the web interface.

flask matplotlib numpy pandas python3 scikit-learn seaborn

Last synced: 11 Apr 2026

https://github.com/akhileshthite/india-population

ML (simple linear regression) model for predicting India's population.

machine-learning numpy pandas python scikit-learn

Last synced: 09 Apr 2026

https://github.com/chokzb/covid19_vaccination_analysis

An EDA project examining global COVID-19 vaccination progress. The notebook investigates vaccination trends by country, daily vaccination rates, timeline patterns, and dose distribution. The project includes visualisations created with Matplotlib, Seaborn, and Plotly.

covid-19 data-analysis data-visualization jupyter-notebook matplotlib numpy pandas plotly python seaborn vaccination

Last synced: 07 May 2026

https://github.com/stdlib-js/blas-ext-linspace

Return a new ndarray filled with linearly spaced values over a specified interval along one or more ndarray dimensions.

arange arrange javascript linear linspace math mathematics matlab ndarray node node-js nodejs numpy seq sequence statistics stats stdlib

Last synced: 04 May 2026

https://github.com/4211421036/githubiotpy

GitHubIoT is a comprehensive toolkit designed to simplify the visualization of IoT (Internet of Things) data with seamless GitHub integration. The application provides an intuitive graphical interface for real-time data monitoring, analysis, and configuration

cli esp32 esp8266 github-actions github-iot matplotlib numpy pypi-packages python tkinter

Last synced: 16 Apr 2025

https://github.com/pramodyasahan/model-selection

This repository explores and compares different regression models for predicting continuous outcomes. This repository includes implementations and evaluations of five key regression models. The primary goal is to demonstrate how each model works, evaluate their performance using R-squared values, and guide users in selecting the best model.

machine-learning modelselection numpy pandas python regression scikit-learn

Last synced: 08 Mar 2025

https://github.com/chaudharypraveen98/lungcancerdetection

To distribute the work of doctors and process the large amount of data to produce accurate results on the go

numpy pandas pillow python scipy tenserflow

Last synced: 16 Apr 2026

https://github.com/matheusafonseca/c111

Este repositório é dedicado ao armazenamento e organização dos códigos desenvolvidos na disciplina C111 - Análise de Dados, oferecida pelo Instituto Nacional de Telecomunicações (INATEL).

data-analysis matplotlib numpy pandas python

Last synced: 06 May 2026