An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/avijit-jana/redbus-data_scraping_and_filtering_with_streamlit_app

A Streamlit-based application leveraging Selenium to automate data scraping from Redbus, enabling efficient collection, analysis, and visualization of bus travel data for improved operational efficiency and strategic planning in the transportation industry.

automation dashboard data-analysis data-visualization datadrivendecisions python3 redbus selenium-python streamlit-application webscrapping

Last synced: 15 Mar 2025

https://github.com/mardavsj/weather-prediction

Weather prediction model which mainly focuses on visualization.

data-analysis data-visualization matplotlib numpy pandas pandas-dataframe

Last synced: 10 Apr 2026

https://github.com/datalopes1/ds_salaries2024_eda

Neste projeto será realizado o processo de EDA (Exploratory Data Analysis) a partir do dataset Data Science Salaries 2024, que pode ser encontrado no Kaggle, com licensa Database: Open Database e enviado por Sazidul Islam.

data-analysis data-visualization eda exploratory-data-analysis jupyter-notebook python

Last synced: 29 Apr 2026

https://github.com/quantitext/quantitext

Official repository for QuantiText applications in the .NET ecosystem.

api aspnet-core csharp data-analysis dotnet-core mvc-architecture

Last synced: 30 Mar 2025

https://github.com/m-faizan-mahmood/detailed-exploratory-data-analysis-eda-marketing-recomendations.

This project focuses on cleaning, preprocessing, and analyzing data using Pandas and NumPy. Key steps include handling missing values, removing outliers, feature engineering, and exploratory data analysis (EDA). Visualizations with Matplotlib and Seaborn highlight trends in customer spending, campaign performance, and product sales.

big-data data-analysis data-processing data-science eda exploratory-data-analysis numpy pandas python

Last synced: 11 Apr 2026

https://github.com/manikantasanjay/time_series_data_analysis_on_stocks

Time Series Data Analysis project on Daily Stock Prices of the following companies(Apple, Microsoft, Google, Amazon) for a span of 5 years.

data-analysis pandas stock time-series time-series-analysis

Last synced: 03 May 2026

https://github.com/alemalvarez/data-analysis-web-project

Web-app providing a simple interface for data storage,

data-analysis data-science javascript react webapp

Last synced: 29 Apr 2026

https://github.com/sunnybibyan/exploratory-data-analysis-eda

Welcome to the Titanic Dataset - Exploratory Data Analysis (EDA) project repository! This project aims to uncover insights from the Titanic dataset using Python and Jupyter Notebook. By analyzing key variables such as age, gender, and class, we aim to visualize relationships between passenger characteristics and survival rates.

data-analysis data-visualization jupyter-notebook python titanic-dataset

Last synced: 18 Jan 2026

https://github.com/jossimmar/ensa-scripts_py

Repositorio destinado al manejo de datos de consumo de los Clientes Mayores de ENSA del Grupo Distriluz.

data-analysis electrical-engineering python sqlite

Last synced: 10 May 2026

https://github.com/iamjuniorb/data_structures_and_algorithms

I'm working on Data Structures and Algorithms I C949 class in school and decided to write up all of these searching algorithms, sorting algorithms, strutures, and so on to get a better understanding. These can be used with large datasets to test their space and time complexities.

data data-analysis data-science data-structures datastructures datastructures-algorithms datastructuresandalgorithm math mathematics programming python python-app python-library python3

Last synced: 08 Jun 2026

https://github.com/geo-y20/uber-rides-data-analysis

This project aims to analyze Uber ride data to understand various aspects of ride usage, such as the distribution of rides across different categories, purposes, months, days, and times.

dashboard dashboard-templates data data-analysis data-analysis-python data-analytics data-visualization pandas powerbi python recommendation-system rides uber

Last synced: 13 Apr 2026

https://github.com/carolinedotxyz/dp_sgd_classification

A hands-on educational walkthrough of training a CelebA (Eyeglasses) image classifier with Differentially Private SGD using PyTorch and Opacus. The focus of this repo is on clarity and reproducibility through balanced subsets, deterministic preprocessing, and side-by-side baseline vs. DP training, while acknowledging real trade-offs.

celeba-dataset classification data-analysis dp-sgd machine-learning opacus python pytorch

Last synced: 16 May 2026

https://github.com/ahammadshawki8/playing-with-pandas

🐼 Pandas is one of my favourite library in python. It is well-known for "Analyzing" data. Learn basics and beyond the basics of Pandas from this repository. 🤍🖤

beginner-friendly data-analysis favourite-library pandas python

Last synced: 17 Apr 2026

https://github.com/ddihora1604/iitk_esg

Researching and Analyzing key ESG (Environmental, Social, Governance indicators) metrics and their impact on stock performance and market behavior. Leveraging AI techniques (like Machine Learning and NLP) in finance to extract insights from ESG disclosures, enhancing financial predictions and sustainable investment strategies.

data-analysis data-visualization esg python yahoo-finance

Last synced: 24 Apr 2025

https://github.com/aritrakar/statpy

A simple package containing some functions for analysing Gaussian and Binomial distributions. Created for the Udacity AWS MLE Foundations 2021 course.

data-analysis python statistics

Last synced: 24 Oct 2025

https://github.com/virajbhutada/global-universities-success-analysis-powerbi-sql-excel

This capstone project conducts in-depth analysis using Power BI, SQL, and Excel to explore complex dynamics shaping global university success. Integrating data from diverse ranking systems and criteria, our aim is to unravel the factors influencing universities worldwide.

capstone capstoneproject data-analysis data-analytics data-insights data-science data-science-projects data-visualization excel exploratory-data-analysis mece mysql powerbi powerpoint sql

Last synced: 20 Jun 2025

https://github.com/bretsw/beds

Bookdown project for an open education resource (OER) book: Becoming Educational Data Scientists

analytics data-analysis data-analytics data-science

Last synced: 31 Mar 2025

https://github.com/silvano315/med-physics

This would be a repository about medical physics. It will based on 4 paths: medical data to analyse, SOTA programs for medical purposes, computer vision and eXplainability.

computer-vision data-analysis data-science explainable-ai medical-imaging medical-physics medical-tool

Last synced: 24 Mar 2025

https://github.com/ryannapp12/quant_trading_engine

A modular, and scalable quantitative trading engine built in Python. This project demonstrates efficient data caching with SQLite, concurrent backtesting, and advanced risk analytics, showcasing best practices in clean code architecture and performance optimization.

algorithmic-trading backtesting dash data-analysis data-visualization fintech lstm machine-learning numpy pandas plotly python quantitative-finance real-time risk-management sqlite technical-analysis tensorflow time-series-analysis trading-strategies

Last synced: 11 Apr 2026

https://github.com/abhinavsharma07/fraud_analytics-credit_card_fraud_detection

The aim of this project is to predict fraudulent credit card transactions with the help of different machine learning models.

banking data-analysis decision-trees hyperparameter-optimization machine-learning-algorithms pipelines random-forest-classifier svm-classifier xgboost-classifier

Last synced: 06 Oct 2025

https://github.com/ayu-hack/ayu-hack

Enthusiastic learner passionate about building software and exploring the world of technology. Eager to contribute to open-source projects and collaborate with the developer community. Continuously developing my skills in Python,SQL,HTML,CSS,PowerBI, MacOS. Always open to feedback and excited to keep growing!

config css data-analysis github-config html powerbi-desktop python3 sql

Last synced: 30 Apr 2026

https://github.com/victoriapm/analyze_a-b_test_results

Understand the results of an A/B test run by an e-commerce website.

ab-testing data-analysis ecommerce-website

Last synced: 06 Oct 2025

https://github.com/zeynepcol/data-analysis-visualization

Data visualization and interactive analytics - Olympics Dataset

data-analysis data-science data-visualization matplotlib pandas plotly python scipy seaborn streamlit

Last synced: 03 May 2026

https://github.com/luochang212/weibo-analysis

Data analysis based on sina weibo.

data-analysis weibo

Last synced: 03 Apr 2026

https://github.com/airdac/sim-telco_customer_churn

Prediction of customer churn with logistic regression in R. Team project from UPC's Master's Degree in Data Science

classification data-analysis data-science logistic-regression r statistical-models upc

Last synced: 28 May 2026

https://github.com/gorhkdwj/da_portfolio

Kim Jae Chun's DA_Portfolio

data data-analysis python sql

Last synced: 20 Feb 2026

https://github.com/walkerdustin/vergleich-von-messmethoden-fuer-punktwolken

Bei der Vermessung eines physischen Raumes ist das Ergebnis eine Punktwolke. Diese Punktwolke beschreibt dann ausgewählte Punkte im Raum, zum Beispiel auf den Wänden und der Decke. Wenn diese Punkte in zwei seperaten Messungen gemessen werden, vielleicht sogar von unterschiedlichen Geräten, soll hinterher herausgefunden werden wie genau diese Punktwolken übereinstimmen. Dafür gibt es zwei grundsätzlich verschiedene Methoden. Diese sollen hier verglichen werden.

3d-models accuracy-metrics data-analysis data-visualization kaggle measure-distance numpy point-cloud pointcloudprocessing punkte python science-research simulation statistics

Last synced: 11 Apr 2026

https://github.com/phillbertnevinemmanuel/movieindustryanalysis-correlation

This project is a comprehensive data analysis endeavor within the Movie Industry, spanning from Data Cleaning to Exploratory Data Analysis, Correlation Analysis, and Temporal Analysis. The dataset was sourced from Kaggle, purportedly scraped using the IMDb API. Python was the primary tool utilized for analysis.

data-analysis data-cleaning python

Last synced: 30 Apr 2026

https://github.com/umutsevdi/hr-management

HR Management, Analytics and Salary Determination System

analytics data-analysis java java17 postgresql python spring spring-boot vaadin vaadin-flow

Last synced: 10 Apr 2026

https://github.com/komailmk/instagram-reach-forecasting

This repository provides a Python-based solution for forecasting Instagram reach using historical data and SARIMA modeling techniques.

data-analysis data-visualizations machine-learning

Last synced: 05 Oct 2025

https://github.com/alcestide/scianalytics

Playground for Data Analysis and Visualization for Research and Scientifical Purposes with Pandas and Plotly.

csv data-analysis data-science data-visualization pandas plotly python science-research statistics

Last synced: 30 Apr 2026

https://github.com/erickkhosasi/thelook-data_analysis

Final project for my SQL mini bootcamp. This project explores an e-commerce dataset to uncover key business insights. Data insights were queried in Google BigQuery and visualized with Google Sheets.

bigquery data-analysis e-commerce sql

Last synced: 05 Oct 2025

https://github.com/muhammadhilmyputrarisma/ab-test

Python code for A/B testing on Cookie Cats game data. This project analyzes the impact of moving the first gate from level 30 to level 40 on player retention and game rounds, helping to evaluate if delaying the gate improves player engagement and gameplay experience.

ab-testing cookie-cats data-analysis data-visualization game-analytics python statistics

Last synced: 18 May 2026

https://github.com/tks18/pyquery

PyQuery is a local-first data operating system built on lazy execution that processes 100GB+ files while you doomscroll. No cap. 🧢

data-analysis data-science etl hdfs parquet pipeline polars python

Last synced: 14 Jan 2026

https://github.com/webuccinoco/mysql-pivot-tables

Build complex MySQL pivot tables without touching a single line of code. This free PHP tool lets you visually connect your database and map out your data sources with a few simple clicks.

business-analytics business-intelligence crosstab data-analysis data-analytics data-visualization mysql mysql-database mysql-pivot-table mysql-reports mysql-virtualization php php-pivot-table php-reports pivot-tables reporting-tools

Last synced: 04 Feb 2026

https://github.com/rara-ch/data-analysis-portfolio

This repository to store my data analytics projects, showcasing my skills in SQL and Python.

data-analysis mathematics matplotlib numpy pandas portfolio probability python seaborn sql statistics

Last synced: 12 Mar 2025

https://github.com/gustavo-zamai/shop_data_analisys

Analysis diferents shopping mall sells

data-analysis openpyxl pandas python3 pywin32

Last synced: 01 Mar 2025

https://github.com/priboy313/pandasflow

A set of custom python modules for friendly workflow on pandas

catboost data-analysis data-science pandas phik python scikit-learn shap

Last synced: 20 Jan 2026

https://github.com/jcaperella29/stock_evaluation_python

A Python script to classify companies based on financial metrics like Piotroski F-Score and Stock Valuation, using CSV financial data for analysis and output.

ai-in-finance artificial-intelligence classification csv-processing data-analysis expert-system finance financial-analysis financial-analysis-tools piotroski-f-score python quantitative-analysis rule-based-classifier stock-analysis stock-valuation

Last synced: 07 Sep 2025

https://github.com/hetuvpatel/research-chatgpt

Research and data analysis project evaluating the social, ethical, and educational impacts of ChatGPT using survey-driven insights and Python-powered data analysis. 📚🤖

data-analysis matplotlib pandas python seaborn

Last synced: 01 May 2026

https://github.com/amstuta/cpp-neural-network

Simple implementation of a feedforward neural network in c++

data-analysis deep-learning machine-learning neural-network

Last synced: 08 Apr 2025

https://github.com/ondrejhruby/countries-of-the-world

Explore global data with this repository, featuring insights, visualizations, and Python code examples on countries worldwide—perfect for enhancing your data analysis and visualization skills.

data-analysis data-science data-visualization geography jupyter-notebook machine-learning matplotlib pandas python statistics

Last synced: 16 Apr 2026

https://github.com/sijuswamy/data-analytics-using-r

Course Repository for Data Analysis using R- Add-on course

data-analysis

Last synced: 12 Apr 2025

https://github.com/joannescode/data-series-with-kaggle

Repositório de notebooks práticos sobre tratamento e análise de datasets

data-analysis matplotlib pandas python

Last synced: 13 Mar 2025

https://github.com/0xpr03/clantool

CF Management & Data Analysis Tool, crawler backend in rust

backend-server crawler data-analysis rust

Last synced: 05 Feb 2026

https://github.com/happybono/sonatasmooth

Provides three different noise reduction algorithms for smoothing out data : Rectangular Averaging, Binomial Median Filtering, and Binomial Averaging. It processes data from a list and displays the results in another list.

algorithms average binomial binomial-coefficient binomial-theorem calibration csharp data-analysis data-calibration dynamic-noise-reduction median noise-algorithms noise-reduction noise-reduction-kernel outliers rectangular-averaging windows-desktop windows-desktop-application windows-forms winforms

Last synced: 30 Oct 2025

https://github.com/as16082023/hotel-booking-analysis-eda-

Exploratory Data Analysis on hotel booking data using Python

data-analysis data-visualization exploratory-data-analysis jupyter-notebook python

Last synced: 29 Apr 2026

https://github.com/as16082023/music-store-analysis

This project involves analyzing music store data using SQL queries in MySQL workbench to enhance decision-making, identify trends, and understand customer behavior

data-analysis music-store-analysis mysql sql

Last synced: 06 Jul 2025

https://github.com/programmer-rd-ai/moviedatascraper

Explore the cinematic universe with our IMDb web scraping project! Dive into movie data with ease, uncovering insights from cast to critical reviews. With dynamic visualizations and reliable data, let's journey through the world of movies like never before. Lights, camera, analysis!

beautifulsoup beautifulsoup4 data data-analysis jupyter-notebook matplotlib numpy pandas programming python python3 scraping seaborn software web

Last synced: 01 Mar 2025

https://github.com/mohamed3nan/udacity

Udacity Data Analysis Nanodegree Program

data-analysis data-visualization numpy pandas python

Last synced: 10 Apr 2026

https://github.com/md-emon-hasan/1-simple-stock-price-ml-app

A simple mahcine learning application for stock prices, demonstrating data preprocessing, model training, and deployment using scikit-learn.

data-analysis data-science eda ml-app streamlit-webapp time-series time-series-analysis webapp

Last synced: 31 May 2026

https://github.com/sanam2405/chatinfo

Analysing the WhatsApp Chat with my crush over a 6M period

data-analysis data-visualization python

Last synced: 27 Apr 2026

https://github.com/derrickbaruga7/mapping-median-age-europe

An R project that creates an interactive map of the median age across European regions using Eurostat data and spatial visualization packages.

data-analysis data-science data-visualization datascience european-union mapping r

Last synced: 25 Mar 2025

https://github.com/com-480-data-visualization/project-2023-the-vizards

Lausanne Transportation : a data visualization of the Lausanne Transportation network. Developed by the Vizards team as part of the EPFL Data Visualization course project (COM-480).

buses data-analysis data-science data-visualization epfl lausanne map metro public-transport public-transportation switzerland webgl

Last synced: 01 May 2026

https://github.com/jubinjacob03/heartdiseaseclassify-ml

Heart Disease Dataset Analysis & Classification using ML models such as linear, support vector machine, k-means, k-nearest neighbors and logistic regression.

data-analysis data-science data-visualization ipython-notebook kaggle-dataset kmeans knn linear-regression logistic-regression machine-learning matplotlib python seaborn support-vector-machine

Last synced: 18 Jan 2026

https://github.com/savinrazvan/heredity

An AI that assesses the likelihood of genetic traits in individuals using a Bayesian Network to analyze family genetic data, modeling genetic inheritance and mutations to infer probabilities of gene presence and trait expression.

ai bayesian-network biological-data-analysis data-analysis educational-project family-genetics genetic-inheritance genetic-traits heredity mutation-modeling probability-calculation python

Last synced: 27 Feb 2025

https://github.com/ahmad-ali-rafique/weather-prediction-fcnn

This project demonstrates a complete pipeline for weather prediction using a Fully Connected Neural Network (FCNN). The project is implemented in Python using Jupyter Notebook, and it covers data loading, preprocessing, model training, and performance evaluation.

ai artificial-intelligence data-analysis data-science deep-learning deep-neural-networks fully-connected-network machine-learning machine-learning-algorithms weather-information

Last synced: 28 Aug 2025

https://github.com/henrylin03/china-gdp

Analysis and visualisation of China GDP data using Python.

data data-analysis data-visualisation dataset kaggle pandas

Last synced: 01 May 2026

https://github.com/virajbhutada/google-stock-price-forecasting-lstm

Analyzing and predicting Google's stock prices through detailed data exploration and advanced LSTM models. This project involves data preprocessing, creating time-series sequences, constructing and training LSTM networks, and evaluating their performance to forecast future stock prices utilizing Python and Machine Learning libraries.

data-analysis data-science data-visualization future-prediction google-dataset google-stock-price-prediction google-stocks lstm-model lstm-neural-network machine-learning machine-learning-models matplotlib model-building model-training numpy python stock-forecasting

Last synced: 27 Feb 2025

https://github.com/nafisalawalidris/data-analysis-with-python

This repo features Jupyter Notebook labs for learning data analysis with Python. Explore data acquisition, wrangling, visualization, modeling, and evaluation. Enhance your skills in Python data analysis.

data-acquisition data-analysis data-science data-wrangling exploratory-data-analysis feature-engineering machine-learning model-development model-evaluation-and-refinement pandas

Last synced: 02 May 2026

https://github.com/chouaib-629/customersegmentation

Hadoop-based Customer Segmentation project using the Online Retail Dataset. Implements MapReduce for processing and Python for preprocessing to uncover customer purchasing patterns for targeted marketing.

big-data customer-segmentation data-analysis data-science distributed-computing hadoop hadoop-mapreduce java mapreduce marketing-analytics python

Last synced: 04 May 2026

https://github.com/ivanildobarauna-dev/api-to-dataframe

Python library that simplifies obtaining data from API endpoints by converting them directly into Pandas DataFrames. This library offers robust features, including retry strategies for failed requests.

data-analysis data-analytics data-engineering library pypi-packages python

Last synced: 06 Mar 2025

https://github.com/vidhi1290/zomato-data-analysis

Zomato Data Analysis - Explore the world of Zomato restaurant data through Python and data analysis. Uncover trends and insights using Pandas for data manipulation and Matplotlib for visualization. Join us in this journey to reveal the hidden stories within the data!

data-analysis data-analysis-python data-science data-visualization dataprocessing machine-learning machine-learning-algorithms matplotlib numpy pandas python scikit-learn zomato-data-analysis

Last synced: 11 Apr 2026

https://github.com/narenkhatwani/arkouda-projects

This repository contains the source codes of the projects done using Arkouda (a software package that allows a user to interactively issue massive parallel computations on distributed data using functions and syntax that mimic NumPy, the underlying computational library used in most Python data science workflows.)

arkouda data-analysis data-analytics data-science high-performance high-performance-computing highperformancecomputing numpy pandas parallel-computing parallel-processing parallelization python

Last synced: 17 Apr 2026

https://github.com/nafisalawalidris/hici-african-foods

HiCi African Foods: Excel dashboard & pivot table analysis of EU food rejection data to identify risks & recommend focus areas for market expansion.

data-analysis data-cleaning data-visualization eu-food-rejection excel-dashboard hici-african-foods market-expansion pivot-tables

Last synced: 19 Mar 2026

https://github.com/pradeepchegur/seamantic_web_design

We designed a semantic web for Instagram in Wix platform.

data-analysis framework instagram semantic-web website-design wix

Last synced: 19 Mar 2026

https://github.com/sathyasris27/statistical-analysis-on-rehoming-time-for-different-dog-breeds-in-animal-shelter

The aim of this project is given a collection of records documenting the stray, unwanted, or neglected dogs sent to animal shelters to be rehomed, we analyse their rehoming patterns based on their breeds.

data-analysis r statistical-analysis statistical-inference statistical-models

Last synced: 05 Jun 2026

https://github.com/leosimoes/datascienceacademy-powerbi-3.0

Projetos do curso Microsoft Power BI Para Data Science Versão 3.0 da DataScienceAcademy. Dashboards para diversos casos de negócios.

business-intelligence dashboards data-analysis data-visualization microsoft-power-bi

Last synced: 19 Mar 2026

https://github.com/hifza-khalid/book-management-system-sql

A Book Management System SQL project 📚 featuring tables for Authors ✍️, Books 📖, Customers 👤, and Orders 🛒. Includes sample queries for tracking book sales 💰, pricing by genre 🎭, and customer order history 📅.

book-management data-analysis database-management sql sql-queries

Last synced: 03 Feb 2026

https://github.com/discdiver/new-belgium-ratings

Find the most popular New Belgium beers of all time!

beautifulsoup data-analysis pandas python seaborn webscraping

Last synced: 10 Apr 2026

https://github.com/sing-group/bew

Public repository for Biofilmfs Experiment Workbench (BEW).

aibench data-analysis data-management java jfreechart workbench

Last synced: 03 Jul 2025

https://github.com/gher-uliege/bluecloud-plankton

Spatial interpolation of plankton data using a neural network

data data-analysis data-visualization neural-network oceanography

Last synced: 30 Mar 2025

https://github.com/anushadatta/airbnb-in-seattle

🏨 Understanding the Airbnb rental landscape in Seattle using data science.

airbnb data-analysis data-exploration data-visualization datascience sentiment-analysis

Last synced: 13 Jun 2025

https://github.com/chiemekaifemegbulem/make.com

A curated portfolio of Make.com automation workflows engineered to streamline operations and ensure precision. Featuring solutions for e-commerce, data integration, marketing, and bespoke business processes, it exemplifies expertise in designing scalable, efficient, and dependable automated systems.

api automate automated automation business data-analysis data-science dataengineering integration integromat make scenario software-engineering upwork workflows

Last synced: 15 Feb 2026

https://github.com/aadityatamrakar/futures_spread_chart

Cash Market & Futures Daily Spread Chart - NSE Stocks

data data-analysis data-mining expressjs nodejs requests

Last synced: 10 Apr 2026

https://github.com/lijesh010/globalsuperstoresalesanalysis

The Global Superstore Sales Analysis repository showcases a comprehensive Power BI dashboard that provides valuable insights into sales performance. This project is designed to present key information and trends to stakeholders, enabling informed decision-making.

dashboard data-analysis data-visualization msexcel power-bi sales-analysis

Last synced: 19 Mar 2026