An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/magnus0969/black-friday-sales-analysis

An in-depth analysis of Black Friday sales data to uncover trends, customer behavior, and product insights. Utilizing Python, data visualization, and machine learning techniques, this project provides key business intelligence to optimize sales strategies.

analysis data-analysis data-science python sales-analysis

Last synced: 09 May 2026

https://github.com/zxjahid/matplotlib

A comprehensive guide to mastering data visualization with Matplotlib through hands-on examples and advanced techniques. 🚀📊

candlestick candlestick-chart cheatsheet data-analysis data-visualization gtk jupyter-notebook maps matplotlib-python pandas thesis-template tk tutorial wx

Last synced: 09 May 2026

https://github.com/mmfava/qualesuapergunta-scripts-base-2015-2018

Este repositório contém scripts R utilizados durante meu trabalho de consultoria em bioestatística. Os scripts abrangem várias análises estatísticas e serviram como base para análises que foram realizadas. Eles não são scripts das consultorias ou assessorias em si.

analytics data-analysis r

Last synced: 20 May 2026

https://github.com/salma-mamdoh/exploring-the-evolution-of-linux-project

My Project to learn the Basics of Analysis on DataCamp

data-analysis datacamp pandas python time-series-analysis

Last synced: 09 May 2026

https://github.com/drill-n-bass/ovh-project

The goal of this task is to prepare statistical analysis of set of data from disks.

anaconda analysis data-analysis data-analysis-python jupyter-notebook matplotlib-python pandas python3 seaborn-plots

Last synced: 09 May 2026

https://github.com/datalopes1/manufacturing_defects

Projeto de EDA utilizando o Manufacturing Defects que pode ser encontrado no Kaggle

data-analysis data-visualization eda exploratory-data-analysis python

Last synced: 09 May 2026

https://github.com/monish-nallagondalla/algerian_forest_fires

This project predicts forest fires in Algeria using machine learning models . The dataset includes various meteorological and environmental features such as temperature, humidity, and wind speed. The app cleans the data and builds models to predict the likelihood of forest fires based on historical data and environmental conditions.

data-analysis data-science datacleaning flask forest-fire-prediction machine-learning meteorological-data python regression-models ridge-regression

Last synced: 09 May 2026

https://github.com/yeopster/datascience_notebook

Compilation of my Notebook based on Kaggle Dataset

data-analysis data-science kaggle notebook python

Last synced: 10 May 2026

https://github.com/imrandil/sql_practice_with_analysis

SQL practice using postgres db and docker as a tool to setup postgres, loving the sql way

data-analysis docker markdown postgres sql

Last synced: 10 May 2026

https://github.com/greenpau/esqrunner

Run Elasticsearh queries and create metrics based on the result of the queries in Elasticsearch database.

data-analysis elasticsearch query-builder querydsl

Last synced: 10 May 2026

https://github.com/datasqlsantosh/global-energy-consumption-renewable-generation-python-data-analysis-portfolio

This project focuses on analyzing global energy consumption patterns and trends in renewable energy generation using Python data analysis libraries such as Seaborn and NumPy. The analysis aims to explore energy consumption data from various regions worldwide and examine the contribution of renewable energy sources over time

data data-analysis data-visualization pandas seaborn

Last synced: 10 May 2026

https://github.com/andersoncrs/aprendizaje_no_supervisado_kmeans_customers

Este repositorio contiene un análisis de datos de clientes de un centro comercial utilizando técnicas de aprendizaje no supervisado, específicamente K Means y clustering jerárquico. El objetivo del proyecto es segmentar a los clientes en grupos homogéneos para entender mejor sus comportamientos y características.

data-analysis kmeans-clustering matplotlib numpy seaborn visualization

Last synced: 10 May 2026

https://github.com/szuzick/us-immigration-presidential-analysis

Power BI dashboard analyzing 40 years of U.S. immigration data across presidential administrations (1981-2020)

dashboard data-analysis data-visualization government-data immigration powerbi powerbi-dashboards powerbi-visuals presidential-analysis

Last synced: 10 Jun 2026

https://github.com/sdley/tp2_datascience

Exercice Pratique de traitement de donnees avec python

data-analysis pandas python

Last synced: 11 May 2026

https://github.com/monarch1108/customerinsights-kmeans

understanding customers using KMeans and RFM(recency, frequency & monetary) analysis

data-analysis data-visualization kmeans-clustering machine-learning matplotlib numpy pandas scikit-learn

Last synced: 11 May 2026

https://github.com/szuzick/hr-analytics-pipeline

End-to-end HR analytics solution using PostgreSQL, dbt, and Power BI

data-analysis data-visualization database-maintenance dbt hr-analytics insights postgresql powerbi sql

Last synced: 10 Jun 2026

https://github.com/hrosicka/czechpopulationestimation

This GitHub repository contains Python code for data analysis and population prediction in the Czech Republic up to the year 2050. The code is written in Python and utilizes the Pandas and Matplotlib libraries.

data-analysis data-visualization matplotlib matplotlib-figures matplotlib-pyplot pandas pandas-dataframe pandas-library pandas-python python python3

Last synced: 11 May 2026

https://github.com/ceia-prefeitura/urban-lit-tracker-etl

UrbanLitTracker coleta artigos acadêmicos sobre mudanças urbanas via OpenAlex API, processa e armazena em MongoDB. Oferece dashboard interativo com Dash, exibindo dados como trabalhos mais relevantes, autores e palavras-chave frequentes, facilitando a análise e visualização da literatura urbana.

academic-research bibliometrics data-analysis data-pipeline data-visualization etl openalex-api urban-studies

Last synced: 11 May 2026

https://github.com/sferez/gradient_descent

Multiple Linear Regression, Gradient Descent with Python

data-analysis data-science gradient-descent linear-regression python

Last synced: 12 May 2026

https://github.com/OdessaZ/Portfolio-Projects

This is a repository I have created to showcase skills, share projects and track my progress in Data Analytics and Data Science

applied-mathematics data-analysis data-science excel jupyter-notebook matplotlib-pyplot pandas portfolio python r r-studio seaborn sql statistics

Last synced: 12 May 2026

https://github.com/leticia-ducatti/sales-dashboard-project

Interactive sales dashboard built with Python and Streamlit — shows KPIs, allows filtering, and visualizes sales data.

data-analysis pandas plotly python streamlit

Last synced: 12 May 2026

https://github.com/jayita11/customer-engagement-insights-for-yelp-restaurant-business-success

This project analyzes Yelp restaurant data using SQLite, Python, and Tableau to explore user engagement, reviews, and ratings. It provides insights into restaurant success across cities, regions, and user behavior.

customer-engagement data-analysis interactive-visualizations json python ratings review sqlite3 tableau-dashboards-for-data-visualization yelp-restaurants

Last synced: 12 May 2026

https://github.com/ggarciajavier/udacity-dalf-project2-wrangle-openstreetmap-data

Work performed for the 2nd project of Udacity Data Analyst Nanodegree: OpenStreetMap data wrangling and analysis.

data-analysis openstreetmap python sql

Last synced: 12 May 2026

https://github.com/ygalvao/bra_scraper_2022

A web scraper bot for the 2nd round of the 2022 Brazilian Federal Elections.

data-analysis data-analytics selenium web-scraper webscraper

Last synced: 12 May 2026

https://github.com/leopeng1995/neuralsql

Make DataStore More Intelligent

data-analysis mongodb sql

Last synced: 12 May 2026

https://github.com/sakan811/honkai-star-rail-a-few-fun-insights-with-data-analysis

The project gives insights that delve into the Honkai Star Rail's character's stats of all available characters as of the given date.

data data-analysis data-science data-visualization docker flask game honkai honkai-star-rail honkai-starrail seaborn webscraping webscraping-data webscraping-selenium

Last synced: 10 Jun 2026

https://github.com/sebastian-diaz-berdecia/analisis-popularidad-de-series-y-generos-de-series

Consultas SQL para el análisis de la popularidad de series y géneros series de la base de datos NetflixDB.

business-analytics bussiness-intelligence data data-analysis database mysql mysql-database sql

Last synced: 12 May 2026

https://github.com/elishah-john/happiness-report-2019

Analysis of "Happiness Report 2019" using python.

data-analysis data-visualization educational jupyter-notebook python

Last synced: 12 May 2026

https://github.com/priyanshu7639/data_visualization_dashboard

An Interactive data visualization tool that combines traditional plotting capabilities with modern AI assistance. It allows users to create and modify visualizations through natural language commands, making data exploration accessible to users of all skill levels.

business-analytics data-analysis data-engineering data-exploration data-science data-visualization datapreprocessing datascience interactive-visualizations matplotlib plotly plotting python research-tool streamlit

Last synced: 12 May 2026

https://github.com/johannaschmidle/amazon-cat-couch

Customer product reviews + ratings analysis and visualization [Python, Excel, Tableau, R]

data-analysis data-visualization jupyter-notebook python-notebook r-markdown sentiment-analysis text-analysis web-scraping

Last synced: 11 Jun 2026

https://github.com/agailloty/preprocess

preprocess is a fast data analysis preprocessing tool.

cli data-analysis preprocessing-data

Last synced: 12 May 2026

https://github.com/sricasea/fundraising-insights-mwpccc

Data storytelling meets impact strategy — a nonprofit fundraising analysis project combining SQL, Python, and Deepnote to uncover donor trends and guide smarter decisions.

data-analysis data-storytelling data-visualization deepnote fundraising nonprofit portfolio-project python sql

Last synced: 12 May 2026

https://github.com/parthds02/-daily-calorie-count-meal-plan-generator-

Welcome to the Daily Calorie Count Meal Plan Generator project! This Streamlit web application is designed to create personalized meal plans based on user inputs such as age, weight, gender, and calorie goals. It also allows users to download their customized meal plans as PDFs.

calories-tracker data-analysis data-science pdf-generation streamlit vscode

Last synced: 13 May 2026

https://github.com/devanshsahu47/prime-content-analytics

Prime Data Explorer analyzes Amazon Prime's content and credits data to uncover trends in release years, genres, and ratings. It cleans, merges, and visualizes the data to provide actionable insights for optimizing content strategy and boosting audience engagement.

data-analysis data-visualization exploratory-data-analysis jupyter-notebook python3

Last synced: 13 May 2026

https://github.com/manukot/sturdy-engine-python-

I've leant not only various Theoretical Concepts but also practical projects in my Masters Coursework

data-analysis data-visualization python3

Last synced: 13 May 2026

https://github.com/mituskillologies/dkte-da-mar25

Programs conducted at DKTE's Engineering Institute, Ichalkaranji in training on Python Data Analytics March 2025.

data-analysis matplotlib numpy pandas python-programming tkinter-python

Last synced: 13 May 2026

https://github.com/ibrahimhabibeg/national-university-of-singapore-sms-analysis

Analysis of SMS messages collected by the National University of Singapore

analytics data-analysis data-science nlp python

Last synced: 13 May 2026

https://github.com/deliprofesor/joblocationmapper

JobLocationMapper is a Python tool that visualizes job listings on an interactive map. It uses city and state data to place job markers accurately and color-codes them by occupation (Software, Marketing, Design). The map clusters markers for better organization, and users can click on them to view job details.

clustrered-markers data-analysis data-visualization folium geocoding geographical-visualization interactive-map job-listings map-visualization pandas python

Last synced: 14 May 2026

https://github.com/satvikpraveen/matplotlibmasterpro

📷 MatplotlibMasterPro is a complete, portfolio-ready project to master data visualization using matplotlib. Includes 16 notebooks, real datasets, exportable plots, custom themes, Streamlit dashboard, and Docker support. Ideal for learners and data professionals.

charts custom-plots dashboarding data-analysis data-science data-visualization educational-project interactive-visualizations jupyter-notebook matplotlib notebooks open-source plotting portfolio-project python python-utilities reproducible-research subplots time-series-analysis visualization-tools

Last synced: 14 May 2026

https://github.com/yashsingh43/lung-cancer-biomarker-analysis

Gene expression analysis to identify biomarkers for early lung cancer detection (SCLC & NSCLC)

bioinformatics biomarkers cancer cytoscape data-analysis gene-expression gsea nsclc r sclc

Last synced: 11 Jun 2026

https://github.com/iamsainikhil/web-data-scraping

Data scraping from a webpage using Python

beautiful-soup data-analysis data-scraping python

Last synced: 11 Jun 2026

https://github.com/sambit-mondal/stockx

StockX is a full-stack application designed to help store owners efficiently manage their inventory, track purchases, and analyze stock levels. The system integrates MongoDB, Express, React, and Flask (Python) to provide a seamless experience.

artificial-intelligence data-analysis inventory-management-system machine-learning mern-stack

Last synced: 12 Jun 2026

https://github.com/marialuizaleitao/walmartsalesanalysis

This project explored data collection and preprocessing, advanced application of SQL queries, and feature engineering. Key calculations, such as COGS (Cost of Goods Sold) and VAT (Value Added Tax), were performed to assess the profitability and financial efficiency of the branches.

business-analytics data-analysis mysql-database sql

Last synced: 13 Jun 2026

https://github.com/luizassimoes/q5ga-latency-and-throughput

Quick 5G Analyser: PyQT5 software developed to help with simple graphical analysis and chart generating for ping and iperf3 tests.

data-analysis data-visualization pyqt5 python

Last synced: 13 Jun 2026

https://github.com/gmalbert/immigration

Immigration Data Analysis

data-analysis immigration

Last synced: 14 Jun 2026

https://github.com/nob101/lotto-analyzer

Ein Node.js & SQLite basiertes Tool zur Analyse und Auswertung der Euromillionenziehung und Joker. A Node.js and SQLite web application to analyze, track, and evaluate lottery (Euromillionen) and Joker results.

backend css data-analysis express html5 javascript nodejs sqlite statistical-analysis

Last synced: 14 Jun 2026

https://github.com/jkazari/rollercoaster-eda

Repository of a small data-analysis project in R for Mathematical Software class on the 3rd semester of studying Mathematics at Gdańsk University of Technology

data-analysis r

Last synced: 14 Jun 2026

https://github.com/dipeshgoyal013/crypto-currency-dashboard

This project analyzes historical cryptocurrency data and builds an interactive Power BI dashboard. It includes time-series forecasting of Bitcoin and Ethereum using ARIMA and Power BI’s forecasting model.

data-analysis excel power-bi python

Last synced: 15 Jun 2026

https://github.com/marielachirinosr/nyc-taxi-trip-exploration-2019-2020

Explores passenger behavior & impact of COVID-19 on NYC taxi industry (Q1 2019-2020).

bigquery data data-analysis data-visualization python sql tableau

Last synced: 15 Jun 2026

https://github.com/anderson-andre-p/uber-data-analysis

This repository contains a comprehensive data analysis project focused on Uber rides. The dataset used in this project is a spreadsheet obtained from Uber, containing data related to ride details, such as pick-up and drop-off locations, date and time of the ride, and the fare amount.

data-analysis data-science data-visualization python

Last synced: 15 Jun 2026

https://github.com/victoryfanfare/car-price-prediction

ML модель для определения рыночной стоимости автомобилей с пробегом. Проект включает анализ данных, feature engineering и сравнение различных алгоритмов машинного обучения.

catboost data-analysis jupyter-notebook lightgbm machine-learning pandas python regression

Last synced: 15 Jun 2026

https://github.com/dcs-training/data-wrangling-and-vis-pandas

Introduction to analyzing structured data with the Python libraries pandas, for CSV and TSV data, and ElementTree, for XML data. Go to the readme file

data-analysis data-visualisation data-wrangling python

Last synced: 16 Jun 2026

https://github.com/llnl/cap

HPC workflow that automates the tedious actions of compiling, analyzing, and parsing with bincfg

data-analysis hpc python workflows

Last synced: 17 Jun 2026

https://github.com/juanse0330/registro-pacientes-terapia-python

Proyecto en Python para automatizar el registro y análisis de pacientes en terapia ocupacional domiciliaria. Herramienta orientada al sector salud.

automatizacion data-analysis python salud terapia-ocupacional

Last synced: 17 Jun 2026

https://github.com/fahadnasir13/financial_data-analyzer_tool

A Python-based framework for analyzing, cleaning, and reconciling financial data stored in Excel workbooks.

data-analysis excel financial python store

Last synced: 17 Jun 2026

https://github.com/preetesh21/spotme

This repository is using the web-based API provided by Spotify to retrieve data and then analyse it.

api data-analysis

Last synced: 18 Jun 2026

https://github.com/lotfiferaga/amazon-alexa-reviews-sentiment-analysis

Amazon Alexa, developed by Amazon, allows users to interact with technology through voice commands. Analyzing user sentiments about Alexa, with over 40 million users worldwide, is an intriguing data project.

classification data-analysis python sentiment-analysis

Last synced: 18 Jun 2026

https://github.com/duoan/ds-nbs

Data analysis and machine learning notebook.

data-analysis data-scientists deep-learning kaggle-competition machine-learning

Last synced: 18 Jun 2026

https://github.com/ilhanseyhanx/car-price-prediction-with-machine-learning

🚗 ML-powered car price prediction model with 95.88% accuracy using Random Forest and comprehensive data preprocessing

car-price-prediction data-analysis data-science machine-learning pandas python random-forest regression sklearn

Last synced: 19 Jun 2026

https://github.com/shahaf-f-s/feature-space

A modular framework for combining pandas series features

data-analysis data-science feature-engineering

Last synced: 19 Jun 2026

https://github.com/alinababer/covid19-timeseries-cases-and-deaths-forecasting-

This study is based on confirmed cases and deaths collected from Pakistan. Results demonstrate the promising potential of TIME SERIES model in forecasting COVID-19 cases and highlight the superior performance of the time series compared to the LSTM.we apply AI-based forecasting models such time series ARIMA, LSTM, prophet and VAR.

arima covid-19 data-analysis data-science data-visualization fbprophet forecasting lstm rnn time-series var vectorautoregression

Last synced: 19 Jun 2026

https://github.com/mahapeth/invest-track

Реализация инструмента для мониторинга активности пользователей ИС "Инвест" для ВКР по направлению 01.03.02 Прикладная математика и информатика

analitycs app data-analysis data-visualization jupyter-notebook python sites

Last synced: 20 Jun 2026

https://github.com/dcs-training/intro-to-statistics

Intro to Statistics workshop. In this repo, you are going to find the code and files we are going to use for the practical part of the workshop, together with the ppt associated with this training. Go to the readme file

data-analysis data-visualisation data-wrangling r statistics

Last synced: 20 Jun 2026

https://github.com/aonurakman/data-analysis-and-ml-algorithms

An exploration of data analysis techniques and standard ML algorithms on QSAR oral toxicity dataset. - 2021 - Yıldız Technical University

classification clustering data-analysis data-mining isolation-forest python regression

Last synced: 20 Jun 2026

https://github.com/evanmathew/northwind-traders

SQL-powered analysis of sales, employee performance, and customer behavior using PostgreSQL window functions. This project uncovers key business insights to optimize decision-making.

case-study data-analysis jupyter-notebook northwind-traders postgresql python-postgresql sql

Last synced: 20 Jun 2026

https://github.com/haseebn19/urban-housing-demand

A full-stack web application for visualizing housing and labour market data

data-analysis data-visualization docker full-stack gradle statistics web webapp

Last synced: 22 Jun 2026

https://github.com/engusseus/warframe-market-set-profit-analyzer

Python tool that analyzes Warframe Market data to find profitable item sets to trade

api data-analysis python trading waframe

Last synced: 23 Jun 2026

https://github.com/ladaegorova18/data_analysis

Learning the basics of data analysis in Python

analytics data-analysis data-visualization steam-games

Last synced: 24 Jun 2026

https://github.com/anburocky3/cbse-schools-data

Fetch CBSE Schools in seconds and use it for your data projects

cbse data data-analysis data-science grabber nextjs

Last synced: 24 Jun 2026

https://github.com/imosudi/unsupervised-ml-kmeans-analysis

K-Means clustering analysis using synthetic datasets generated with scikit-learn, including meshgrid visualisation, silhouette score evaluation, and investigation of cluster count and random seed effects.

clustering data-analysis jupyter-notebook kmeans kmeans-clustering machine-learning matplotlib python3 scikit-learn silhouette-score unsupervised-learning

Last synced: 25 Jun 2026

https://github.com/parsabordbar/ctx3docs

The Documentation for context Tree Project.

ai-tools context ctx3 ctx3-docs data-analysis documentation tree workflow

Last synced: 25 Jun 2026