An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/danielrosehill/data-projects-index

Data apps and datasets deployed to Streamlit Community Cloud, Hugging Face, and elsewhere.

data-analysis data-science data-visualization

Last synced: 16 Mar 2026

https://github.com/dzakwanalifi/stadata-x

Terminal UI untuk menjelajahi dan mengunduh data BPS Indonesia secara interaktif

bps-api cli-app data-analysis data-visualization indonesia-statistics indonesian-data open-data python statistics terminal-ui textual tui

Last synced: 20 Jan 2026

https://github.com/abeltavares/postql

Python library and command-line interface (CLI) tool for interacting with PostgreSQL databases, providing simplified database management, query execution, and result export functionalities.

cli command-line-interface data-analysis data-engineering data-export data-management data-processing data-visualization database database-administration database-tools etl oop postgres postgresql psycopg2 python sql sqlalchemy wrapper

Last synced: 19 Jan 2026

https://github.com/aroramrinaal/spotistats

Spotistats is a data analysis and visualization project based on your Spotify streaming history.

data-analysis numbers spotify spotify-history visualization

Last synced: 15 Mar 2025

https://github.com/ayaatmohammed/amazon-sales-analysis-pyspark

In-depth analysis of the Olist E-commerce dataset from Kaggle using PySpark for customer segmentation (RFM) and market basket analysis.

big-data big-data-analytics customer-segmentation data-analysis data-science ecommerce jupyter-notebook kaggle pyspark python rfm-analysis

Last synced: 05 May 2026

https://github.com/NurFakhri/scraping-and-analysis-skincare

Scraping and data analysis of Indonesian skincare reviews.

beutifulsoup data-analysis data-scraping python requests review scraping-websites

Last synced: 12 Oct 2025

https://github.com/iamsainikhil/web-data-scraping

Data scraping from a webpage using Python

beautiful-soup data-analysis data-scraping python

Last synced: 11 Jun 2026

https://github.com/jeffbrennan/analysis-templates

Templates of commonly used graphics/functions/settings to help focus on the bigger picture

data-analysis r rmd

Last synced: 12 Oct 2025

https://github.com/tzerk/esr

R package 'ESR' for plotting and analysing ESR spectra in dating applications

data-analysis data-visualization electron-spin-resonance geochronology r

Last synced: 13 Mar 2026

https://github.com/akash1070/project--uber-data-analysis

To Determine UBER data from the dataset using Python

data-analysis data-science python

Last synced: 09 May 2026

https://github.com/leosimoes/digitalinnovationone-analise-covid

Projeto prático "Criando modelos com Python e Machine Learning para prever a evolução do COVID-19 no Brasil" da Digital Innovation One.

arima-models data-analysis data-science python time-series

Last synced: 09 May 2026

https://github.com/zulhaditya/web-scraping-python

A repository that stores various source code and web scraping methods using Python.

data-analysis python3 webscraping

Last synced: 12 Oct 2025

https://github.com/katiesaund/tidy_tuesday

A weekly data project in R from the R4DS online learning community

data-analysis data-visualization datascience plot r rstats tidytuesday

Last synced: 24 Mar 2025

https://github.com/chirlmin-joo-lab/papylio

Single-molecule fluorescence trace extraction and analysis

biophysics data-analysis fluorescence fret single-molecule sparxs

Last synced: 12 Oct 2025

https://github.com/0-mostafa-rezaee-0/sandwich_structures

Impact test of Sandwich Structures

composite-materials data-analysis r

Last synced: 09 Aug 2025

https://github.com/agb2k/twitter-analyzer

Project to extract tweets based on searches, analyze it's data and autocorrect potentially incorrect words

data-analysis python tweepy twitter

Last synced: 13 Oct 2025

https://github.com/vinitgurjar/r_lang_exp

This is a collection of my collage Data Analytics lab work and assignment, the files here contains program of R language

data-analysis data-visualization r

Last synced: 02 Jul 2025

https://github.com/nahiyanhkhan/sales-insight-dashboard_powerbi

Build a dashboard to display the sales insights of a company's sales data over the 4 years period. It includes displaying revenue, sales quantity in different regions over the years.

dashboard data-analysis data-analytics data-visualization powerbi salesdashboard

Last synced: 08 Jan 2026

https://github.com/nimbostratos/titanic-survival-prediction

Machine learning project predicting Titanic survival using AdaBoost with feature engineering and hyperparameter optimization

data-analysis data-science data-science-projects kaggle machine-learning machine-learning-models python scikit-learn

Last synced: 05 May 2026

https://github.com/zachbateman/easy_plot

Easy Statistical Visualization in Python

data-analysis data-visualization graphics matplotlib python seaborn

Last synced: 18 Jan 2026

https://github.com/bocchio01/skyward_recruitment_assignment

Assignment to join the PoliMi SkyWard software team

data-analysis kalman-filter model-rocket

Last synced: 15 Mar 2025

https://github.com/vbhvsingh0/matplotlib__egs

The codes here are examples of Matplotlib

data-analysis matplotlib-pyplot numpy-library pandas-python python3

Last synced: 28 May 2026

https://github.com/shivakumarhl/digital-music-store-analysis

Digital Music Store Data Analysis using SQL

data-analysis sql

Last synced: 10 Mar 2026

https://github.com/jsimell/sleepanalysis

A Python data analysis project analyzing the sleep quality affecting factors and temporal patterns in the sleeping data of a single subject.

data-analysis matplotlib numpy pandas python scikit-learn seaborn

Last synced: 14 Apr 2026

https://github.com/gmalbert/supreme-court

Data Analysis of the US Supreme Court from 1790 to present

data-analysis data-science supreme-court

Last synced: 31 May 2026

https://github.com/curtisalexander/cramisc

Personal R functions for data analysis

data-analysis r r-pkg

Last synced: 12 Mar 2025

https://github.com/caesaredia/ymusic-project

Exploratory data analysis (EDA) of music streaming behavior in two fictional cities using Python, Pandas, and Jupyter Notebook. It explores user behavior, genre preferences, and listening patterns throughout the week.

data-analysis eda pandas python

Last synced: 05 May 2026

https://github.com/szymon-budziak/real_estate_house_prices_prediction

Predicting real estate house prices using various machine learning algorithms, including data exploration, preprocessing, model training, and evaluation.

data-analysis data-preprocessing data-science eda jupyter-notebook machine-learning matplotlib numpy optuna pandas predictive-modeling price-prediction python random-forest regression scikit-learn seaborn

Last synced: 21 Jan 2026

https://github.com/cyberoctane29/epa-carbon-monoxide-aqi-analysis

This project continues my EPA Air Quality AQI Analysis, focusing on carbon monoxide levels in EPA data. Using Python, I applied statistics, probability analysis, outlier detection, sampling, and hypothesis testing to assess pollution and health impacts. Leveraging Pandas, NumPy, SciPy, and Matplotlib, it supports environmental policy decisions.

data-analysis eda hypothesis-testing probability-distribution sampling sampling-distribution statistical-analysis

Last synced: 24 Mar 2025

https://github.com/wojtekdomino/titanic-eda

Exploratory Data Analysis (EDA) of Titanic dataset using Pandas, Matplotlib, and Seaborn.

data-analysis eda matplotlib pandas python seaborn

Last synced: 10 Jun 2025

https://github.com/atharvapathak/rsvp_movies_case_study

SQL queries performed on IMDb database to provide recommendations to RSVP Movies based on insights.

data-analysis data-cleaning data-science imdb-dataset rsvp-movies sql

Last synced: 28 Jan 2026

https://github.com/sumit9000/submission-of-web-server-log-analysis-assessment

This project analyzes one year of real-world HTTP access logs from the University of Calgary’s computer science server. Using Python, pandas, and regular expressions, we clean and parse the data to extract meaningful insights and answer 10 analytical questions.

data-analysis data-cleaning eda jupyter-notebook log-parsing pandas python realworld-data regex web-log-analysis

Last synced: 14 Apr 2026

https://github.com/b-varun-reddy/fairwai-bias-detection

Submission for the FairwAI Hospitality Intern Challenge. This project analyzes bias signals in Yelp hospitality reviews using open-source data, Python, and fairness-focused keyword detection.

bias-detection data-analysis ethical-ai fairness hospitality machine-learning natural-language-processing python social-impact yelp-dataset

Last synced: 19 Apr 2025

https://github.com/manel15279/datamining-project

A university project that aims to explore various data mining techniques like Data Exploration, Association Rule Mining, Supervised and Unsupervised Learning, applied to real-world datasets, focusing on soil fertility analysis and COVID-19 cases evolution over time.

covid-19 data-analysis data-mining data-visualization datascience gradio machine-learning python soil-properties

Last synced: 10 Jun 2025

https://github.com/hemangsharma/job-tracker

A comprehensive Streamlit application for tracking and analyzing job applications.

data-analysis python streamlit-dashboard streamlit-webapp

Last synced: 15 Mar 2025

https://github.com/tolumie/rfm-marketing-analysis

This project focuses on RFM (Recency, Frequency, and Monetary) Analysis, a powerful customer segmentation technique used in marketing and business analytics. The analysis helps businesses identify their most valuable customers, potential loyalists, at-risk customers, and churned users.

business-analytics customer-behavior-analysis customer-loyalty customer-retention customer-segmentation-analysis data-analysis data-driven-decisions ecommerce marketing-analytics python

Last synced: 18 May 2026

https://github.com/giseletoledo/case-study-wellness-smart

Project from the coursera course Google Data Analytics

data-analysis kaggle-dataset r

Last synced: 14 Oct 2025

https://github.com/samkazan/business-analysis-tableau

Business Analysis on Global/Superstore data using Tableau.

analysis data-analysis tableau visualization

Last synced: 08 Feb 2026

https://github.com/lucaso21/euro-2021-player-stats-analysis

A short project analyzing stats for players at the Euro 2021 tournament.

data-analysis data-science r rvest tidyverse

Last synced: 16 Mar 2025

https://github.com/anushkundu/london-housing-market-analysis

London Housing Market Analysis: An Insightful Power BI Dashboard"

data-analysis data-visualization powerbi transformation

Last synced: 27 Jan 2026

https://github.com/ayorick23/python-data-science-cheat-sheet

Guía rápida y práctica de sintaxis, comandos y funciones esenciales de Python para Ciencia de Datos. Perfecta para recordar cómo usar las librerías más comunes como NumPy, Pandas, Matplotlib y Scikit-learn en tus análisis diarios.

cheat-sheet data-analysis data-science data-visualization deep-learning jupyter-notebook machine-learning matplotlib ml numpy pandas python scikit-learn scipy seaborn statistics sympy tensorflow

Last synced: 07 Apr 2026

https://github.com/asuquoaa/air_bnb_analysis_dashboard-tableau-

Interactive Tableau dashboards to analyze and visualize data, providing actionable insights for better decision-making

dashboard data-analysis interactive-visualization tableau

Last synced: 13 Mar 2026

https://github.com/prakashjha1/new-analysis-using-llm-locally

An interactive news analysis tool built with Streamlit and local LLMs. This app allows users to analyze and gain insights from the latest news articles using advanced language models, all running locally. Explore trends, sentiment, and key topics with an intuitive interface.

artificial-intelligence data-analysis data-science llms ollama python streamlit

Last synced: 14 Mar 2025

https://github.com/ankitpoddar07/excel-project_back-office

📊 Coffee Sales Analytics – Back Office Excel Project

data-analysis ms-excel

Last synced: 05 Feb 2026

https://github.com/hms75/movie_rating_analysis

A movie rating analysis which identifies trends amongst a dataset of 5000 movies.

data-analysis data-visualization matplotlib-pyplot numpy pandas python

Last synced: 05 May 2026

https://github.com/thbaylson/datascience

All of my past data science assignments put into one singular notebook. Most of this comes from my Machine Learning course.

data-analysis data-science data-visualization decision-tree jupyter-notebook k-nearest-neighbors linear-regression machine-learning neural-network pandas-library python3 scikit-learn

Last synced: 09 May 2026

https://github.com/tszon/data-science-projects

Included are all the worth-noting Data Science projects in my learning journey with DataCamp.

data-analysis data-science exploratory-data-analysis feature-engineering machine-learning modelling preprocessing-data scikit-learn supervised-learning

Last synced: 15 Mar 2025

https://github.com/ginalamp/covid_dashboard_twitternews

Corona Dashboard & report based on Twitter media outlet news.

dashboard data-analysis data-visualization twitter

Last synced: 28 Jan 2026

https://github.com/chahelgupta/dep-videogames-dataset

The data extraction and processing involved thorough exploration, preprocessing, and visualization of the "Video Game Sales with Ratings" dataset.

data-analysis data-exploration data-extraction data-preparation data-preprocessing data-processing data-science data-visualization

Last synced: 15 Oct 2025

https://github.com/syed-amjad-ali/-bank-churn-ml

Predicting bank customer churn using machine learning. This project includes exploratory data analysis (EDA), feature engineering, classification models (Logistic Regression, Random Forest), and customer segmentation using K-Means clustering.

classification data-analysis data-science eda jupyter-notebook k-means-clustering machine-learning ml python segmentation

Last synced: 09 Mar 2025

https://github.com/alphatwirl/qtwirl

qtwirl (quick-twirl), one-function interface to AlphaTwirl

alphatwirl data-analysis data-frame pandas r root-cern

Last synced: 11 Apr 2026

https://github.com/rohanrony19/movie-recommendation-system

This is a python project where using Pandas library we will find correlation and give the best recommendation for movies.

data-analysis deep-learning knn-algorithm numpy pandas python recommendation-system

Last synced: 14 Apr 2026

https://github.com/sanjayankur31/20181206-neurofedora

Slides for my NeuroFedora seminar at the UH Biocomputaiton group's weekly seminar

computational-neuroscience data-analysis neurofedora neuroimaging neuroscience open-science

Last synced: 19 Feb 2026

https://github.com/virajbhutada/hollywood-insights-tableau

Strategic cinematic insights through Hollywood's data landscape. Tableau-driven analytics for genre, studio profitability, and audience dynamics. Uncover trends, assess audience reception, and navigate through years of film data, elevating your understanding of the cinematic world.

analystics business-intelligence dashboard data-analysis data-visualization entertainment hollywood storytelling tableau tableau-desktop visualization

Last synced: 05 Feb 2026

https://github.com/skuschel/postexperiment

postprocessor for experimental (event based) data.

data-analysis eventstore hacktoberfest postprocessing

Last synced: 12 Jun 2026

https://github.com/kunalpisolkar24/winequalityprediction

Predicting wine quality using machine learning with matplotlib, numpy, pandas, and seaborn for insightful data analysis. 🍇🤖📊

data-analysis data-science data-visualization machine-learning prediction-model

Last synced: 16 Oct 2025

https://github.com/shahriarha/sql

Structured query language

data-analysis mysql mysql-database sql

Last synced: 02 Sep 2025

https://github.com/adilshamim8/eda-on-health-and-sleep-data

Exploratory Data Analysis (EDA) on health and sleep data, uncovering patterns and insights using Python and visualization tools.

data-analysis data-visualization eda health healthcare sleep sleep-analysis

Last synced: 15 Mar 2025

https://github.com/devexpress-examples/wpf-pivot-grid-connect-to-an-olap-datasource

This example shows how to specify connection settings to the server and create fields that relate to specific measures and dimensions of the cube for the Pivot Grid for WPF.

data-analysis dotnet dxpivotgrid pivot-grid pivot-grid-for-wpf wpf xpf

Last synced: 06 May 2026

https://github.com/hase3b/flask-dash-interactive-dashboard

An interactive data visualization dashboard created using Flask and Dash. This project includes comprehensive data preparation, exploratory data analysis (EDA), and dynamic visualizations with Seaborn and Plotly. Explore the multi-page Dash app with features like dropdowns and callbacks for updated plots.

callbacks dash dashboard data-analysis data-visualization dropdown eda flask interactive plotly seaborn web-app

Last synced: 19 May 2026

https://github.com/abelarduu/power_bi_analyst

Projeto Power BI para relatório de dados financeiros, com navegação intuitiva e recursos interativos. Oferece uma experiência completa ao usuário, combinando apresentação sofisticada e funcionalidade eficaz para análise de dados.

dashboard data-analysis data-analytics modelagem-de-dados powerbi tratamento-de-dados

Last synced: 08 Sep 2025

https://github.com/prgermux/data-plotter

This Python application provides a graphical user interface (GUI) for analyzing and visualizing data from various sources. It uses the PyQt5 framework for the GUI and Matplotlib for plotting data. The application supports multiple file formats, allows users to select any columns for the X and Y axes, and provides dynamic plots.

automation data-analysis plott python

Last synced: 12 Jun 2026

https://github.com/fatihilhan42/nba-players-data-1950-to-2021

In this project, the data of the NBA players between the years 1950-2021 were examined. After the NBA players' season, height, performance, averages of points, teams and positions they played were obtained through csv files, important tables and graphs were created using data cleaning and data visualization algorithms.

data data-analysis data-engineering data-science data-visualization

Last synced: 16 Oct 2025

https://github.com/supertetelman/coursera-exdata-09

This repo contains several R scripts that were used to analyze, plot, and clean data from various datasets. These projects were part of the Coursera course, Exploratory Data Analysis. The end results of the analysis are included.

big-data course coursera data-analysis r

Last synced: 16 Oct 2025

https://github.com/abdullahashfaqvirk/powerbi-dashboards

A collection of Microsoft Power BI dashboards and reports designed to address business challenges and support data driven decision-making.

dashboards data-analysis data-driven data-science microsoft powerbi reports visualization

Last synced: 10 Mar 2026

https://github.com/deller23/hotel_booking_data_cleaning

Efficiently transforming raw hotel booking data into actionable insights! This project leverages Python and Pandas for advanced data cleaning—handling missing values, detecting outliers, and optimizing features—ensuring a high-quality dataset ready for analysis and modeling.

data-analysis data-cleaning data-preprocessing data-visualization data-wrangling pandas python

Last synced: 31 Mar 2025

https://github.com/balajimohan18/sql-projects

The repository contains Structured Query Language (SQL) Scripts. The Multiple SQL scripts for various projects which includes data cleaning, data pre-processing, data processing, data transformation and insights gaining through Query Language

data-analysis data-mining data-science eta microsoft-sql-server query-language sql sql-server sql-server-management-studio sqlqueries

Last synced: 14 Mar 2026

https://github.com/mattdelaune/excel_sales_dashboard

Interactive Excel Dashboard for Coffee Sales Analysis: This project leverages Excel to analyze sales data, uncover seasonal trends, regional preferences, and customer behaviors, providing actionable insights for optimizing inventory and marketing strategies.

data-analysis excel pivot-tables sales-dashboard sales-data

Last synced: 27 Jan 2026

https://github.com/mohammad-malik/covid-visualizations-d3

This project provides a dashboard with five different perspectives on the pandemic, from patient-infection relationships to regional trends and hierarchical distributions. This was developed as part of a project for the course Data Analysis and Visualization (DS3001).

covid-19 d3 d3-visualization d3js data data-analysis data-analytics data-science visualization

Last synced: 28 May 2026

https://github.com/benjaminrose/data-analysis-book

A Jupyter Book for my Spring 2025 PHY 5381 class on Data Analysis

book data-analysis data-science data-visualization jupyter-book open-book python r statistics-course

Last synced: 06 May 2026

https://github.com/walid0912/rfm_analysis

RFM Analysis is employed to comprehend and categorize customers according to their purchasing patterns. RFM, an acronym for recency, frequency, and monetary value, comprises three essential metrics that offer insights into customer involvement, allegiance, and significance to a business.

data-analysis data-visualization python rfm-analysis

Last synced: 02 Sep 2025

https://github.com/pizofreude/da-with-r

Data analysis with R data centric programming language

data-analysis r

Last synced: 17 Oct 2025

https://github.com/felpzreiz/stockdata_pipeline

Este projeto consiste no desenvolvimento de um pipeline de dados que consome informações financeiras de uma API da Bolsa de Valores Americana (StockData.org) para análise e tratamento. Utilizando Python e bibliotecas como pandas, matplotlib e pyarrow

api data-analysis data-science jupyter-notebook pandas python

Last synced: 19 Apr 2026

https://github.com/loginchik/mid_contracts

Анализ контрактов государственных закупок МИДа РФ

data-analysis dataset pandas python

Last synced: 17 Apr 2025

https://github.com/prateek5525/online-shopping-analytics-project

The Online Shopping Analytics Project analyzed product trends, and regional sales using SQL and Tableau. Insights from the Sales and Location Dashboards highlighted key trends in demographics, product popularity, and regional performance. These findings empower businesses to optimize strategies, enhance marketing, and improve inventory management.

data-analysis excel kaggle-dataset sql tableau

Last synced: 20 Feb 2026

https://github.com/sambit-mondal/stockx

StockX is a full-stack application designed to help store owners efficiently manage their inventory, track purchases, and analyze stock levels. The system integrates MongoDB, Express, React, and Flask (Python) to provide a seamless experience.

artificial-intelligence data-analysis inventory-management-system machine-learning mern-stack

Last synced: 12 Jun 2026

https://github.com/ryuzen6/bangalore-real-estate-price-prediction

This is a Data Science Project which predicts the cost of Real Estate in Bangalore. Requirements: Jupyter Notebook (for Data Cleaning and creating the Linear Regression using various python libraries) , Pycharm (python IDE for creating Python Flask Server), Visual Studio Code (to create the UI with HTML, CSS and Javascript).

css3 data-analysis data-science html5 javascript jupyter-notebook machine-learning python3

Last synced: 06 May 2026

https://github.com/abhijeet107/task-4

Design an interactive dashboard for business stakeholders.

data-analysis excel-csv tableau-dashboards tableau-public

Last synced: 22 Jan 2026

https://github.com/omkar2503/credit-risk-dashboard

A SQL-based Credit Risk Scoring System visualized using Metabase

credit-risk dashboard data-analysis data-analytics metabase postgresql sql

Last synced: 01 Jul 2025