An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/msthamizh/phonepe-pulse-data-visualization-and-exploration

Developing a Streamlit application that allows users to explore and analyze transaction data from the PhonePe Pulse dataset. The project aims to provide insights into digital payment trends across India.

data-analysis data-visualization dataframe mysql pandas plotly python streamlit

Last synced: 02 May 2026

https://github.com/seankwarren/water-quality-analysis

An examination of water quality in the Atlanta watershed with a focus on identifying neglected areas and potential strategies for improving water quality monitoring

analytics data-analysis jupyter-notebook python

Last synced: 03 May 2026

https://github.com/fybex/chatgpt-conversations-analysis

Analysis of 89,000 ChatGPT conversations to understand interaction patterns and response behaviors.

chatgpt conversation-analysis data-analysis data-visualization language-analysis prompt-patterns sentiment-analysis

Last synced: 02 May 2026

https://github.com/programmer-rd-ai/moviedatascraper

Explore the cinematic universe with our IMDb web scraping project! Dive into movie data with ease, uncovering insights from cast to critical reviews. With dynamic visualizations and reliable data, let's journey through the world of movies like never before. Lights, camera, analysis!

beautifulsoup beautifulsoup4 data data-analysis jupyter-notebook matplotlib numpy pandas programming python python3 scraping seaborn software web

Last synced: 01 Mar 2025

https://github.com/programmer-rd-ai/dimensionality-reduction

DimRed is a comprehensive Python toolkit for advanced dimensionality reduction, integrating with major machine learning libraries and featuring real-time performance monitoring to enhance data analysis and model efficiency.

analytics data-analysis data-science lightgbm machine-learning matplotlib numpy pandas programming python python3 sklearn university xgboost

Last synced: 01 Mar 2025

https://github.com/madhuresh2011/telco-customer-churn-analysis-using-python

The analysis primarily investigates factors influencing customer churn, particularly focusing on payment methods and contract types.

csv data-analysis matplotlib numpy pandas pyhton seaborn vizualisation

Last synced: 02 May 2026

https://github.com/webuccinoco/mysql-pivot-tables

Build complex MySQL pivot tables without touching a single line of code. This free PHP tool lets you visually connect your database and map out your data sources with a few simple clicks.

business-analytics business-intelligence crosstab data-analysis data-analytics data-visualization mysql mysql-database mysql-pivot-table mysql-reports mysql-virtualization php php-pivot-table php-reports pivot-tables reporting-tools

Last synced: 04 Feb 2026

https://github.com/as16082023/hotel-booking-analysis-eda-

Exploratory Data Analysis on hotel booking data using Python

data-analysis data-visualization exploratory-data-analysis jupyter-notebook python

Last synced: 29 Apr 2026

https://github.com/allanotieno254/awsome-chocolate-company-sales-analysis-dashboard

This repository contains an in-depth analysis of chocolate consumption trends, focusing on various factors influencing consumer preferences, production, and market performance.

data-analysis data-science data-transformation measures powerbi sales-analysis visualization

Last synced: 23 Feb 2026

https://github.com/anandanraju/youtube-data-api-model

The YouTube Analytics API enables you to generate custom reports containing YouTube Analytics data. The API supports reports for channels and for content owners. Report fields are characterized as either dimensions or metrics

analytics data-analysis data-science metrics model python telemetry youtube youtube-api

Last synced: 03 May 2026

https://github.com/manikantasanjay/time_series_data_analysis_on_stocks

Time Series Data Analysis project on Daily Stock Prices of the following companies(Apple, Microsoft, Google, Amazon) for a span of 5 years.

data-analysis pandas stock time-series time-series-analysis

Last synced: 03 May 2026

https://github.com/nikhilash45/live_ipl_report

This repository hosts the source code for an interactive IPL (Indian Premier League) Dashboard built using PowerBI. The dashboard provides real-time updates on ongoing matches, including live scores, batting and bowling statistics for both teams, and the points table.

analysts cleaning-data cricket-data dashboard data data-analysis data-visualization dax powerbi

Last synced: 19 Mar 2026

https://github.com/jossimmar/ensa-scripts_py

Repositorio destinado al manejo de datos de consumo de los Clientes Mayores de ENSA del Grupo Distriluz.

data-analysis electrical-engineering python sqlite

Last synced: 10 May 2026

https://github.com/theairbend3r/mice-memory-response

Effect of memory on current response in mice using methods from computational neuroscience and machine learning.

computational-neuroscience data-analysis data-science machine-learning neuroscience python

Last synced: 09 Jun 2026

https://github.com/danhenriquex/data-science-project

The main goal of this project was to apply the concepts of data visualization and analysis.

data-analysis data-science numpy pandas python

Last synced: 12 Apr 2026

https://github.com/sijuswamy/data-analytics-using-r

Course Repository for Data Analysis using R- Add-on course

data-analysis

Last synced: 12 Apr 2025

https://github.com/nomadsdev/sys-moninsight

System Monitoring and Analysis Tool is a utility for real-time performance tracking. It logs CPU, memory, and disk usage, provides visual graphs, and offers performance recommendations. Perfect for optimizing system efficiency.

automation cpu-usage data-analysis data-visualization disk-usage matplotlib memory-usage performance-analysis performance-optimization psutil python real-time-monitoring resource-management sys-moninsight system-metrics

Last synced: 19 Jun 2026

https://github.com/gappeah/nike_web_crawler

This project involves web scraping Nike's product pages to extract product names, prices and links. The project showcases three different implementations of the web crawler using Selenium and BeautifulSoup. It also includes visualisation of the scraped data using Matplotlib and Seaborn.

beautifulsoup data-analysis data-visualization python selenium web-crawler web-scraper webcrawler webscraper webscraping webscraping-beautifulsoup

Last synced: 04 Jul 2025

https://github.com/0xjeremy/me-18-final

Data collection and Analysis tools for IMUs

data-analysis imu raspberry-pi

Last synced: 03 May 2026

https://github.com/chouaib-629/customersegmentation

Hadoop-based Customer Segmentation project using the Online Retail Dataset. Implements MapReduce for processing and Python for preprocessing to uncover customer purchasing patterns for targeted marketing.

big-data customer-segmentation data-analysis data-science distributed-computing hadoop hadoop-mapreduce java mapreduce marketing-analytics python

Last synced: 04 May 2026

https://github.com/uchida16104/healthanalysis

It abstracts the health status of each device from its operational time calculated from RescueTime and analyzes the data.

data-analysis portfolio portfolio-website security security-tool

Last synced: 02 Feb 2026

https://github.com/akash1070/project---applied-statistics-

To dive deep into this data & find some valuable insights.

data-analysis data-science python statistics

Last synced: 30 Apr 2026

https://github.com/angelgardt/wlm-sdarp-old

World of Linear Models: Statistics & Data Analysis in R for Psychologists

data-analysis data-visualization gh-pages manim-animations quarto r rstudio statistics

Last synced: 04 May 2026

https://github.com/lobooooooo14/badwords-pt-br

💬 Wordlist com palavrões em pt-BR para análise de dados, filtros, ou texto considerado "evitável"

badword-filter badwords brasil data-analysis filter filter-lists filterlist portugues portuguese text-analysis wordlist

Last synced: 25 Mar 2025

https://github.com/sanam2405/chatinfo

Analysing the WhatsApp Chat with my crush over a 6M period

data-analysis data-visualization python

Last synced: 27 Apr 2026

https://github.com/shuddha2021/stellar-candidate-selector

A sophisticated candidate selection algorithm leveraging multi-criteria analysis and machine learning to identify top software engineering candidates. This tool features flexible filtering, score adjustment, and detailed visualizations to streamline the recruitment process.

candidate-selection data-analysis data-visualization machine-learning pandas plotting-in-python python python-data-analysis recruitment scikit-learn

Last synced: 05 May 2026

https://github.com/jeniljani-4444/end-to-end-world-cup-analysis-web-app

Our streamlined Streamlit web app fetches and processes ESPN CricInfo data delivering dynamic graphs for a quick and engaging cricket experience. Deployed on AWS EC2 with CI/CD pipelines.

aws-ec2 data-analysis plotly preprocessing streamlit-webapp

Last synced: 02 Apr 2025

https://github.com/tks18/pyquery

PyQuery is a local-first data operating system built on lazy execution that processes 100GB+ files while you doomscroll. No cap. 🧢

data-analysis data-science etl hdfs parquet pipeline polars python

Last synced: 14 Jan 2026

https://github.com/myounus-codes/saleprice-prediction-dataset-analysis-and-cleaning-advance-regression

In this project I have cleaned the data for the model. Project Google Colab Link: https://colab.research.google.com/drive/1vQY-XEFJSdEkW2PQOSf1j13Yk8L-XXNw?usp=sharing

algorithms data-analysis data-science eda google-colab machine-learning numpy pandas python scikit-learn scikit-learn-python

Last synced: 05 May 2026

https://github.com/githubuseraccountamazing/the-amari-project

a project in which I attempted to push some of the limits of stable-diffusion while taking some data along the way

ai ai-generated-images bash data-analysis machine-learning stable-diffusion textual-inversion

Last synced: 05 May 2026

https://github.com/scarblase/homeless-animals-analysis

A data-driven exploration of homeless animal statistics 🐶🐱. Analyze age distribution, shelter dynamics, and adoption patterns using Python, Pandas, and Seaborn.

animals data-analysis data-mining data-science data-science-projects data-visualization matplotlib matplotlib-pyplot numpy pandas plotly python python3 ukraine

Last synced: 06 May 2026

https://github.com/gholamrezadar/favourite-youtube-channels

this program goes through your youtube watch history and sorts channels based how many of their videos you have watched!

data-analysis data-visualization python

Last synced: 16 Jan 2026

https://github.com/amirhosseinhonardoust/customer-sentiment-intelligence-platform

An enterprise-grade NLP + Streamlit + SQL platform for analyzing customer feedback. Performs automated sentiment detection, stores labeled reviews in SQLite, and delivers real-time dashboards with probability insights to support business, marketing, and product optimization decisions.

community-project cost-of-living dashboard data-analysis data-visualization economic-analysis inflation-tracking local-data open-data pandas price-tracker public-insight python sqlite streamlit

Last synced: 06 May 2026

https://github.com/namratha2301/best-selling-books

Comprehensive examination of best-selling books, focusing on understanding sales patterns, genre distributions, and the impact of various features on book performance.This project aims to predict book sales and classify genres, providing valuable insights for authors, publishers, and readers.

data-analysis data-visualization matplotlib pandas sckiit-learn seaborn

Last synced: 06 May 2026

https://github.com/flexmonster/svelte-flexmonster

Svelte wrapper for Flexmonster Pivot Table & Charts

data-analysis data-visualization frontend pivot-tables svelte sveltekit

Last synced: 27 Feb 2026

https://github.com/patriloto/intro_r_para_reinventartec_2021

Material del taller Primeros pasos en R para el análisis de datos

data-analysis rstats

Last synced: 12 Feb 2026

https://github.com/eslamdyab21/imdb-data-analysis

This data set contains information about 10,000 movies collected from The Movie Database (TMDb), including user ratings and revenue

data-analysis pandas python udacity-data-analyst-nanodegree

Last synced: 06 May 2026

https://github.com/gauranshgoel123/predictive-demand-analysis

Demand Forecasting Project A web application for predicting future demand for part numbers based on historical data. Built with React for the frontend and FastAPI with Python for the backend, this application visualizes demand trends and allows users to input additional data for improved accuracy. In render analyzer is frontend analysis is backend

chartjs data-analysis data-science data-visualization dataset deployment full-stack machine-learning numpy pandas predictive-analysis prophet-model python reactjs render

Last synced: 13 Apr 2026

https://github.com/thevinh-ha-1710/diabetes-predictive-model

This project aims to train a predictive model to diagnose diabetes on women patients.

data-analysis data-science data-visualization model-training-and-evaluation python

Last synced: 13 Feb 2026

https://github.com/abhinavsharma07/fraud_analytics-credit_card_fraud_detection

The aim of this project is to predict fraudulent credit card transactions with the help of different machine learning models.

banking data-analysis decision-trees hyperparameter-optimization machine-learning-algorithms pipelines random-forest-classifier svm-classifier xgboost-classifier

Last synced: 06 Oct 2025

https://github.com/backdoorali/insider-threat-detection-project

Personal data analysis project combining insider threat detection, cybersecurity, and exploratory data analytics. Built for portfolio showcase and practical skills demonstration.

cybersecurity data-analysis data-analysis-excel data-analysis-project data-analyst data-analytics data-visualization eda excel insider-threat jupyter-lab jupyter-notebook matplotlib numbers pandas portfolio-project python python3 threat-detection threat-intelligence

Last synced: 07 May 2026

https://github.com/gorodroz/crypto-tracker

Realtime Bitcoin price tracker using Binance WebSocket and REST API. Logs prices to CSV and supports Pandas for data analysis.

binance bitcoin crypto csv-logger data-analysis pandas python rest-api websocket

Last synced: 07 May 2026

https://github.com/bcko/ud-da-stroopeffect

Udacity Data Analyst Nanodegree Project : Test a Perceptual Phenomenon (Stroop Effect)

data-analysis data-analyst-nanodegree stroop-effect udacity udacity-data-analyst-nanodegree

Last synced: 04 Jul 2025

https://github.com/bassamn/titanic-data-analysis

Exploratory data analysis (EDA) of the Titanic dataset using Python. Analyzed survival patterns by age, gender, and class with visualizations (seaborn/matplotlib). Non-ML focus—highlighting insights with statistics and plots.

data-analysis eda pandas python seaborn titanic visualization

Last synced: 08 May 2026

https://github.com/muneeb1030/dataannotation

This streamlines the process of annotating data for machine learning tasks, making it easier and more efficient for teams to create labeled datasets by leveraging Label Studio and Bulk

bulk data-analysis data-annotation label-studio python

Last synced: 10 May 2026

https://github.com/gab-182/market-analysis-report-for-national-clothing-chain

Using custom M and DAX codes in Power BI, I conducte a thorough market analysis for a national clothing chain. The insights gathered from customer data and US Census Bureau statistics led to the formulation of a targeted marketing strategy, contributing to enhanced sales and customer satisfaction.

data-analysis power-bi

Last synced: 19 Mar 2026

https://github.com/geo-y20/loan-approval-automation-using-mongodb-and-pymongo

This project demonstrates the implementation of a loan approval system that utilizes MongoDB for distributed data storage and management, and PyMongo for database operations. The project aims to automate the assessment of loan eligibility using customer details from online applications.

crud-application data data-analysis data-science data-visualization deployment jupyter-notebook loan-default-prediction loan-prediction-analysis machine-learning machine-learning-algorithms matplotlib mongodb pymongo streamlit web

Last synced: 08 May 2026

https://github.com/aravindnathan02/sales-and-customer-analytics

This is a repository for sales and customer performance Tableau dashboard.

customer-dashboard dashboard data-analysis data-visualization sales-analysis sales-dashboard tableau

Last synced: 08 Jan 2026

https://github.com/prashver/dashboard-gallery

These dashboards provide insights across diverse domains, including cryptocurrency sales, workforce challenges, disease impact analysis, and retail trends. Leveraging tools like Power BI and Excel, they offer actionable insights for decision-making.

cryptocurrency dashboards data-analysis data-profession data-visualization market-segmentation-analysis microsoft-excel monkey-pox powerbi product-analysis retail-trends

Last synced: 15 Feb 2026

https://github.com/pradeepchegur/seamantic_web_design

We designed a semantic web for Instagram in Wix platform.

data-analysis framework instagram semantic-web website-design wix

Last synced: 19 Mar 2026

https://github.com/iguptashubham/ott-churn-eda-ml

Understanding why customers discontinue their subscriptions will be crucial in optimizing the user experience, reducing churn, and maximizing customer lifetime value. By using Machine learning model to predict the Customer Churn.

data-analysis data-analysis-project data-science data-science-portfolio data-science-projects data-visualization machine-learning python

Last synced: 08 May 2026

https://github.com/jethronap/jstat-gui

Web-based GUI application for data analysis

data-analysis data-visualization java jstat mongodb

Last synced: 08 May 2026

https://github.com/nafisalawalidris/hici-african-foods

HiCi African Foods: Excel dashboard & pivot table analysis of EU food rejection data to identify risks & recommend focus areas for market expansion.

data-analysis data-cleaning data-visualization eu-food-rejection excel-dashboard hici-african-foods market-expansion pivot-tables

Last synced: 19 Mar 2026

https://github.com/miroslav-reiter/kurz_jazyk_sql_analytici_datovi_vedci

Materiály ku kurzu Jazyk SQL 1 pre Analytikov a Dátových Vedcov

analysis analytics data data-analysis data-science database mysql reiter sql

Last synced: 08 May 2026

https://github.com/framebuffers/mindhunter

Wrappers for Pandas DataFrames to add quicker access for common statistical values, utilities and functionality.

data-analysis data-science numpy pandas python utilities-python

Last synced: 08 May 2026

https://github.com/aekanshd/crazytics-suicidesindia

Basic interpretation of the Suicides in India data-set using R.

data-analysis data-science graph india r suicides

Last synced: 10 Jun 2026

https://github.com/edisedis777/duckdb-analyzer

A powerful tool for analyzing large CSV datasets using DuckDB.

csv data-analysis database duckdb

Last synced: 16 Apr 2026

https://github.com/leosimoes/datascienceacademy-powerbi-3.0

Projetos do curso Microsoft Power BI Para Data Science Versão 3.0 da DataScienceAcademy. Dashboards para diversos casos de negócios.

business-intelligence dashboards data-analysis data-visualization microsoft-power-bi

Last synced: 19 Mar 2026

https://github.com/antononcube/wl-quantileregression-paclet

Wolfram Language (aka Mathematica) paclet that provides various Quantile Regression functions.

data-analysis machine-learning quantile-regression time-series time-series-analysis

Last synced: 20 Mar 2026

https://github.com/mxagar/airbnb_data_analysis

An analysis of the AirBnB dataset from Euskadi / the Basque Country.

airbnb data-analysis data-science eda feature-engineering modeling pandas regression

Last synced: 25 Apr 2026

https://github.com/tnleite/projeto_king_lift

Este projeto apresenta uma análise detalhada dos dados financeiros da King Lift, uma empresa de locação de empilhadeiras. Utilizando Microsoft Excel, Power Query e Power Pivot, desenvolvi um dashboard interativo, também em Excel, que ajuda a empresa a obter insights valiosos para melhorar a eficiência operacional e aumentar o faturamento.

data-analysis data-science data-visualization excel

Last synced: 19 Mar 2026

https://github.com/sedatdikbas/aefes-time-series-forecasting

Bu proje, Anadolu Efes Biracılık ve Malt Sanayii A.Ş. (AEFES) piyasa verilerini kullanarak kapanış fiyatlarının gelecekteki değerlerini tahmin etmek amacıyla derin öğrenme yöntemleri (LSTM, BiLSTM, CNN+LSTM) kullanmaktadır. Projede, veri ön işleme, model eğitimi ve değerlendirme adımları detaylandırılmıştır.

bilstm cnn-lstm data-analysis deep-learning financial-forecasting lstm machine-learning python stock-price-prediction tensorflow

Last synced: 09 May 2026

https://github.com/harshmule1/store-sales-analysis

Sales Analysis Using Power Bi

data-analysis powerbi

Last synced: 19 Mar 2026

https://github.com/dina-hosny/explore-us-bike-share-data-project

Explore US Bike Share Data project - FWD Data Analysis Professional Track. In this project, I used Python to explore data related to bike share systems for three major cities in the United States and answer questions about it by computing descriptive statistics.

data-analysis data-science numpy pandas python

Last synced: 09 May 2026

https://github.com/dogan-the-analyst/developer_survey_analysis

Analysis of the 2024 Stack Overflow developer survey. Tools used include Python, Pandas, Matplotlib, and IBM Cognos.

data-analysis data-visualization ibm-cognos-analytics matplotlib pandas python

Last synced: 09 May 2026

https://github.com/sunnybibyan/marketing_campaign_analysis_power_bi_dashboard

Campaign Performance Analysis This project analyzes the performance of Spring, Summer, and Fall marketing campaigns, revealing key insights and actionable recommendations.

data-analysis data-visualization dax marketing-campaign powerbi

Last synced: 19 Mar 2026

https://github.com/mariam-badr-mb/gtc-ml-project2-diabetes-prediction

This project is part of the GTC Machine Learning Program. It demonstrates the end-to-end ML workflow by building a predictive model for diabetes detection

classification-algorithm data-analysis data-visualization diabetes-prediction gridsearchcv hyperparameter-tuning machine-learning python

Last synced: 09 May 2026

https://github.com/aritrakar/statpy

A simple package containing some functions for analysing Gaussian and Binomial distributions. Created for the Udacity AWS MLE Foundations 2021 course.

data-analysis python statistics

Last synced: 24 Oct 2025

https://github.com/ahammadshawki8/playing-with-pandas

🐼 Pandas is one of my favourite library in python. It is well-known for "Analyzing" data. Learn basics and beyond the basics of Pandas from this repository. 🤍🖤

beginner-friendly data-analysis favourite-library pandas python

Last synced: 17 Apr 2026

https://github.com/santiagortiiz/snowflake-data-warehousing

Snowflake University. Snowflake Data Warehousing. Foundamentals

big-data data-analysis data-warehouse olap snowflake

Last synced: 19 Mar 2026

https://github.com/gabrielmpinho/cs50-sql

Solutions and notes from CS50’s Introduction to Databases with SQL. Covers CRUD operations, data modeling, normalization, joins, views, indexes, and connecting SQL with Python and Java. Begins with SQLite for portability and introduces PostgreSQL and MySQL for scalability.

data-analysis data-structures data-visualization database databases javascript python sql

Last synced: 10 May 2026

https://github.com/whis99/userfunnelanalysis

An ecommerce user funnel conversion data analysis with matplotlib & python.

data-analysis data-analysis-python data-analyst data-visualization google-colab jupyter-notebook matplotlib python

Last synced: 13 Apr 2026

https://github.com/steno-aarhus/legliv

Substitution of red meat with legumes and risk of primary liver cancer in UK Biobank participants: A prospective cohort study

cancer-research data-analysis epidemiology nutritional-epidemiology nutritional-science open-science reproducibility reproducible-research rstats ukbiobank

Last synced: 03 Mar 2026

https://github.com/sunnybibyan/exploratory-data-analysis-eda

Welcome to the Titanic Dataset - Exploratory Data Analysis (EDA) project repository! This project aims to uncover insights from the Titanic dataset using Python and Jupyter Notebook. By analyzing key variables such as age, gender, and class, we aim to visualize relationships between passenger characteristics and survival rates.

data-analysis data-visualization jupyter-notebook python titanic-dataset

Last synced: 18 Jan 2026

https://github.com/pratik-khose/data-analysis-with-pandasai

PandasAI with Llama3 for Interactive Data Analysis

data-analysis llama3 llma pandasai streamlit visualization

Last synced: 11 May 2026

https://github.com/chayandatta/got_script_manipulation

Game of Thrones Script - String & file manipulation

data-analysis data-science pandas python3

Last synced: 11 May 2026

https://github.com/easycris-software/easycris

Professional statistical analysis and RNA-seq for researchers — no coding required

anova bioinformatics data-analysis desktop-app genomics pharmacology research-tools rna-seq statistics tauri

Last synced: 11 May 2026

https://github.com/targetta/ankaflow

YAML-based data pipeline framework that runs both locally and fully in-browser designed for data engineers, ML teams, and SaaS developers who need flexible, SQL-powered pipelines.

bigquery clickhouse data-analysis dataops deltalake duckdb elt-pipeline etl etl-automation motherduck parquet python sql

Last synced: 09 Oct 2025

https://github.com/muneeb1030/webscrapper_mastodon

The Mastodon Social Platform Scraper is a Python-based web scraping tool designed to explore and extract valuable data from the Mastodon social platform.

data-analysis data-collection mastodon python3 scrapy scrapy-spider selenium-python webscraping

Last synced: 09 Oct 2025

https://github.com/0xpr03/clantool

CF Management & Data Analysis Tool, crawler backend in rust

backend-server crawler data-analysis rust

Last synced: 05 Feb 2026

https://github.com/sleeplessglory/big-data

Projects regarding big data analysis, presented within Jupyter Notebook

big-data data-analysis data-visualization jupyter python

Last synced: 16 Apr 2026