An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/imrandil/excel_learning_dir

Excel learning practice with some data, the doing

data-analysis datasets excel

Last synced: 27 Jan 2026

https://github.com/extwiii/datascience-jhu

Ask the right questions, manipulate data sets, and create visualizations to communicate results - Coursera

biostatistics data-analysis data-science linear-regression multivariate-regression r r-programming toolbox visualization

Last synced: 05 Jul 2025

https://github.com/scarlet-enlight/ml_project

Comparison of different classifiers (KNN, Naive Bayes, Decision Tree) on Sleep Health and Lifestyle Dataset

data-analysis machine-learning

Last synced: 13 Mar 2026

https://github.com/lucs1590/agidatatest

This is a repository with data analysis and data science tests.

data-analysis data-science python test

Last synced: 13 May 2026

https://github.com/pranav016/exploratory-data-analysis-of-google-app-store-dataset

This is a data analysis done on the Google app store dataset to answer a few questions related to the data through data visualization techniques.

data-analysis

Last synced: 11 Oct 2025

https://github.com/cezlul/analyse-ventes-immobilier

Solution ML d'analyse immobilière parisienne : classification automatique appartements vs commerces (K-means, 91%) et prédiction prix (régression, R²=0.98) sur 26K transactions. Valorise portefeuille 169M€ avec recommandations stratégiques data-driven.

data-analysis jupyter-notebook machine-learning matplotlib numpy pandas python sklearn

Last synced: 13 Apr 2026

https://github.com/jiwookseo/natural_language_analysis

api sample for google natural language and ECOS(한국은행 경제통제시스템)

data-analysis google-natural-language-api text-analysis

Last synced: 11 Oct 2025

https://github.com/montanaz0r/testing-if-mma-math-deduction-works-using-ufc-fighters-data

The probabilistic reasoning about phenomenon called MMA math using UFC fighters data and Python.

bayesian-inference data-analysis data-science graphviz jupyter-notebook pandas python scipy statistics

Last synced: 14 Apr 2026

https://github.com/dhruvil-26/tableau-projects

This repository contains Tableau visualization projects focused on data analysis across different domains. Projects include: 1. IPL Visualization - Insights into IPL match, Team and player statistics. 2. EV Analysis - Visualizations exploring the adoption of electric vehicles. 3. Road Accident Analysis - Analysis of road accident patterns

analysis data data-analysis data-analytics electric-vehicles ipl road-accident-analysis tableau tableau-public

Last synced: 19 Jan 2026

https://github.com/silvermete0r/sdu_hackathon_uss_db_analysis

Smart Data Ukimet Hackathon - "Data Modeling" case Solution - Topic: Store Analysis based on Unified Star Schema

data-analysis data-modeling postgresql python sql unified-star-schema

Last synced: 14 Apr 2026

https://github.com/thinzarhninyu/dap

Notes and Projects for Data Analysis with Python course from FreeCodeCamp.org

data-analysis data-analysis-python ipynb jupyter-notebook python

Last synced: 18 Feb 2026

https://github.com/mituskillologies/dkte-da-mar25

Programs conducted at DKTE's Engineering Institute, Ichalkaranji in training on Python Data Analytics March 2025.

data-analysis matplotlib numpy pandas python-programming tkinter-python

Last synced: 13 May 2026

https://github.com/treasarose/us_candy_distribution_analysis_project

This project focuses on advanced data analysis and optimization using SQL. It includes queries for analyzing sales, product margins, and shipping efficiency for a US candy distributor.

data-analysis entity-relationship mssql optimization query sql-server sqlproject us-candy-distributor

Last synced: 12 Oct 2025

https://github.com/NurFakhri/scraping-and-analysis-skincare

Scraping and data analysis of Indonesian skincare reviews.

beutifulsoup data-analysis data-scraping python requests review scraping-websites

Last synced: 12 Oct 2025

https://github.com/hassanislam463/data-cleaning-and-modelling-top-5-categories-analysis-forage

This project involves cleaning, merging, and analyzing datasets to identify the top 5 performing categories based on aggregate popularity scores. It includes cleaned datasets, a final merged dataset, visualizations, and a presentation summarizing the tasks and results. Tools used: Microsoft Excel, Python, and PowerPoint.

data-analysis data-visualization microsoft-excel

Last synced: 07 Jan 2026

https://github.com/jeffbrennan/analysis-templates

Templates of commonly used graphics/functions/settings to help focus on the bigger picture

data-analysis r rmd

Last synced: 12 Oct 2025

https://github.com/akash1070/project--uber-data-analysis

To Determine UBER data from the dataset using Python

data-analysis data-science python

Last synced: 09 May 2026

https://github.com/amoghkori/effect-of-box-office-on-unemployment

Data preparation and cleaning process for movie ratings and reviews dataset and US unemployment rate dataset, involving an 8-step data wrangling process to create an Analytic Base Table (ABT) structure, emphasizing data structuring techniques, cleaning for outliers and missing values, and the importance of accurate and reliable data for analysis.

data-analysis data-cleaning data-preprocessing data-validation data-wrangling model-selection

Last synced: 13 Jun 2025

https://github.com/chirlmin-joo-lab/papylio

Single-molecule fluorescence trace extraction and analysis

biophysics data-analysis fluorescence fret single-molecule sparxs

Last synced: 12 Oct 2025

https://github.com/parthds02/e-commerce-data-analysis-with-python

This project focuses on analyzing an e-commerce dataset using Python. The goal is to derive meaningful insights through exploratory data analysis (EDA) and uncover trends and patterns that can drive business decisions.

data-analysis ecommerce exploratory-data-analysis jupyter-notebook pytho sales-analysis visualization

Last synced: 13 Jun 2025

https://github.com/borjamome/accidentes_madrid

Análisis de Accidentes en Madrid en SQL (2023)

accidentes-coche data-analysis madrid sql

Last synced: 17 Jan 2026

https://github.com/alexgenovese/react-charts-covid-19-data

Examples on COVID-19 data using different library charts: G2, G2Plot, Plotly, ApexCharts

data-analysis data-science data-visualization react reactjs

Last synced: 13 May 2026

https://github.com/alefrp/properties_dbt

A DBT project for analyzing city property data.

data-analysis data-warehouse dbt python sql

Last synced: 13 Oct 2025

https://github.com/gmalbert/supreme-court

Data Analysis of the US Supreme Court from 1790 to present

data-analysis data-science supreme-court

Last synced: 31 May 2026

https://github.com/angelalim88/jakarta-air-quality-index-classification

This project classifies Jakarta's Air Quality Index (AQI) from 2010 to 2023 using machine learning models (Random Forest, MLP, SVM) based on pollutant concentrations.

data-analysis data-visua machine-learning scikit-learn tensorflow

Last synced: 13 Oct 2025

https://github.com/analysisbyvivek/Road-Accident

Analyzes road accident patterns, exploring factors like lighting, weather, speed limits, time of day, and road conditions to uncover trends in severity and frequency.

data-analysis data-visualization eda jupyter-notebook kaggle tableau-public

Last synced: 29 Jan 2026

https://github.com/gmalbert/rugby

Rugby Data Analysis and Sports Betting

data-analysis rugby sports-betting

Last synced: 31 May 2026

https://github.com/darrenjolson/pba-analysis-app

Data analysis and visualization tool for professional bowling tournaments, predicting performance across different oil patterns and venues.

bowling data-analysis data-visualization flask pba predictive-analytics python reactjs sports-analytics

Last synced: 13 Apr 2026

https://github.com/inddrsingh/restaurant_orders_mysql

Complex SQL queries on restaurant data for better and precise insights

data-analysis insights mysql

Last synced: 28 Jan 2026

https://github.com/ibrahimhabibeg/national-university-of-singapore-sms-analysis

Analysis of SMS messages collected by the National University of Singapore

analytics data-analysis data-science nlp python

Last synced: 13 May 2026

https://github.com/giseletoledo/case-study-wellness-smart

Project from the coursera course Google Data Analytics

data-analysis kaggle-dataset r

Last synced: 14 Oct 2025

https://github.com/samkazan/business-analysis-tableau

Business Analysis on Global/Superstore data using Tableau.

analysis data-analysis tableau visualization

Last synced: 08 Feb 2026

https://github.com/ironlegion88/media_bias

An end-to-end NLP pipeline to analyze ideological bias in online news media during elections. Uses sentiment analysis, topic modeling (LDA/NMF), and NER to quantify media framing.

data-analysis machine-learning media-bias nlp nltk political-science python scikit-learn sentiment-analysis spacy topic-modeling

Last synced: 13 Apr 2026

https://github.com/nullthefirst/py-notebooks

Jupyter Notebooks holding Data Science projects

data-analysis data-science data-visualization datasets jupyter-notebooks python

Last synced: 26 Apr 2026

https://github.com/achrefbenammar404/quasi-patterned-conversations-analysis

Official Implementation of the IEEE EUROCON 2025 Paper A Computational Approach to Modeling Conversational Systems Analyzing Large-Scale Quasi-Patterned Dialogue Flows Mohamed Achref Ben Ammar – National Institute of Applied Science and Technology (INSAT), University of Carthage, Tunisia Mohamed Taha Bennani – University of Tunis El Manar (FST)

ai computational-linguistics conversational-agent conversational-ai data-analysis graph-algorithms nlp research research-paper

Last synced: 14 Oct 2025

https://github.com/ayorick23/python-data-science-cheat-sheet

Guía rápida y práctica de sintaxis, comandos y funciones esenciales de Python para Ciencia de Datos. Perfecta para recordar cómo usar las librerías más comunes como NumPy, Pandas, Matplotlib y Scikit-learn en tus análisis diarios.

cheat-sheet data-analysis data-science data-visualization deep-learning jupyter-notebook machine-learning matplotlib ml numpy pandas python scikit-learn scipy seaborn statistics sympy tensorflow

Last synced: 07 Apr 2026

https://github.com/asuquoaa/air_bnb_analysis_dashboard-tableau-

Interactive Tableau dashboards to analyze and visualize data, providing actionable insights for better decision-making

dashboard data-analysis interactive-visualization tableau

Last synced: 13 Mar 2026

https://github.com/deliprofesor/joblocationmapper

JobLocationMapper is a Python tool that visualizes job listings on an interactive map. It uses city and state data to place job markers accurately and color-codes them by occupation (Software, Marketing, Design). The map clusters markers for better organization, and users can click on them to view job details.

clustrered-markers data-analysis data-visualization folium geocoding geographical-visualization interactive-map job-listings map-visualization pandas python

Last synced: 14 May 2026

https://github.com/saisurajmatta/e-commerce-sales-advanced-data-analysis

Excel-based e-commerce analytics for FNP, a gift company. It covers data extraction, modeling, and visualization, providing actionable insights on revenue, customer behavior, and operations. Key skills include Excel, Power Query, Power Pivot, and DAX. The analysis culminates in data-driven business recommendations.

data-analysis data-visualization dax excel power-pivot power-query

Last synced: 22 Jan 2026

https://github.com/rohanrony19/movie-recommendation-system

This is a python project where using Pandas library we will find correlation and give the best recommendation for movies.

data-analysis deep-learning knn-algorithm numpy pandas python recommendation-system

Last synced: 14 Apr 2026

https://github.com/sanjayankur31/20181206-neurofedora

Slides for my NeuroFedora seminar at the UH Biocomputaiton group's weekly seminar

computational-neuroscience data-analysis neurofedora neuroimaging neuroscience open-science

Last synced: 19 Feb 2026

https://github.com/virajbhutada/hollywood-insights-tableau

Strategic cinematic insights through Hollywood's data landscape. Tableau-driven analytics for genre, studio profitability, and audience dynamics. Uncover trends, assess audience reception, and navigate through years of film data, elevating your understanding of the cinematic world.

analystics business-intelligence dashboard data-analysis data-visualization entertainment hollywood storytelling tableau tableau-desktop visualization

Last synced: 05 Feb 2026

https://github.com/kunalpisolkar24/winequalityprediction

Predicting wine quality using machine learning with matplotlib, numpy, pandas, and seaborn for insightful data analysis. 🍇🤖📊

data-analysis data-science data-visualization machine-learning prediction-model

Last synced: 16 Oct 2025

https://github.com/avratanubiswas/fluorpenplugin

A matlab user interface for analysing OJIP curve datasets from FluorPen instrument. That is, serving as an additional plug in for "quick categorical analysis".

data-analysis fluorpen ojip-curve

Last synced: 18 Mar 2026

https://github.com/hase3b/flask-dash-interactive-dashboard

An interactive data visualization dashboard created using Flask and Dash. This project includes comprehensive data preparation, exploratory data analysis (EDA), and dynamic visualizations with Seaborn and Plotly. Explore the multi-page Dash app with features like dropdowns and callbacks for updated plots.

callbacks dash dashboard data-analysis data-visualization dropdown eda flask interactive plotly seaborn web-app

Last synced: 19 May 2026

https://github.com/supertetelman/coursera-exdata-09

This repo contains several R scripts that were used to analyze, plot, and clean data from various datasets. These projects were part of the Coursera course, Exploratory Data Analysis. The end results of the analysis are included.

big-data course coursera data-analysis r

Last synced: 16 Oct 2025

https://github.com/fbarffmann/nosql-challenge

Analyzed 28,000+ UK restaurant records using MongoDB and PyMongo. Queried hygiene scores, location data, and customer ratings.

data-analysis data-cleaning database-analysis json mongodb nosql pymongo python restaurant-data

Last synced: 13 Apr 2026

https://github.com/itskshitija/lego-set-explorer

As a part of the Maven Analytics Lego challenge, I developed an interactive Power BI dashboard exploring the evolution of LEGO sets from 1970 to 2022.

data-analysis data-science data-visualization dataanalysis dataset powerbi powerbi-desktop powerbi-report

Last synced: 12 Jun 2025

https://github.com/sumit0ubey/internship

This repository showcases the tasks and projects I completed during various internships. It includes work across diverse domains such as: Data Analysis: Exploratory data analysis, data visualization, and insights generation using Python and libraries like Pandas, Matplotlib, and Seaborn. Backend Development: Designing and implementing RESTful API

backend-development data-analysis python-developer

Last synced: 05 Sep 2025

https://github.com/mattdelaune/excel_sales_dashboard

Interactive Excel Dashboard for Coffee Sales Analysis: This project leverages Excel to analyze sales data, uncover seasonal trends, regional preferences, and customer behaviors, providing actionable insights for optimizing inventory and marketing strategies.

data-analysis excel pivot-tables sales-dashboard sales-data

Last synced: 27 Jan 2026

https://github.com/mindlessmuse666/iris-ml-based-on-decision-trees

Проект демонстрирует применение моделей машинного обучения на основе деревьев решений и случайного леса для классификации набора данных Iris. Включает в себя загрузку данных, обучение моделей, оценку производительности и визуализацию результатов. Предназначен для изучения основ машинного обучения и анализа данных.

classification data-analysis data-visualization decision-trees iris-dataset machine-learning model-evaluation python random-forest scikit-learn

Last synced: 17 Oct 2025

https://github.com/fbarffmann/python-challenge

Automated financial and election data analysis using Python. Cleaned and transformed large CSV datasets, calculated key business metrics, and generated automated reports for stakeholders.

automation csv data-analysis data-cleaning election-analysis financial-analysis python reporting

Last synced: 24 Apr 2025

https://github.com/prateek5525/online-shopping-analytics-project

The Online Shopping Analytics Project analyzed product trends, and regional sales using SQL and Tableau. Insights from the Sales and Location Dashboards highlighted key trends in demographics, product popularity, and regional performance. These findings empower businesses to optimize strategies, enhance marketing, and improve inventory management.

data-analysis excel kaggle-dataset sql tableau

Last synced: 20 Feb 2026

https://github.com/codeslash21/communicate_data_findings

Analyze and visualize Bay Wheel system data which contains 2.5M individual trips data. And communicate the data findings from the dataset in the form notebook slide.

bay-wheel data-analysis data-visualization explanatory-data-visualization exploratory-data-analysis

Last synced: 22 Jan 2026

https://github.com/abhijeet107/task-4

Design an interactive dashboard for business stakeholders.

data-analysis excel-csv tableau-dashboards tableau-public

Last synced: 22 Jan 2026

https://github.com/casassg/ms_thesis

Social Media Analysis for Crisis Informatics in the Cloud

casassg-thesis data-analysis google-cloud kubernetes

Last synced: 19 Oct 2025

https://github.com/nsandoya/python_scrp_project

This is a tool specially made for Dipaso ecommerce website. You can extract data from there, analyze it and see keywords, brands, and categories frecuency, prices distribution and other market tendencies as well —all in a group of friendly stadistic tables and graphics (exported from a Jupyter notebook) :)

beautifulsoup4 data data-analysis jupyter-notebook pandas python3

Last synced: 28 Apr 2026

https://github.com/farhad-here/predict_student_performance

Predict Student Performance, is a data analysis and machine learning project aimed at predicting students' final performance (g3) based on demographic, family, and academic features. The project supports both Regression (predicting exact grades) and classification (Pass/Fail categories).

classification data-analysis data-visualization linear-regression machine-learning numpy pandas postgresql powerbi scikit-learn streamlit

Last synced: 14 Apr 2026

https://github.com/xza85hrf/excel-comparison-app

Excel Comparison Application is a Python-based tool that compares two Excel files and generates a new Excel file with the differences. It's primarily designed to help in database updating by identifying new clients. The app also has a graphical user interface for easier use and logs operations for potential troubleshooting.

case-sensitive-comparison data-analysis data-difference database-comparison database-updates excel-comparison file-merging file-processing gui-application new-client-detection python

Last synced: 25 Mar 2025

https://github.com/yulia-momotyuk/dla-data-analysis-practice

This repository contains my homework assignments completed during the "Data Analyst in IT" course at Data Loves Academy.

analytics data-analysis data-visualization excel mysql numpy pandas postgres powerbi python seaborn sql tableau

Last synced: 14 Apr 2026

https://github.com/Kaushik-Puttaswamy/Airline-Passenger-Referral-Prediction-Using-Machine-Learning

This project uses a machine learning model to predict if passengers referred by existing customers will book a flight, helping airlines target likely customers. Key factors like service ratings and value for money drive predictions, achieving over 90% accuracy.

airline-marketing customer-referral-prediction customer-satisfaction data-analysis feature-engineering hyperparameter-tuning machine-learning model-evaluation predictive-analytics

Last synced: 20 Oct 2025

https://github.com/shrinidhi857/simpledataanalysisonstartups

The Indian startup ecosystem has experienced remarkable growth over the past decade, becoming a hotbed of innovation and entrepreneurship. In this data analysis we are segregating fields ,finding new insights.

data-analysis data-science data-visualization indian-startups

Last synced: 17 Sep 2025

https://github.com/ashwin331133/sql-pizza-outlet-sales-analysis

This project analyzes pizza sales data to gain insights into customer behavior and revenue patterns. Key analyses include customer insights, popular pizza types and sizes, revenue generation, and order trends. The findings help optimize menu offerings, staffing, and marketing strategies to boost overall business performance.

data-analysis sql

Last synced: 24 Feb 2026

https://github.com/smoeding/jmeterplugin-datasketches

A JMeter listener using DataSketches to estimate response time quantiles and histograms

data-analysis jmeter jmeter-listeners jmeter-plugin

Last synced: 06 Mar 2025

https://github.com/wtmcgrew/sql-credit-risk-analysis

Credit Risk Analysis using SQL & Excel – Approval trends by FICO, DTI, PTI, LTV, and delinquencies.

case-study credit-risk data-analysis financial-analysis loan-applications portfolio-project sql sqlite underwriting

Last synced: 04 Jul 2025

https://github.com/muhammed-fazal/student-success-and-early-intervention-analytics-system

To consolidate scattered student performance records into a unified Data Warehouse in SQL Server. Engineer an Interactive Power BI dashboards that visualize academic trends, identifying student performance and implement predictive analytics.

analysis analytics dashboard data data-analysis data-engineering data-science data-visualization database etl etl-pipeline power-bi powerbi python sql sql-server

Last synced: 29 May 2026

https://github.com/navp7/hr_analysis_excel

This project utilizes Microsoft Excel to conduct a comprehensive analysis of HR data, focusing on identifying the various reasons for employee attrition and evaluating job satisfaction

dashboards data-analysis excel visualization

Last synced: 23 Jan 2026

https://github.com/dcs-training/spatial_dynamics

Use of QGIS and R to analyse first and second order geospatial effects. Go to the Readme file

data-analysis geographical-data gis qgis r statistics

Last synced: 23 Oct 2025

https://github.com/changyeop-yang/study-datasciencefoundation

Big Data Science and its Analytics plays a major role in this decade. How to clean and prepare your data for analysis is still a challenge, like How to perform basic visualization of your data, How to model your data, How to curve-fit your data, And finally, how to present your findings and wow the audience

data-analysis ios kyungpook-national-university swift

Last synced: 23 Oct 2025

https://github.com/prakshal0809/power-bi-analytics-dashboard

I have developed a dashboard in Power BI utilizing data from an Excel file. The dashboard effectively visualizes and analyzes the given data.

data-analysis powerbi

Last synced: 22 Feb 2026

https://github.com/gunifiri/duckdb-ghw

🦆 Accelerate analytics with DuckDB's integration for GitHub workflows, enabling efficient data handling and processing directly within your repositories.

analytics analytics-engine big-data columnar-storage data-analysis data-science database duckdb in-memory-database open-source parquet python query-planner r sql

Last synced: 29 Apr 2026

https://github.com/yeonjaee/data-analytics

converts raw data into actionable insights

data-analysis text-mining

Last synced: 11 Jun 2026