An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/wtmcgrew/sql-credit-risk-analysis

Credit Risk Analysis using SQL & Excel – Approval trends by FICO, DTI, PTI, LTV, and delinquencies.

case-study credit-risk data-analysis financial-analysis loan-applications portfolio-project sql sqlite underwriting

Last synced: 04 Jul 2025

https://github.com/sarthakagg29/sql-share-trading-analysis

Analysis of share trading transactions using SQL. Includes table setup, sample data, and a variety of queries to answer typical business questions about stocks and trading.

data-analysis dbeaver portfolio postgresql share-market sql

Last synced: 04 Jul 2025

https://github.com/xza85hrf/excel-comparison-app

Excel Comparison Application is a Python-based tool that compares two Excel files and generates a new Excel file with the differences. It's primarily designed to help in database updating by identifying new clients. The app also has a graphical user interface for easier use and logs operations for potential troubleshooting.

case-sensitive-comparison data-analysis data-difference database-comparison database-updates excel-comparison file-merging file-processing gui-application new-client-detection python

Last synced: 25 Mar 2025

https://github.com/siddhant2105s/airman-database-system

This repository contains the design and implementation of the AirMan System for managing airport operations at London Biggin Hill Airport. It includes an ERD diagram, MySQL scripts for database creation, data insertion, and queries, as well as detailed data definitions and system requirements documentation.

data-analysis database-design database-normalization entity-relationship-diagram entity-relationship-models mysql relational-databases sql-queries

Last synced: 25 Mar 2025

https://github.com/alinenog/desenvolve_gb_2022

Formação Desenvolve 2022 do Grupo Boticário na área de dados

data-analysis data-science googlesheet machine-learning numpy pandas python

Last synced: 13 Apr 2026

https://github.com/hazim-hf/data-science

This course covers basic data science principles, Python programming, and the concept of big data and its types. It explores algorithms, methods, and analyses in data science with practical Python examples. Additionally, it highlights current data technologies for storing and archiving.

data-analysis data-wrangling time-series

Last synced: 04 Jul 2025

https://github.com/nafiealhilaly/first-dash-app

A simple dash plotly app to explore and analyze imagined students assessment dataset

data-analysis data-analytics data-visualization eda plotly-dash python

Last synced: 02 Apr 2025

https://github.com/analysisbyvivek/Road-Accident

Analyzes road accident patterns, exploring factors like lighting, weather, speed limits, time of day, and road conditions to uncover trends in severity and frequency.

data-analysis data-visualization eda jupyter-notebook kaggle tableau-public

Last synced: 29 Jan 2026

https://github.com/parthds02/e-commerce-data-analysis-with-python

This project focuses on analyzing an e-commerce dataset using Python. The goal is to derive meaningful insights through exploratory data analysis (EDA) and uncover trends and patterns that can drive business decisions.

data-analysis ecommerce exploratory-data-analysis jupyter-notebook pytho sales-analysis visualization

Last synced: 13 Jun 2025

https://github.com/ireneflorez/nypd-mvc

Analysis of NYPD Motor Vehicle Collisions

basemap data-analysis folium jupyter-notebook matplot pandas python

Last synced: 08 May 2026

https://github.com/extwiii/datascience-jhu

Ask the right questions, manipulate data sets, and create visualizations to communicate results - Coursera

biostatistics data-analysis data-science linear-regression multivariate-regression r r-programming toolbox visualization

Last synced: 05 Jul 2025

https://github.com/jameswrigley/laph

A node-based data analysis program.

cpp data-analysis nodes qml

Last synced: 05 Jun 2026

https://github.com/mr-chang95/udacity_movie_project

Movie Data Analysis and Visualization Project for Udacity's Data Analyst Program. Using Python in Jupyter Notebook.

data-analysis data-visualization jupyter-notebook movie python

Last synced: 13 Apr 2026

https://github.com/abhisek-13/fake_news_classifier

The Fake News Classifier is a TensorFlow-based machine learning project that detects and classifies fake news with 97% accuracy. The repository includes a single Python file with complete code for building and training the model, which you can use to create and deploy your own model.

colab-notebook data-analysis data-engineering deep-learning eda kaggle keras machine-learning nlp pandas python tensorflow

Last synced: 13 Apr 2026

https://github.com/ray-chew/pycsam

pyCSAM is a robust approach for approximating geodesic subgrid-scale orographic spectra with applications to weather forecasting and broader data analysis

data-analysis gmted icon-model merit-dem orographic spectral-analysis topography weather-forecast

Last synced: 28 Feb 2025

https://github.com/singhrdeep/croppilot

CropPilot is a lightweight, Python-based command-line tool designed to help small-scale farmers, gardeners, and students manage crop data, track profits, and explore sustainable practices. Built for usability and extensibility.

agriculture data-analysis farm-management open-source python

Last synced: 25 Apr 2025

https://github.com/anthonytlei/graphsql

Lightweight SQL-to-GraphQL connector for querying GraphQL endpoints using SQL syntax.

connector data-analysis dbapi graphql graphsql python sql sqlalchemy superset

Last synced: 09 Apr 2026

https://github.com/hemangsharma/streamingcontentanalyzer

This Streamlit application provides an interactive dashboard for analyzing streaming content data. It allows users to explore movie and TV show ratings, distributions, temporal trends, and genre breakdowns through various visualizations and filters.

dashboard data-analysis data-science data-visualization python streamlit-dashboard streamlit-webapp

Last synced: 02 Apr 2025

https://github.com/mchirico/go_slicestore

Pull Data from Slice Store

data-analysis go ibm

Last synced: 16 Mar 2025

https://github.com/luminati-io/Target-dataset-samples

A sample dataset of over 1000 target products, extracted using the Bright Data API, ideal for brand reputation, tracking inventory, and optimizing prices.

api data-analysis data-mining datasets target web-scraper web-scraping

Last synced: 09 Apr 2025

https://github.com/deliprofesor/virtual-reality-in-education-impact-analysis-and-insights

This project examines the impact of Virtual Reality (VR) on education, focusing on its effects on student engagement, learning outcomes, and creativity. It uses data analysis techniques like descriptive statistics, correlation analysis, and clustering to assess VR's effectiveness in enhancing learning.

clustering data data-analysis data-science data-visualization exploratory-data-analysis hypothesis-testing machine-learning python regression-analysis virtual-reality

Last synced: 14 Jun 2025

https://github.com/shellynagar27/good-cabs-data-analysis-project

This project is part of CodeBasics Challenge #13, where the goal was to provide actionable insights to the Chief of Operations at Goodcabs, a cab service provider in tier-2 cities of India. The project focused on analyzing key metrics like trip volume, repeat passenger rate, and passenger satisfaction.

critical-thinking data-analysis data-visualization excel exploratory-data-analysis power-bi presentation problem-solving sql storytelling

Last synced: 25 Jan 2026

https://github.com/mehedi-hassan81/mastercourse

Data analysis project analysing renewable energy production across 212 countries, visualizing trends with Tableau. Highlights China's dominance (2,894 TWh) and Paraguay's 100% renewable share.

data-analysis pandas python renewable-energy selenium tableau-dashboards tableau-public web-scraping

Last synced: 08 May 2026

https://github.com/ravi-prakash1907/covid-19-china

A data-science research work to understand the growth rate of the novel Coronavirus.

china coronavirus covid-19 data-analysis data-mining data-science mathematical-modelling project r research research-paper

Last synced: 06 Sep 2025

https://github.com/giyanellow/time-series-analysis-on-philippine-debt-and-inflation

A Time Series Analysis on the Philippine Inflation Rate with some predictions using RandomForest.

data-analysis data-analysis-python machine-learning python random-forest

Last synced: 18 Mar 2026

https://github.com/diligencefrozen/dcinside-data

Analyzing the Dcinside Frozen Gallery Dataset. #디시

data-analysis dataset

Last synced: 30 May 2026

https://github.com/fbarffmann/vba-challenge

Built an Excel VBA script to automate stock market analysis across multiple years. Programmatically calculated and visualized key financial metrics, reducing manual reporting time and improving data accuracy.

automation data-analysis excel excel-vba financial-analysis reporting stock-market vba

Last synced: 04 Feb 2026

https://github.com/weisswuerste/polars-eurovision-analytics

Analytics example using both the Pandas and Polars libraries

data-analysis data-analytics pandas polars python python-3 python3

Last synced: 08 May 2026

https://github.com/abhisek-13/whatsapp-chat-analyzer

The WhatsApp Chat Analyzer is a data analysis project that provides insights into WhatsApp chats. It analyzes chat data to show metrics like the number of lines, most used letter, chatting duration, media files shared, most used emojis, and group member activity. The results are displayed on a user-friendly dashboard built with Streamlit.

data-analysis data-mining data-visualization eda machine-learning machine-learning-algorithms matplotlib numpy pandas python seaborn sklearn

Last synced: 13 Apr 2026

https://github.com/isaqueiros/newspapersales-predictions-linearregression_and_regularisation

This notebook is a study on the sales of newspapers of a local stand, with intention to predict the newspaper sales performance based on the different features available. For this, 4 sklearn models are applied: Linear Regression, Lasso Regression, Ridge Regression and Elastic Net Regression.

data-analysis data-science linear-regression machine-learning python regularization-methods sklearn-library sklearn-linear-regression

Last synced: 02 May 2026

https://github.com/lucalullo/monitoring-healthcare-waiting-times-puglia

Monitoring and analysis of public healthcare waiting times in Puglia (Italy), 2024 — based on official open data

data-analysis healthcare italy jupyter-notebook kaggle open-data pandas public-data puglia time-series waiting-times

Last synced: 08 Jan 2026

https://github.com/hyperentangledqubit/shellplot

shellplot -- Generate plot(s) directly from terminal via matplotlib or ggplot2 (plotnine)!

data-analysis ggplot2 graphics matplotlib plotnine plotting pyplot terminal

Last synced: 10 May 2026

https://github.com/karishmagupta05/udemy-course-analysis

This project analyzes Udemy courses using Exploratory Data Analysis (EDA) techniques to uncover insights about course trends, pricing, subscriber counts, and popularity. By leveraging Python, Pandas, and data visualization libraries, we extract meaningful information from the dataset.

data-analysis data-visualization eda jupiter-notebook pandas python

Last synced: 13 Apr 2026

https://github.com/devexpress-examples/winforms-pivot-change-summarydisplaytype-in-context-menu

The following example shows how to customize the field header's context menu in the PivotGridControl.PopupMenuShowing event handler. The event allows you to change the SummaryDisplayType value of the field.

data-analysis dotnet pivot-grid pivot-grid-for-winforms winforms xtrapivotgrid-suite

Last synced: 06 May 2026

https://github.com/devexpress-examples/web-forms-pivot-grid-custom-summary-values

This example demonstrates how to determine the value type when you calculate custom summary values in Pivot Grid for Web Forms.

asp-net-web-forms data-analysis dotnet pivot-grid pivot-grid-for-web-forms

Last synced: 06 Jul 2025

https://github.com/deypadma2020/dataanalysis-mlalgo

Practice repository for data analysis, feature engineering, statistics, web scraping, and building ML model pipelines in Python.

data-analysis eda feature-engineering machine-learning-algorithms ml-pipeline statistics web-scraping

Last synced: 30 May 2026

https://github.com/codeslash21/wrangle-twitter-archive

Wrangle Twitter Archive WeRateDog. WeRateDog has 8M followers and they rate the dogs with funny comments and unique rating system. Also use dog-breed classifier to predict dog's breed in the tweets.

data-analysis data-wrangling neural-networkt twitter-api twitter-archive

Last synced: 10 Apr 2025

https://github.com/codeslash21/wrangle_twitter_archive

Wrangle Twitter Archive WeRateDog. WeRateDog has 8M followers and they rate the dogs with funny comments and unique rating system. Also use dog-breed classifier to predict dog's breed in the tweets.

data-analysis data-wrangling nanodegree-project neural-network twitter-api twitter-archive

Last synced: 10 Apr 2025

https://github.com/sadratehranian/data-collection-and-machine-learning

create a model using logistic regression to predict whether the fire alarm of a smoke detector should sound or not. Second, predicts whether an electric drive in a production plant may be faulty or not.

data data-analysis data-science datacollection logistic-regression machine-learning ml nn

Last synced: 05 Jan 2026

https://github.com/pratik-khose/realtime-sales-simulation

Power BI: Realtime Sales Simulation using SQL Server and Direct Query

data-analysis data-analytics data-visualization dax-query powerbi sql sql-server sqlserver

Last synced: 10 Jun 2026

https://github.com/esther-poniatowski/multitask-context-dependent-behavior

Data analysis of neuronal recordings in naive and trained animals performing multiple tasks in active and passive attentional states

cognitive-neuroscience computational-neuroscience data-analysis data-visualization information-processing

Last synced: 26 Mar 2025

https://github.com/satyam4229/prediction-of-different-diseases

Prediction of the different diseases with the help of different symptoms express the diseases in the real time. In the dataset, there are 132+ different symptoms on which the model is trained to give the best result of the disease.

data-analysis data-science data-visualization jupyter-notebook kaggle python

Last synced: 13 Apr 2026

https://github.com/mattholy/haka

HaKa is an out-of-the-box tool system designed for data engineers and data analysts in medium-sized enterprises. It is easy to deploy and scale.

celery data-analysis data-engineering fastapi python uvicorn-gunicorn

Last synced: 19 May 2026

https://github.com/fbarffmann/mycitibike

Built an interactive Leaflet.js map visualizing over 750 Citi Bike station locations in NYC. Analyzed usage patterns, station density, and user navigation across the network.

citibike data-analysis data-visualization geojson geospatial interactive-map javascript leaflet nyc web-mapping

Last synced: 07 Jul 2025

https://github.com/mainak-97/weather-data-analysis-using-python

A comprehensive analysis of time-series weather data using Python and Pandas, focusing on data exploration, cleaning, and uncovering insights.

data-analysis jupyter-notebook pandas pandas-dataframe python python3 time-series-analysis

Last synced: 08 May 2026

https://github.com/subratamondal1/heart-attack-prediction

Heart Attack Prediction of patients based on the required data. Data Ingestion - Data Preparation - Exploratory Data Analysis (EDA) - Modelling - Evaluation.

data-analysis data-science data-visualization kaggle-dataset machine-learning matplotlib-pyplot numpy pandas python3 scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/ilke-kas/multivariate-data-analysis

A curated collection of R-based data analysis projects applying regression modeling, clustering, dimensionality reduction, multivariate statistics, and classification. Each project showcases practical data science techniques, interpretability, and domain insights using real-world and academic datasets.

classification data-analysis data-visualization dimensionality-reduction machine-learning multivariate-analysis r regression statistics

Last synced: 05 Oct 2025

https://github.com/seblehner/feldprakt

Collection of plotting routines for a field exercise work using different measurement tools and Hobo weather stations.

data-analysis data-visualization jupyter-notebook python

Last synced: 05 Oct 2025

https://github.com/jianxi-erin/bigdata-machinelearning-lab

本项目是一个综合性的大数据与机器学习实验平台,包含两个主要任务,每个任务涵盖三个关键技术模块:大数据处理、数据分析和机器学习。项目基于真实的竞赛设计,提供完整的数据处理模拟和建模实践。

data-analysis data-visualization hadoop machine-learning python spark sql

Last synced: 03 May 2026

https://github.com/mahmoud2abdallah/improvado-marketing-homework

This Looker Studio dashboard provides a comprehensive analysis of marketing performance for August 2024, transforming raw data into actionable insights for data-driven decision making.

bigquery business-intelligence data-analysis looker-studio marketing

Last synced: 05 Oct 2025

https://github.com/vishal786-commits/target-businesscasestudy-sql

This project analyzes Target’s e-commerce transactions in Brazil between 2016 and 2018 using SQL. The goal was to explore customer behavior, order patterns, payments, delivery times, and freight costs to generate actionable business insights.

bigquery data-analysis sql

Last synced: 05 Oct 2025

https://github.com/affan005-ai/tesla-stock-prediction

This project analyzes Tesla stock data and builds machine learning models to predict and classify stock movements. The analysis includes EDA, feature correlation, moving averages, and two models

data data-analysis data-science data-visualization-project eda machine-learning matplotlib pandas predictive-analytics predictive-modeling python scikit-learn

Last synced: 05 Oct 2025

https://github.com/trismald/eurosoccer1023

Data Analyst - European Soccer 2010 2023

data-analysis data-visualization jupyter-notebook pandas powerbi python

Last synced: 06 May 2026

https://github.com/marielachirinosr/analysis-urgencias-hospital-pitalito

This project involves analyzing emergency room admission data from the E.S.E Hospital Departamental de Pitalito using a star schema model.

bigquery data data-analysis etl-pipeline tableau

Last synced: 21 Jan 2026

https://github.com/subhamghimire/dataanavis

Learning Data analysis and visualization

data-analysis data-science data-visualization dataset

Last synced: 06 Oct 2025

https://github.com/dacq-trap/dacq

DacQ Score Server

data-analysis go nextjs

Last synced: 14 Jan 2026

https://github.com/nuriadevs/informes-powerbi

Este repositorio contiene informes elaborados con Power BI.

data-analysis powerbi

Last synced: 18 Feb 2026

https://github.com/prarthana-singh/bangalore-house-price-predictor

🏡 Bangalore House Price Prediction – A Machine Learning model to predict house prices in Bangalore using real estate data. Built with Linear Regression, Python, Pandas, NumPy, and Scikit-Learn.

data-analysis eda house-price-prediction linear-regression machine-learning numpy pandas python real-estate regression scikit-learn

Last synced: 19 Apr 2026

https://github.com/banyc/csv_logger

Long-term logger for data analysis

csv data-analysis logging

Last synced: 07 Oct 2025

https://github.com/npodlozhniy/podlozhnyy-module

One place for the most useful methods for work

data-analysis data-science pypi

Last synced: 21 Jan 2026

https://github.com/pranavsp108/market_basket_analysis-instacart

Customer segmentation and market basket analysis using the Instacart dataset with Python, Pandas, and K-Means clustering.

customer-segmentation-and-buying-behavior data-analysis data-visualization instacart jupyter-notebook kmeans-clustering market-basket-analysis pandas python scikit-learn

Last synced: 05 May 2026

https://github.com/sorebit/pdrpy-pd-2

Data analysis of various stackechange.com archives.

data-analysis stackexchange time-travel university-project

Last synced: 08 Oct 2025

https://github.com/debjyotisaha/power-bi-projects-phase-2

Created interactive dashboards and reports using Power BI to visualize complex datasets. Demonstrated proficiency in data modelling, DAX calculations, and storytelling through data to provide actionable insights.

dashboards data-analysis data-modeling data-visualisation power-query powerbi

Last synced: 18 Jan 2026

https://github.com/faisal-khann/ipl-analysis

The IPL Analysis project is a comprehensive data-driven exploration of the Indian Premier League (IPL), analyzing historical match data to uncover patterns in team performance, player statistics, and match outcomes.

data-analysis exploratory-data-analysis jupyter-notebook matplotlib numpy pandas seaborn

Last synced: 08 May 2026

https://github.com/takshshah-16/pizza_sales_sql

SQL-powered pizza sales analytics project using MySQL Workbench to derive business insights through data exploration and queries.

business-intelligence data-analysis database-management mysql sql

Last synced: 09 Oct 2025

https://github.com/mirwais-farahi/data-visualization-with-tableau-specialization

The Specialization provides Tableau for data visualization and business intelligence. The series covers skills like assessing data quality, designing visualizations and dashboards, and combining data sources to create compelling, data-driven stories.

dashboard data-analysis geospatial map tableau visualization

Last synced: 16 Feb 2026

https://github.com/adithya2369/safa_public

AI-powered customer feedback analyzer that uses generative AI to transform customer reviews into actionable business insights. Upload review data, get instant summaries, satisfaction scores, detailed reports, and improvement suggestions—all in an easy-to-deploy Docker container.

data-analysis data-visualization docker-containerization full-stack-development generative-ai langchain langchain-groq web-development

Last synced: 10 Oct 2025

https://github.com/samuelsoaress/wkd-default-reduction

reduction of default from 35% to 25% or less with machine learning techniques

data-analysis data-exploration data-science machine-learning-algorithms

Last synced: 10 Oct 2025

https://github.com/loaiwalid07/automation_data_overviwe

This is Streamlit app that gives an overview for a dataset you upload

automation data data-analysis data-exploration data-science data-transformation data-visualization

Last synced: 19 May 2026

https://github.com/salma-mamdoh/the-android-app-market-on-google-play-project

My project aims to practice Data Analysis and Data Visualization on DataCamp

data-analysis data-visualization datacamp jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 12 Apr 2026

https://github.com/jiwookseo/natural_language_analysis

api sample for google natural language and ECOS(한국은행 경제통제시스템)

data-analysis google-natural-language-api text-analysis

Last synced: 11 Oct 2025

https://github.com/vinay-jose/territorial-sales-dashboard

EDA was carried out in the sales data of Atliq Technologies and a Dashboard was created in PowerBI to draw insights.

data-analysis data-visualization powerbi-desktop sql

Last synced: 11 Oct 2025

https://github.com/ahsankhizar5/titanic-eda-visualization

Exploratory Data Analysis and Visualization on the Titanic Dataset using Python, Pandas, Matplotlib, and Seaborn to uncover survival patterns.

data-analysis data-science data-visualization eda kaggle machine-learning matplotlib pandas python seaborn titanic-dataset

Last synced: 31 May 2026

https://github.com/NurFakhri/scraping-and-analysis-skincare

Scraping and data analysis of Indonesian skincare reviews.

beutifulsoup data-analysis data-scraping python requests review scraping-websites

Last synced: 12 Oct 2025

https://github.com/alexondata/daan_eda-exploratory-data-analysis_ecommerce

This project presents an Exploratory Data Analysis (EDA) pipeline for an eCommerce dataset, integrating Python, SQL Server, and Power BI to transform raw transactional data into meaningful business insights. The project was developed as part of an academic assignment at Transilvania University of Brașov, Faculty of Mathematics and Computer Science.

data-analysis data-visualization ecommerce microsoft-sql-server powerbi python

Last synced: 18 May 2026

https://github.com/chirlmin-joo-lab/papylio

Single-molecule fluorescence trace extraction and analysis

biophysics data-analysis fluorescence fret single-molecule sparxs

Last synced: 12 Oct 2025

https://github.com/veronsheva/hr_dashboards

Interactive HR dashboard using Tableau & MySQL – explore employee trends, performance, attrition, and salary insights.

calculated-fields charts cte dashboards data-analysis data-cleaning design eda mysql queries tableau window-functions

Last synced: 24 Jan 2026

https://github.com/jsimell/sleepanalysis

A Python data analysis project analyzing the sleep quality affecting factors and temporal patterns in the sleeping data of a single subject.

data-analysis matplotlib numpy pandas python scikit-learn seaborn

Last synced: 14 Apr 2026