An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/parthds02/pizza_sales_sql

SQL project analyzing pizza sales data. Includes creating tables, executing queries, and solving basic to advanced analytical questions to derive insights from sales data.

analytics data-analysis data-science pizza-sales sql sql-query

Last synced: 04 Mar 2026

https://github.com/gaboelc/analysis-of-the-employment-situation-in-costa-rica-2018-2022

This is an analysis with data extracted from the INEC in order to identify the changes that occurred in the Costa Rican labor market before, during and after the COVID-19 pandemic.

costa-rica data-analysis empleo employment

Last synced: 24 Mar 2025

https://github.com/datastalker/survival-cox

This repository contains an R script for performing survival analysis on breast cancer surgery data from the University of Chicago's Billings Hospital. The analysis includes Kaplan-Meier estimation and Cox Proportional Hazards modeling to assess patient survival.

breast-cancer-prediction cox-model data-analysis data-science data-visualization epidemiology kaplan-meier r survival-analysis

Last synced: 02 Apr 2025

https://github.com/mothraa/etl-marketanalysis-webscraping-poo

OC project 2 refactoring (POO version not yet completed)

data-analysis etl poo python web-scraping

Last synced: 20 Oct 2025

https://github.com/kathkoeh/pimaindian-kk

Logistic regression analysis of diabetes risk using the Pima Indians dataset. Includes prevalence analysis, modeling, ROC/AUC evaluation, and patient testing in Python.

data-analysis diabetes epidemiology logistic-regression machine-learning public-health python

Last synced: 28 Apr 2026

https://github.com/krzysikd/apartment-prices-in-poland-analysis-and-visualization

Data Analyst portfolio project that involves cleaning, transforming, and visualizing data to create an insightful dashboard. The project uses SSIS for ETL processes, SSMS for database management and queries, and Power BI for data visualization, focusing on the analysis of rental and sales apartment prices in Poland.

data-analysis data-cleaning data-visualizations powerbi sql sqlserver ssis

Last synced: 04 Feb 2026

https://github.com/andersoncrs/prediccion-del-precio-de-vehiculos-un-enfoque-con-regresion-lineal-y-regularizacion

Este proyecto tiene como objetivo predecir el precio de vehículos usados utilizando técnicas de regresión lineal y regularización Lasso. A través del análisis y procesamiento de datos, se construye un modelo predictivo preciso e interpretable basado en las características más relevantes de cada vehículo.

data-analysis data-exploration lasso-regression machine-learning polinomial-regression regularization-methods

Last synced: 03 Jul 2025

https://github.com/leosimoes/nexoseducacao-imersao-powerbi

Atividades realizadas na Imersão PowerBI pela Nexos Educação com Karine Lago e Leticia Smirelli em Setembro de 2023.

business-intelligence dashboards data-analysis microsoft-power-bi

Last synced: 06 Jan 2026

https://github.com/navp7/hr_analysis_excel

This project utilizes Microsoft Excel to conduct a comprehensive analysis of HR data, focusing on identifying the various reasons for employee attrition and evaluating job satisfaction

dashboards data-analysis excel visualization

Last synced: 23 Jan 2026

https://github.com/juliuspinsker/bioconductor-learning-container

🧬 Containerized development environment for Harvard's Professional Certificate in Data Analysis for Genomics (PH525.x series). Streamlined setup for Bioconductor, R, and genomic data analysis with RStudio and DevContainer support.

bioconductor bioinformatics chip-seq data-analysis data-science devcontainer dna-methylation docker edx functional-genomics genomics harvard harvardx ph525 ph525x r reproducible-research rna-seq rstudio single-cell-rna-seq

Last synced: 14 May 2026

https://github.com/devanshsahu47/prime-content-analytics

Prime Data Explorer analyzes Amazon Prime's content and credits data to uncover trends in release years, genres, and ratings. It cleans, merges, and visualizes the data to provide actionable insights for optimizing content strategy and boosting audience engagement.

data-analysis data-visualization exploratory-data-analysis jupyter-notebook python3

Last synced: 13 May 2026

https://github.com/albertobarrago/sentinel

A contribute for the research of Corrado Malanga and Filippo Biondi

data-analysis sar

Last synced: 24 Oct 2025

https://github.com/noturlee/iris-dataanalyis

This project aims to classify Iris flowers into three species—setosa, versicolor, and virginica—based on their sepal and petal measurements using machine learning techniques. The dataset comprises 150 samples evenly distributed among these species

data-analysis data-modeling data-science data-structures-and-algorithms data-visualization

Last synced: 08 Apr 2025

https://github.com/jofaval/sonar

Binary Classification of Sonar Signals of Rocks and Metal cylinders in 1987

data-analysis data-science data-visualization machine-learning python scikit-learn sonar uci

Last synced: 09 Apr 2026

https://github.com/satyacoder29/crm-analytics-power-bi

CRM Analytics Dashboard – An interactive dashboard using Tableau, SQL, and Salesforce CRM Analytics (CRMA) to analyze sales performance, customer segmentation, and churn prediction. Features automated ETL pipelines, predictive analytics, and real-time insights for data-driven decision-making. 🚀📊

advanced-excel data-analysis data-cleaning data-collection data-transformation data-visualization matplotlib numpy pandas powerbi python seaborn sql tableau

Last synced: 14 Apr 2026

https://github.com/alchemine/analysis-tools

Analysis tools for machine learning projects

data-analysis explanatory-data-analysis machine-learning python

Last synced: 06 Aug 2025

https://github.com/ginga1402/youtube_analysis

Exploratory Data Analysis on YouTube data

college-project data-analysis pandas-python

Last synced: 30 Mar 2025

https://github.com/sehgal-vishal/sql-nyc-collision-analysis

this analysis is based on the Collisions(Accidents) happend in New York City. I have used Sql Server For EDA(Exploratory Data Analysis

data-analysis database eda sql-server

Last synced: 06 Feb 2026

https://github.com/aroramrinaal/spotistats

Spotistats is a data analysis and visualization project based on your Spotify streaming history.

data-analysis numbers spotify spotify-history visualization

Last synced: 15 Mar 2025

https://github.com/lotfiferaga/amazon-alexa-reviews-sentiment-analysis

Amazon Alexa, developed by Amazon, allows users to interact with technology through voice commands. Analyzing user sentiments about Alexa, with over 40 million users worldwide, is an intriguing data project.

classification data-analysis python sentiment-analysis

Last synced: 18 Jun 2026

https://github.com/vishalsiingh/deloitte-virtual-internship

Submission for the STEM Virtual Program by Deloitte via Forage.

coding cyber-security data-analysis deloitte development forage forensics

Last synced: 23 Jan 2026

https://github.com/bishopce16/pyber_analysis

The purpose of this project was to complete an exploratory analysis and create visualizations of the 2019 ride sharing data from PyBer.

data-analysis data-visualization jupyter-notebook matplotlib pandas python

Last synced: 04 May 2026

https://github.com/dcs-training/r-visualisation-and-stats

This repository contains material from a 8 classes course on Data Visualisation and statistics with R

data-analysis data-visualisation data-wrangling intro-to-programming r statistics

Last synced: 20 Jun 2026

https://github.com/alunera-data/alunera-data

Hi, I’m Yvonne – building data solutions at the intersection of BI, SQL & Service Management

business-intelligence data-analysis data-engineering data-science github-profile portfolio rstats sql

Last synced: 28 Jan 2026

https://github.com/valentinoli/swiss-foodprint

Project in Applied Data Analysis, EPFL 2019

carbon-emissions data-analysis diet foodprint swiss switzerland

Last synced: 24 Jan 2026

https://github.com/divyanshugit/indian-judiciary-analysis

Analysis of Indian district court data across states.

classification data-analysis

Last synced: 02 Jul 2025

https://github.com/hdgiacon/power_bi_projects

Repositório contendo cursos, dashboards e projeto relacionados à análise de dados e Power BI.

data-analysis data-engineering data-visualization microsoft-power-bi

Last synced: 24 Jan 2026

https://github.com/anilyigitsel/tourist-attraction-data-analysis

This project analyzes tourism trends from 2017 to 2021, focusing on visitor numbers, ratings, and attraction popularity during these years.

data-analysis data-visualization excel sql tourism

Last synced: 26 Jan 2026

https://github.com/anuragmudgal96/data-warehouse-project

Designing and implementing a modern data warehouse on SQL Server, covering ETL pipelines, dimensional modeling, and analytical reporting.

data-analysis data-engineering data-warehouse datawarehousing etl etl-job etl-pipeline sql sql-server

Last synced: 09 Oct 2025

https://github.com/tasosfotiadis/time-series-forecasting-for-bitcoin

This project forecasts Bitcoin’s daily closing price using time series models. Data from Jan 2021 to Mar 2022 is processed by converting timestamps, resampling, and handling missing values. LSTM and ARIMA models are evaluated on MAE, RMSE, and MAPE, with LSTM achieving better accuracy while ARIMA is faster in training and inference.

arima bitcoin data data-analysis data-science deep-learning forecasting jupyter-notebook neural-networks python time-series

Last synced: 06 May 2026

https://github.com/srimantapal205/dataengineerwireframedesigns

Data Engineer Wireframe Designs are essential for planning and visualizing data pipelines, architecture, and workflows before implementation.

data-analysis data-engineering dataflow dataflow-programming datapipeline dataprocessing development visualization

Last synced: 29 Jan 2026

https://github.com/andreicirciumaru/best-of-breed

CSV fundamentals screener: schema validation + market-cap weights

csv data-analysis finance pandas python screener

Last synced: 15 Apr 2026

https://github.com/angchekar28/sales-report-power-bi

A Power BI sales report analyzing country-wise and product-wise sales trends. Includes dashboards, decomposition trees, and key influencers analysis for business insights.

dashboard data-analysis data-cleaning data-visualization powerbi sales-report

Last synced: 16 Mar 2026

https://github.com/wareflowx/excel-toolkit

A powerful command-line toolkit for Excel and CSV data manipulation, analysis, and transformation.

data-analysis data-wrangling excel pandas python uv

Last synced: 29 Jan 2026

https://github.com/badranalyst/residential-unit-prices-data-analysis-application

Python-based analysis of residential unit prices, focusing on data cleaning, visualization, and exploratory data analysis (EDA). Key features include price distribution, and correlation analysis between factors like size, location, and pricing.

data-analysis data-visualization dataset matplotlib numpy pandas python seaborn

Last synced: 05 May 2026

https://github.com/smahala02/magnetism-lab

This repository contains Python scripts and data for analyzing inductance in toroidal coils to calculate the magnetic permeability of ferrite materials. The project helps classify materials as soft or hard magnets based on experimental data.

data-analysis inductance jupyter-notebook magnetism python toroids

Last synced: 29 Jan 2026

https://github.com/mfakhriazhar/healthcare-dashboard-project

This project is a comprehensive data analysis and visualization of healthcare data using Power BI. It focuses on understanding patient distribution, billing trends, and hospital performance through a clean and interactive dashboard.

dashboard dashboardreporting data-analysis datacleaning excel powerbi powerquery

Last synced: 30 Jan 2026

https://github.com/wojtekdomino/titanic-eda

Exploratory Data Analysis (EDA) of Titanic dataset using Pandas, Matplotlib, and Seaborn.

data-analysis eda matplotlib pandas python seaborn

Last synced: 10 Jun 2025

https://github.com/sajjad425/edaipl

The dataset covers the Indian Premier League (IPL) with details on matches (date, teams, venue, results), player stats (runs, wickets), team stats (wins, losses), season summaries, and umpire info. The EDA reveals patterns and insights, highlighting dominant teams, star players, and trends across seasons.

data-analysis eda exploratory-data-analysis ipl python

Last synced: 05 May 2026

https://github.com/tralahm/parliament-2017-dataset

Concise, Clean data sets of the 2017 Kenyan General Election results for the Members of the Senate and National Assembly Composition

csv-parsing data-analysis data-visualization datasets election-data ipynb-jupyter-notebook kaggle-dataset kenya-constituencies kenya-counties matplotlib python3 tralahtek

Last synced: 31 Jan 2026

https://github.com/jujulis18/olympicsmedalsdashboard

Olympic Dashboard – Paris 2024 est un tableau de bord interactif permettant d’explorer les performances des athlètes médaillés des Jeux Olympiques d’été de Paris 2024.

dashboard data-analysis data-visualization eda olympic python streamlit

Last synced: 31 Jan 2026

https://github.com/traore-07/fedex-sales-analysis

Analysis of the FedEx Sales Transaction

data-analysis data-visualization sales-analysis tabeau

Last synced: 31 Jan 2026

https://github.com/allanotieno254/bank-loan-analysis-dashboard-power-bi

An interactive Power BI dashboard that analyzes bank loan data to provide insights into approval trends, default risks, and customer profiles. Designed to assist financial institutions in making data-driven lending decisions.

bank-loans business-intelligence dashboard data-analysis financial-analysis power-bi risk-assessment

Last synced: 31 Jan 2026

https://github.com/malthejorgensen/repx

Python regular expression file transformer

command-line-tool data-analysis text-processing

Last synced: 31 Jan 2026

https://github.com/mikeesto/ausvotes19

:bird: A collection of 67,284 public tweets published on the night of the 2019 Australian election

australia data-analysis data-visualization elections open-data twitter

Last synced: 06 Apr 2025

https://github.com/balajimohan18/foreign-exchange-rate-time-series-datascience-project

This project will use time series analysis to forecast the exchange rate between the euro and the US dollar. The project will use a variety of statistical techniques, such as ARIMA to model the data and forecast the exchange rate.

data-analysis data-analytics data-preprocessing data-science data-transformation data-visualization eda exploratory-data-analysis foreign-exchange-rates machine-learning model-fitting predictive-modeling python3 time-series time-series-analysis

Last synced: 14 May 2026

https://github.com/tusharpandey003/chat_analysis

Analysis of group chat with respect to individual member of group

chat-analysis chat-analyzer data-analysis data-science streamlit whatsapp whatsapp-chat whatsapp-web

Last synced: 01 Feb 2026

https://github.com/bineet-ratna-shakya/data-science-salary-analysis

analyzing a dataset containing salaries of data science professionals from 2020 to 2023.

data-analysis data-science data-visualization jupyter numpy pandas python

Last synced: 01 Feb 2026

https://github.com/ludreinsalvador/life-expectancy-data-analysis

Contains Power BI dashboards analyzing global life expectancy trends, mortality rates, and health expenditures. Using a dataset sourced from Google Sheets, the project explores the impact of economic and healthcare factors on longevity.

dashboard data-analysis data-visualization healthcare-analysis life-expectancy powerbi

Last synced: 25 Feb 2026

https://github.com/rissh/titanicsurvivalpredictionusingml

Predicting Titanic passenger survival through machine learning. This project includes data preprocessing, exploratory data analysis, feature engineering, and model training using Python. 🚢

data data-analysis data-science data-visualization dataanalysis jupiter-notebook machine-learning machine-learning-algorithms machinelearning matplotlib numpy pandas prediction prediction-model python python3 seaborn tenserflow tflearn titanic

Last synced: 01 Feb 2026

https://github.com/thbaylson/datascience

All of my past data science assignments put into one singular notebook. Most of this comes from my Machine Learning course.

data-analysis data-science data-visualization decision-tree jupyter-notebook k-nearest-neighbors linear-regression machine-learning neural-network pandas-library python3 scikit-learn

Last synced: 09 May 2026

https://github.com/nagar2nd/jenson-usa-mysql-analysis

We are analyzing Jenson USA's dataset to gain valuable insights into customer behavior, staff performance, inventory management, and store operations. By crafting advanced SQL queries, the analysis explores key metrics such as product sales, customer spending, and order patterns, ultimately guiding strategic decision-making and operations.

data-analysis problem-solving sql

Last synced: 01 Feb 2026

https://github.com/ginalamp/covid_dashboard_twitternews

Corona Dashboard & report based on Twitter media outlet news.

dashboard data-analysis data-visualization twitter

Last synced: 28 Jan 2026

https://github.com/mnoalett/cscrawler

BSc degree thesis - crawler for www.couchsurfing.org

bsc-thesis couchsurfing crawler data-analysis database python

Last synced: 02 May 2026

https://github.com/suhail25/hotel-booking-analysis

Analyzed the cancelling of booking of hotels and summarized insights to the Hotel Manager to increase profit by 30%. Demonstrated data exploration, cleaning, analysis using Python and its libraries: pandas, seaborn, matplot. Documented the results in PDF report: reduced cancellation by 30% and releasing discounts for 10 days in a month.

data-analysis ipynb-notebook matplotlib pandas python seaborn

Last synced: 08 Feb 2026

https://github.com/motapinto/agent-based-simulation-conquest

Agent-based simulation modelation of the conquest Battlefield gamemode

agent-based-simulation data-analysis jade java sajas swing

Last synced: 24 Jan 2026

https://github.com/annnieglez/nlp-stock-market-and-news

This project focuses on detecting fake news from news headlines using advanced Natural Language Processing (NLP) techniques. It combines sentiment analysis with news headlines embeddings, generated from Hugging Face transformer models, to train a binary classification model that distinguishes between real and fake news.

classification-model data-analysis embeddings machine-learning machine-learning-models nlp nlp-deep-learning nlp-machine-learning python scraping-websites sentiment-analysis

Last synced: 25 Apr 2026

https://github.com/ejw-data/pandas-school

Analysis of school data with Pandas

data-analysis pandas python

Last synced: 08 May 2026

https://github.com/themihirmathur/uber-data-analytics

The goal of this project is to perform comprehensive data analytics on Uber trip data using a modern data engineering stack on Google Cloud Platform (GCP).

bigquery data-analysis data-engineering etl-pipeline google-cloud-platform looker python

Last synced: 09 Feb 2026

https://github.com/barraharrison/airbnb-price-trends

Looking at how Airbnbs differ in price when it comes to location, room type and host activity

data-analysis data-science pandas plotly python streamlit

Last synced: 09 Feb 2026

https://github.com/27ahmad/amazon-sales-analysis

This repository contains an exploratory data analysis (EDA) and visualization project of Amazon sales data. The goal is to uncover insights and present key metrics through a Tableau dashboard.

data-analysis eda pandas python seaborn tableau

Last synced: 15 Apr 2026

https://github.com/omkar2503/credit-risk-dashboard

A SQL-based Credit Risk Scoring System visualized using Metabase

credit-risk dashboard data-analysis data-analytics metabase postgresql sql

Last synced: 01 Jul 2025

https://github.com/ludreinsalvador/global-covid-19-data-analysis

Contains Power BI dashboards that visualizes and analyzes global COVID-19 cases, deaths, and vaccination trends using data from the World Health Organization (WHO). The project aims to provide insights into the pandemic’s impact and vaccination progress worldwide through dynamic reports and advanced analytics.

analytics covid-19 covid19-data data data-analysis data-collection data-transformation data-visualization

Last synced: 26 Feb 2026

https://github.com/vanajmoorthy/bibliotype

Find out your bibliotype!

alpinejs data-analysis django goodreads

Last synced: 09 Feb 2026

https://github.com/aakk23/perfomance-dashboard-tableau

This Tableau dashboard provides an interactive analysis of Superstore sales data, covering key metrics like sales, profit, orders, and customer trends. It helps visualize business performance across product categories, customer segments, and geographic regions.

data-analysis data-visualization superstore-data-analysis tableau tableau-dashboards

Last synced: 10 Feb 2026

https://github.com/bcko/ud-da-eda-redwinequality

Udacity Data Analyst Nanodegree Project : Exploratory Data Analysis : Red Wine Quality dataset

data-analysis data-analyst-nanodegree exploratory-data-analysis r-markdown rstudio udacity udacity-data-analyst-nanodegree udacity-nanodegree

Last synced: 10 Feb 2026

https://github.com/nickenshidqia/startup-venture-funding-dashboard-data-analysis

The Startup Venture Funding Dashboard is a comprehensive visual representation of the dynamic landscape of startup funding, providing valuable insights into the top startups, funding round types, markets, startup statuses, and investor details.

dashboard data-analysis tableau tableau-dashboards

Last synced: 11 Feb 2026

https://github.com/dhruwsunita/car-sales-dashboard

Car sales dashboard using Tableau visualization tool.

car-sales data-analysis data-visualization excel kpis tableau

Last synced: 27 Feb 2026

https://github.com/fatihilhan42/eda-spacex-launches-falcon9-and-falcon-heavy

In this project, we analyze the space flight data of Spacex space research company Falcon 9 rocket.

data-analysis data-science data-visualization eda elonmusk spacex

Last synced: 23 Mar 2025

https://github.com/deeksha-dhawan/pizza-outlet-analysis-using-sql

This project analyzes pizza sales data to gain insights into customer behavior and revenue patterns. Key analyses include customer insights, popular pizza types and sizes, revenue generation, and order trends. The findings help optimize menu offerings, staffing, and marketing strategies to boost overall business performance.

coding-challenge data-analysis data-science microsoft my portfolio-project programming project projects sql sql-analysis sql-project sqlproject sqlserver

Last synced: 23 Mar 2025

https://github.com/iliyasalve/tiktok_claim_classification_model

Develop a predictive model for classifying videos with claims to reduce the backlog of user reports and optimize the content moderation process.

data-analysis machine-learning python regression-models tiktok

Last synced: 21 May 2026

https://github.com/bala-1409/sql-projects

The repository contains Structured Query Language (SQL) Scripts. The Multiple SQL scripts for various projects which includes data cleaning, data pre-processing, data processing, data transformation and insights gaining through Query Language.

data-analysis data-mining data-science data-transformation database eda etl-framework exploratory-data-analysis microsoft-sql-server query-language sql sql-server sql-server-database sql-server-management-studio

Last synced: 27 Feb 2026

https://github.com/thanaraklee/exploring-and-analyzing-data-in-oracle-database

This project focuses on data analysis using SQL with Oracle Database 21c. It aims to familiarize with data management and data analysis using SQL commands and Oracle Database 21c.

data-analysis oracle-database sql sql-developer

Last synced: 12 Feb 2026

https://github.com/ryuzen6/bangalore-real-estate-price-prediction

This is a Data Science Project which predicts the cost of Real Estate in Bangalore. Requirements: Jupyter Notebook (for Data Cleaning and creating the Linear Regression using various python libraries) , Pycharm (python IDE for creating Python Flask Server), Visual Studio Code (to create the UI with HTML, CSS and Javascript).

css3 data-analysis data-science html5 javascript jupyter-notebook machine-learning python3

Last synced: 06 May 2026

https://github.com/syarwinaaa09/exploring-nyc-public-school-test-result-scores

📊 analyzing NYC school test scores with python 🐍 to spot top performers 🏆 & trends 📈

data-analysis education pandas python visualization

Last synced: 06 May 2026

https://github.com/koldlight/bluetab-data-science-2017

Repositorio para compartir material y publicar los retos

course data-analysis data-science exercises

Last synced: 12 Feb 2026