An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/alinababer/covid19-timeseries-cases-and-deaths-forecasting-

This study is based on confirmed cases and deaths collected from Pakistan. Results demonstrate the promising potential of TIME SERIES model in forecasting COVID-19 cases and highlight the superior performance of the time series compared to the LSTM.we apply AI-based forecasting models such time series ARIMA, LSTM, prophet and VAR.

arima covid-19 data-analysis data-science data-visualization fbprophet forecasting lstm rnn time-series var vectorautoregression

Last synced: 19 Jun 2026

https://github.com/angelmtenor/idafc

Udacity's Intro to Data Analysis

data-analysis

Last synced: 20 Jun 2026

https://github.com/an4pdm/relatorio-de-vendas

O presente projeto foi feito através das ferramentas oferecidas pelo Power BI afim de aprimorar meus conhecimentos sobre ETL. Os dados utilizados foram de origem do site "Kaggle".

data-analysis data-visualization database etl powerbi

Last synced: 20 Jun 2026

https://github.com/sakan811/stress-pattern-occurrence-in-english-words

This project is intended to provide English learners with data that allows them to make a data-driven guess when encountering words that they aren't sure where to stress

data-analysis data-visualization english english-language english-learning language powerbi powerbi-report powerbi-visuals

Last synced: 20 Jun 2026

https://github.com/dcs-training/intro-to-statistics

Intro to Statistics workshop. In this repo, you are going to find the code and files we are going to use for the practical part of the workshop, together with the ppt associated with this training. Go to the readme file

data-analysis data-visualisation data-wrangling r statistics

Last synced: 20 Jun 2026

https://github.com/evanmathew/northwind-traders

SQL-powered analysis of sales, employee performance, and customer behavior using PostgreSQL window functions. This project uncovers key business insights to optimize decision-making.

case-study data-analysis jupyter-notebook northwind-traders postgresql python-postgresql sql

Last synced: 20 Jun 2026

https://github.com/jayavarshini-jayakumaran/nba-exploratory-data-analysis

A data analytics project that explores NBA game and player data using Python and Power BI. Features data preprocessing, EDA, feature engineering, and an interactive dashboard for visualizing team and player performance trends.

data-analysis data-visualization exploratory-data-analysis powerbi python3

Last synced: 20 Jun 2026

https://github.com/haseebn19/urban-housing-demand

A full-stack web application for visualizing housing and labour market data

data-analysis data-visualization docker full-stack gradle statistics web webapp

Last synced: 22 Jun 2026

https://github.com/dcs-training/datavisualisationwithr2021

Data Visualisation with R Course (delivered by the Centre in October/November 2021). This workshop is focusing on good practice of creating graphs with R and R Studio. Go to the readme file

data-analysis data-visualisation data-wrangling r

Last synced: 23 Jun 2026

https://github.com/emaleckova/emaleckova.github.io

My personal website created with Quarto

biology data-analysis data-viz quarto r

Last synced: 23 Jun 2026

https://github.com/ladaegorova18/data_analysis

Learning the basics of data analysis in Python

analytics data-analysis data-visualization steam-games

Last synced: 24 Jun 2026

https://github.com/vbhvsingh0/coulombic_dyn_formaltetra

The Python code simulates a formaldehyde tetra-cation molecule using Coulombic forces

data-analysis physics-simulation python shell-scripting

Last synced: 24 Jun 2026

https://github.com/imosudi/unsupervised-ml-kmeans-analysis

K-Means clustering analysis using synthetic datasets generated with scikit-learn, including meshgrid visualisation, silhouette score evaluation, and investigation of cluster count and random seed effects.

clustering data-analysis jupyter-notebook kmeans kmeans-clustering machine-learning matplotlib python3 scikit-learn silhouette-score unsupervised-learning

Last synced: 25 Jun 2026

https://github.com/chdre/data-analyzer

A small package to analyze and preprocess data.

data-analysis python

Last synced: 28 Jun 2026

https://github.com/tyriek-cloud/nyc-mobility-survey-analysis

An end-to-end data engineering project in which five NYC DOT datasets were modified in an ETL process and analyzed for insights.

aws aws-athena aws-glue aws-glue-crawler aws-quicksight aws-s3 data-analysis data-engineering etl-pipeline json python

Last synced: 09 May 2026

https://github.com/dcs-training/spatial_dynamics

Use of QGIS and R to analyse first and second order geospatial effects. Go to the Readme file

data-analysis geographical-data gis qgis r statistics

Last synced: 23 Oct 2025

https://github.com/jigyasag18/bird-strikes-in-aviation-project

This project analyzes over a decade of U.S. bird strike data (2000–2011) to evaluate safety risks, damage trends, and cost implications in aviation. Using PostgreSQL for database management and Power BI for dashboard visualization, it uncovers critical insights into when, where, and how wildlife impacts aircraft. Key findings inform strategically.

bird-strike-prevention bird-strike-prevention-in-real-airport data data-analysis data-analysis-project data-visualisation data-visualization data-visualization-project data-visualizations database dataset dax-query postgresql postgresql-database powerbi powerbi-desktop powerbi-report powerbi-visuals sql sql-database

Last synced: 09 May 2026

https://github.com/changyeop-yang/study-datasciencefoundation

Big Data Science and its Analytics plays a major role in this decade. How to clean and prepare your data for analysis is still a challenge, like How to perform basic visualization of your data, How to model your data, How to curve-fit your data, And finally, how to present your findings and wow the audience

data-analysis ios kyungpook-national-university swift

Last synced: 23 Oct 2025

https://github.com/albertobarrago/sentinel

A contribute for the research of Corrado Malanga and Filippo Biondi

data-analysis sar

Last synced: 24 Oct 2025

https://github.com/sugumarsrinivasan/sql-datawarehouse-project

Building Mordern datawarehouse with SQL Server, including ETL Processes, data modeling, and data analytics.

data-analysis data-analytics data-engineering data-lake data-science data-warehouse datawarehousing etl etl-pipeline medallion-architecture sql sql-query sql-server

Last synced: 19 Jun 2026

https://github.com/brianlesko/r_data_science_stat5730

Written by Brian Lesko, the repository contains R Scripts demonstrating data science topics largely originating from study at Ohio State. Contents are written in R studio using the R markdown file. As of 1/21/23 Future projects concerning data science, statistics, and machine learning will be in python in my machine learning Repository

data data-analysis flight-data ggplot2 olympics-data r-markdown tidyverse

Last synced: 23 Jan 2026

https://github.com/satyacoder29/crm-analytics-power-bi

CRM Analytics Dashboard – An interactive dashboard using Tableau, SQL, and Salesforce CRM Analytics (CRMA) to analyze sales performance, customer segmentation, and churn prediction. Features automated ETL pipelines, predictive analytics, and real-time insights for data-driven decision-making. 🚀📊

advanced-excel data-analysis data-cleaning data-collection data-transformation data-visualization matplotlib numpy pandas powerbi python seaborn sql tableau

Last synced: 14 Apr 2026

https://github.com/alessandroryo/bike-rental-data-analysis

A data analysis project focused on understanding and predicting bike rental patterns. This project utilizes data processing, visualization, and predictive modeling techniques to gain insights into bike rental usage, fulfilling the final submission requirement for Dicoding Indonesia's Data Analysis course.

bike-rental data-analysis data-visualization jupyter-notebook machine-learning python streamlit

Last synced: 09 Apr 2026

https://github.com/janiavdv/data-spirits

Analysis of alcohol and sports betting data, including a correlation investigation.

correlation data-analysis data-science machine-learning

Last synced: 11 Nov 2025

https://github.com/psychelzh/cogstruct-old

Data Analysis on Cognitive Structure

cognition data-analysis intelligence psychology

Last synced: 25 Oct 2025

https://github.com/shrutiijoshi/apple_greenhouse_gas_emissions

A breakdown of Apple's greenhouse gas emissions from 2015 to 2022 as they aim to reach net zero emissions by 2030.

dashboard data-analysis data-visualization powerbi

Last synced: 06 Feb 2026

https://github.com/ljadhav25/linear_regression_data_science

Linear regression analysis is used to predict the value of a variable based on the value of another variable. The variable you want to predict is called the dependent variable. The variable you are using to predict the other variable's value is called the independent variable.

data-analysis data-science linear-regression machine-learning

Last synced: 26 Oct 2025

https://github.com/srinibas-masanta/infosys-springboard-internship

An interactive Power BI dashboard developed during my Infosys Springboard Internship to visualize Indian election trends. It integrates historical and live API data to analyze vote shares, turnout patterns, and demographic insights across constituencies, helping news agencies report results in real time.

dashboard data-analysis data-cleaning data-collection data-visualization dax-functions powerbi

Last synced: 25 Feb 2026

https://github.com/vishalsiingh/deloitte-virtual-internship

Submission for the STEM Virtual Program by Deloitte via Forage.

coding cyber-security data-analysis deloitte development forage forensics

Last synced: 23 Jan 2026

https://github.com/campagnucci/exercitando_pandas

Exercícios práticos de pandas com dados abertos da educação de São Paulo

data-analysis data-science education-data exercises pandas-tutorial

Last synced: 28 Jan 2026

https://github.com/codewithjazmine/bookbot

Python command-line tool that analyzes text files for word count and character statistics

command-line-tool data-analysis learning-project python text-analysis

Last synced: 23 Jan 2026

https://github.com/code-jl/nfl-kicker-predictor

A sophisticated Python application that provides real-time NFL kicker statistics and performance analysis with an intuitive graphical interface.

beautifulsoup data-analysis data-visualization espn football gui nfl prediction python real-time-analytics real-time-data sport-analytics sports-data statistics tkinter web-scraping

Last synced: 01 Jun 2026

https://github.com/alunera-data/alunera-data

Hi, I’m Yvonne – building data solutions at the intersection of BI, SQL & Service Management

business-intelligence data-analysis data-engineering data-science github-profile portfolio rstats sql

Last synced: 28 Jan 2026

https://github.com/garcane/unicorn-companies-analysis

Tracking unicorn startups (valued at $1B+) provides valuable insights for investors and analysts to identify high-growth industries and emerging trends.

data-analysis exploratory-data-analysis financial-analysis investor postgresql sql

Last synced: 24 Jan 2026

https://github.com/valentinoli/swiss-foodprint

Project in Applied Data Analysis, EPFL 2019

carbon-emissions data-analysis diet foodprint swiss switzerland

Last synced: 24 Jan 2026

https://github.com/rahulchouhan1/sql-data-warehouse-project

Building a modern data warehouse with SQL Server, including ETL Processes, data modeling, and analytics.

data-analysis data-cleaning data-engineering data-science data-warehouse datascience etl etl-pipeline sql sql-query sql-server

Last synced: 24 Jan 2026

https://github.com/mysto-007/cyclistic-bike-share-analysis

Analyzed the dataset of Cyclistic Rental Service as the Capstone project for Google Data Analytics SpecializationAnalyzed the dataset of Cyclistic bike-share (Capstone project for Google Data Analytics Specialization)

bigquery data-analysis excel ms-sql-server sql tableau tableau-public

Last synced: 16 Mar 2026

https://github.com/annnieglez/fraud-detection-eda

Fraud Detection - Exploratory Data Analysis (EDA). Analyzing financial transactions to detect fraud patterns using Python and Tableau. Libraries: Pandas, Seaborn and Matplotlib. Key Focus: Data cleaning, fraud trends, high-risk transactions, time-based patterns

data-analysis data-science data-visualization eda fraud-detection fraud-prevention matplotlib seaborn

Last synced: 28 Jan 2026

https://github.com/tasosfotiadis/time-series-forecasting-for-bitcoin

This project forecasts Bitcoin’s daily closing price using time series models. Data from Jan 2021 to Mar 2022 is processed by converting timestamps, resampling, and handling missing values. LSTM and ARIMA models are evaluated on MAE, RMSE, and MAPE, with LSTM achieving better accuracy while ARIMA is faster in training and inference.

arima bitcoin data data-analysis data-science deep-learning forecasting jupyter-notebook neural-networks python time-series

Last synced: 06 May 2026

https://github.com/anurag-ghosh-12/library_management_system_sql

This project showcases the development of a comprehensive Library Management System utilizing Structured Query Language (SQL). It demonstrates a practical application of relational database principles to efficiently manage library resources, member information, and borrowing/returning transactions.

data-analysis data-visualisation dbms-project sql

Last synced: 29 Jan 2026

https://github.com/angchekar28/sales-report-power-bi

A Power BI sales report analyzing country-wise and product-wise sales trends. Includes dashboards, decomposition trees, and key influencers analysis for business insights.

dashboard data-analysis data-cleaning data-visualization powerbi sales-report

Last synced: 16 Mar 2026

https://github.com/engineertolulope/us_states_living_ranking_analysis

Python script for analyzing and ranking U.S. states based on factors like cost of living, tax burden, diversity, crime rates, and climate. Uses weighted criteria to identify the best states to live in according to these metrics. Ideal for decision-making on relocation.

data-analysis data-science linear-regression machine-learning python scikit-learn

Last synced: 29 Jan 2026

https://github.com/wareflowx/excel-toolkit

A powerful command-line toolkit for Excel and CSV data manipulation, analysis, and transformation.

data-analysis data-wrangling excel pandas python uv

Last synced: 29 Jan 2026

https://github.com/abhi227070/medical-insurance-predictor

This project implements a machine learning regression model to predict medical insurance charges based on user-provided details such as smoking status, number of children, gender, and age. The user-friendly interface allows individuals to estimate their average insurance price before purchasing medical insurance.

data-analysis machine-learning machine-learning-algorithms machinelearning python3 regression-models

Last synced: 04 May 2026

https://github.com/isaqueiros/newspapersoldout-predictions-logistic_regression

This notebook is a study of the application of sklearn Logistic Regression model and analysis of metric quality with a focus on the impact of imbalanced data. The problem presented is the analysis of sales of newspapers of a local stand in order to classify the probability of the newspaper being Sold Out or Not, given a set of features.

data-analysis data-imbalance data-science logistic-regression machine-learning python sklearn-library sklearn-logistic-regression

Last synced: 18 Apr 2026

https://github.com/joannescode/regex_with_py

Learning by practicing with Regex (Python)

data-analysis python3 regex

Last synced: 30 Jan 2026

https://github.com/surajwate/datalab

DataLab is a versatile toolkit designed to simplify data exploration, analysis, and visualization for data scientists.

data-analysis data-science python visualization

Last synced: 30 Jan 2026

https://github.com/ljadhav25/decision-tree-random-forest-algorithm-data-science-

This repository contains an implementation of decision tree and random forest algorithms from scratch in Python. Decision trees and random forests are popular machine learning algorithms used for classification and regression tasks. The goal of this project is to provide a clear and understandable implementation of these algorithms

data-analysis data-science decision-trees machine-learning-algorithms matplotlib numpy pandas python random-forest-classifier

Last synced: 15 Apr 2026

https://github.com/manishabarse/hr_data_analysis

Used Microsoft SQL Server Management Studio and Power BI

data-analysis powerbi sql ssms

Last synced: 30 Jan 2026

https://github.com/jcaperella29/jc_bioinformatics_hub

A personal hub to showcase my bioinformatics applications including RNA-Seq, ATAC-Seq, and miRNA-Seq analysis tools. Powered by simple HTML, CSS, and JavaScript with a biotech-themed design.

atac-seq bioinformatics biotech data-analysis github-pages portal rna-seq webapp

Last synced: 25 Feb 2026

https://github.com/aavishkarmahajan/sql

SQL code assignments and practice questions from SQL courses, SQL data analysis

data-analysis sql sql-server

Last synced: 07 Feb 2026

https://github.com/auliannee/new-york-uber-pickups-analysis

This repository contains the projects related to data collecting, quality check, manipulation, analyzing, and visualizations.

data-analysis data-science ipython-notebook jupyter-notebook python

Last synced: 07 Feb 2026

https://github.com/tralahm/parliament-2017-dataset

Concise, Clean data sets of the 2017 Kenyan General Election results for the Members of the Senate and National Assembly Composition

csv-parsing data-analysis data-visualization datasets election-data ipynb-jupyter-notebook kaggle-dataset kenya-constituencies kenya-counties matplotlib python3 tralahtek

Last synced: 31 Jan 2026

https://github.com/jofaval/titanic-disaster

Data Analysis of the famous Titanic Disaster in 1912 with Machine Learning

classification data-analysis data-science data-visualization google-colab kaggle machine-learning python scikit-learn

Last synced: 15 Apr 2026

https://github.com/jujulis18/olympicsmedalsdashboard

Olympic Dashboard – Paris 2024 est un tableau de bord interactif permettant d’explorer les performances des athlètes médaillés des Jeux Olympiques d’été de Paris 2024.

dashboard data-analysis data-visualization eda olympic python streamlit

Last synced: 31 Jan 2026

https://github.com/amishidesai04/flipkart-mobile-sales-analysis

Flipkart Mobile Sales Analysis is a Tableau project that visualizes mobile sales data from Flipkart. It highlights trends in brand performance, pricing, ratings, and customer preferences. The interactive dashboard helps users explore key insights for data-driven decisions in e-commerce and retail.

dashboard data-analysis data-visualization storyboard tableau

Last synced: 31 Jan 2026

https://github.com/traore-07/fedex-sales-analysis

Analysis of the FedEx Sales Transaction

data-analysis data-visualization sales-analysis tabeau

Last synced: 31 Jan 2026

https://github.com/cca/panopto-session-data

analyzing Panopto session data for retention purposes

data-analysis ipython-notebook video

Last synced: 07 Feb 2026

https://github.com/allanotieno254/bank-loan-analysis-dashboard-power-bi

An interactive Power BI dashboard that analyzes bank loan data to provide insights into approval trends, default risks, and customer profiles. Designed to assist financial institutions in making data-driven lending decisions.

bank-loans business-intelligence dashboard data-analysis financial-analysis power-bi risk-assessment

Last synced: 31 Jan 2026

https://github.com/malthejorgensen/repx

Python regular expression file transformer

command-line-tool data-analysis text-processing

Last synced: 31 Jan 2026

https://github.com/gastonstat/stat133

STAT 133: Concepts in Computing with Data

data-analysis data-science data-visualization r-programming syllabus

Last synced: 25 Feb 2026

https://github.com/alex-pierron/ekip-enedis-genai

Repository for the team "Ekip" during the H-GenAI Hackathon 2025 organized at SIA Partners, Paris, France

amazon-nova artificial-intelligence aws aws-lambda data-analysis database generative-ai mistral nlp

Last synced: 15 Apr 2026

https://github.com/tusharpandey003/chat_analysis

Analysis of group chat with respect to individual member of group

chat-analysis chat-analyzer data-analysis data-science streamlit whatsapp whatsapp-chat whatsapp-web

Last synced: 01 Feb 2026

https://github.com/axsk/geekgraph

parse, cluster and visualize boardgamegeek.com user profiles

data-analysis scraper

Last synced: 01 Feb 2026

https://github.com/bineet-ratna-shakya/data-science-salary-analysis

analyzing a dataset containing salaries of data science professionals from 2020 to 2023.

data-analysis data-science data-visualization jupyter numpy pandas python

Last synced: 01 Feb 2026

https://github.com/tapas-gope/global-superstore-sales

This repository contains a Power BI dashboard designed to provide comprehensive insights into sales performance across various regions, segments, and products. The dashboard utilizes a variety of visualizations, including bar charts, line charts, maps, and tables, to effectively communicate key metrics and trends.

business-intelligence data-analysis data-modeling data-visualization financial-reporting powerbi sales-analysis

Last synced: 07 Feb 2026

https://github.com/ludreinsalvador/life-expectancy-data-analysis

Contains Power BI dashboards analyzing global life expectancy trends, mortality rates, and health expenditures. Using a dataset sourced from Google Sheets, the project explores the impact of economic and healthcare factors on longevity.

dashboard data-analysis data-visualization healthcare-analysis life-expectancy powerbi

Last synced: 25 Feb 2026

https://github.com/asghar-rizvi/world-energy-consumption-analysis-1965-2023-

An in-depth analysis of global energy consumption trends from 1965 to 2023, using data from various countries and regions.

data-analysis data-analysis-python data-science python real-world-data real-world-data-analysis real-world-problem-solving real-world-project visulaization

Last synced: 15 Apr 2026

https://github.com/rohitdusane/healthcare-analytics

𝐏𝐨𝐰𝐞𝐫 𝐁𝐈 𝐃𝐚𝐬𝐡𝐛𝐨𝐚𝐫𝐝 is designed to provide valuable insights into patient waiting times across outpatient and inpatient healthcare services. It offers a comprehensive analysis of key factors influencing wait lists, including Age Profile, Specialty, Time Bands, and Patient Case Types.

data-analysis data-visualization dax dax-query healthcare-analysis powerbi-report

Last synced: 01 Feb 2026

https://github.com/vishnu-vamshii/data-science-jobs-salaries

Created an interactive dashboard to analyze data science jobs salaries in different regions of the world, experience levels, average salaries in USD and type of employment along with a geographical visual.

data-analysis data-science data-visualization tableau tableau-dashboard

Last synced: 01 Feb 2026

https://github.com/tameronline/ai-financial-analyst

AI-driven financial analyst system utilizing LangChain and Ollama for real-time stock analysis, market trends, and financial insights.

ai data-analysis finance financial-analysis langchain machine-learning nlp ollama stock-market

Last synced: 02 Feb 2026

https://github.com/khanovico/python-stock-analyzer

This is a Webapp implemented by python and several data science frameworks, enabling online stock trend analyzing.

amcharts-js-charts data-analysis data-visualization flask javascript pandas python scikit-learn

Last synced: 02 Feb 2026

https://github.com/vladimiracunadev-create/python-data-science-program

Python Data Science Program — 197 clases en 9 partes. Pauta avanzada derivada de Géron, VanderPlas, Huyen, ISLP y Barocas/Hardt/Narayanan. Recurso personal de aprendizaje, enseñanza y mejora continua.

bootcamp data-analysis data-science education jupyter machine-learning matplotlib numpy pandas python scikit-learn

Last synced: 01 Jun 2026

https://github.com/shubham200137/customer-churn-analysis

In this case study, we analyze customer churn for a telecom company serving Southern California. The company faces increased competition and wants to retain customers by understanding the reasons for churn. Our objectives include improving service quality, identifying churn factors, pinpointing attractive services, and retaining high LTV customers.

data-analysis data-visualization numpy-python pandas-python sqlite tableau

Last synced: 15 Apr 2026

https://github.com/devbigboy/excel-power-query-get-transform

Power Query is a feature in Excel that allows you to quickly import data from multiple sources and easily clean, transform, and reshape it to suit your needs.

data-analysis data-science excel

Last synced: 08 Feb 2026

https://github.com/suhail25/hotel-booking-analysis

Analyzed the cancelling of booking of hotels and summarized insights to the Hotel Manager to increase profit by 30%. Demonstrated data exploration, cleaning, analysis using Python and its libraries: pandas, seaborn, matplot. Documented the results in PDF report: reduced cancellation by 30% and releasing discounts for 10 days in a month.

data-analysis ipynb-notebook matplotlib pandas python seaborn

Last synced: 08 Feb 2026

https://github.com/mrgeislinger/bike-data-exploration

Data exploration of bike-related data

bicycle bike data-analysis data-science

Last synced: 08 Feb 2026

https://github.com/sroman0/data-analytics

Data Analytics Exercises is a collection of comprehensive university-level exercises aimed at enhancing skills in data analytics. The repository includes practical notebooks covering data manipulation, exploratory data analysis (EDA), statistical analysis, data visualization, and machine learning fundamentals.

data-analysis data-analytics data-science data-visualization education exercises exploratory-data-analysis hands-on-practice jupyter-notebook machine-learning python statistics

Last synced: 15 Apr 2026

https://github.com/grindelfp/datasets-analysis

The Machine Learning and Data Analysis course task dedicated to training skills of data normalizing and preprocessing.

data-analysis datasets ipynb mlda

Last synced: 05 Mar 2026

https://github.com/siddhant2105s/airline-performance-analysis-dashboard

Enhancing Airline Performance Analysis for the Department of Transport

data-analysis data-visualization tableau

Last synced: 08 Feb 2026