An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/vishalsiingh/deloitte-virtual-internship

Submission for the STEM Virtual Program by Deloitte via Forage.

coding cyber-security data-analysis deloitte development forage forensics

Last synced: 23 Jan 2026

https://github.com/limatix/limatix

Limatix datacollect and processtrak tools

data-analysis python scientific-workflows

Last synced: 23 Jan 2026

https://github.com/9dl/usbfalcon

Automatically copies files from plugged USB drives to a specified location, enabling quick data retrieval for analysis.

automation data-analysis data-retrieval ethical-hacking file-copying usb

Last synced: 27 Oct 2025

https://github.com/aneeshmurali-n/global-superstore-sales-dashboard---power-bi-stunning-dark-theme

This Power BI dashboard provides a comprehensive view of sales data, enabling users to analyze sales trends, identify top-performing regions, and gain insights into customer behavior.

dark-theme dashboard data-analysis data-science data-visualization powerbi salesdashboard

Last synced: 28 Jan 2026

https://github.com/leftcoastnerdgirl/excel_crowdfunding_analysis

This project demonstrates the use of MS Excel for data cleansing & formatting to prepare for data analysis and visualization.

bar-charts conditional-formatting data-analysis data-analytics data-analytics-excel data-preparation data-preprocessing data-visualization excel line-graph

Last synced: 06 Feb 2026

https://github.com/wassimhd/pwc-switzerland-power-bi-in-data-analytics-virtual-case-experience

The Project helps to build a foundation in data analysis and Power BI software which is provided by PWC virtual internship

data-analysis data-visualization datastorytelling powerbi

Last synced: 28 Jan 2026

https://github.com/tasosfotiadis/time-series-forecasting-for-bitcoin

This project forecasts Bitcoin’s daily closing price using time series models. Data from Jan 2021 to Mar 2022 is processed by converting timestamps, resampling, and handling missing values. LSTM and ARIMA models are evaluated on MAE, RMSE, and MAPE, with LSTM achieving better accuracy while ARIMA is faster in training and inference.

arima bitcoin data data-analysis data-science deep-learning forecasting jupyter-notebook neural-networks python time-series

Last synced: 06 May 2026

https://github.com/srimantapal205/dataengineerwireframedesigns

Data Engineer Wireframe Designs are essential for planning and visualizing data pipelines, architecture, and workflows before implementation.

data-analysis data-engineering dataflow dataflow-programming datapipeline dataprocessing development visualization

Last synced: 29 Jan 2026

https://github.com/andreicirciumaru/best-of-breed

CSV fundamentals screener: schema validation + market-cap weights

csv data-analysis finance pandas python screener

Last synced: 15 Apr 2026

https://github.com/smahala02/magnetism-lab

This repository contains Python scripts and data for analyzing inductance in toroidal coils to calculate the magnetic permeability of ferrite materials. The project helps classify materials as soft or hard magnets based on experimental data.

data-analysis inductance jupyter-notebook magnetism python toroids

Last synced: 29 Jan 2026

https://github.com/edumoraes1/comissao-reduzida

Criação de segmentação de publico via SQL para nova feature do enjoei de comissão reduzida

bq data-analysis salesforce sql

Last synced: 06 Feb 2026

https://github.com/surajwate/datalab

DataLab is a versatile toolkit designed to simplify data exploration, analysis, and visualization for data scientists.

data-analysis data-science python visualization

Last synced: 30 Jan 2026

https://github.com/mfakhriazhar/healthcare-dashboard-project

This project is a comprehensive data analysis and visualization of healthcare data using Power BI. It focuses on understanding patient distribution, billing trends, and hospital performance through a clean and interactive dashboard.

dashboard dashboardreporting data-analysis datacleaning excel powerbi powerquery

Last synced: 30 Jan 2026

https://github.com/aygp-dr/values-compass

Tools for exploring and analyzing Anthropic's Values-in-the-Wild dataset for AI ethics research

ai-ethics anthropic-claude data-analysis nlp values

Last synced: 25 Feb 2026

https://github.com/jaseel342/ecommerce_sales_dashboard

The E-commerce Sales Dashboard project offers a comprehensive view of e-commerce sales performance using interactive Power BI dashboards. It focuses on key metrics like YTD Sales, YTD Profit, YTD Profit Margin, and Quantity of Products sold, analyzing data by product categories, states, and regions.

data-analysis data-modelling dax-expression excel power-query powerbi visualization

Last synced: 07 Feb 2026

https://github.com/auliannee/new-york-uber-pickups-analysis

This repository contains the projects related to data collecting, quality check, manipulation, analyzing, and visualizations.

data-analysis data-science ipython-notebook jupyter-notebook python

Last synced: 07 Feb 2026

https://github.com/tralahm/parliament-2017-dataset

Concise, Clean data sets of the 2017 Kenyan General Election results for the Members of the Senate and National Assembly Composition

csv-parsing data-analysis data-visualization datasets election-data ipynb-jupyter-notebook kaggle-dataset kenya-constituencies kenya-counties matplotlib python3 tralahtek

Last synced: 31 Jan 2026

https://github.com/amishidesai04/flipkart-mobile-sales-analysis

Flipkart Mobile Sales Analysis is a Tableau project that visualizes mobile sales data from Flipkart. It highlights trends in brand performance, pricing, ratings, and customer preferences. The interactive dashboard helps users explore key insights for data-driven decisions in e-commerce and retail.

dashboard data-analysis data-visualization storyboard tableau

Last synced: 31 Jan 2026

https://github.com/allanotieno254/bank-loan-analysis-dashboard-power-bi

An interactive Power BI dashboard that analyzes bank loan data to provide insights into approval trends, default risks, and customer profiles. Designed to assist financial institutions in making data-driven lending decisions.

bank-loans business-intelligence dashboard data-analysis financial-analysis power-bi risk-assessment

Last synced: 31 Jan 2026

https://github.com/alex-pierron/ekip-enedis-genai

Repository for the team "Ekip" during the H-GenAI Hackathon 2025 organized at SIA Partners, Paris, France

amazon-nova artificial-intelligence aws aws-lambda data-analysis database generative-ai mistral nlp

Last synced: 15 Apr 2026

https://github.com/ajmannust41288/data-analyst

Data Analyst ,Microsoft Professional expert,Desktop PowerBi ,Tablue and Dashboards with ChatGP4 AI uses

business-analytics data-analysis data-analyst data-analytics eda

Last synced: 01 Feb 2026

https://github.com/bineet-ratna-shakya/data-science-salary-analysis

analyzing a dataset containing salaries of data science professionals from 2020 to 2023.

data-analysis data-science data-visualization jupyter numpy pandas python

Last synced: 01 Feb 2026

https://github.com/tameronline/ai-financial-analyst

AI-driven financial analyst system utilizing LangChain and Ollama for real-time stock analysis, market trends, and financial insights.

ai data-analysis finance financial-analysis langchain machine-learning nlp ollama stock-market

Last synced: 02 Feb 2026

https://github.com/vladimiracunadev-create/python-data-science-program

Python Data Science Program — 197 clases en 9 partes. Pauta avanzada derivada de Géron, VanderPlas, Huyen, ISLP y Barocas/Hardt/Narayanan. Recurso personal de aprendizaje, enseñanza y mejora continua.

bootcamp data-analysis data-science education jupyter machine-learning matplotlib numpy pandas python scikit-learn

Last synced: 01 Jun 2026

https://github.com/shubham200137/customer-churn-analysis

In this case study, we analyze customer churn for a telecom company serving Southern California. The company faces increased competition and wants to retain customers by understanding the reasons for churn. Our objectives include improving service quality, identifying churn factors, pinpointing attractive services, and retaining high LTV customers.

data-analysis data-visualization numpy-python pandas-python sqlite tableau

Last synced: 15 Apr 2026

https://github.com/sroman0/data-analytics

Data Analytics Exercises is a collection of comprehensive university-level exercises aimed at enhancing skills in data analytics. The repository includes practical notebooks covering data manipulation, exploratory data analysis (EDA), statistical analysis, data visualization, and machine learning fundamentals.

data-analysis data-analytics data-science data-visualization education exercises exploratory-data-analysis hands-on-practice jupyter-notebook machine-learning python statistics

Last synced: 15 Apr 2026

https://github.com/mdaltamashalam/uber-fare-prediction-models

Predicts the fare amount of Uber rides based on various factors such as pickup/drop-off coordinates, passenger count, and trip distance.

catboost data-analysis data-cleaning data-visualization lgbm-regressor machine-learning matplotlib numpy pandas python random-forest regression-models skit-learn xgboost-algorithm

Last synced: 26 Feb 2026

https://github.com/josericodata/statisticsapp

Interactive statistics analysis app using Python and Streamlit. Perform key statistical tests, visualise distributions, and explore data with ease.

alpha-value chi-square-test confidence-intervals data-analysis dublin dublin-ireland europe hyphotesis-tests ireland normal-distribution null-hypothesis p-value portfolio python statistics streamlit t-test tech ubuntu z-test

Last synced: 26 Feb 2026

https://github.com/themihirmathur/uber-data-analytics

The goal of this project is to perform comprehensive data analytics on Uber trip data using a modern data engineering stack on Google Cloud Platform (GCP).

bigquery data-analysis data-engineering etl-pipeline google-cloud-platform looker python

Last synced: 09 Feb 2026

https://github.com/barraharrison/airbnb-price-trends

Looking at how Airbnbs differ in price when it comes to location, room type and host activity

data-analysis data-science pandas plotly python streamlit

Last synced: 09 Feb 2026

https://github.com/naninsv/apple-retail-sales-warranty-analysis

An advanced SQL project analyzing over 1 million rows of Apple retail sales data to solve real-world business problems, optimize query performance, and extract actionable insights. The analysis includes sales trends, warranty claims, product performance, and year-over-year growth

business-intelligence data-analysis data-science etl insights retailanalytics sql sqladvance

Last synced: 26 Feb 2026

https://github.com/ninadpatil09/heart_disease_detection_analysis

The Heart Disease Detection Analysis aims to create a predictive model for identifying individuals at risk of heart disease. Using a dataset with attributes like age, sex, and health metrics, the project focuses on distinguishing patients with and without heart disease.

data-analysis data-cleaning data-science data-visualization machine-learning

Last synced: 15 Apr 2026

https://github.com/rajeev2806/netflix-data-analysis

In this project i have implemented ETL . I used netflix dataset to clean and analyze using postgresql and python

data-analysis data-cleaning postgresql python

Last synced: 15 Apr 2026

https://github.com/mathusanm6/critics-vs-players-analysis

This data analysis examines the relationship between critic scores, sales (owners), player engagement, and pricing to determine the ROI of critic reviews.

data-analysis data-science data-visualization game-reviews games-sales jupyter-notebook python-3 steam-games

Last synced: 16 Apr 2026

https://github.com/shruti23-ui/blinkit-powerbi-dashboard

A comprehensive Power BI dashboard analyzing Blinkit's sales performance, outlet metrics, and multi-tier market analytics with interactive visualizations and business intelligence insights.

data-analysis data-visualization microsoft-excel microsoft-power-bi powerbi sales-analysis sql

Last synced: 09 Feb 2026

https://github.com/animesh-chourey/power-bi

Various projects at my attempt to learn Power BI

business-analytics data-analysis data-visualization powerbi

Last synced: 10 Feb 2026

https://github.com/tushar2704/imdb-movie-analysis

This project extracts meaningful insights, trends, and patterns from the data, shedding light on various aspects of the movie industry. By leveraging this analysis, filmmakers, studios, and enthusiasts can gain valuable information to inform decision-making, understand audience preferences, and contribute to the creation of successful movies.

artificial-intelligence data-analysis data-science imdb project tushar2704

Last synced: 10 Feb 2026

https://github.com/bcko/ud-da-eda-redwinequality

Udacity Data Analyst Nanodegree Project : Exploratory Data Analysis : Red Wine Quality dataset

data-analysis data-analyst-nanodegree exploratory-data-analysis r-markdown rstudio udacity udacity-data-analyst-nanodegree udacity-nanodegree

Last synced: 10 Feb 2026

https://github.com/nickenshidqia/startup-venture-funding-dashboard-data-analysis

The Startup Venture Funding Dashboard is a comprehensive visual representation of the dynamic landscape of startup funding, providing valuable insights into the top startups, funding round types, markets, startup statuses, and investor details.

dashboard data-analysis tableau tableau-dashboards

Last synced: 11 Feb 2026

https://github.com/multitagging/benchmarks

Provides benchmarks to test the MultiTagging framework

benchmarks data-analysis ethereum smart-contracts vulnerabilities

Last synced: 11 Feb 2026

https://github.com/ohimoiza1205/mastercard-cybersecurity-simulation

Served as an analyst on Mastercard’s Security Awareness Team to identify and report security threats

cybersecurity data-analysis data-presentation security-awareness-training technical-security-awareness

Last synced: 11 Feb 2026

https://github.com/sharmas1ddharth/mode_of_transport_analysis

This project requires you to understand what mode of transport employees prefers to commute to their office. The data includes employee information about their mode of transport as well as their personal and professional details like age, salary, and work exp. We need to predict whether or not an employee will use private transport. Also, which variables are a significant predictor behind this decision.

data-analysis r-programming

Last synced: 11 Feb 2026

https://github.com/thlindustries/mortalidade_neonatal_python_react

Uma plataforma de visualização de dados montada utilizando Python e React com a library de visualização do Plotly

data-analysis data-visualization plotly python python3 react reactjs

Last synced: 16 Apr 2026

https://github.com/l1ght14/e-commerce-sales-analysis

Interactive Power BI dashboard analyzing e-commerce sales, profit trends, top products, and customer segments using the Sample Superstore dataset.

dashboard data-analysis powerbi

Last synced: 12 Feb 2026

https://github.com/yalai92/alfalfa_imp_exp_analysis

This repository covers data cleaning, analysis, and visualization of global alfalfa and pellet imports, focusing on trends from 2003 to 2023. It also includes a predictive analysis of global alfalfa demand for 2024-2029, using data science techniques to provide insights for stakeholders in the alfalfa industry.

data-analysis data-cleaning data-visualization matplotlib numpy pandas python sckiit-learn tableau

Last synced: 12 Feb 2026

https://github.com/shrutiijoshi/coffee_sales

This project aims to analyze coffee sales data to identify key trends, patterns, and factors influencing sales performance.

data-analysis microsoft-excel

Last synced: 28 Feb 2026

https://github.com/m-ah07/text-sentiment-analysis-api

A lightweight Python project for analyzing the sentiment of textual data using the TextBlob library. This project provides a simple and effective way to measure the polarity and subjectivity of any given text.

data-analysis machine-learning python python-project sentiment-analysis text-analysis text-mining

Last synced: 14 Feb 2026

https://github.com/mo-elshamy/machine-learning-practice

This repository serves as a collection of my work and learning in machine learning while my internship in Cellual-Technologies, including algorithm explanations, data preprocessing workflows, and two projects.

data-analysis data-science dbscan decision-trees eda gradient-boosting gxboost hierarchical-clustering kmeans-clustering knn-classification linear-regression logistic-regression machine-learning model pca polynomial-regression preprocessing random-forest support-vector-machines training

Last synced: 14 Feb 2026

https://github.com/balajimohan18/tableau-visualization-project

This repository contains Visualization Projects which is visualized through Tableau Software, by using the visualization we can gain multiple insights and strategies which helps to develop the business for gaining high profit margins and also it provides social values in some cases to reduce damages by calamities.

data-analysis data-science data-visualization exploratory-data-analysis tableau tableau-public

Last synced: 19 Mar 2026

https://github.com/fhdsl/seattlestatsummer_r

A 4-day introduction to R programming, focused on Fred Hutch Research Interns

beginner beginner-friendly course data-analysis data-science introduction-to-programming r-programming tidyverse

Last synced: 19 Mar 2026

https://github.com/projects-developer/full-stack-network-intrusion-detection-system-using-machine-learning

The project aims to design and develop a full-stack network intrusion detection system using machine learning techniques. Project Includes Source Code, PPT, Synopsis, Report, Documents, Base Research Paper & Video tutorials

algorithms computerscienceproject cybersecurity data-analysis full-stack-development intrusion-detection-system machine-learning network-intrusion-detection network-security web-development

Last synced: 14 Feb 2026

https://github.com/suhail25/pizza-sales-analysis

Delved into detailed analysis of sales data presented in Excel by Pizza sales manager; implemented strategic pricing adjustments resulting in a 25% revenue surge and enhanced profit margins. Explore and cleaned the data set using SQL and then performed data analysis by filtering the 12% of data using SQL commands in MySQL.

data-analysis excel powerpoint-presentations sql

Last synced: 15 Feb 2026

https://github.com/kunalkumar2001/data-analyst-power-bi

Data Analyst Power BI Project for Portfolio

data-analysis data-analyst data-analyst-power-bi powerbi

Last synced: 16 Feb 2026

https://github.com/prekshivyas/cis-595-big-data-analytics

Comprehensive real estate price prediction project, integrating socioeconomic indicators and property features.

data-analysis data-cleaning data-mining data-preprocessing data-science data-visualization data-wrangling exploratory-data-analysis web-scraping

Last synced: 16 Feb 2026

https://github.com/k-bloch/car-theft-analysis

A dashboard created to inform the public about car theft, providing insights extracted from real-world police stats.

data-analysis maven-analytics tableau

Last synced: 19 Mar 2026

https://github.com/arunesh-tiwari/sales-analysis

Tableau Data Analysis Project.

data-analysis data-visualization tableau

Last synced: 01 Mar 2026

https://github.com/cuadernin/dispositivos_analisis

Breve análisis de un conjunto de datos sobre dispositivos móviles

data-analysis data-science data-visualization descriptive-statistics jupyter-notebook python-3 seaborn

Last synced: 18 Apr 2026

https://github.com/madusales/powerbi-etl-elt

Venho estudando, através do Bootcamp da DIO sobre Data Analytics & Power BI, acerca do uso de SQL para criar soluções em BI. Esse repositório é dedicado a registrar os meus conhecimentos adquiridos até então sobre o que é BI, Tipos de análises, ETL e ELT.

big-data business-intelligence data-analysis powerbi

Last synced: 19 Mar 2026

https://github.com/chaitanyaprasad60/sql-queries

This is a list of complex SQL Queries I have practiced.

data-analysis sql window-functions

Last synced: 03 Mar 2026

https://github.com/jjfiv/csc212spellchecking

Data Structure Analysis for Spell Checking

data-analysis smith-csc212

Last synced: 03 Mar 2026

https://github.com/anas436/student-performance-analysis

In this project I have constructed a Machine Learning System which will analyis students performance with about their academic records. Note that, this project will work with any students recods which you want to provide.

data-analysis jupyter-notebook matplotlib numpy pandas python3 seaborn

Last synced: 16 Apr 2026

https://github.com/asghar-rizvi/eda_student_dataset

This repository contains the results of data analysis and exploratory data analysis (EDA) conducted on the Student_Dataset. The analysis focuses on understanding various factors affecting student grades and visualizing these relationships using Matplotlib and Seaborn.

data-analysis data-analysis-python data-science jupyter-notebook python3

Last synced: 16 Apr 2026

https://github.com/steno-aarhus/mediation-analysis-course

Modern mediation analysis for basic, clinical and epidemiological research in diabetes and endocrinology

data-analysis data-analysis-in-r diabetes diabetes-epidemiology mediation-analysis open-educational-resource

Last synced: 03 Mar 2026

https://github.com/jofaval/melbourne-housing

Data Analysis of the Housing Market in Melbourne, Australia in 2016-2017

data-analysis data-science data-visualization deep-learning google-colab kaggle machine-learning melbourne python xgboost

Last synced: 16 Apr 2026

https://github.com/abhipatel35/gym-performance-analysis

Analyzing gym performance and user engagement in Arizona using Spark SQL, PySpark, and visualization techniques on the Yelp dataset.

apache-spark asu business-insights data-analysis data-processing-at-scale data-visualization dps gym-analysis rating-patterns sql trend-analysis user-insights yelp-dataset

Last synced: 16 Apr 2026

https://github.com/kosuri-indu/allaboutolympics

All About Olympics is an interactive dashboard presenting comprehensive data and insights on Olympic Games from 1896 to 2020.

data-analysis pandas plotly python streamlit

Last synced: 16 Apr 2026

https://github.com/akash-srm/user-engagement-analysis

Analyzed user engagement and feedback data to derive actionable insights for an online learning platform.

analytics-projects data-analysis data-cleaning eda jupyter-notebook pandas python seaborn student-engagement

Last synced: 16 Apr 2026

https://github.com/danpoynor/omdb-api-data-analysis

Gathers data for Oscar-winning movies using their IMDB ids, saves the information to a CSV file, and answers a few data analysis questions about the movies using JupyterLab.

analytics csv data-analysis jupyter-notebook matplotlib omdb-api pandas-dataframe python-dotenv python3 seaborn-plots

Last synced: 16 Apr 2026

https://github.com/yasumorishima/yasumorishima

Manufacturing Engineer & Data Analyst. 17 years exp in MFG. Python, VBA, Automation Specialist. (盛島康徳 / Yasunori Morishima)

automation data-analysis manufacturing portfolio python vba

Last synced: 05 Mar 2026

https://github.com/kheriberto/knn_project

This is a simple project that uses dummie data to practice and demonstrate my knowledge of the KNN algorithm.

data-analysis knn-classifier numpy python scikit-learn seaborn

Last synced: 02 Apr 2026

https://github.com/humayun-raza-030/restaurant-recommendation-system

This project is a Restaurant Recommendation System that helps users find restaurants in Lahore based on their location, customer reviews, and ratings. The system scrapes restaurant data from Google Maps, analyzes user reviews for sentiment, and provides a visualization dashboard using Tableau.

data-analysis data-science data-visualization python

Last synced: 17 Apr 2026

https://github.com/sharmas1ddharth/data-analysis-with-python

Freecodecamp's Data Analysis with Python Projects Code

data-analysis data-analysis-with-python freecodecamp-project

Last synced: 03 Jun 2026

https://github.com/rishisolanke/pdf_query_langchain

PDF Query LangChain is a tool that extracts and queries information from PDF documents using advanced language processing. Leveraging LangChain, OpenAI, and Cassandra, this app enables efficient, interactive querying of PDF content. Ideal for data analysis, research, and automated reporting, it simplifies detailed document analysis with ease.

artificial-intelligence data-analysis document-query langchain natural-language-processing nlp openai pdf-analysis pdf-extraction python research-tool

Last synced: 17 Apr 2026

https://github.com/victoorv/criminalite_us

Une analyse de la criminalité en fonction de variables socio-économiques a été menée, incluant la sélection et la comparaison de modèles de régression multiple ainsi que des tests d'hypothèses sur les coefficients et la significativité des modèles.

data-analysis data-science r regression regression-analysis regression-models statistical-analysis statistical-tests statistics

Last synced: 04 Apr 2026

https://github.com/davidmalko87/steam-library-exporter

Python script to export your Steam game library to CSV — playtime, genres, reviews, metacritic scores, prices, tags & estimated owners via Steam Web API + Store API + SteamSpy

csv-export data-analysis game-data metacritic playtime-tracker python steam steam-api steam-games steam-library steamspy

Last synced: 04 Apr 2026

https://github.com/pawlo77/airline-performance-data-analysis

Preprocessing of structured data - part of IAD study program, Faculty of Mathematics and Information Science, Warsaw University of Technology

data-analysis data-science visualization

Last synced: 10 May 2026

https://github.com/rajeev2806/retail-order-data-analysis

Dataset downloaded from kaggle api and then data cleaning and analysis is performed

data-analysis data-cleaning postgresql

Last synced: 18 Apr 2026

https://github.com/akhundmuzzammil/energyconsumptionprediction

This repository contains code and resources for training a linear regression model to predict energy consumption based on various building parameters.

data-analysis energy-consumption linear-regression machine-learning python scikit-learn streamlit visualization

Last synced: 18 Apr 2026

https://github.com/nishanthmuruganantham/football-player-wages-eda

This repository uses Python for analyzing football player data, focusing on various aspects such as player positions, league distributions, wages, and the relationship between player age and appearances. It includes visualizations generated using Plotly to provide insights into the dynamics of football player demographics and performance.

data-analysis data-science data-visualization eda football football-analytics football-data kaggle kaggle-dataset pandas plotly python

Last synced: 18 Apr 2026