An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/pradipece/weather_forecast_data_analysis

Using decision trees and random forest algorithms to solve real-world data analysis. "sklearn_decision_trees_random_forests"

data-analysis data-science data-visualization git github python python3

Last synced: 19 Apr 2026

https://github.com/melogabriel/nubank-expenses-analysis

This project consolidates monthly credit card statement data from Nubank into a single CSV file using Python, enabling data visualization through a Google Sheets dashboard in Looker Studio.

data-analysis data-visualization googlesheets lookerstudio pandas python

Last synced: 02 May 2026

https://github.com/as16082023/restaurant-order-analysis

Analyzing order data to identify the most and least popular menu items and types of cuisine

data-analysis maven-analytics mysql restaurant-order sql

Last synced: 10 Apr 2025

https://github.com/phomint/udacity_dataanalysis

All projects and activities

data-analysis python udacity-nanodegree

Last synced: 11 Oct 2025

https://github.com/luabagg/worldwide-trends

Worldwide Google Trends visualization and classification

data-analysis data-visualization google-trends trends

Last synced: 03 Feb 2026

https://github.com/gaurav-van/house_price_predictor_streamlit_web_app

Data Science Project to Predict House Prices in Bangalore using the concept of Regression. This Repository is used for Deployment of the Project

data-analysis data-science exploratory-data-analysis machine-learning prediction python regression streamlit

Last synced: 02 May 2026

https://github.com/devandrenicolas/analise-de-vendas

This project is a comprehensive data analysis tool designed to analyze sales performance data. It includes modules for generating fake sales data, cleaning and preprocessing the data, and performing exploratory data analysis (EDA) with advanced visualizations.

data-analysis data-visualization faker-generator matplotlib pandas python

Last synced: 07 May 2026

https://github.com/vimal0156/ruaroa-ai

🧙‍♂️ Zero-Code Machine Learning Wizard - Transform ideas into intelligent solutions without writing code. AI-powered ML pipeline automation with interactive web interface.

ai-agents ai-assistant artificial-intelligence automated-machine-learning code-generation data-analysis data-science deep-learning jupyter machine-learning machine-learning-pipeline neural-networks no-code openai python scikit-learn streamlit visualization

Last synced: 09 Apr 2026

https://github.com/al-ghaly/power-bi-dashboard

A dashboard to analyze data specializations job market.

dashboard data-analysis powerbi

Last synced: 02 Feb 2026

https://github.com/adityakumarsingh01/customer-purchase-behaviour-analysis

A data analysis project exploring online consumer behavior and FOMO effects using EDA on survey data.

consumer-behavior data-analysis eda fomo online-shopping python survey-data

Last synced: 25 Apr 2026

https://github.com/giordano-lucas/tesco-extension

Products clustering and interactive visualization

clustering data-analysis data-visualization tesco

Last synced: 17 Jun 2026

https://github.com/ganesh2409/cricket-player-performance

This repository contains a comprehensive project focused on analyzing cricket player performance using various datasets, including batting, bowling, and match results. The project involves data preprocessing, feature engineering, and model training to predict and evaluate player performance scores. It includes detailed scripts for data analysis

cricket-performance-analysis data-analysis machine-learning sports-analytics

Last synced: 05 Aug 2025

https://github.com/khuyentran1401/sample_datapane_script

This repo shows how to use Datapane create a simple script to see the rank of the authors or publications with respect to publishing frequency

data-analysis data-science datapane python

Last synced: 21 May 2026

https://github.com/cuadernin/coffeeanalysis

Análisis de datos correspondiente a la tercera etapa de la certificación de Datacamp.

coffee data-analysis datacamp python

Last synced: 07 Aug 2025

https://github.com/roberto-butti/fit_explorer

FIT File Explorer, in GO Lang

data-analysis fitness geospatial golang

Last synced: 12 Apr 2025

https://github.com/nafisalawalidris/northwind-traders-sales-analysis

Northwind Traders Sales Analysis project, which analyses sales data for a fictitious company. It utilises the Northwind Database and includes SQL queries to provide insights on employees, products, suppliers and revenue. The project aims to help the company gain valuable information for business decision-making.

business-insights data-analysis database northwind-traders sales sql

Last synced: 07 Aug 2025

https://github.com/garcane/nike_web_crawler

This project involves web scraping Nike's product pages to extract product names, prices and links. The project showcases three different implementations of the web crawler using Selenium and BeautifulSoup. It also includes visualisation of the scraped data using Matplotlib and Seaborn.

beautifulsoup data-analysis data-visualization python selenium web-crawler web-scraper webcrawler webscraper webscraping webscraping-beautifulsoup

Last synced: 18 Apr 2026

https://github.com/sarathchandranpm/cleaning-and-exploratory-analysis-of-global-layoff-data

This project involves a thorough data analysis and cleaning process centered on global layoff data. It showcases advanced data management abilities by integrating data cleaning methods with a detailed exploration of workforce reduction patterns across various companies, industries, and countries.

data-analysis data-cleaning mysql sql

Last synced: 22 Sep 2025

https://github.com/turquetti/projeto5-vamoai

Projeto final da Resilia + iFood <3

data-analysis python tableau

Last synced: 14 May 2026

https://github.com/zachpinto/real-time-indicators

Streamlit-based analytics dashboard visualizing real-time economic indicators. This project uses cron jobs to provide real-time updates of common economic indicators

analytics-engineering data-analysis plotly streamlit visualization

Last synced: 15 May 2026

https://github.com/jen-uis/loan-status-prediction

This repository contains project materials for the Winter STAT 206 class, University of California, Riverside, A. Gary Anderson School of Management.

data data-analysis data-analytics data-cleaning data-visualization descriptive-analytics julia julia-language jupyter-notebook predictive-analytics predictive-modeling team-collaboration

Last synced: 02 Jan 2026

https://github.com/navdeep-g/data-quality-checker

A comprehensive Python tool for data analysis and data quality

data-analysis data-science pandas python

Last synced: 16 May 2026

https://github.com/revan-alqahmi/summarize-talabat-company-reviews

Natural Language Processing Project, which is a program that analyzes Arabic comments at Talabat Company and classifies them into positive, negative, and neutral using machine learning algorithms and natural language processing techniques.

artificial-intelligence data-analysis machine-learning-algorithms natural-language-processing python

Last synced: 11 Jan 2026

https://github.com/the-tech-idea/beepdm

A Library for Managing your Connection to Different DataSources . Still in Alpha.please be patient

data-analysis data-management data-management-platform data-science database dataset information

Last synced: 08 Aug 2025

https://github.com/draym/swmanager

Web-app to help you in your daily life raids in SpacesWars thanks to game statistics and data management

dashboard-application data-analysis data-visualization game-data game-utility

Last synced: 19 Jun 2025

https://github.com/messi10tom/ai-based-grade-prediction

GDSC task-1: Build a model to predict a student’s final grade based on features such as attendance, participation, assignment scores, and exam marks.

ai data-analysis data-science regression streamlit

Last synced: 02 May 2026

https://github.com/simranjeet97/ipl-dataanalysis

Data Analysis performed on IPL Dataset with Data Profiling, Data Pre-Processing, Data Manipulation, and Data Visualization.

artificial-intelligence data-analysis data-manipulation data-mining data-preprocessing data-science data-visualization indian-premier-league-2008-2018 ipl ipl-dataset iplayer python

Last synced: 08 May 2026

https://github.com/ebowwa/chatgpt-export-processor

🤖 Extract, analyze & search your ChatGPT conversations locally | Privacy-first tool for OpenAI ChatGPT data export processing | Python CLI with embeddings support

ai-tools chatgpt chatgpt-export chatgpt-tools cli conversation-analysis data-analysis data-extraction embeddings local-first nlp openai openai-api privacy python

Last synced: 19 May 2026

https://github.com/airscholar/data_analysis_with_ai

A repository showing how to use AI and ChatGPT for Data Analysis with Pandas and Python

chatgpt data-analysis gpt4 openai pandas pandasai python

Last synced: 10 Apr 2026

https://github.com/saksham-jain177/automated-data-analysis-and-visualization

About Automated Data Analysis and Visualization is a Streamlit web application designed for quick and insightful data analysis. Users can easily upload CSV files, perform automated preprocessing, and generate interactive visualizations such as histograms, scatter plots, and heatmaps.

automated-reporting data-analysis data-preprocessing data-science data-visualization datasets exploratory-data-analysis interactive-visualizations machine-learning python streamlit

Last synced: 15 May 2026

https://github.com/salman-khan-mohammed/predicting-the-intent-of-online-shoppers

This project aims to predict online shoppers' purchase intentions using browsing history and user data from e-commerce sites. By analyzing clickstream and session information, the goal is to create a machine learning model that accurately forecasts customers' likelihood of making a purchase.

cluster-analysis data-analysis data-pre eda outliers prediction

Last synced: 31 Oct 2025

https://github.com/jayita11/atliqo-bank-credit-card-launch-eda

This project involves exploratory data analysis and statistical testing for AtliQo Bank's new credit card launch. Key insights include targeting high-income occupations and the 18-25 age group. Recommendations focus on tailored marketing campaigns, education, and incentives to enhance credit card adoption and usage among young adults.

data-analysis hypothesis-testing matplotlib p-value pandas python seaborn statistics z-test

Last synced: 09 Apr 2026

https://github.com/JovaniPink/excel-powerbi

The folder of my work with Excel, VBA, and PowerBI for Data Analysis & Visualization.

data-analysis data-visualization dax excel excel-vba power-pivot power-query powerbi vba-macros

Last synced: 20 Jul 2025

https://github.com/leocornus/leocornus-visualdata

JavaScript libraries to make data visualization simpler and easier.

data-analysis data-mining data-visualization data-visualization-simpler javascript-library

Last synced: 10 Aug 2025

https://github.com/deepanshkhurana/cloudsimplifier

Simple helper functions to fetch and read data from various formats stored on Amazon AWS S3 Buckets. Most functions are essentially wrapping over cloudyR.

amazon aws cloudyr data-analysis data-fetching data-science package r rpackage s3

Last synced: 20 May 2026

https://github.com/abhaysingh71/laptop-price-predictor

Laptop Price Predictor is a Dockerized machine learning project that predicts laptop prices based on specs using ensemble models like Random Forest, XGBoost, and Gradient Boosting.Including Streamlit UI, and full Docker support.

data-analysis data-science deployment docker docker-image ensemble-learning laptop-price-prediction machine-learning-algorithms streamlit xgboost

Last synced: 05 May 2026

https://github.com/ahmednurabdii/data-analytics-portfolio-superstore

My first portfolio project showcasing data cleaning, analysis, and visualization of Superstore sales data.

data-analysis data-visualization jupyter-notebook matplotlib numpy pandas portfolio-project python sales-analysis scipy seaborn superstore-dataset

Last synced: 07 Apr 2026

https://github.com/sayantanidalui/student-mental-health-analysis

A SQL-based analysis project exploring student mental health, stress, and lifestyle patterns. Uncovers key insights using joins, CTEs, and window functions — no other tools used.

data-analysis mental-health mysql sql studentdata

Last synced: 07 Jul 2025

https://github.com/nabilshadman/r-data-analysis

A modular R framework for data analysis, with emphasis on data processing and reproducible workflows.

data-analysis data-cleaning data-manipulation data-science descriptive-statistics programming r r-studio statistical-analysis statistical-computing t-test

Last synced: 04 Apr 2025

https://github.com/atxtechbro/glassdoorwebscraping

"Scraping Glassdoor: A GraphQL Journey" is an advanced data harvesting tool leveraging GraphQL and an API-first strategy to extract and analyze Glassdoor data for business intelligence and predictive analytics.

api-first-approach business-intelligence data-analysis data-harvesting data-mining data-science glassdoor-scraper graphql html machine-learning performance-optimization predictive-analytics python requests-library-python scaleability scraper system-design web-scraping

Last synced: 16 May 2026

https://github.com/jcbritobr/iris

Iris dataset and data analysis with julia language.

data-analysis data-science data-visualization iris-dataset julia-language

Last synced: 06 Apr 2025

https://github.com/thanaphongk37/data-science-and-data-analyst-project

Portfolio Data Analysis and Data Science projects and Data Engineer built using Azure Service, SQL and Python.

apache-superset azure-storage dashboards data-analysis data-science databricks dataengineering datafactory datapipeline powerbi python sisense sql sql-server visualization

Last synced: 11 May 2026

https://github.com/filiplangiewicz/businessintelligence

🏭 Data warehouses and business intelligence project

airbnb business-intelligence data-analysis data-warehouse

Last synced: 09 Mar 2026

https://github.com/anurag-kumar-molankala/anurag-kumar-molankala

👋 About Me I'm a Power BI Developer with a passion for data visualization and UI/UX design. I create interactive dashboards that turn data into clear, actionable insights for smarter decision-making.

business-intelligence dashboards data-analysis data-visualization dax-query mlanguage powerbi sqlserver uiuxdesigner

Last synced: 25 Jan 2026

https://github.com/sumidcyber/dataviz-master

This Python application provides a user-friendly interface to load and visualize the contents of a CSV file. Users can choose from various types of graphs and perform analyses on the dataset.

data-analysis data-analysis-project data-analysis-python database databases python python3

Last synced: 02 Jan 2026

https://github.com/shriram-vibhute/digit_classification

This project demonstrates various machine learning techniques for classifying handwritten digits from the MNIST dataset. It covers data preprocessing, model training, evaluation, and advanced classification strategies.

classification data-analysis data-visualization machine-learning matplotlib numpy pandas sk-learn

Last synced: 28 Oct 2025

https://github.com/pymarcus/tcc_sistemasdeinformacao2025

This application is part of a research project aimed to use Gemini AI agent to identify "atoms of confusion" -- minimal code elements that cause misunderstandings -- in the context of Software Engineering.

atoms-of-code ci-cd clean-architecture concurrent-programming data-analysis design-patterns gemini-api golang ifmg inteligencia-artificial postgresql software-engineering solid tcc tdd workerpool

Last synced: 14 May 2026

https://github.com/denko5/sales-analysis

A complete SQL-based sales analysis project covering Africa, showcasing data cleaning, exploratory analysis, insights, and lessons learned. The project highlights sales trends, regional performances, and marketing effectiveness across multiple platforms.

africa data data-analysis data-science exploratory-data-analysis insights kenya sales sql

Last synced: 24 Jan 2026

https://github.com/theanujsinha01/rainfall-prediction-using-machine-learning

This project predicts whether it will rain or not based on weather features like pressure, humidity, dew point, cloud cover, sunshine, wind direction, and wind speed. We use a Random Forest Classifier, a popular ML algorithm, trained on historical weather data. The model learns patterns and helps us forecast rain chances.

classification data-analysis eda machine-learning-algorithms matplotlib numpy pandas python scikit-learn seaborn supervised-learning

Last synced: 11 Apr 2026

https://github.com/as16082023/nashville-housing-data-cleaning-project

This project involved using MySQL to clean and optimize a Nashville housing dataset, addressing key data quality issues to ensure it was ready for accurate analysis.

data-analysis data-cleaning mysql nashville-housing-data

Last synced: 10 Apr 2025

https://github.com/ghackenberg/kurs-datenanalyse

This repository contains material for my data analysis course. In this course we first introduce the concept of databases and SQL, before diving into OLAP and other data analysis tools.

data-analysis data-structures data-warehouse entity-relationship-diagram etl graph list olap relational-algebra relational-database sql tree

Last synced: 17 Feb 2026

https://github.com/adolbyb/data-science-python

An Introduction to Data Science and Data Visualization with the FAU Data Science and Machine Learning Club

data-analysis data-science data-visualization jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 13 Apr 2026

https://github.com/ehopperdietzel/billionaires-analysis

Análisis de la cantidad de billonarios por país. Inspirado en el artículo "Russian Billionaires"

bootstrap data-analysis poisson-distribution prediction

Last synced: 18 May 2026

https://github.com/arv-anshul/easy-analysis

A python package to perform Data Analysis easily. (Not Recommended)

arv-dumped data-analysis data-science easy-analysis eda pypi pypi-package python3

Last synced: 14 May 2025

https://github.com/virajbhutada/walmart-retail-analyzer

Gain valuable insights into retail sales with the "Walmart Retail Performance Dashboard" in MS Excel. This user-friendly tool facilitates an in-depth analysis of key sales metrics, providing a comprehensive view of Walmart's performance. Make data-driven decisions for informed and strategic business outcomes.

analytics data-analysis data-science data-visualization excel insights interactive-visualizations performance-analysis retail-sales walmart

Last synced: 04 Mar 2026

https://github.com/fatihilhan42/web_scraping_football_statistics_per_game_data-main

In this notebook I will describe the process of scraping data from web portal understat.com that has a lot of statistical information about all games in top 5 European football leagues.

data-analysis data-manipulation data-science data-scraping data-visualization jupyter-notebook python

Last synced: 19 May 2026

https://github.com/yuvraj0412s/proactive-fraud-detection-using-machine-learning

An end-to-end machine learning project for detecting financial fraud using LightGBM, featuring in-depth EDA, advanced feature engineering, and a focus on actionable business insights.

class-imbalance classification-model data-analysis data-science data-visualization exploratory-data-analysis feature-engineering fintech fraud-detection jupyter-notebook lightgbm machine-learning pandas python scikit-learn smote

Last synced: 02 May 2026

https://github.com/helosantosdesousa/analise-previsao-de-rotatividade-ml

Projeto final do Bootcamp Data Girls 2025 que analisa a rotatividade de funcionários usando Machine Learning. Com base no dataset IBM HR Analytics Attrition, o projeto identifica os principais fatores de risco e cria modelos preditivos (SVC e Random Forest) com até 89% de acurácia para antecipar saídas e apoiar decisões estratégicas de RH.

analise-de-dados analise-exploratoria bootcamp ciencia-de-dados colab-notebook dados data data-analysis data-science dataanalytics dataframe eda machine-learning machine-learning-algorithms pandas python random-forest svc

Last synced: 16 Apr 2026

https://github.com/idhs-song/resume-matcher-agent-cn

🤖 Enhance your job applications with this AI-driven resume matcher that analyzes job descriptions to optimize your resume for better chances of success.

api-integration automation backend-development data-analysis data-visualization github-actions job-search machine-learning natural-language-processing open-source-tools python recommendation-system resume-matching user-interface web-app

Last synced: 18 May 2026

https://github.com/rayyan9477/multiple-disease-prediction-system

This repository contains a Multiple Disease Prediction System leveraging machine learning techniques for accurate predictions. It utilizes Python, Pandas, Scikit-learn, and Flask for data preprocessing, model building, and web deployment. Explore the project and connect on LinkedIn for collaborations.

data-analysis data-science machine-learning python streamlit

Last synced: 10 Apr 2026

https://github.com/nhsdigital/sde_summary_notebooks

Notebooks provided by the Wranglers for users to quickly gain insights on datasets inside the Secure Data Environment (SDE)

data-analysis data-linkage data-quality data-summary metrics statistics

Last synced: 12 Aug 2025

https://github.com/agustinmusanti/sqlchallenge-4

Desafio de creación de una base de datos SQL para una plataforma de streaming. Incluye DDL, DML y consultas avanzadas.

data-analysis database mysql sql streaming

Last synced: 18 May 2026

https://github.com/Narius2030/Hive-DataWarehouse-Analysis

Implement a Hive data warehouse to store meaningful data, apply Machine Learning like Clustering or Regression for dealing with business problems

apache-hadoop apache-hive data-analysis etl-pipeline hiveql machine-learning statistics

Last synced: 12 Aug 2025

https://github.com/akash1070/data-science-advanced-analytics-virtual-experience-program

The BCG Open-Access Data Science & Advanced Analytics Virtual Experience Program

data-analysis data-science machine-learning-algorithms

Last synced: 16 May 2026

https://github.com/jen-uis/la-crime-data-analysis

This repository contains project materials for the Fall 2023 MGT 256 class. This project is completed with assists from Professor Adem Orsdemir.

business-analytics crime-data crime-data-analysis data-analysis knn la-crimes-from-2020 la-safe r r-markdown r-studio report-generation rmd united-states visualization

Last synced: 14 Mar 2025

https://github.com/mahdi-eth/covid-analysis

Covid-19 data analysis project using python, numpy, pandas, matplotlib

data-analysis data-science python

Last synced: 13 Aug 2025

https://github.com/karatechop/noaa-storm-database-data-analysis

Analysis of population health and economic consequences of events documented in the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database.

data-analysis knitr r rmarkdown

Last synced: 14 Mar 2025

https://github.com/bcko/ud-da-eda-whitewinequality

Udacity Data Analyst Nanodegree Project : Exploratory Data Analysis : White Wine Quality dataset

data-analysis exploratory-data-analysis rmarkdown rstudio udacity udacity-data-analyst-nanodegree

Last synced: 03 Jan 2026

https://github.com/misaghmomenib/stock-momentum-analysis

A Python-based Data Analysis Tool Designed to Evaluate Stock Momentum. Leverages Historical Market Data to Identify Trends, Predict Price Movements, and Assist in Making Informed Investment Decisions.

data-analysis data-analysis-python data-visualization git open-source python

Last synced: 10 Apr 2025

https://github.com/drcbeatz/aynm-data

Python scripts for data cleaning and processing for AYNM (Pandas/NumPy/Selenium/AWS Textract)

automation aws-textract csv data-analysis data-cleaning ipynb numpy ocr pandas python reverb selenium shopify webscraping xml

Last synced: 07 Mar 2026

https://github.com/naruaika/eruo-data-studio

A powerful yet friendly ETL tool powered by Polars backend

data-analysis data-science desktop-app gnome-desktop gtk4 proof-of-concept python spreadsheet

Last synced: 18 Jul 2025

https://github.com/shibam120302/heart-disease-data-analysis-by-shibam

You can read more on the heart disease statistics and causes for self-understanding. This project covers manual exploratory data analysis

analysis data-analysis scraper

Last synced: 13 Aug 2025

https://github.com/subhojit45/python3-iphones-x-flipkart-sales-analysis

A simple six questions and their insights derived from iphone sales on Flipkart dataset.

data-analysis jupyter-notebook python3 visual-studio-code visualization

Last synced: 19 May 2026

https://github.com/x1ao4/doc-merger

通过 python 脚本将两个相对不完整的文档合并为一个完整的文档 / merge two relatively incomplete documents into one complete document via python script

data-analysis data-merging document-analysis document-comparison document-processing documents filtering filtering-data merge merge-documents

Last synced: 28 Jun 2025

https://github.com/geobatpo07/office-hours-bootcamp

Practical case studies and labs from the Akademi 2025 Data Science & AI Bootcamp office hours.

artificial-intelligence data-analysis data-science data-visualization database deep-learning learning learning-by-doing machine-learning statistics

Last synced: 07 Mar 2026

https://github.com/jasoncobra3/whatsapp_chat_analyzer

WhatsApp Chat Analyzer is a powerful tool that provides insightful analytics from your WhatsApp conversations. Whether you're curious about your chatting habits, want to analyze group dynamics, or need to extract meaningful data from your conversations, this tool has got you covered!

data-analysis data-science data-visualization machine-learning streamlit streamlit-webapp whatsapp-chat whatsapp-chat-analyzer

Last synced: 31 Jan 2026

https://github.com/hafeez-urrehman/mobile-price-classification

In the Mobile Price Classification project, I built a predictive model to categorize mobile phones into different price ranges based on their features by applying machine learning techniques.

data-analysis linear-regression machine-learning mobile-price-prediction model-save-and-load predictive-modeling

Last synced: 15 May 2026

https://github.com/noeyislearning/sharpe-ratio-amazon-facebook

Explore the Sharpe Ratio and its application to evaluate the performance of two tech giants: Amazon and Facebook.

amazon data-analysis data-science data-visualization facebook python3 sharpe-ratio

Last synced: 27 Mar 2025

https://github.com/nelsonkariuki/dataanalysis

This project involves data analysis of vido game sales from https://www.kaggle.com/gregorut/videogamesales/download

data-analysis data-visualization python

Last synced: 11 Jun 2026

https://github.com/anurag-kumar-molankala/data-professional-survey

This Power BI dashboard analyzes survey responses from data professionals, covering key aspects such as salary distribution, job satisfaction, and preferred programming languages. The insights help understand trends in the data industry and what matters most to professionals.

dashboard data-analysis data-visualization dax-measures dax-query demographics etl-process excel-import power-bi salary-analysis sql-server survey-analysis trend-analysis

Last synced: 02 Feb 2026