An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/adriangalvanzamora/ecommerce-analytics-olist

Data analysis project based on the Olist Brazilian E-Commerce dataset. Includes data cleaning, exploratory analysis, delivery performance metrics, customer satisfaction modeling, and geospatial insights. Built entirely in Python (Jupyter Notebook) using real-world data from Kaggle.

brazil customer-satisfaction data-analysis data-visualization ecommerce folium geospatial-analysis machine-learning matplotlib notebook pandas plotly python seaborn

Last synced: 06 May 2026

https://github.com/balajimohan18/loan-clustering-datascience-project

This project uses Machine Learning to Cluster loan together based on their similarities. The project uses a dataeset of loan application which includes information about the Loan amount and Balance. The project then use the clustering algorithm to group the loan together based on the similarities.

clustering-algorithm data-analysis data-science data-visualization eda kmeans-clustering machine-learning sql unsupervised-learning

Last synced: 27 Jul 2025

https://github.com/vedantshi/tableau-bike-data-dashboard

London Bike Rides Analysis explores bike usage patterns using data visualization and machine learning. It identifies trends through a dynamic moving average, analyzes weather impact with heatmaps, and provides actionable insights via an interactive Tableau dashboard. Tools: Python, Tableau.

data-analysis data-visualization python tableau weather-data

Last synced: 16 May 2026

https://github.com/drisskhattabi6/meteo-data-mining

This repo contains using Data Mining Techniques to analyze meteorological (meteo) data. The objective is to extract meaningful insights and patterns from the data that can aid in understanding weather phenomena and predicting future weather conditions.

cart data-analysis data-mining data-visualization decision-making decision-tree extract-data extract-insights insights-analytics insights-data k-means knn machine-learning svm

Last synced: 21 Mar 2025

https://github.com/maxbiostat/diehl_ebola_cell_2016

supplementary code and data to Diehl et al, 2016 (Cell)

data-analysis data-visualization disease-spread ebola mutation

Last synced: 11 Jul 2025

https://github.com/debjyotisaha/data-analytics-projects-phase-2

Developed and showcased various data analytics projects, including data preprocessing, exploratory data analysis, and visualization. Utilized tools such as Python, Pandas, NumPy, and Matplotlib to derive actionable insights and demonstrate problem-solving capabilities.

data-analysis data-preprocessing eda matplotlib numpy pandas python seaborn

Last synced: 09 Apr 2026

https://github.com/sakan811/gachascope

Evaluate the cost-effectiveness of various in-app purchase bundles available in gacha games.

data data-analysis data-visualization game honkai honkai-star-rail honkai-starrail hoyoverse javascript nextjs tableau tableau-public typescript wutheringwaves

Last synced: 04 May 2026

https://github.com/adnanrahin/nlp-with-disaster-tweets

Kaggle Competition: Predict which Tweets are about real disasters and which ones are not. Natural Language Processing.

data-analysis data-science data-visualization kaggle-competition machine-learning natural-language-processing regular-expression tweets

Last synced: 21 Jun 2025

https://github.com/liebsen/overlemon

Overlemon institutional application

data-analysis design devops sysadmin webdev

Last synced: 21 Jul 2025

https://github.com/capjamesg/personal-notebooks

Notebooks for personal experiments with machine learning and computer vision.

data-analysis machine-learning notebooks

Last synced: 03 Apr 2025

https://github.com/bamresearch/utah-saxs-tools

The Utah SAXS Tools (USToo), adapted for Python 3, originally by David P. Goldenberg, 2009-2012

data-analysis saxs small-angle-scattering small-angle-xray-scattering

Last synced: 17 Jan 2026

https://github.com/pkjjoshi/restaurants-analysis

Performed beginner-level EDA on a restaurant dataset using Python. Analyzed top cuisines, city-wise ratings, price ranges, and online delivery impact using Pandas and Matplotlib. Includes 4 well-structured notebooks with visual insights.

beginner-project data-analysis data-visualization exploratory-data-analysis jupyter-notebook pandas python restaurant-data seaborn

Last synced: 21 Jun 2025

https://github.com/teditae/data-analysis-with-pandas

Mini data science projects focused on Pandas-powered analysis.

data-analysis data-manipulation pandas python

Last synced: 30 Apr 2026

https://github.com/atharvkadammm/suicide-prediction-system

A machine learning project predicting suicide risk based on multiple socio-economic and environmental factors using data mining techniques.

csv data-analysis data-science data-visualization datamining exploratory-data-analysis feature-engineering machine-learnin matplotlib mental-health numpy pandas riskassesment seaborn sklearn suicide-prediction supervised-

Last synced: 01 Jul 2025

https://github.com/lavkalsi/tableau-project-stock-market-analysis

The Tableau Project: Stock Market Analysis features a dashboard that combines Descriptive, Diagnostic, Predictive, and Prescriptive analytics to provide insights into stock market trends. Using Python for data processing and an LSTM model for forecasting, this project visualizes historical and predicted stock prices, helping make informed decision.

dashboard data-analysis deep-learning lstm-model python tableau

Last synced: 18 May 2026

https://github.com/caprogs/paris-events-analyzer

A project to analyze events in Paris using open source data provided by the city.

data data-analysis data-platform dbt docker ingestion python streamlit transformation vizualisation

Last synced: 04 May 2026

https://github.com/rathod-shubham/google-data-analytics

Learning a wide range of skills that are useful in everyday life as well as being a data analyst.

data-analysis data-analysis-in-r data-analyst data-analyst-nanodegree data-analytics data-visualization google

Last synced: 03 Feb 2026

https://github.com/dsrodrigovieira/rossmannsales

Este repositório contém um projeto desenvolvido para praticar análise de dados e aplicação de modelos de regressão (aprendizagem supervisionada)

data-analysis data-science machine-learning python telegram-bot xgboost-regression

Last synced: 19 May 2026

https://github.com/atharvkadammm/calmlytic

An end-to-end machine learning project that predicts anxiety severity using classification models (Naive Bayes, Decision Tree, SVM, Logistic Regression, XGBoost), based on lifestyle, health, and behavioral features.

anxiety-prediction classification csv data-analysis data-preprocessing-and-cleaning data-science data-visualization ensemble-learning logistic-regression machine-learning-algorithms matplotlib mental-health numpy pandas python sci-kit-learn seaborn supervised-learning svm xgboost

Last synced: 21 Jun 2025

https://github.com/kevin-rsj/the-substance-sentiment-analysis

Se analiza los comentarios de usuarios de Reddit sobre la película The Substance (2024) usando técnicas de NLP. Se obtuvo un sentiment score promedio de 0.19, y palabras clave como "horror" y "like" destacan entre las opiniones.

data-analysis notebook python sentiment-analysis tableau visualization

Last synced: 19 May 2026

https://github.com/kianaasd93/faostat

build a multilayer perceptron model that can be used for forecasting the export value of crop products for a geographical region three years into the future

agriculture data-analysis data-science faostat machine-learning ml multiplayer python rnn

Last synced: 19 May 2026

https://github.com/marcogdepinto/olympichistoryanalysis

Python visual analysis of the Olympic Games history. Kaggle gold medal with 15000+ views, 200+ upvotes and 100+ comments.

data-analysis data-science jupyter-notebook olympic-games python seaborn

Last synced: 29 Apr 2026

https://github.com/rezowanrahat/netflix_analysis

Data analysis of Netflix content using Python, Pandas, and Seaborn

data-analysis data-visualization netflix pandas python

Last synced: 07 May 2026

https://github.com/shrunga92/5g_qos_data_transformation_python

Resource Allocation in 5G Network Service

5g-nr data-analysis python

Last synced: 19 May 2026

https://github.com/kushagrakumar04/visual-age-distribution

A Bar chart or histogram to visually depict the distribution of a categorical or continuous variable, such as the age distribution or gender composition within a population. This graphical representation provides a clear and insightful overview of the data's patterns and trends.

data-analysis data-science google-colab

Last synced: 21 Jun 2025

https://github.com/jesusgomez-data/retail-sales-data-analysis

End-to-end retail sales data analysis project using SQL, SQLite and Python (Pandas). Includes data generation, KPIs and business insights.

data-analysis junior-data-analyst pandas portfolio-project python retail-analysis sql sqlite sqlite3

Last synced: 11 Apr 2026

https://github.com/saidabderrahmane/bus_line_supervision

Performance evaluation of the Saint-Sébastien bus line using real data to predict the number of passengers.

beautifulsoup4 data-analysis data-science deep-learning machine-learning python scraper sklearn

Last synced: 11 Apr 2026

https://github.com/jatin-s16/netflix_analysis

This project involves a comprehensive analysis of Netflix's movies and TV shows data using SQL. The goal is to extract valuable insights and answer various business questions based on the dataset. The following README provides a detailed account of the project's objectives, business problems, solutions, findings, and conclusions.

data-analysis excel postgresql sql

Last synced: 19 May 2026

https://github.com/thc1006/taiwan-ai-usage-index

台灣 AI 使用指數 (TAUI) - 開源資料分析框架,測量分析台灣各地區 AI 技術採用率 | Taiwan AI Usage Index - Open-source framework for measuring regional AI adoption

ai-adoption anthropic-index bilingual data-analysis human-ai-collaboration onet-classification open-source policy-analysis privacy-protection python research taiwan tdd usage-index visualization

Last synced: 03 Oct 2025

https://github.com/jedrzej-wydra/competition-cooperation

Competition, cooperation, and parental effects in larval aggregations formed on carrion by communally breeding beetles Necrodes littoralis (Staphylinidae: Silphinae)

data-analysis non-linear-regression r

Last synced: 20 Aug 2025

https://github.com/jidesamuell/data-analytics-projects

This is a repository i have created to showcase my skills, share projects and track my progress in Data Analytics areas.

data-analysis excel matplotlib powrebi python sql

Last synced: 04 May 2026

https://github.com/first-coding/aidanalyst

AIDAnalyst is an AI-powered data analysis tool that leverages large language models (LLMs) to generate SQL queries from natural language prompts. Upload CSV files, explore the data schema, and retrieve insights with ease. The system ensures error correction in SQL queries, delivering detailed reports and visualizations in a streamlined workflow

data-analysis llm openai prompt-engineering python

Last synced: 19 May 2026

https://github.com/vubacktracking/freecodecamp-data-analysis-with-python

5 Projects in Data Analysis With Python Course on Freecodecamp

data-analysis freecodecamp freecodecamp-project python

Last synced: 19 May 2026

https://github.com/hamzacham/data_set-projet-8

Analyzing a real world data-set with SQL and Python

data-analysis database dataset jupyter-notebook paython sql

Last synced: 19 May 2026

https://github.com/bjornmelin/data-analytics-playground

🧐 Collection of academic data analytics projects showcasing exploratory data analysis, geographic visualization, and interactive dashboards.

data-analysis data-analytics data-visualization geographic-analysis ggplot interactive-maps leaflet r r-programming shiny tidyverse

Last synced: 06 Apr 2025

https://github.com/abdoomohamedd/data-science-projects

A collection of data science projects ranging from exploratory data analysis to predictive modeling and clustering. Each project is designed to solve specific problems or explore particular datasets using various data science techniques and tools.

data-analysis data-analysis-python data-cleaning data-science data-visualization machine-learning machine-learning-algorithms

Last synced: 14 May 2025

https://github.com/alpkanoz/ibm_data_science_professional_certificate

The repository contains projects and training materials carried out throughout the IBM data science professional course.

classification clustering data-analysis data-science data-visualization dataframe ibm ibm-watson machine-learning mathplotlib pandas predictive-modeling python scikit-learn

Last synced: 07 Mar 2026

https://github.com/jgohel9902/toronto-airbnb-snowflake

This project analyzes Airbnb listings in Toronto using **Snowflake’s cloud data platform**. It follows a **Bronze → Silver → Gold** medallion architecture and leverages **Snowflake Cortex** to generate **AI-driven executive insights**.

data-analysis python snowflake sql

Last synced: 07 Mar 2026

https://github.com/marlysson/craw

A system to show the data collected from various sources using chartjs - ⚡️

chartsjs data-analysis data-science web-scraping

Last synced: 21 Jun 2025

https://github.com/jyrki69pro/pdf-insight-agent

📄 Extract insights from PDFs effortlessly with this AI-powered summarizer, transforming documents into structured, actionable points.

agent-based-model agentic-ai agentic-workflow agents ai-agent data-analysis finance-management financial-analysis generative-ai langchain langgraph llama3 llm multiagent-systems pdf phidata python toolcalling

Last synced: 11 Apr 2026

https://github.com/idaraabasiudoh/drug_prescribtion_decision_tree_model

This repository contains a machine learning project focused on classifying drugs based on patient characteristics using a Decision Tree classifier. The project uses Python and popular data science libraries such as scikit-learn, pandas, and matplotlib.

data-analysis jupyter-notebook machine-learning python3 scikit-learn

Last synced: 10 Apr 2026

https://github.com/devexpress-examples/web-forms-pivot-grid-calculate-running-totals

This example demonstrates how to calculate running totals in Pivot Grid for Web Forms.

asp-net-web-forms data-analysis dotnet pivot-grid pivot-grid-for-web-forms

Last synced: 08 Aug 2025

https://github.com/touppercase78/salary-prediction-collection

Salary predictions with ML models and analyses on datasets from several other GitHub repos

data-analysis data-visualization datasets machine-learning python3 regression-models

Last synced: 02 May 2026

https://github.com/ramonanf/tc1002s_semanatec

Herramientas computacionales: El arte de la analítica

data-analysis data-visualization jupiter-notebook pandas-python

Last synced: 15 Jun 2025

https://github.com/bho0920/crime-data-analysis-eu

Crime Data Analysis for Self-Defense Tool Market Entry in the EU.

data data-analysis sql sqlite tableau

Last synced: 21 Jun 2025

https://github.com/eco786786/salaries

This analysis explores the factors influencing salaries for data professionals from 2020 to 2024, including job titles, experience levels, remote work ratios, employment types, company locations and sizes. Using data from Kaggle, the project uncovers trends and insights to guide both companies and professionals in the tech industry.

data-analysis git postgresql powerbi

Last synced: 19 May 2026

https://github.com/dmytrori/himalayan_expeditions

Himalayan expedition stats, 1905–2020

alpinism data-analysis data-visualization pandas-python

Last synced: 21 Jun 2025

https://github.com/mimi-netizen/python-and-machine-learning-in-financial-analysis

This comprehensive repository covers financial data analysis using Python and machine learning techniques, including time series modeling, portfolio optimization, risk assessment, credit risk prediction, and deep learning applications in finance.

data-analysis data-science data-visualization finance financial-analysis financial-data financial-modeling

Last synced: 19 May 2026

https://github.com/jabulente/tanzania-geographical-zones

This project provides a geospatial visualization of Tanzania's geographical zones and regions. It uses geospatial data to map each zone, display regions, and annotate them for easy identification. The visualizations include simulated data to demonstrate thematic mapping techniques.

ai data-analysis data-science data-visualization geopandas geospatial location matplotlib ml python tanzania tanzania-geographic tanzania-locations

Last synced: 19 May 2026

https://github.com/mysftz/statistics-analysis

A python statistical analysis of a dataset and probability.

data-analysis matplotlib python python3 statistical-analysis

Last synced: 29 Jun 2025

https://github.com/galahad20/b244006e_analisis_data

Data Analysis project at Dicoding course "Belajar Analisis Data dengan Python". I learn to do analyst on data and visualizing it to get meaningful insight.

data-analysis data-analytics python streamlit

Last synced: 06 Apr 2026

https://github.com/jabercrombia/invoice-tracker

Created an invoice tracker with sample data using Nextjs and data visualizations.

data-analysis nextjs postgres shadcn vercel

Last synced: 07 Apr 2026

https://github.com/iamsainikhil/data-visualization

Visualization of Web data using Python

data-analysis data-visualization python webscraping

Last synced: 13 Jun 2026

https://github.com/srvcl/lung-cancer-survival-analysis

Data Cleaning of a dataset and Survival Analysis in R Language

data-analysis data-science data-visualization r survival-analysis

Last synced: 11 May 2026

https://github.com/ccoolbaugh/individualized_cooling_data_analysis

Matlab code to analyze data collected during a brown adipose tissue individualized cooling protocol.

brown-adipose-tissue cold-exposure data-analysis ibutton matlab skin-temperature thermoregulation

Last synced: 18 Aug 2025

https://github.com/madrury/hot-sauce

Simuation of a Hot Sauce Spicyness Dataset

data-analysis data-science data-visualization dataset machine-learning

Last synced: 16 May 2026

https://github.com/whisplnspace/insightgenie

InsightGenie is an AI-powered data analyst that lets you upload files, ask questions, and get insights with visualizations

data-analysis data-science data-visualization deployment gemini-api huggingface nlp

Last synced: 19 Jun 2025

https://github.com/lucasfloresc/final_project

This is the final project of the Ironhack Bootcamp. In this project I applied all methods and tecniques learned in the Bootcamp, such as Web Scrapping and API extraction, Data cleaning and processing with Python, Python logic, the implementation of machine learning and Data Visualization. All displayed in Streamlit for more user friendly interface

data-analysis data-visualization machine-learning python streamlit webscraping

Last synced: 08 May 2026

https://github.com/madi-s/tennispredictor

Program to predict outcomes of major tennis matches.

data-analysis prediction-algorithm python scraper tennis webdriver

Last synced: 06 Jul 2025

https://github.com/mkk-1817/cvip-ds-exploratory_data_analysis-terrorism

This repository deals with exploring global terrorism trends analyzing the Global Terrorism Database to uncover temporal patterns, identify top terrorist groups, examine attack types, and gain insights into geographical and success/failure dynamics.

coderscave data-analysis data-science data-visualization eda exploratory-data-analysis python terrorism-analysis

Last synced: 19 Jun 2025

https://github.com/jabulente/kruskall-wallis-test

This repository contain project that provides a reusable Python function to perform the Kruskal-Wallis H-test across multiple continuous variables, grouped by a categorical feature

data-analysis data-science eda hypothesis-tests kruskal-wallis kruskals-algorithm scipy-stats statistics

Last synced: 22 Jul 2025

https://github.com/celineboutinon/lafleche-et-associes

OpenClassrooms Data Analyst 2022-2023 - Projet 7 using KNIME Analytics Platform

data-analysis data-analytics data-visualisation knime-analytics-platform no-code rgpd

Last synced: 08 Feb 2026

https://github.com/lorinczakos/sql-projects

This is a collection of my SQL scripts that I wrote and were approved through my course with GoIT Romania Data Analyst course

bigquery cte data data-analysis dbeaver marketing-analytics postgresql project-repository sql vscode

Last synced: 16 May 2026

https://github.com/carvalhoandre/coletor-tweets

Criado para coletar e armazenar tweets utilizando a API do Twitter. Inicialmente inspirado no caso de uso do livro Um Voluntário na Campanha de Obama, este projeto tem como objetivo demonstrar a importância do monitoramento no X. O coletor permite buscar tweets sobre qualquer termo desejado

data-analysis mongodb python twiter-analysis twitter

Last synced: 19 May 2026

https://github.com/prasad-chavan1/bank_data_analysis_r

Bank data analysis in R language

data data-analysis data-science r

Last synced: 24 Feb 2025

https://github.com/nferno55/mock-data-governance

Working with messy data and using data quality practices to clean it up and practice SQL/Python automation. YAML will be used for Metadata validation soon.

data-analysis database-management metadata python sql sqlite3 yaml

Last synced: 16 May 2026

https://github.com/muneeb706/human_activity_recognition

This project performs data cleaning and data exploration steps for Human Activity Recognition Using Smartphones Data Set in R programming language.

data-analysis data-cleaning data-exploration r-programming

Last synced: 08 Aug 2025

https://github.com/chaganti-reddy/ai-prototype-customer-segmentation

Artificial Intelligence Prototype product based model for Customer Segmentation in E-Commerce Industry.

artificial-intelligence cluster-analysis customer-segmentation data-analysis machine-learning product-based prototype

Last synced: 13 Mar 2025

https://github.com/ujjwalll/econometrics_analysis_of_india_gdp_misestimation

A Econometric Analysis of the India's GDP to determine whether their is any flaw in India's GDP, as quoted by Dr. Arvind Subhramanium.

coefficient-estimates data-analysis econometrics economics gdp india r statistics

Last synced: 31 Oct 2025

https://github.com/tabibyte/aoty-highest-rated-albums-data-analysis

Data Analysis of AOTY Highest Rated Albums

albums aoty data-analysis music

Last synced: 10 Sep 2025

https://github.com/sweta-kaundilya/911-calls-capstone-project

For this capstone project we will be analyzing some 911 call data from Kaggle.

data data-analysis data-visualization jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 28 Apr 2026

https://github.com/sweta-kaundilya/sql_projects_data_analytics

This repository contains SQL porfolio projects

data-analysis mysql-database mysql-workbench

Last synced: 10 Sep 2025

https://github.com/al-ogr/sf_pr2_job_analysis_hh_sql

SkillFactory DataScience PROJECT-2. Анализ вакансий из HeadHunter

data-analysis data-science ipynb plotly python sql

Last synced: 19 May 2026