An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/celineboutinon/lafleche-et-associes

OpenClassrooms Data Analyst 2022-2023 - Projet 7 using KNIME Analytics Platform

data-analysis data-analytics data-visualisation knime-analytics-platform no-code rgpd

Last synced: 08 Feb 2026

https://github.com/mkk-1817/cvip-ds-exploratory_data_analysis-terrorism

This repository deals with exploring global terrorism trends analyzing the Global Terrorism Database to uncover temporal patterns, identify top terrorist groups, examine attack types, and gain insights into geographical and success/failure dynamics.

coderscave data-analysis data-science data-visualization eda exploratory-data-analysis python terrorism-analysis

Last synced: 19 Jun 2025

https://github.com/bho0920/crime-data-analysis-eu

Crime Data Analysis for Self-Defense Tool Market Entry in the EU.

data data-analysis sql sqlite tableau

Last synced: 21 Jun 2025

https://github.com/jgohel9902/toronto-airbnb-snowflake

This project analyzes Airbnb listings in Toronto using **Snowflake’s cloud data platform**. It follows a **Bronze → Silver → Gold** medallion architecture and leverages **Snowflake Cortex** to generate **AI-driven executive insights**.

data-analysis python snowflake sql

Last synced: 07 Mar 2026

https://github.com/rezowanrahat/netflix_analysis

Data analysis of Netflix content using Python, Pandas, and Seaborn

data-analysis data-visualization netflix pandas python

Last synced: 07 May 2026

https://github.com/atharvkadammm/suicide-prediction-system

A machine learning project predicting suicide risk based on multiple socio-economic and environmental factors using data mining techniques.

csv data-analysis data-science data-visualization datamining exploratory-data-analysis feature-engineering machine-learnin matplotlib mental-health numpy pandas riskassesment seaborn sklearn suicide-prediction supervised-

Last synced: 01 Jul 2025

https://github.com/pkjjoshi/restaurants-analysis

Performed beginner-level EDA on a restaurant dataset using Python. Analyzed top cuisines, city-wise ratings, price ranges, and online delivery impact using Pandas and Matplotlib. Includes 4 well-structured notebooks with visual insights.

beginner-project data-analysis data-visualization exploratory-data-analysis jupyter-notebook pandas python restaurant-data seaborn

Last synced: 21 Jun 2025

https://github.com/vedantshi/tableau-bike-data-dashboard

London Bike Rides Analysis explores bike usage patterns using data visualization and machine learning. It identifies trends through a dynamic moving average, analyzes weather impact with heatmaps, and provides actionable insights via an interactive Tableau dashboard. Tools: Python, Tableau.

data-analysis data-visualization python tableau weather-data

Last synced: 16 May 2026

https://github.com/balajimohan18/loan-clustering-datascience-project

This project uses Machine Learning to Cluster loan together based on their similarities. The project uses a dataeset of loan application which includes information about the Loan amount and Balance. The project then use the clustering algorithm to group the loan together based on the similarities.

clustering-algorithm data-analysis data-science data-visualization eda kmeans-clustering machine-learning sql unsupervised-learning

Last synced: 27 Jul 2025

https://github.com/jayita11/eda-student-exam-performance

This project performs Exploratory Data Analysis (EDA) and hypothesis testing on student performance data. It explores trends based on attributes like gender, race/ethnicity, parental education, lunch type, and test preparation course completion.

data-analysis eda hypothesis-testing matplotlib pandas python seaborn statsmodels student-performance-analysis

Last synced: 11 Jul 2025

https://github.com/nikbarb810/motif_detection_in_r

Motif Detection for TFBS in Glycolysis and Glyconeogenesis pathways

bioinformatics data-analysis null-hypothesis pwm r

Last synced: 23 Jun 2025

https://github.com/farzeennimran/fashion-mnist-dataset-classification-using-neural-network

Implementation of a Multi-layer Perceptron classifier with hyperparameter tuning and k-fold cross-validation employing GridSearchCV for classifying images on the Fashion MNIST dataset 👗👚👖

artificial-intelligence data-analysis data-mining data-science dataset deep-learning fashion-mnist-dataset gridsearchcv hyperparameter-tuning kfold-cross-validation machine-learning multilayer-perceptron-network neural-network numpy pandas python sklearn

Last synced: 03 Apr 2026

https://github.com/gappeah/credit-card-transactions-fraud-detection-project

The Credit Card Transactions Fraud Detection Project repository is designed to analyse and detect fraudulent transactions in credit card data.

data-analysis postgresql sql

Last synced: 12 Jul 2025

https://github.com/myktorijus/retention-cohort

Extracted cohort data using SQL in BigQuery focusing on weekly retention from week 0 to week 6

bigquery data-analysis data-visualization powerbi sql

Last synced: 13 Jul 2025

https://github.com/jhermienpaul/google-data-analytics-program

Hands-on learning materials from the 8-course Google Data Analytics Professional Certificate program, covering foundational data skills, tools, and real-world business problem-solving

bigquery dashboard data-analysis data-analytics data-modeling data-storytelling data-visualization data-wrangling descriptive-analytics diagnostic-analytics etl-pipeline r-programming rstudio sql tableau

Last synced: 13 Jul 2025

https://github.com/percival33/machine-learning-engineering

Uni project about enhancing fictional music streaming service, by developing machine learning models to generate popular playlists

data-analysis data-science machine-learning python

Last synced: 14 Jul 2025

https://github.com/bonelesswater/tradingbot

This project is a web application for a trading bot that displays financial data and indicators. It includes functionality for researching financial data, displaying market indicators, and more.

ai azure css d3 data-analysis django html javascript jquery materializecss python stock-market

Last synced: 30 Dec 2025

https://github.com/theo-jenkins/fmri-brain-scan-analyser

MATLAB toolkit for reading, analysing and simulating rs-fMRI brain scans in .nii format.

algorithms data-analysis data-visualization fmri-data-analysis matlab neuroimaging

Last synced: 15 Jul 2025

https://github.com/priyanshubiswas-tech/farmlab-report-and-case-study-iot

This project was developed through live interviews and case studies with farmers in the year 2023 to address key agricultural challenges. The device provides real-time farm insights for better decision-making. Future plans include a digital portal, increased range, more sensors, and improved design. Open to collaboration!

arduino-ide c case case-study data data-analysis iot iot-device serialization

Last synced: 15 Jul 2025

https://github.com/jarrarshahid/nutrition-calculator

Simple python app to calculate nutritions in everyday meals.

data-analysis health json jupyter-notebook logic-programming python

Last synced: 15 Jul 2025

https://github.com/vishnu-vamshii/layoffs-data-analysis-in-sql

This project focuses on the cleaning and exploratory analysis of a dataset containing layoff information. It includes data deduplication, standardization of columns, handling null and blank values, and analyzing layoffs by company, industry, country, and date. Various SQL queries are used to explore trends and patterns in layoffs over time.

data-analysis eda mysql

Last synced: 15 Jul 2025

https://github.com/viper373/lol-dataanalytics

腾讯游戏-英雄联盟赛事20/21/22年数据综合分析预测

crawler-python data-analysis jupyter-notebook lol python spider

Last synced: 15 Jul 2025

https://github.com/shrutiijoshi/airbnb-listing-reviews

Airbnb is an online marketplace that connects people who want to rent out their homes with travelers seeking accommodations.

data-analysis matplotlib-pyplot pandas-python python seaborn

Last synced: 17 May 2026

https://github.com/gagan8605/zepto_sql_analysis

This project explores and analyzes the inventory data of Zepto, a rapidly growing 10-minute grocery delivery platform in India. The dataset contains over 3,000+ SKUs across key product categories such as Fruits & Vegetables, Dairy, Beverages, Packaged Foods, and more. The analysis was performed using PostgreSQL, covering both data cleaning and bus

cleaning-data data-analysis database-management postgresql sql

Last synced: 16 Jul 2025

https://github.com/venkat-023/thyroid-cancer-prediction

This project aims to develop a machine learning pipeline to predict thyroid cancer based on patient data. The dataset was sourced from multiple public repositories, cleaned, and merged to create a comprehensive dataset for modeling. Various classification algorithms were implemented, including Random Forest, Logistic Regression, K-Nearest Neighbors

data-analysis data-cleaning deep-learning ensembling-methods hyperparameter-tuning machine-learning-algorithms nueral-networks

Last synced: 17 May 2026

https://github.com/ofir-frd/predict-success-of-a-restaurant

Apply machine learning on a restaurante database. Study and analyse the data for prediction of a successful restaurant.

data-analysis data-science machine-learning visualization

Last synced: 11 Jun 2026

https://github.com/cyblx/clustering

This project explores clustering techniques and supervised learning applied to World Cup team performance analysis. The methodologies include K-Means, DBSCAN, K-Nearest Neighbors, Gaussian Mixture Models (GMM), and Agglomerative Clustering.

clustering data-analysis dbscan gmm kmeans supervised-learning unsupervised-learning world-cup

Last synced: 18 Jul 2025

https://github.com/theveryhim/web-scraping-and-statistical-tests

Crawling web for data and perform statistical tests to verify judgments

data-analysis hypothesis-testing web-scraping

Last synced: 18 Jul 2025

https://github.com/mh0386/motorcycle_data_analysis

Data analysis applied to motorcycle dataset.

data-analysis

Last synced: 19 Jul 2025

https://github.com/malexandersalazar/tools-python-mssql-statistics-descriptor

A lightweight tool based on sweetviz that generates high-density visualizations to kickstart Exploratory Data Analysis within Microsoft Azure SQL Database using ODBC with just one line of code

azure-sql-database data-analysis data-visualization eda python

Last synced: 16 May 2026

https://github.com/preciousclement/maternal-experiences-in-nigeria

This repository contains a Python-based project that generates realistic synthetic data simulating the maternal health journey of 5,000 women in Nigeria.

data-analysis data-generation maternal-health nigeria public-health python

Last synced: 08 May 2025

https://github.com/jm199504/data-analysis-practice

数据分析练习(Titanic / BankCustomers)

data-analysis python

Last synced: 02 May 2026

https://github.com/sharoonjoseph321/social_media_eda

Data Analysis on social media apps ,using pandas, python, matplotlib.

data data-analysis data-science data-visualization matplotlib programming-language project python pythonprojects

Last synced: 03 Mar 2025

https://github.com/josafary-ds/curso_dnc

Repositório para armazenamento dos arquivos de estudo e projetos DNC - Cientista de Dados

data-analysis data-science data-visualization machine-learning powerbi python

Last synced: 13 Mar 2025

https://github.com/iamber12/stack-overflow-analysis-using-stack-exchange-api

This Python-based project utilizes the Stack Exchange API to analyze StackOverflow data, focusing on the 'R' and 'Dot Net' programming tags.

data-analysis data-visualization python stack-exchange-api

Last synced: 20 Jul 2025

https://github.com/leoz0214/foodhygieneanalysis

Data analysis regarding Food Hygiene Ratings in England, Wales and Northern Ireland.

data-analysis food-hygiene-ratings pandas python

Last synced: 17 May 2026

https://github.com/dhruvil-26/sql-projects

This repository contains SQL projects focusing on data analysis and insights. Currently, it includes: 1. RSVP Movies Analysis - SQL queries to analyze movie trends, ratings, and genres. 2. Pizza Sales Analysis - SQL queries to explore sales patterns, customer behavior, and profitability in a pizza business.

analysis data-analysis database mysql pizza-sales-analysis rdbms rsvp sql

Last synced: 17 May 2026

https://github.com/aelmah/ibm-applied-ds

Find here : A collection of projects I've done throught Applied DS Specialization !

applied-data-science-capstone beautifulsoup data-analysis data-visualization machine-learning python-for-ai-and-data-science web-scraping

Last synced: 11 Sep 2025

https://github.com/wesleych3n/my-work-log

A self project to record and analyze work's check in/out time on google sheet with telegram bot.

data-analysis telegram-bot worklog

Last synced: 20 Jul 2025

https://github.com/sangampaudel530/bhutan-rainfall-explorer

Interactive dashboard to explore, analyze, and forecast rainfall trends in Bhutan (2021–2025) using Streamlit, Plotly, and Prophet.

bhutan climate-change data-analysis prophet-facebook rainfall-prediction streamlit visualization

Last synced: 17 May 2026

https://github.com/saksham-jain177/data-analysis

A collection of data analysis and machine learning projects across various datasets. Explore predictive modeling, data visualization, and insights from real-world data. Projects include sales predictions, disease detection, customer segmentation, and more.

api data data-analysis data-cleaning data-science data-visualization datamodeling dataset datasets exploratory-data-analysis python python3 web-scraping youtube-api

Last synced: 01 May 2026

https://github.com/dionixius7/titanic-disaster-ml-model

This project predicts the survival of passengers on the Titanic by using Kaggle Titanic Disaster Dataset. The dataset contains information related to passengers, such as age, gender, and class. Different machine learning algorithms have been applied for this predictive model to accomplish an accurate prediction that will define the survival chances

data-analysis data-science data-visualization eda knn-classifier machine-learning neural-network python scikit-learn svm tensorflow titanic-kaggle titanic-survival-prediction

Last synced: 07 Feb 2026

https://github.com/anandanraju/sql_data_analysis_projects

About This Two projects involves analyzing Pizza Data & Walmart Sales data using SQL to identify insights and trends. The aim is to do data-driven approaches to understand sales performance, identify key factors influencing sales, and provide actionable recommendations for business improvement.

csv data-analysis data-management mysql pizza sql sql-schema walmart

Last synced: 24 Jun 2025

https://github.com/monish-nallagondalla/cement_strength_prediction

The Cement Strength Prediction project uses machine learning to predict the compressive strength of cement based on its components, such as Cement, Fly Ash, Water, Superplasticizer, Coarse Aggregate, Fine Aggregate, and Age. The goal is to forecast compressive strength (MPa) for optimized cement production and quality control.

cement-strength-prediction construction-industry data-analysis data-preprocessing data-science data-visualization feature-engineering machine-learning predictive-modeling python regression-analysis scikit-learn

Last synced: 11 May 2026

https://github.com/edumoraes1/journey_active_users

Segmentação de base via SQL para jornada de vendedores ativos

bq data-analysis salesforce sql

Last synced: 02 Feb 2026

https://github.com/collins-kimotho/communicate-data-findings

Data Analysis Project: Investigating Factors Contributing to No-Show Appointments in Medical Records

data-analysis data-science data-visualization dataset pandas python

Last synced: 17 May 2026

https://github.com/gemaquejr/restaurant-orders

Projeto com o objetivo de aplicar os conceitos de POO e trabalhar com Set, Hashmap e Dict. Este projeto foi criado para avaliação final na seção 06 do módulo de ciência da computação do Curso de Desenvolvimento Web na Trybe.

data-analysis dict hashmap poo python set

Last synced: 30 Oct 2025

https://github.com/tinaland101/python-api-challenge

This project involves analyzing weather data from cities around the world using the OpenWeatherMap API and creating visualizations to explore the relationship between weather variables and latitude.

api-integration-and-data-retrieval data-analysis data-collection-and-geospatial-analysis problem-solving-and-decision-making statistical-analysis

Last synced: 03 Mar 2025

https://github.com/arv-anshul/pw-api

Perform data analysis on PW Skills APIs. Made a web app using streamlit. See any course syllabus, analytics, quizzes and assignments.

api course data-analysis ineuron-ai physics-wallah project pw-skills python3 streamlit

Last synced: 18 Apr 2026

https://github.com/lauratrigo/codigo_roti

Análise de ROTI é uma ferramenta em MATLAB para processar e visualizar dados ionosféricos (ROTI) de múltiplas estações GNSS. Desenvolvido para pesquisas em geofísica espacial, o script gera gráficos temporais comparativos com filtros de qualidade e tratamento de dados faltantes. 📡

data-analysis geophysics image-processing matlab roti scientific-initiation

Last synced: 24 Jun 2025

https://github.com/mahdikh03/custumers_clustering_rmf

A data analysis project to implement RFM (Recency, Frequency, Monetary) analysis for customer segmentation and behavior analysis using the K-Means algorithm.

customer-segmentation data-analysis k-means-clustering unsupervised-learning

Last synced: 09 May 2025

https://github.com/niaid/genetic-linkage-analysis

Materials for ACE course on Genetic Linkage Analysis.

ace ace-uganda2020 analysis bcbb-training clinical data-analysis genetics ngs ngs-analysis

Last synced: 24 Jun 2025

https://github.com/muneeb1030/webscrapper_altnews

The project utilizes a combination of Python, Scrapy, and Selenium to navigate through the dynamic content of AltNews.in and collect valuable information for analysis and verification.

data-analysis data-collection python3 scrapy scrapy-spider selenium selenium-python

Last synced: 17 May 2026

https://github.com/macorisd/instagram-fake-account-analysis

A project in R focused on detecting fake Instagram accounts. It includes exploratory data analysis, data visualization, and analysis using three techniques: association rules, formal concept analysis, and regression. The results are presented in an interactive Quarto book.

data-analysis data-science data-visualization r

Last synced: 10 Jun 2025

https://github.com/shaikh-raj/data-science-portfolio

Data Science Portfolio of Raj Shaikh including Case Studies and Articles that I have completed that solve various business problems.

articles case-study data-analysis deep-learning machine-learning nlp statistics

Last synced: 20 Jul 2025

https://github.com/arction/lcjs-example-0507-dashboardfiberanalysis

A demo application showcasing using LightningChart JS to visualize fiber analysis data.

area-plot area-series chart charts dashboard data-analysis demo heatmap javascript lcjs lightningchart-js performance visualization webgl

Last synced: 12 Mar 2025

https://github.com/soumasish2005/ai-chatbot-using-snowflake

This project is a Streamlit application that allows users to upload a CSV file and ask questions about their data in natural language.

cloud data-analysis data-science data-visualization python snowflake streamlit

Last synced: 17 May 2026

https://github.com/srinibas-masanta/deloitte-forage-virtual-internship

This repository contains my work from the Deloitte Forage Virtual Internship, where I analyzed factory telemetry data in Tableau to identify machine breakdown patterns and assessed gender pay equality using Excel. From interactive dashboards to insightful classifications, this project showcases hands-on data analysis and visualization skills. 🚀📊

data-analysis data-visualization deloitte excel forage tableau

Last synced: 15 Jan 2026

https://github.com/rohitdusane/interactive-ibd-analysis-dashboard-with-dash-plotly

This repository showcases a project that combines data analysis and visualization through Dash and Plotly. The goal of this project is to offer an efficient and user-friendly way to integrate robust data analysis with an interactive web-based interface.

clinical-research data-analysis exploratory-data-analysis pyhton statistical-reports

Last synced: 24 Jun 2025

https://github.com/sotirismos/pattern-recognition-labs

Lab exercises and quizzes for Pattern Recognition course, Auth winter semester 20-21

classification clustering data-analysis machine-learning pattern-recognition

Last synced: 17 Jun 2025

https://github.com/rachelresende/regressaolinear

Este repositório é destinado as aulas de regressão linear que realizei em um curso da Udemy sobre o assunto em 2025. Sendo um curso de reciclagem, pois estudei esse tratamento também em 2020 em um curso de estatística da Alura.

data-analysis data-science linear-regression

Last synced: 11 Sep 2025

https://github.com/mainak-97/pizza-sales-analysis-project

Pizza Sales Analysis Project: This project optimizes a pizza restaurant's operations by analyzing demand patterns, revenue, and efficiency, providing insights to enhance profitability, streamline production, and improve customer satisfaction.

business-analytics business-intelligence dashboards data-analysis operations-optimization peak-hours power-bi restaurant-analysis revenue-analysis

Last synced: 06 Jan 2026

https://github.com/parth-jatav/ipl-data-analysis-mentorness

This project uses Power BI to analyze IPL cricket data, featuring dashboards with insights on batting averages, strike rates, and player roles. It identifies the top 11 players and includes navigable pages focused on specific roles like Anchors, Finishers, and All-Rounders.

dashboard data-analysis ipl ipl-dashboard powerbi

Last synced: 07 Mar 2026

https://github.com/rahulsm20/trackbyte

A full-stack web application that helps users keep track of their playlist and provides analytics based on their music taste. Built using React, Node.js, Express.js, MySQL and Bootstrap.

bootstrap data-analysis expressjs mysql nodejs reactjs sql

Last synced: 07 Apr 2026

https://github.com/pylena/movies-prediction

This project focuses on clustering movies based on their genres using machine learning techniques. By analyzing genre data, the model groups similar movies together, facilitating recommendations and insights into genre-based patterns.

data-analysis machine-learning render streamlit unsupervised-learning

Last synced: 18 May 2026

https://github.com/judyway2/de-data

A brief analysis on schools ARR data

data-analysis jupyter-notebook

Last synced: 11 May 2025

https://github.com/natanel567/university_machine_learning_project

Machine Learning final project Tel Aviv University

data-analysis jupyter-notebook machine-learning

Last synced: 11 May 2025

https://github.com/prakhar-code/british_airways_review_analysis

Analysis of the British Airways Reviews by Customers, filtered by several different factors such as food, entertainment, services, etc.

data-analysis data-cleaning excel tableau-dashboards tableau-public tableau-visualization

Last synced: 15 Jan 2026

https://github.com/ziaeemehr/neuro_toolbox

Single Header File C++ library for analysis of neurophysiological and simulated data.

data-analysis data-science signal-processing synchronization

Last synced: 21 Jul 2025

https://github.com/mfakhriazhar/stock-price-prediction

Stock prices are highly volatile and influenced by various factors, making accurate prediction a major challenge in investment decisions.

data-analysis data-science deep-learning python recurrent-neural-networks

Last synced: 18 May 2026

https://github.com/jasonsu131/cps188-term-project

A data analysis program developed in C to extract information about diabetic patients across Canada from a governmental spreadsheet available online. The program showcases summaries and averages based on the extracted data.

c data-analysis data-statictics file-reading

Last synced: 28 Mar 2025

https://github.com/velut/thesis-sw

Software and datasets used in the "Cost-effective and Scalable Activity Matching using Crowdsourcing" thesis

bpmn cost crowdflower crowdsourcing data-analysis dataset performance-analysis plotting-algorithms r thesis

Last synced: 19 Jun 2025

https://github.com/mfakhriazhar/ecom-qtt-prediction

In e-commerce, understanding seasonal sales trends and best-selling products is critical to business strategy. However, companies often struggle with predicting sales, determining factors that influence sales (discounts, product categories, locations), and optimizing stock and marketing.

data-analysis data-science data-visualization e-commerce-project eda machine-learning python

Last synced: 19 May 2026

https://github.com/c17an/data-analysis-exercise

데이터 분석 수련장

data-analysis python3

Last synced: 05 Apr 2025

https://github.com/sabdikay/telco-customer-churn-analysis-ibm-dataset

This project explores customer churn trends for a company in California using an IBM dataset. Built in a Jupyter Notebook, it employs pandas, NumPy, matplotlib, seaborn, plotly, and scipy to clean, analyze, and visualize data. Through statistical tests and interactive maps, it uncovers key drivers behind customer cancellations

business-intelligence customer-churn data-analysis data-analysis-python data-visualization exploratory-data-analysis jupyter-noteboook matplotlib numpy pandas plotly predictive-modeling python scipy seaborn statistical-analysis

Last synced: 07 Apr 2026

https://github.com/annaanastasy/classification-project-student-grades

A machine learning project to predict students' academic performance using features like demographics, study habits, and parental involvement, achieving 74% accuracy with the CatBoost model.

catboost-classifier classification data-analysis data-visualization machine-learning-algorithms predictive-modeling

Last synced: 29 Mar 2025

https://github.com/sparkerdata/hockeyshotmap

Interactive Streamlit app for NHL shot maps & player analysis. Pulls live (or demo) play-by-play data, normalizes rink coordinates, and visualizes shots with context filters (strength, period, player).

data-analysis data-visualization duckdb hockey hockey-analytics ice-hockey nhl nhl-data python sports sports-analytics

Last synced: 18 May 2026

https://github.com/dacosmicgiant/marketing-sms-analyser

Mini project for R language SEM - V

data-analysis r shiny

Last synced: 21 Mar 2025

https://github.com/kammarah/studentdata

I created & deployed a Streamlit app to store, manage & analyze student data. 📊🎓

connection data data-analysis data-visualization deploy deployments libraries python streamlit streamlit-webapp webapp

Last synced: 18 May 2026

https://github.com/stefagnone/data_storyboarding_visualization

Data Storyboarding and Visualization Techniques for Effective Communication

data-analysis data-visualization ggplot2-analysis r tableau-dashboards

Last synced: 05 Apr 2025