An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/projects-developer/full-stack-network-intrusion-detection-system-using-machine-learning

The project aims to design and develop a full-stack network intrusion detection system using machine learning techniques. Project Includes Source Code, PPT, Synopsis, Report, Documents, Base Research Paper & Video tutorials

algorithms computerscienceproject cybersecurity data-analysis full-stack-development intrusion-detection-system machine-learning network-intrusion-detection network-security web-development

Last synced: 14 Feb 2026

https://github.com/fbarffmann/car_price_prediction

Predicted used car prices with a Random Forest model (R² = 0.96) using Python. Analyzed 2,000+ listings and visualized trends with Tableau.

car-price-prediction data-analysis machine-learning pandas python random-forest regression sklearn tableau

Last synced: 13 Apr 2026

https://github.com/hlexnc/project-arepo

Data-driven stroke risk assessment & personalized recommendations, powered by machine-learning and an NLU-driven chatbot.

chatbot data-analysis docker docker-compose machine-learning nlu-chatbot python rasa scikit-learn sklearn streamlit

Last synced: 15 Feb 2026

https://github.com/l1ght14/customer-churn-prediction

Predict customer churn using machine learning models like Logistic Regression and Random Forest. Includes data preprocessing, model evaluation, feature importance, and insights to drive retention strategies.

churn-prediction classification customer-churn customer-churn-prediction data-analysis logistic-regression machine-learning python random-forest scikit-learn telecom

Last synced: 09 May 2026

https://github.com/risdorn/restaurant-delivery-platforms-analysis-bdm-project

This project analyzes restaurant delivery platforms to understand customer preferences, industry competition, and expansion opportunities. Conducted as part of the BDM project from IITM, it includes descriptive stats, distribution, correlation, regression, and geospatial analysis using multiple datasets.

data-analysis data-visualization jupyter-notebook kaggle

Last synced: 15 Feb 2026

https://github.com/alinababer/covid19-timeseries-cases-and-deaths-forecasting-

This study is based on confirmed cases and deaths collected from Pakistan. Results demonstrate the promising potential of TIME SERIES model in forecasting COVID-19 cases and highlight the superior performance of the time series compared to the LSTM.we apply AI-based forecasting models such time series ARIMA, LSTM, prophet and VAR.

arima covid-19 data-analysis data-science data-visualization fbprophet forecasting lstm rnn time-series var vectorautoregression

Last synced: 19 Jun 2026

https://github.com/abhisek-13/whatsapp-chat-analyzer

The WhatsApp Chat Analyzer is a data analysis project that provides insights into WhatsApp chats. It analyzes chat data to show metrics like the number of lines, most used letter, chatting duration, media files shared, most used emojis, and group member activity. The results are displayed on a user-friendly dashboard built with Streamlit.

data-analysis data-mining data-visualization eda machine-learning machine-learning-algorithms matplotlib numpy pandas python seaborn sklearn

Last synced: 13 Apr 2026

https://github.com/faisal-khann/ipl-analysis

The IPL Analysis project is a comprehensive data-driven exploration of the Indian Premier League (IPL), analyzing historical match data to uncover patterns in team performance, player statistics, and match outcomes.

data-analysis exploratory-data-analysis jupyter-notebook matplotlib numpy pandas seaborn

Last synced: 08 May 2026

https://github.com/sarvesh2304/stellarator_simulation

A comprehensive Julia package for stellarator fusion reactor physics analysis featuring 3D magnetic field calculations, neoclassical transport modelling, quasi-isodynamic optimisation algorithms, and interactive 3D visualisations. Includes tokamak comparison framework and high-resolution plotting capabilities for fusion research.

3d-visualisation data-analysis field-line-tracing fusion-physics fusion-research interactive-3d julia magnetic-confinement magnetic-field-calculations magnetic-surfaces matplotlib neoclassical-transport numerical-methods optimisations physics-simulation plasma-physics plotly quasi-isodynamic stellarator stellarator-optimization

Last synced: 09 Oct 2025

https://github.com/drod75/burger_king_analysis

A simple analysis on a burger king dataset.

data-analysis data-visualization jupyter-notebook pandas python seaborn

Last synced: 09 May 2026

https://github.com/sorebit/pdrpy-pd-2

Data analysis of various stackechange.com archives.

data-analysis stackexchange time-travel university-project

Last synced: 08 Oct 2025

https://github.com/rizkipragustono/data_analysis_spark

Exploration: Data Analysis using Spark

apache-spark data-analysis pyspark python spark-sql sql

Last synced: 09 May 2026

https://github.com/k-bloch/car-theft-analysis

A dashboard created to inform the public about car theft, providing insights extracted from real-world police stats.

data-analysis maven-analytics tableau

Last synced: 19 Mar 2026

https://github.com/devexpress-examples/aspxpivotgrid-group-date-time-values

This example shows how to group date-time values in Pivot Grid for Web Forms.

asp-net-web-forms data-analysis dotnet pivot-grid pivot-grid-for-web-forms

Last synced: 01 Mar 2026

https://github.com/kmranrg/bikeshare

a project based on Data Analysis

data-analysis python

Last synced: 08 Oct 2025

https://github.com/emanoelcampos/python-onemonth

This repository contains educational materials and projects developed during a Python course offered by OneMonth. It covers Python basics, intermediate concepts, web development with Flask, and data analysis with pandas. The course is structured into weeks, each focusing on a different aspect of Python programming and its applications.

data-analysis flask jupyter-notebook onemonth python python3

Last synced: 09 May 2026

https://github.com/marielachirinosr/pandas-weather-project

Pandas Weather Data. Explore straightforward Python scripts for weather information analysis.

data-analysis pandas python

Last synced: 29 Apr 2026

https://github.com/mahapeth/invest-track

Реализация инструмента для мониторинга активности пользователей ИС "Инвест" для ВКР по направлению 01.03.02 Прикладная математика и информатика

analitycs app data-analysis data-visualization jupyter-notebook python sites

Last synced: 20 Jun 2026

https://github.com/grishmahat/discord-data-cli

A terminal UI tool to analyze your Discord data exportbuilt in Rust

cli data-analysis discord discord-data ratatui rust terminal tui

Last synced: 01 Mar 2026

https://github.com/an4pdm/relatorio-de-vendas

O presente projeto foi feito através das ferramentas oferecidas pelo Power BI afim de aprimorar meus conhecimentos sobre ETL. Os dados utilizados foram de origem do site "Kaggle".

data-analysis data-visualization database etl powerbi

Last synced: 20 Jun 2026

https://github.com/abhroroy365/market_analysis

This project explores customer segmentation and market analysis in the context of online retail using an online retail dataset. By applying advanced analytics, we aim to uncover insights that can drive strategic decisions and enhance business performance.

clustering data data-analysis data-visualization kmeans-clustering machine-learning market-analysis python silhouette-analysis

Last synced: 09 May 2026

https://github.com/isaacmaffeis/imad-2023

Model Identification and Data Analysis (IMAD) | University course

data data-analysis data-science model model-identification

Last synced: 09 May 2026

https://github.com/pranavsp108/market_basket_analysis-instacart

Customer segmentation and market basket analysis using the Instacart dataset with Python, Pandas, and K-Means clustering.

customer-segmentation-and-buying-behavior data-analysis data-visualization instacart jupyter-notebook kmeans-clustering market-basket-analysis pandas python scikit-learn

Last synced: 05 May 2026

https://github.com/yash22222/pwc-power-bi-virtual-case-experience

The Power BI PwC Virtual Case Experience is an exciting and educational program designed to provide participants with hands-on exposure to Power BI, a prominent business intelligence and data visualization tool, within the context of consulting at PwC.

business-analyst business-analytics business-intelligence dashboard data-analysis data-analyst data-analytics dax microsoft-power-bi powerbi powerbi-dashboards powerbi-visuals pwc

Last synced: 02 Mar 2026

https://github.com/badranalyst/covid-deaths-dashboard-with-tableau

This project showcases an interactive dashboard developed in Tableau to visualize COVID-19 deaths data. It provides insights into trends, geographical distributions, and key metrics related to mortality during the pandemic. The dashboard aims to enhance understanding of the data, supporting public health analysis and decision-making.

covid-19 dashboard data data-analysis data-visualization dataset tableau tableau-dashboards visualization

Last synced: 02 Mar 2026

https://github.com/zxjahid/matplotlib

A comprehensive guide to mastering data visualization with Matplotlib through hands-on examples and advanced techniques. 🚀📊

candlestick candlestick-chart cheatsheet data-analysis data-visualization gtk jupyter-notebook maps matplotlib-python pandas thesis-template tk tutorial wx

Last synced: 09 May 2026

https://github.com/anuppm9917/data-processing-and-csv-to-json-using-python-project

This project guides you through processing data from CSV to JSON format using Python. You'll learn to cleanse, validate, and transform data with pandas, numpy, csv, and json libraries, ensuring it's ready for POS system integration. This will help improve data integrity and streamline integration.

csv-files data data-analysis data-cleaning data-collection data-transformation data-validation python3 transformation

Last synced: 16 Apr 2026

https://github.com/dcs-training/r-visualisation-and-stats

This repository contains material from a 8 classes course on Data Visualisation and statistics with R

data-analysis data-visualisation data-wrangling intro-to-programming r statistics

Last synced: 20 Jun 2026

https://github.com/jjfiv/csc212spellchecking

Data Structure Analysis for Spell Checking

data-analysis smith-csc212

Last synced: 03 Mar 2026

https://github.com/jooapa/bytebrother

Byte Brother is watching YOU

data data-analysis security

Last synced: 26 Jan 2026

https://github.com/tatilimongi/first_python_project

Este repositório contém um estudo de caso de automação de planilhas em Python para análise de vendas de carros por fabricante ao longo dos anos

data-analysis email-sending file-manipulation graphical-visualization spreadsheet-automation

Last synced: 26 Mar 2025

https://github.com/aymanmomin/excel-coffee-data-analytics-exploring-coffee-orders-dataset

This project utilizes a coffee orders dataset to perform comprehensive data analytics and gain insights into customer preferences, popular items, and sales trends. The analysis aims to provide valuable information for coffee shop owners and enthusiasts, facilitating data-driven decision-making and improved customer satisfaction.

data-analysis data-visualization excel project

Last synced: 18 Jan 2026

https://github.com/1401dev/customer-lifetime-value-prediction

A data science project leveraging Python and Scikit-Learn to build predictive models that estimate customer lifetime value (CLV). Includes data cleaning, feature engineering, and model selection to identify key drivers of CLV, supporting strategic decision-making in customer retention and marketing.

clv clv-analysis customer-retention data-analysis dataprocessing feature-engineering machine-learning marketing-analytics predictive-modeling python regression-analysis scikit-learn

Last synced: 06 May 2026

https://github.com/omarsolieman/socialgiveawaydataanalysis

This project involved cleaning, analyzing, and processing data from an Instagram giveaway to ensure a fair and data-driven winner selection process. The primary goal was to automate the process of identifying valid entries, weighting them based on engagement (likes and multiple entries), and performing a post-giveaway analysis

data-analysis data-science data-visualization instagram scraping threejs

Last synced: 14 May 2026

https://github.com/banyc/csv_logger

Long-term logger for data analysis

csv data-analysis logging

Last synced: 07 Oct 2025

https://github.com/gabboraron/biostatisztika_es_alkalmazasai

"A statisztika a matematika azon ága, melynek feladata, hogy eszközt adjon a politikusok kezébe, mellyel tetszőleges állítás és annak ellentéte is tudományos alapon igazolható"

biostatistics data-analysis data-visualization r statistics statistics-course

Last synced: 24 Oct 2025

https://github.com/grindelfp/logistic-regression-study

Example of logical regression data analysis and exercise on it.

data-analysis ipynb logistic-regression python

Last synced: 03 Mar 2026

https://github.com/roydevashish/algo8.ai-data-manipulation-assignment

This assignment performs transaction-level sales data analysis and generates reports using Pandas / SQL / Spark inside a containerized environment. The dataset contains sales transaction records and is used to analyze SKUs, customers, and sales representative performance.

data-analysis duckdb python3 sql uv

Last synced: 15 May 2026

https://github.com/lintangwisesa/ujian_analyticsvisualization_jcds07

Panduan Soal Ujian Data Analytics & Visualization Job Connector Data Science batch 7

data-analysis data-science data-visualisation exam

Last synced: 04 Mar 2026

https://github.com/adrianlardies/feelms_predict_by_emotion

Feelms is a mood-based movie recommendation app that uses collaborative filtering and machine learning to suggest films based on your emotions. Built with Streamlit and powered by AWS, Feelms personalizes each user's experience through simulated interactions and tailored predictions.

aws-ec2 aws-rds data-analysis data-science machine-learning python streamlit

Last synced: 16 Apr 2026

https://github.com/edanur-y/bank-customer-churn-prediction-with-classification-models

Comparing the performances of multi-layer perceptron, decision tree, random forest, gradient boosting and extreme gradient boosting classifications on customer data to predict their status of exiting the bank.

data-analysis data-transformation hyperparameter-tuning python

Last synced: 16 Apr 2026

https://github.com/kosuri-indu/allaboutolympics

All About Olympics is an interactive dashboard presenting comprehensive data and insights on Olympic Games from 1896 to 2020.

data-analysis pandas plotly python streamlit

Last synced: 16 Apr 2026

https://github.com/ronaessi-28/sales-data-analysis-visualization-project

A comprehensive data analysis and visualization project using Python, Pandas, Matplotlib, Seaborn, and Streamlit. The project explores Superstore sales data to uncover trends, region-wise performance, product category insights, and builds an interactive dashboard.

data-analysis data-visualization eda matplotlib pandas plotly python-project sales-dashboard seaborn streamlit

Last synced: 16 Apr 2026

https://github.com/akash-srm/user-engagement-analysis

Analyzed user engagement and feedback data to derive actionable insights for an online learning platform.

analytics-projects data-analysis data-cleaning eda jupyter-notebook pandas python seaborn student-engagement

Last synced: 16 Apr 2026

https://github.com/marben06/rent-in-germany

Interactive visualizations and maps depicting topics around rent prices and income in Germany built with Svelte.

charts d3 d3-visualization d3js data-analysis data-visualization gis gis-data infographic infographics map mapbox mapbox-gl mapbox-gl-js mapboxgl svelte

Last synced: 27 Apr 2026

https://github.com/prarthana-singh/bangalore-house-price-predictor

🏡 Bangalore House Price Prediction – A Machine Learning model to predict house prices in Bangalore using real estate data. Built with Linear Regression, Python, Pandas, NumPy, and Scikit-Learn.

data-analysis eda house-price-prediction linear-regression machine-learning numpy pandas python real-estate regression scikit-learn

Last synced: 19 Apr 2026

https://github.com/yasumorishima/yasumorishima

Manufacturing Engineer & Data Analyst. 17 years exp in MFG. Python, VBA, Automation Specialist. (盛島康徳 / Yasunori Morishima)

automation data-analysis manufacturing portfolio python vba

Last synced: 05 Mar 2026

https://github.com/myles/notebooks

Some of my random Jupyter Notebooks.

data-analysis data-science jupyter-notebooks

Last synced: 18 Jan 2026

https://github.com/ilaxi/lomicontadores

data management tool in reference to number of actions per day in a year

data-analysis gdscript godot godot4 python

Last synced: 19 Apr 2026

https://github.com/marvinmarnold/oipm_stop_search

OIPM's analysis on Stop & Search (frisk) activity by the New Orleans Police Department.

data-analysis frisk new-orleans oipm police search stop

Last synced: 22 Jul 2025

https://github.com/shashwat9kumar/us-accidents-data-analysis

Analysis of the US accidents using the US-Accidents dataset (4.2 million entries) from Kaggle

accidents accidents-analysis data-analysis data-analytics data-visualisation data-visualization matplotlib numpy pandas python

Last synced: 17 Apr 2026

https://github.com/saro0307/exploratory-data-analysis-terrorism

Phase 1 of Data Science project (program) to perform Exploratory Data Analysis on Terrorism using Python On Google Colab for Coderscave Internship sept 2023

colaboratory data-analysis datascience machine-learning numpy pandas python seaborn skit-learn visualization

Last synced: 13 Apr 2026

https://github.com/vaishnavis03/finlatics_ml_program

This repository contains the .ipynb files for 3 datasets, along with a PPT for each. The datasets included are Facebook Marketplace Data, Sales Prediction Data, and Wine Quality data.

correlation data-analysis data-science data-visualization knn linear-regression machine-learning matplotlib numpy pandas random-forest-classifier scikit-learn

Last synced: 17 Apr 2026

https://github.com/rishitabansal9/adult-census-income-prediction

This is a project made for data analysis and income prediction using random forest classifier with 91% accuracy.

data data-analysis data-science feature-engineering random-forest-classifier

Last synced: 25 Mar 2025

https://github.com/swatisinghit/treadmill-customer-profiling-for-aerofit

Create comprehensive customer profiles for each AeroFit treadmill product through descriptive analytics. Develop two-way contingency tables and analyze conditional and marginal probabilities to discern customer characteristics, facilitating improved product recommendations and informed business decisions.

analytics conditional-probability data-analysis data-science data-visualization eda numpy pandas probability statistics

Last synced: 08 May 2026

https://github.com/nathadriele/ifood-data-governance-pipeline

Este projeto demonstra uma solução completa de Data Governance com foco em qualidade, rastreabilidade, segurança e conformidade com LGPD. Utiliza tecnologias modernas como Streamlit, Airflow, dbt e Pydantic para implementar um ecossistema funcional e interativo com dashboard de governança de dados.

airflow dashboard data-analysis data-catalog data-engineering data-governance data-quality data-visualization dbt ifood lgpd matplotlib numpy observability-data pandas pipeline pyspark redis seaborn streamlit

Last synced: 02 Apr 2026

https://github.com/ruajean/netflixmoviescraper

🎬 A powerful tool for gathering movie data and user reviews from FilmAffinity's Netflix category. This script scrapes movie details and iterates through user reviews, saving structured information to a CSV file for analysis. Ideal for insights into user sentiments and movie popularity on FilmAffinity.

data-analysis data-visualization dataset jupyter-notebook python scraping

Last synced: 17 Apr 2026

https://github.com/davifeliciano/modern_physics_experiments

Collection of data analysis and visualization scripts developed in Python around some modern physics experiments

data-analysis data-visualization modern-physics physics physics-experiments

Last synced: 18 Jan 2026

https://github.com/jabercrombia/video-game-data

This project integrates FastAPI as the backend and Next.js as the frontend to create a full-stack web application. It processes and displays vides game sales data, enabling seamless API communication while maintaining a scalable and efficient architecture.

data-analysis nextjs nintendo playstation python typescript video-game

Last synced: 02 Apr 2026

https://github.com/ngangawairimu/linear-regression-

This project builds a linear regression model in Python to predict outcomes and derive insights from feature data. It covers data cleaning, feature analysis, and model evaluation, showcasing predictive modeling techniques using scikit-learn, pandas, and visualization libraries.

data-analysis linear-regression machine-learning predictive-modeling python scikit-learn

Last synced: 17 Apr 2026

https://github.com/farhad-here/height-distribution-analysis

Statistical comparison of height distributions in two groups using mean, standard deviation, and boxplots.

coefficient-of-variation data-analysis interquartile-ranges matplotlib mean numpy python scipy standard-deviation variance

Last synced: 13 Apr 2026

https://github.com/salma-mamdoh/exploring-the-evolution-of-linux-project

My Project to learn the Basics of Analysis on DataCamp

data-analysis datacamp pandas python time-series-analysis

Last synced: 09 May 2026

https://github.com/eliasdehondt/learn-r

Welcome to the Learn-R repository! This is your go-to resource for learning the R programming language, whether you're a beginner or looking to enhance your skills.

data-analysis data-visualization education machine-learning programming r statistics tutorials

Last synced: 03 Apr 2026

https://github.com/jhrcook/checkplease

Analysis of an immune checkpoint-blockade screen.

bayesian-statistics data-analysis pymc3 python python3 r

Last synced: 17 Apr 2026

https://github.com/tyriek-cloud/nyc-mobility-survey-analysis

An end-to-end data engineering project in which five NYC DOT datasets were modified in an ETL process and analyzed for insights.

aws aws-athena aws-glue aws-glue-crawler aws-quicksight aws-s3 data-analysis data-engineering etl-pipeline json python

Last synced: 09 May 2026

https://github.com/shimazadeh/ft_linear_regression

Implementing a modular linear regression from scratch to predict the price of cars using a gradient descent algorithm.

data-analysis data-science hyperparameter-tuning linear-regression predictive-modeling

Last synced: 03 Jun 2026

https://github.com/sharmas1ddharth/data-analysis-with-python

Freecodecamp's Data Analysis with Python Projects Code

data-analysis data-analysis-with-python freecodecamp-project

Last synced: 03 Jun 2026

https://github.com/nathaliacosim/migration-patrim

Automação para extração, conversão e migração de dados patrimoniais para o sistema patrimônio cloud da betha sistemas. O projeto garante um fluxo estruturado e seguro de transferência de informações, utilizando C# (.NET Framework), PostgreSQL e integração via API.

conversion-tool data-analysis data-conversion data-transformation dotnet dotnet-code dotnet-console-app migration-tool

Last synced: 17 Apr 2026

https://github.com/victoorv/criminalite_us

Une analyse de la criminalité en fonction de variables socio-économiques a été menée, incluant la sélection et la comparaison de modèles de régression multiple ainsi que des tests d'hypothèses sur les coefficients et la significativité des modèles.

data-analysis data-science r regression regression-analysis regression-models statistical-analysis statistical-tests statistics

Last synced: 04 Apr 2026

https://github.com/manishkaa/google_data_analytics_capstone_case_study

This case study is a part of Google Data Analytics Capstone Project

bigquery data-analysis sql tableau

Last synced: 05 Oct 2025

https://github.com/sdley/cas_pratiques_a_rendre

Exercices pratiques de traitement de données avec python.

data-analysis pandas python

Last synced: 09 May 2026

https://github.com/royungar/automotive_sales_insights_dashboard

Data visualization project analyzing automotive sales, recalls, and customer sentiment using IBM Cognos Analytics. Features KPIs, treemaps, heatmaps, and advanced visual storytelling techniques.

automotive-industry business-intelligence cognos-analytics csv customer-sentiment dashboard data-analysis data-engineering data-visualization eda excel heatmap ibm kpi recall-analysis sales-data treemap

Last synced: 04 Jun 2026

https://github.com/josepablodmg/python--linear-regression---housing-exercise

A predictive analysis exploring the relationship between household characteristics and median income in California. Using linear regression, the project investigates whether blocks with fewer households correspond to higher median incomes.

california data-analysis data-science exploratory-data-analysis housing-data linear-regression machine-learning python regression scikit-learn statistics visualization

Last synced: 05 Oct 2025

https://github.com/davidmalko87/steam-library-exporter

Python script to export your Steam game library to CSV — playtime, genres, reviews, metacritic scores, prices, tags & estimated owners via Steam Web API + Store API + SteamSpy

csv-export data-analysis game-data metacritic playtime-tracker python steam steam-api steam-games steam-library steamspy

Last synced: 04 Apr 2026

https://github.com/datalopes1/manufacturing_defects

Projeto de EDA utilizando o Manufacturing Defects que pode ser encontrado no Kaggle

data-analysis data-visualization eda exploratory-data-analysis python

Last synced: 09 May 2026

https://github.com/sevilaymuni/project-no.3-seaborn-plots

Pandas and Seaborn Mediated Comprehensive Analysis on Differentiated Thyroid Cancer

data-analysis data-structures data-visualization mathplotlib pandas python seaborn

Last synced: 18 Apr 2026