An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/johannaschmidle/amazon-cat-couch

Customer product reviews + ratings analysis and visualization [Python, Excel, Tableau, R]

data-analysis data-visualization jupyter-notebook python-notebook r-markdown sentiment-analysis text-analysis web-scraping

Last synced: 11 Jun 2026

https://github.com/agailloty/preprocess

preprocess is a fast data analysis preprocessing tool.

cli data-analysis preprocessing-data

Last synced: 12 May 2026

https://github.com/ani717/pneumonia_detection_effecientnet_b7

Pneumonia Detection in Chest X-ray Image with EfficientNet-B7. Accuracy = 87.98%, Precision = 100%, Recall = 83.87%, F1 Score = 91.23.

cnn computer-vision data-analysis data-augmentation efficientnet image-classification image-processing machine-learning

Last synced: 13 May 2026

https://github.com/mituskillologies/dkte-da-mar25

Programs conducted at DKTE's Engineering Institute, Ichalkaranji in training on Python Data Analytics March 2025.

data-analysis matplotlib numpy pandas python-programming tkinter-python

Last synced: 13 May 2026

https://github.com/alexgenovese/react-charts-covid-19-data

Examples on COVID-19 data using different library charts: G2, G2Plot, Plotly, ApexCharts

data-analysis data-science data-visualization react reactjs

Last synced: 13 May 2026

https://github.com/deliprofesor/joblocationmapper

JobLocationMapper is a Python tool that visualizes job listings on an interactive map. It uses city and state data to place job markers accurately and color-codes them by occupation (Software, Marketing, Design). The map clusters markers for better organization, and users can click on them to view job details.

clustrered-markers data-analysis data-visualization folium geocoding geographical-visualization interactive-map job-listings map-visualization pandas python

Last synced: 14 May 2026

https://github.com/satvikpraveen/matplotlibmasterpro

📷 MatplotlibMasterPro is a complete, portfolio-ready project to master data visualization using matplotlib. Includes 16 notebooks, real datasets, exportable plots, custom themes, Streamlit dashboard, and Docker support. Ideal for learners and data professionals.

charts custom-plots dashboarding data-analysis data-science data-visualization educational-project interactive-visualizations jupyter-notebook matplotlib notebooks open-source plotting portfolio-project python python-utilities reproducible-research subplots time-series-analysis visualization-tools

Last synced: 14 May 2026

https://github.com/prgermux/data-plotter

This Python application provides a graphical user interface (GUI) for analyzing and visualizing data from various sources. It uses the PyQt5 framework for the GUI and Matplotlib for plotting data. The application supports multiple file formats, allows users to select any columns for the X and Y axes, and provides dynamic plots.

automation data-analysis plott python

Last synced: 12 Jun 2026

https://github.com/edanur-y/agricultural-yield-prediction-with-multiple-linear-regression

Performing multiple linear regression analysis on agricultural data to predict the yield.

data-analysis missing-data-imputation multiple-linear-regression outlier-analysis r

Last synced: 13 Jun 2026

https://github.com/gmalbert/immigration

Immigration Data Analysis

data-analysis immigration

Last synced: 14 Jun 2026

https://github.com/jkazari/rollercoaster-eda

Repository of a small data-analysis project in R for Mathematical Software class on the 3rd semester of studying Mathematics at Gdańsk University of Technology

data-analysis r

Last synced: 14 Jun 2026

https://github.com/brunomontezano/sleep-quality-cognition

💤 Analysis of the paper "Associations between general sleep quality and measures of functioning and cognition in subjects recently diagnosed with bipolar disorder".

bipolar-disorder cognition data-analysis sleep-analysis sleep-research

Last synced: 15 Jun 2026

https://github.com/prathmesh2507/global-stock-intelligence-dashboard

Interactive Global Stock Market Analytics Dashboard built using Python, YFinance, Pandas, Streamlit, and Plotly. Analyze 20+ countries and 400+ top stocks with advanced visualizations and financial insights.

dashboard data-analysis data-visualization python stock-analysis streamlit

Last synced: 15 Jun 2026

https://github.com/fahadnasir13/financial_data-analyzer_tool

A Python-based framework for analyzing, cleaning, and reconciling financial data stored in Excel workbooks.

data-analysis excel financial python store

Last synced: 17 Jun 2026

https://github.com/ibttf/bayborhood

Interactive map to find the ideal neighborhood in San Francisco based on data.

data data-analysis data-visualization gis mapbox react

Last synced: 18 Jun 2026

https://github.com/httpsnooow/graphs-analysis-neo4j

Challenges from the "Neo4J - Data Analysis with Graphs" course by Digital Innovation One (DIO).

challenge data-analysis data-engineering data-science graph neo4j neo4j-database neo4j-graph

Last synced: 18 Jun 2026

https://github.com/shahaf-f-s/feature-space

A modular framework for combining pandas series features

data-analysis data-science feature-engineering

Last synced: 19 Jun 2026

https://github.com/angelmtenor/idafc

Udacity's Intro to Data Analysis

data-analysis

Last synced: 20 Jun 2026

https://github.com/evanmathew/northwind-traders

SQL-powered analysis of sales, employee performance, and customer behavior using PostgreSQL window functions. This project uncovers key business insights to optimize decision-making.

case-study data-analysis jupyter-notebook northwind-traders postgresql python-postgresql sql

Last synced: 20 Jun 2026

https://github.com/emaleckova/emaleckova.github.io

My personal website created with Quarto

biology data-analysis data-viz quarto r

Last synced: 23 Jun 2026

https://github.com/vbhvsingh0/coulombic_dyn_formaltetra

The Python code simulates a formaldehyde tetra-cation molecule using Coulombic forces

data-analysis physics-simulation python shell-scripting

Last synced: 24 Jun 2026

https://github.com/imosudi/unsupervised-ml-kmeans-analysis

K-Means clustering analysis using synthetic datasets generated with scikit-learn, including meshgrid visualisation, silhouette score evaluation, and investigation of cluster count and random seed effects.

clustering data-analysis jupyter-notebook kmeans kmeans-clustering machine-learning matplotlib python3 scikit-learn silhouette-score unsupervised-learning

Last synced: 25 Jun 2026

https://github.com/imgabreuw/minicurso-python-para-financas

Mini curso de Python para finanças, disponibilizado por Varos.

data-analysis financial-analysis python

Last synced: 29 Jun 2026

https://github.com/mikkelrask/henryrollins-scraper

FANATIC! A dataset of Henry Rollins' listens on his KRCW radio show, with data dating back to 2017 - 496 episodes of weird and rare finds, fast paced punk and frog sounds. Includes a scraper that keeps the data up-to-date with henryrollins.com

archive data-analysis data-visualization music

Last synced: 29 Jun 2026

https://github.com/shriansh8619/sql_eda

Explored relational databases using SQL to perform comprehensive Exploratory Data Analysis (EDA), covering database exploration, segmentation, trend analysis, and performance ranking. Developed reusable SQL scripts to analyze dimensions, measures, and time-based metrics, helping uncover key business insights.

data-analysis exploratory-data-analysis mysql

Last synced: 20 Aug 2025

https://github.com/myriamba/neuraview

AI-Powered Data Insights and Visualization Generator

data-analysis data-engineering data-insights data-visualization generative-ai llm

Last synced: 21 Aug 2025

https://github.com/beyzabasarir/northwind-traders-analysis

Northwind dataset analysis using PostgreSQL, Python, and Power BI. Focused on sales, customers, shipping, and performance insights.

dashboard data-analysis data-visualization jupyter-notebook matplotlib numpy pandas postgresql powerbi python seaborn

Last synced: 10 Apr 2026

https://github.com/oyebamiji-micheal/data-analysis-with-python-zero-to-pandas

This repository contains all assignments and project completed when I took a course, "Data Analysis with Python: Zero to Pandas", on Jovian

data-analysis numpy pandas python

Last synced: 10 Apr 2026

https://github.com/aidan-zamfir/advt-analysis

Web scrapping project. Will eventually use character/episode data for NLP & networking/ data analysis .

data-analysis nlp python selen webscraping

Last synced: 23 Aug 2025

https://github.com/vaishnavipaithane/cyclistic-bike-share-analysis-case-study

This capstone project was done as a part of Google Data Analytics Professional Certificate course.

data-analysis r-programming-language rstudio

Last synced: 24 Aug 2025

https://github.com/0xnu/data-analyst-training

The repository contains training materials for data analysts.

data data-analysis data-analyst

Last synced: 25 Aug 2025

https://github.com/harshnevse/performance_analysis_of_solar_plants_in_india

A Data Analysis project using Tableau

data-analysis tabluea

Last synced: 03 Jan 2026

https://github.com/debjyotisaha/tableau-projects-phase-2

Published interactive dashboards on Tableau Public, highlighting expertise in data visualization and storytelling through analyses of transportation patterns, sales trends, and demographic studies. These projects showcase the ability to transform complex datasets into actionable, intuitive visuals for decision-making.

dashboards data data-analysis data-visualisation tableau

Last synced: 26 Aug 2025

https://github.com/sarathchandranpm/walmart-sales-analysis

Analysis of Walmart Myanmar's Q1 2019 sales data covering customer behavior, product performance, general operations, and sales patterns.

data-analysis mysql sql

Last synced: 29 Aug 2025

https://github.com/lauratrigo/fft_matlab

📡Análise de Fourier para Dados Ionosféricos é um script MATLAB que aplica FFT para gerar espectros unilaterais e bilaterais de parâmetros ionosféricos (hF, f0F2, hmF2), identificando periodicidades e comparando assinaturas espectrais com resolução de 15 minutos, útil para estudos de variações e distúrbios ionosféricos.

data-analysis fast-fourier-transform fft fourier ionosphere matlab scientific scientific-initiation

Last synced: 29 Aug 2025

https://github.com/roggersanguzu/weather-medical-expense-prediction-ml-models

This repo contains a model for determining the rainfall patterns and another for medical expense prediction model

data data-analysis data-science datasets joblib machine-learning machine-learning-algorithms scikitlearn-machine-learning

Last synced: 30 Aug 2025

https://github.com/karlyndiary/adidas-sales-analysis

Analyzed Adidas' product sales performance, top retailers, monthly trends, yearly growth, regional distribution, and pricing insights. Performed ETL from Python (Pandas) to SQL Server, extracted data with SQL, and visualized key insights in Excel.

adidas-sales-analysis adidas-sales-dashboard dashboard data-analysis data-cleaning data-pipeline data-visualization etl excel-dashboard microsoft-excel microsoft-sql-server python

Last synced: 10 Feb 2026

https://github.com/obirikan/u.s.-county-commute-data-analysis

This project extracts and analyzes U.S. county-level commuting data from the 2020 American Community Survey (ACS 5-Year Estimates) via the U.S. Census Bureau API.

data-analysis

Last synced: 28 Jun 2025

https://github.com/agdturner/ccg-data

A modularised Java library for processing data sets with classes for: data records; collections of data records; and identifiers.

data data-analysis

Last synced: 12 Jan 2026

https://github.com/ragedunicorn/mantisx-notebook

A repository for Jupyter notebooks analysing mantisx data

data-analysis data-visualization mantis mantisx shooting training

Last synced: 24 Jul 2025

https://github.com/poglolopez/prueba_tecnica_inlaze

Este repositorio muestra mis habilidades en análisis de datos a través de una prueba técnica para Inlaze. Incluye flujos de trabajo con Python, SQLite y Power BI para analizar el comportamiento de jugadores, depósitos y rendimiento de fuentes de tráfico, destacando eficiencia operativa e información estratégica.

data-analysis data-v etl jupyter powerbi python sqlite

Last synced: 26 Feb 2025

https://github.com/devanshsahu47/talentscape-glassdoor-analysis

TalentScape is an end-to-end Python project that cleans and analyzes a comprehensive Glassdoor Jobs dataset. It features robust data wrangling and 20 insightful visualizations to uncover trends in job titles, salary ranges, company ratings, and more—providing actionable recommendations to optimize recruitment and compensation strategies.

business-intelligence data-analysis data-vizualisation jupyter-notebook python3

Last synced: 15 May 2026

https://github.com/soypete/example-go-dataframes-parser

example of https://godoc.org/github.com/kniren/gota/dataframe

data-analysis data-science datastructures golang-examples ml

Last synced: 12 Sep 2025

https://github.com/luminati-io/target-dataset-samples

A sample dataset of over 1000 target products, extracted using the Bright Data API, ideal for brand reputation, tracking inventory, and optimizing prices.

api data-analysis data-mining datasets target web-scraper web-scraping

Last synced: 04 Jan 2026

https://github.com/mysftz/numerical-methods-in-matlab

Multiple MatLab scripts over multiple data analysis assignments.

data-analysis data-science matlab university university-assignment

Last synced: 14 May 2025

https://github.com/leandrocollares/home-team-advantage-in-epl

Home team advantage in the English Premier League: an exploratory data analysis

data-analysis matplotlib pandas plotly

Last synced: 11 Jun 2026

https://github.com/mysftz/statistical-analysis

A in-depth review of statistical analysis in Python from datasets.

data-analysis python python3 statistics university university-project

Last synced: 14 May 2025

https://github.com/nmelgar/birthday_sports_dataviz

We will analyze how the Matthew Effect has influenced in professional sports players.

analysis csv data data-analysis data-science data-visualization datavisualization dataviz probability research tableau

Last synced: 08 Jan 2026

https://github.com/scailfin/benchmark-templates

Workflow Templates are parameterized workflow specifications for the Reproducible Open Benchmarks for Data Analysis Platform (ROB)

benchmarks data-analysis reproducibility

Last synced: 16 Jan 2026

https://github.com/iness000/online-retail-customer-segmentation

This project performs comprehensive customer segmentation analysis on an online retail dataset using machine learning clustering techniques and RFM (Recency, Frequency, Monetary) analysis. The goal is to identify distinct customer segments to drive better customer relationship management strategies and business insights.

customer-segmentation data-analysis k-means

Last synced: 31 Aug 2025

https://github.com/jayqi/data-analysis-tools

Presentation on Data Analysis Tools

data-analysis presentation-slides

Last synced: 06 Jan 2026

https://github.com/pranjalya/hand-washing-data-visualisation

A small project of Data Visualization, where we analyze the effect of hand washing after introduced by Dr. Semmelweis to the nurses and midwives after giving birth.

data-analysis data-visualization jupyter-notebook pandas python3

Last synced: 06 May 2026

https://github.com/evanwporter/sloth

Faster Pandas Dataframe

cython data-analysis dataframe pandas

Last synced: 14 Mar 2025

https://github.com/virajbhutada/hr-analytics-excel-sql-tableau-powerbi

Explore a comprehensive HR Analytics portfolio showcasing data analysis and visualization skills. Featuring dashboards in Power BI, Excel, and Tableau, along with SQL queries for deeper insights. A holistic view of expertise in HR analytics, data visualization, and database management. Let's dive into the game of data insights!

data-analysis data-management data-visualization excel hr-analytics interactive-dashboards portfolio-project postgresql powerbi powerbi-visuals sql sql-queries tableau tableau-public

Last synced: 02 Aug 2025

https://github.com/satvikpraveen/pandasplayground

📊 A comprehensive pandas mastery project with 10 modular Jupyter notebooks covering data loading, cleaning, grouping, merging, time series, visualization, and performance profiling. Includes real-world workflows, Docker, Streamlit, and reusable utils. Ideal for data scientists and analysts to learn, practice, and refer. Practice-ready and modular.

analytics cheatsheet data-analysis data-cleaning data-pipeline data-science data-visualization docker etl exploratory-data-analysis jupyter-notebook jupyterlab learning-resource memory-profiling open-source pandas performance-tuning python streamlit time-series

Last synced: 10 Apr 2026

https://github.com/hi-jin2/data-analysis-basics

데이터분석기초(R) 수업 중에 작성한 소스코드 모음입니다. 『모두를 위한 R 데이터 분석 입문』 교재를 통해 R언어를 학습하였습니다.

data-analysis r r-studio

Last synced: 19 Jul 2025

https://github.com/suwa-sh/tbm-template

TBM(Technology Business Management)を小さくはじめるテンプレート

cost-control data-analysis data-visualization dbt dlt example grafana postgresql sample tbm

Last synced: 19 Jun 2025

https://github.com/moenessgannouni/englandweather

A mini-project that analyzes weather data in England usingLinear Regression and Multiple Linear Regression. Ideal for learning and applying statistical analysis and predictive modeling.

data-analysis data-visualization linear-regression multiple-linear-regression rprogramming

Last synced: 22 Mar 2025

https://github.com/idb-devs/dataanalyticsairbnb

Construir um modelo de previsão de preço que permita uma pessoa comum que possui um imóvel possa saber quanto deve cobrar pela diária do seu imóvel.

data-analysis data-science jupyter python

Last synced: 18 Apr 2026

https://github.com/ronylpatil/whatsapp-group-chat-analysis

This project is totally based on data analysis where our college official Whatsapp group is used to extract useful information from the chat. Some of the useful extracted features are most active members of the group, most active day of the week, top-10 media contributors in the Group, and many more...

data-analysis data-preprocessing data-wrangling feature-engineering

Last synced: 14 Jun 2025

https://github.com/mnkanout/patients_medication_prediction

The aim of the project is to create a model that can help medical professionals select the proper medication for patients based on their symptoms. The model uses historical data of other patients to predict what could be the most suitable medication based on the patient's symptoms.

data data-analysis data-science data-visualization decision-tree-classifier machine-learning python3

Last synced: 29 Jun 2025

https://github.com/serlo/data-pipeline-interactive-exercises

processing pipeline for exercise dashboards

data-analysis serlo

Last synced: 26 Feb 2025

https://github.com/azaz9026/data_cleaning

Welcome to the Data Cleaning repository! This collection is dedicated to showcasing techniques and methods for cleaning and preparing datasets for analysis.

data-analysis data-engineering data-structures data-visualization eda feature-engineering machine-learning numpy outliers pandas python seaborn

Last synced: 13 Apr 2026

https://github.com/laudebugs/fec-data-analysis-2020

The project aimed to determine the total sum of contributions to the candidate committees as well as the number of contributions made by individuals.

data-analysis fec presidential-candidates

Last synced: 16 May 2026

https://github.com/farhad-here/data-visualization-analysis-dva

This is my data analysis project. Users can use this project to clean and preprocessing the date or data visualization. Individuals can impute or ecnode ther dataset.

altair bokeh data-analysis data-analysis-python io matplotlib numpy pandas plotly python sklearn streamlit

Last synced: 11 Apr 2026

https://github.com/dsrodrigovieira/favoritasales

Este repositório contém o projeto desenvolvido para o desafio do kaggle "Store Sales - Time Series Forecasting. Use machine learning to predict grocery sales"

data-analysis data-science kaggle-competition machine-learning python telegram-bot xgboost-regression

Last synced: 05 May 2026

https://github.com/lotfiferaga/google-play-store-sentiment-analysis

Perform sentiment analysis on Google Play Store reviews using Python. Analyze user feedback to determine the overall sentiment (positive, negative, or neutral) towards various apps. Gain insights to aid developers and businesses in understanding user satisfaction levels and improving their products.

data-analysis data-visualization googleplayservices python reviewsanalysis-nlp

Last synced: 26 Feb 2025

https://github.com/grlyntng/rpims

Django Code and documentation for the Retail Pharmacy Inventory Management System (best final year project award)

data-analysis django erp forecasting-models lstm-neural-networks reporting

Last synced: 26 May 2026

https://github.com/tolumie/loan-approval-prediction

Loan Approval Prediction using Machine Learning | EDA + Decision Tree, Random Forest & Logistic Regression | Automating loan eligibility for Dream Housing Finance by analyzing customer data and predicting loan approvals.

classification credit-risk-analysis data-analysis decision-tree-classifier finance-analytics loan-approval logistic-regression-algorithm machine-learning predictive-modeling-techniques random-forest

Last synced: 30 Jun 2025

https://github.com/ved-coder-king/wheat_ai_project

This project, Smart Wheat Farming AI System, was developed as part of the coursework for the Artificial Intelligence program at Esprit School of Engineering.

agriculture data-analysis data-visualization deep-learning image-classification machine-learning object-detection python wheat

Last synced: 15 Apr 2025

https://github.com/danpoynor/python-number-guessing-game-with-stats

A number guessing game written in Python 3 that presents median, mode, and mean statistics

console-game data-analysis number-guessing-game python3 statistics

Last synced: 26 May 2026

https://github.com/amanyadav-07/customer-churn-prediction

Machine Learning project to predict customer churn using Logistic Regression, Random Forest, and XGBoost. Includes data preprocessing, feature engineering, SMOTE balancing, model training, evaluation, and business insights.

accuracy-metrics data-analysis data-visualization logistic-regression machine-learning matplotlib numpy pandas python3 random-forest-classifier seaborn sklearn xgboost-classifier

Last synced: 11 Apr 2026

https://github.com/zulfachafidz/titanic_explorer_predicting_survival_with_classification_using_knn_algorithm

Tracking Life Safety with the KNN Predictive Analysis Approach. Leveraging the Titanic Dataset, we apply classification analysis to predict the fate of passengers based on a variety of features.

algorithm algorithms data data-analysis data-mining data-science datamodeling datapreprocessing dataset knn-algorithm knn-classification machine-learning machine-learning-algorithms prediction-model

Last synced: 01 Sep 2025

https://github.com/malucor/livros

Programa em Python para fazer uma análise de dados sobre livros, a partir de um arquivo Excel.

analise-de-dados book books bookshelf data-analysis ipynb jupyter-notebook livro livros python

Last synced: 16 May 2026

https://github.com/jaseel342/pizza_sales_report

This Pizza Sales dashboards provide valuable insights, including sales trends, pizza category breakdown, size distribution, top-selling, and least-selling pizzas, enabling data-driven decisions to boost sales and business performance.

data-analysis dax-query power-query powerbi sql sql-server-management-studio visualization

Last synced: 05 Jan 2026

https://github.com/jedrzej-wydra/data-analysis-associate

Associate Data Analyst Exam by DataCamp

data-analysis datacamp r

Last synced: 23 Mar 2025