An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/alexquilis1/spanish-fuel-stations-analysis

Real-time analysis of Spanish fuel prices using government API data with interactive maps and regional comparisons

data-analysis data-visualization fuel-prices geospatial-analysis ggplot2 government-data leaflet open-data r shiny spain tidyverse

Last synced: 08 Oct 2025

https://github.com/inddrsingh/e-commerce_orders

ETL project, with Python for Data cleaning and MySQL for Data analysis

data-analysis etl-pipeline mysql python

Last synced: 18 Apr 2026

https://github.com/abhash-rai/analyzing-credit-card-eligibility

This work was performed as part of BCU undergraduate course.

data-analysis data-visualization ggplot ggplot2 latex r

Last synced: 20 Jan 2026

https://github.com/zafir100100/cancer-stage-prediction

This code predicts cancer data using various regression models, calculates their average R-squared scores, and prints the best model.

cross-validation data-analysis data-preprocessing decision-trees gradient-boosting linear-regression machine-learning-algorithms numpy pandas random-forest regression scikit-learn

Last synced: 05 May 2026

https://github.com/faisal-khann/ipl-analysis

The IPL Analysis project is a comprehensive data-driven exploration of the Indian Premier League (IPL), analyzing historical match data to uncover patterns in team performance, player statistics, and match outcomes.

data-analysis exploratory-data-analysis jupyter-notebook matplotlib numpy pandas seaborn

Last synced: 08 May 2026

https://github.com/jkazari/rollercoaster-eda

Repository of a small data-analysis project in R for Mathematical Software class on the 3rd semester of studying Mathematics at Gdańsk University of Technology

data-analysis r

Last synced: 14 Jun 2026

https://github.com/izzyl3333/mosquito_analysis

An exercise using Python and statistical analysis in mosquito data to understand the relationship between the different variables and the mosquito number.

chicago data-analysis data-science exploratory-data-analysis mosquitoes python statistical-analysis west-nile-virus

Last synced: 19 Jan 2026

https://github.com/marianamartiyns/api-logisticregression

Data analysis, modeling, and deployment of a logistic regression model for churn prediction, integrating a FastAPI backend and a Streamlit frontend.

data-analysis data-science fastapi logistic-regression pyhton streamlit

Last synced: 29 Apr 2026

https://github.com/mohitsai/boston-housing-data-analysis

Data Analysis Project for the City of Boston Government for insights into effect of property rennovations and remodelling on housing availability in the city

data-analysis data-science matplotlib numpy pandas python

Last synced: 05 May 2026

https://github.com/loaiwalid07/automation_data_overviwe

This is Streamlit app that gives an overview for a dataset you upload

automation data data-analysis data-exploration data-science data-transformation data-visualization

Last synced: 19 May 2026

https://github.com/brooks-code/toulouse-biblio-chronicle

Snapshot of Toulouse public library customer habits — cleaning raw, messy datasets of musical, cinematic, and literary checkouts; includes data-cleaning steps, analysis notebook revealing cultural tastes in the Pink City.

data-analysis data-cleaning data-cleaning-and-preprocessing data-quality exploratory-data-analysis jupyter-notebook library-data misaligned-data mojibake tutorial

Last synced: 10 Oct 2025

https://github.com/debjyotisaha/power-bi-projects-phase-1

Portfolio projects related to data visualisation in Power BI

data-analysis data-visualization dax-expression powerbi powerquery

Last synced: 18 Jan 2026

https://github.com/frankelavsky/security-dash-challenge

I had two 8 hour days to create a visualization dashboard for three datasets. Tab one: Voronoi overlay on line graph. Tab two: Data partitioning method keeps in-memory usage low. Tab three: deals with "Failed" vs "Successful" attempts as positive/negative barcharts over time. I used d3.js, require, MVC pattern, and vanilla js.

client-side complexity css3 d3 d3js dashboard data-analysis data-structures-algorithms data-visualization frontend-app html5 interactive-visualizations javascript modular network-analysis network-monitoring network-security security single-page-app visualization

Last synced: 14 Apr 2026

https://github.com/pngo1997/life-expectancy-logistic-regression

Life expectancy analysis project using logistic regression.

data-analysis logistic-regression r rmarkdown

Last synced: 10 Jun 2026

https://github.com/scarlet-enlight/ml_project

Comparison of different classifiers (KNN, Naive Bayes, Decision Tree) on Sleep Health and Lifestyle Dataset

data-analysis machine-learning

Last synced: 13 Mar 2026

https://github.com/pranav016/exploratory-data-analysis-of-google-app-store-dataset

This is a data analysis done on the Google app store dataset to answer a few questions related to the data through data visualization techniques.

data-analysis

Last synced: 11 Oct 2025

https://github.com/anushkundu/crime-pattern-analysis

Analyzing Crime Patterns in Montgomery County, USA: An Inclusive Study Based on NIBRS Data (2016-2022)

data-analysis data-visualization descriptive-statistics matplotlib numpy pandas python seaborn

Last synced: 05 May 2026

https://github.com/montanaz0r/testing-if-mma-math-deduction-works-using-ufc-fighters-data

The probabilistic reasoning about phenomenon called MMA math using UFC fighters data and Python.

bayesian-inference data-analysis data-science graphviz jupyter-notebook pandas python scipy statistics

Last synced: 14 Apr 2026

https://github.com/kianaasd93/sensors-

Data Analysis of wearable technologies autonomous systems sensor in physiotherapy, Conducted a comprehensive data analysis on Xsens MTx sensor data

classification data-analysis data-science jupyter jupyter-notebook knn machine-learning physiotherapy python sensor svm wearable-devices wearable-technology

Last synced: 19 Feb 2026

https://github.com/dhruvil-26/tableau-projects

This repository contains Tableau visualization projects focused on data analysis across different domains. Projects include: 1. IPL Visualization - Insights into IPL match, Team and player statistics. 2. EV Analysis - Visualizations exploring the adoption of electric vehicles. 3. Road Accident Analysis - Analysis of road accident patterns

analysis data data-analysis data-analytics electric-vehicles ipl road-accident-analysis tableau tableau-public

Last synced: 19 Jan 2026

https://github.com/bkataru/physics-e.e

Project repository for IB physics extended essay. Topic: Predictive data modeling of a variable binary star’s brightness over a period of time using astrostatistics.

astrometry astronomical-algorithms astronomical-images astronomy astrophotography astrostatistics data-analysis data-science data-visualization modeling physics polynomial-regression regression-analysis

Last synced: 09 Apr 2025

https://github.com/nkamilla/titanic-eda

Exploratory Data Analysis of the Titanic dataset using Python (Pandas, NumPy, Matplotlib). Includes data cleaning, visualizations, correlations, and key business insights.

data-analysis eda jupyter-notebook matplotlib numpy pandas python titanic-dataset

Last synced: 05 May 2026

https://github.com/navp7/pizzasales_powerbi

This project involves creating a comprehensive sales performance dashboard using Power BI to visualize and analyze the sales data of an Italian pizza company.

data-analysis ms-sql-server ms-word powerbi visualization

Last synced: 13 Mar 2026

https://github.com/jbalooshie/election_analysis

A Python script built to analyze specific election's results, and be re-purposed to analyze the results of other elections. The script provides you with different breakdowns of the vote based on candidate and county,

data-analysis data-science elections python

Last synced: 09 Apr 2025

https://github.com/caesaredia/ymusic-project

Exploratory data analysis (EDA) of music streaming behavior in two fictional cities using Python, Pandas, and Jupyter Notebook. It explores user behavior, genre preferences, and listening patterns throughout the week.

data-analysis eda pandas python

Last synced: 05 May 2026

https://github.com/treasarose/us_candy_distribution_analysis_project

This project focuses on advanced data analysis and optimization using SQL. It includes queries for analyzing sales, product margins, and shipping efficiency for a US candy distributor.

data-analysis entity-relationship mssql optimization query sql-server sqlproject us-candy-distributor

Last synced: 12 Oct 2025

https://github.com/tzerk/esr

R package 'ESR' for plotting and analysing ESR spectra in dating applications

data-analysis data-visualization electron-spin-resonance geochronology r

Last synced: 13 Mar 2026

https://github.com/alexondata/daan_eda-exploratory-data-analysis_ecommerce

This project presents an Exploratory Data Analysis (EDA) pipeline for an eCommerce dataset, integrating Python, SQL Server, and Power BI to transform raw transactional data into meaningful business insights. The project was developed as part of an academic assignment at Transilvania University of Brașov, Faculty of Mathematics and Computer Science.

data-analysis data-visualization ecommerce microsoft-sql-server powerbi python

Last synced: 18 May 2026

https://github.com/zulhaditya/web-scraping-python

A repository that stores various source code and web scraping methods using Python.

data-analysis python3 webscraping

Last synced: 12 Oct 2025

https://github.com/chirlmin-joo-lab/papylio

Single-molecule fluorescence trace extraction and analysis

biophysics data-analysis fluorescence fret single-molecule sparxs

Last synced: 12 Oct 2025

https://github.com/agb2k/twitter-analyzer

Project to extract tweets based on searches, analyze it's data and autocorrect potentially incorrect words

data-analysis python tweepy twitter

Last synced: 13 Oct 2025

https://github.com/monish-nallagondalla/universal-bank

Credit Card Ownership Prediction A machine learning project that predicts credit card ownership using features like age and income, balancing class distributions for improved accuracy.

classification-models credit-card-prediction data-analysis data-classification decision-tree-classifier imbalanced-datasets machine-learning model-evaluation python scikit-learn

Last synced: 05 May 2026

https://github.com/akash-47-tank/personalized-e-commerce-review-summarizer

Personalized E-commerce Product Review Summarizer: A Streamlit app that summarizes product reviews (e.g., from a CSV) using T5-small and tailors summaries to user preferences (price, durability, etc.) with NLP and lightweight ML.

data-analysis e-commerce machine-learning nlp personalization portfolio python scikit-learn sentiment-analysis streamlit t5 transformers web-app

Last synced: 05 May 2026

https://github.com/ayaatmohammed/amazon-sales-analysis-pyspark

In-depth analysis of the Olist E-commerce dataset from Kaggle using PySpark for customer segmentation (RFM) and market basket analysis.

big-data big-data-analytics customer-segmentation data-analysis data-science ecommerce jupyter-notebook kaggle pyspark python rfm-analysis

Last synced: 05 May 2026

https://github.com/aryar-06/linear-regression

A Python project demonstrating basic linear regression with gradient descent and matrix operations, alongside scikit-learn comparison.

data-analysis data-preprocessing educational-project gradient-descent linear-regression machine-learning python regression-algorithms scikit-learn

Last synced: 05 May 2026

https://github.com/meinhere/dicoding-analisis-data

Submission Analisis Data dengan tema E-Commerce Streamlit App

data-analysis data-mining e-commerce python streamlit

Last synced: 05 May 2026

https://github.com/hms75/movie_rating_analysis

A movie rating analysis which identifies trends amongst a dataset of 5000 movies.

data-analysis data-visualization matplotlib-pyplot numpy pandas python

Last synced: 05 May 2026

https://github.com/iamrajmani/sentimental-analysis

Sentimental Analysis - Final Year College Project

data-analysis data-visualization machine-learning python pytorch

Last synced: 06 May 2026

https://github.com/ryuzen6/bangalore-real-estate-price-prediction

This is a Data Science Project which predicts the cost of Real Estate in Bangalore. Requirements: Jupyter Notebook (for Data Cleaning and creating the Linear Regression using various python libraries) , Pycharm (python IDE for creating Python Flask Server), Visual Studio Code (to create the UI with HTML, CSS and Javascript).

css3 data-analysis data-science html5 javascript jupyter-notebook machine-learning python3

Last synced: 06 May 2026

https://github.com/ibrahimceyisakar/hotel-finder

Hotel finder system with Python includes data gathering, analyzing, and visualization.

data-analysis data-gathering data-visualization pandas plotly python selenium streamlit

Last synced: 06 May 2026

https://github.com/yashpaneliya/bank-loan-default-analysis

Analyze and understand the driving factors (or driver variables) behind loan default, i.e. the variables which are strong indicators of default.

data-analysis loan-default-analysis matplotlib numpy pandas python

Last synced: 06 May 2026

https://github.com/erick957/saleprice-prediction-dataset-analysis-and-cleaning-advance-regression

🏠 Predict house prices using advanced regression techniques with this comprehensive analysis and cleaning project, from data loading to model deployment.

data-analysis data-science eda google-colab machine-learning numpy pandas python scikit-learn scikit-learn-python

Last synced: 06 May 2026

https://github.com/ankitwalimbe/sentiment-analysis

Sentiment analysis of Amazon Fashion reviews using VADER and a baseline ML model (TF-IDF + SGDClassifier). Includes visualizations, reproducible notebook, and recruiter-ready documentation.

data-analysis machine-learning matplotlib nlp pandas python seaborn sentiment-analysis sklearn

Last synced: 06 May 2026

https://github.com/kishorep26/school-recommendation-system

Intelligent school recommendation system that matches students with suitable educational institutions based on preferences and performance metrics

bootstrap data-analysis decision-support edtech education education-technology flask matching-algorithm python recommendation-system school-finder school-search student-portal web-application

Last synced: 06 May 2026

https://github.com/josepablodmg/python--linear-regression-advertising

A linear regression analysis to predict sales based on advertising spending across TV, radio, and newspaper channels. The project includes exploratory data analysis, model training, coefficient visualization, and residual analysis.

advertising data-analysis exploratory-data-analysis linear-regression machine-learning python regression scikit-learn visualization

Last synced: 06 May 2026

https://github.com/edanur-y/variable-analysis-of-banks-ratio-data

Testing variables for multicollinearity, multivariate normality and analyzing outliers and missing values. ⭕SPSS 🔵R

data-analysis log-transformation missing-values-analysis multicollinearity normality-test r spss

Last synced: 10 Jun 2026

https://github.com/fbarffmann/home_sales

Analyzed 25,000+ home sales using PySpark and SparkSQL. Identified pricing trends by year built, home features, and view rating. Optimized query run-time by 70% using caching.

aws big-data data-analysis home-sales parquet pyspark python spark spark-sql sql

Last synced: 06 May 2026

https://github.com/suhas-005/jovian-data-analysis-course-assignment

These are my assignments for Data Analysis : Zero to Pandas course by Jovian.ai

data-analysis data-analytics numpy pandas python

Last synced: 07 May 2026

https://github.com/badranalyst/exploratory-data-analysis-on-salaries-dataset

Performing EDA on a dataset related to salaries, exploring relationships between factors like job titles, industries, and locations. Insights are visualized with plots to identify trends and disparities in salary data.

data-analysis dataset eda exploratory-data-analysis pandas python

Last synced: 07 May 2026

https://github.com/warazkhan/airplane-crashes-and-fatalities-since-1908-

This project analyzes airplane crash data (1908 - 2008)✈️📊 to uncover trends in aviation accidents, fatalities, and safety improvements. Using exploratory data analysis (EDA) and data visualization, we examine key factors influencing crashes, identify high-risk regions, and explore advancements in aviation safety.

data-analysis data-visualization exploratory-data-analysis

Last synced: 10 Jun 2026

https://github.com/ddihora1604/advanced_business_analytics_on_world_bank_global_financial_inclusion_data_2021

Bridging the Gaps in Financial Inclusion: Understanding the Cash-Credit Paradox, Divide between Cash and Digital Payments, and Financial Resilience.

advanced-excel business-analytics data-analysis data-engineering data-mining data-visualization database exploratory-data-analysis machine-learning preprocessing-data python

Last synced: 07 May 2026

https://github.com/jpgiant/gujaratrainfallanalysis_2021

Analysis about the rainfall that occurred in the districts of Gujarat state in 2021

data-analysis exploratory-data-analysis exploratory-data-visualizations matplotlib numpy pandas-python python

Last synced: 07 May 2026

https://github.com/pedrosfaria2/fugascomhelicoptero

Meu primeiro uso do Jupyter Notebook em um projeto

analise-de-dados data-analysis jupyter-notebook matplotlib pandas python

Last synced: 07 May 2026

https://github.com/biginformatics/git-basics

Hands-on Git and GitHub lessons for analysts and statisticians

data-analysis git github public-health training

Last synced: 10 Jun 2026

https://github.com/riborings/python_projects

Python projects and other programming experiences

data-analysis machine-learning project python regression-analysis

Last synced: 08 May 2026

https://github.com/aidan-zamfir/the-iliad

Data analysis & relationship network for the characters of Homers Iliad

data data-analysis dataframes networks networkx python selenium spacy webscraping

Last synced: 08 May 2026

https://github.com/blladerunner/customer-churn-dashboard

Customer Churn Dashboard — SQL + Python analytics project exploring customer retention patterns, churn rate by demographics and services, and key insights for telecom business strategy.

business-intelligence churn-analysis customer-retention dashboard data-analysis data-analytics data-science pandas powerbi python sql sqlite telecom

Last synced: 08 May 2026

https://github.com/devexpress-examples/wpf-pivot-grid-group-date-time-values

This example shows how to group date-time values in Pivot Grid for WPF.

data-analysis dotnet dxpivotgrid pivot-grid pivot-grid-for-wpf wpf

Last synced: 08 May 2026

https://github.com/danmadeira/algoritmos-estatistica-python

Demonstração de Algoritmos de Estatística em Python

algorithms data-analysis data-science python statistics

Last synced: 08 May 2026

https://github.com/0290192029/apartment-price-predictor

Python-проект по прогнозированию стоимости аренды квартир с помощью линейной регрессии. Практическая работа по теме: "Основы машинного обучения" дисциплины "МДК 13.01: Основы применения методов искусственного интеллекта в программировании".

apartment-price-prediction apartments-for-rent api correios-api data-analysis feature-engineering feature-enginering linear-regression linear-regression-models mlops numpy prediction-model r seaborn

Last synced: 08 May 2026

https://github.com/shelton-beep/trading-algorithm

A simple trading algorithm for SPY ETF using a moving average crossover strategy. This project analyzes SPY weekly price data, implements a buy/sell algorithm, and tracks performance metrics to evaluate profitability and risk. Ideal for learning algorithmic trading basics and financial data analysis.

data-analysis financial-analysis investment-strategy jupyter-notebook pandas python quantitative-finance technical-analysis time-series-analysis trading-strategies

Last synced: 08 May 2026

https://github.com/sabaasif2501/netflix-data-analysis

Exploratory data analysis of Netflix content using Python and pandas. Content types, genres, countries, and release years.

data-analysis netflix pandas portfolio-project python

Last synced: 08 May 2026

https://github.com/satvikpraveen/numpymasterpro

A hands-on, production-ready toolkit to master NumPy — from first principles to real-world applications. Includes modular Jupyter notebooks, reusable utility scripts, cheatsheets, and advanced projects like K-Means clustering from scratch.

broadcasting data-analysis data-science data-source data-visualization jupyter-notebook kmeans-clustering linear-algebra machine-learning matrix-algebra numerical-computation numpy numpy-broadcasting numpy-examples numpy-tutorial open-source python scientific-computing standardization vectorization

Last synced: 08 May 2026

https://github.com/alejandrolara11/data-preprocessing

Data preprocessing through the use of the libraries NumPy and pandas.

data-analysis data-cleaning data-preprocessing numpy pandas python

Last synced: 09 May 2026

https://github.com/l1ght14/customer-churn-prediction

Predict customer churn using machine learning models like Logistic Regression and Random Forest. Includes data preprocessing, model evaluation, feature importance, and insights to drive retention strategies.

churn-prediction classification customer-churn customer-churn-prediction data-analysis logistic-regression machine-learning python random-forest scikit-learn telecom

Last synced: 09 May 2026

https://github.com/mrunmayee3108/financial-chatbot

A Python chatbot for analyzing financial data of companies with revenue, income, assets, cash flow, and debt ratio queries

chatbot data-analysis jupyter-notebook pandas python python3

Last synced: 09 May 2026

https://github.com/rizkipragustono/data_analysis_spark

Exploration: Data Analysis using Spark

apache-spark data-analysis pyspark python spark-sql sql

Last synced: 09 May 2026

https://github.com/tsbarr/toronto-open-data

Analysis of Toronto's open data initiatives. 🌆 Exploring Toronto's urban systems through data science 📊 Python-based analyses of public datasets 🔍 Focus on community impact and urban patterns 🎓 Academic rigour meets practical insights 🔄 Regularly updated with new analyses

api-integration civic-tech ckan-api data-analysis data-cleaning data-science data-visualization exploratory-data-analysis jupyter-notebook open-data pandas public-data python tableau toronto urban-analytics

Last synced: 09 May 2026

https://github.com/abhroroy365/market_analysis

This project explores customer segmentation and market analysis in the context of online retail using an online retail dataset. By applying advanced analytics, we aim to uncover insights that can drive strategic decisions and enhance business performance.

clustering data data-analysis data-visualization kmeans-clustering machine-learning market-analysis python silhouette-analysis

Last synced: 09 May 2026

https://github.com/magnus0969/black-friday-sales-analysis

An in-depth analysis of Black Friday sales data to uncover trends, customer behavior, and product insights. Utilizing Python, data visualization, and machine learning techniques, this project provides key business intelligence to optimize sales strategies.

analysis data-analysis data-science python sales-analysis

Last synced: 09 May 2026

https://github.com/master-helix/ibm-data-analyst-certification-stock-analysis-project

This is a mini project repository of my IBM Certification involving stock analysis and plotting of Tesla and GameStop

analytics data data-analysis data-visualization ibm matplotlib pandas python web-scraping

Last synced: 09 May 2026

https://github.com/mmfava/qualesuapergunta-scripts-base-2015-2018

Este repositório contém scripts R utilizados durante meu trabalho de consultoria em bioestatística. Os scripts abrangem várias análises estatísticas e serviram como base para análises que foram realizadas. Eles não são scripts das consultorias ou assessorias em si.

analytics data-analysis r

Last synced: 20 May 2026