An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/smohanta23/ev-trendanalytics-24

This Tableau project analyzes EV adoption trends using data up to May 2024. Visualizations cover growth, geography, market share, CAFV eligibility, and consumer preferences, supporting data-driven decisions with detailed drill-downs. Data is meticulously cleaned, offering stakeholders valuable insights into EV market dynamics and trends for future.

business-intelligence data-analysis data-engineering electric-vehicles feature-engineering kpianalysis predictive-analytics tableau trendanalysis

Last synced: 27 Mar 2026

https://github.com/aleskandro/r-hadoop-madreduce-examples

A lot of examples about using R with hadoop for MapReduce with and without libraries as rhadoop/rhipe - DIEEI@unict.it - Advanced Programming Languages

data-analysis hadoop mapreduce r

Last synced: 04 Nov 2025

https://github.com/amanyadav-07/customer-churn-prediction

Machine Learning project to predict customer churn using Logistic Regression, Random Forest, and XGBoost. Includes data preprocessing, feature engineering, SMOTE balancing, model training, evaluation, and business insights.

accuracy-metrics data-analysis data-visualization logistic-regression machine-learning matplotlib numpy pandas python3 random-forest-classifier seaborn sklearn xgboost-classifier

Last synced: 11 Apr 2026

https://github.com/bala-1409/loan-clustering-datascience-projects

This project uses Machine Learning to Cluster loan together based on their similarities. The project uses a dataeset of loan application which includes information about the Loan amount and Balance. The project then use the clustering algorithm to group the loan together based on the similarities.

clustering clustering-algorithm data-analysis data-science data-visualization kmeans-clustering machine-learning machine-learning-algorithms sql unsupervised-learning unsupervised-machine-learning

Last synced: 22 Mar 2025

https://github.com/bala-1409/sales-forecasting-datascience-project

Develop a data science project using historical sales data to build a regression model that accurately predicts future sales. Preprocess the dataset, conduct exploratory analysis, select relevant features, and employ regression algorithms for model development. Evaluate model performance, optimize hyperparameters, and provide actionable insights.

data data-analysis data-science data-visualization datacleaning exploratory-data-analysis machine-learning-algorithms modelfitting prediction predictive-analytics predictive-modeling python3 regression-models salesforecast supervised-learning

Last synced: 26 Apr 2026

https://github.com/sayamalt/fake-news-classification-using-fine-tuned-bert

Successfully developed a text classification model to predict whether a given news text is fake or not by fine-tuning a pretrained BERT transformed model imported from Hugging Face.

bert-embeddings bert-model data-analysis data-visualization deep-learning fine-tuning-bert model-evaluation model-training-and-evaluation text-classification text-preprocessing text-tokenization tokenizer-nlp wordcloud-visualization

Last synced: 05 Apr 2025

https://github.com/badranalyst/student-tests-data-analysis-application

Python-based analysis of student test scores in math, reading, and writing, examining correlations with parental education, lunch type, and test preparation. Includes data cleaning, visualization, and statistical insights into factors influencing academic performance.

data-analysis data-visualization dataset matplotlib numpy pandas python sklearn

Last synced: 05 May 2026

https://github.com/oscarmtr/metrov

Interactive viewer for tropospheric meteorological soundings

climate data-analysis meteorology skew-t soundings temperature tropospheric web

Last synced: 01 Mar 2026

https://github.com/yandexdataschool/ml-sweights-experiments

Experiments for the "Machine Learning on data with sPlot background subtraction" paper

data-analysis high-energy-physics machine-learning statistics

Last synced: 15 May 2025

https://github.com/lawwrites/uncovering_fintech_user_insights

Linear Regression and K-Means to obtain user behavior insights for a fintech company

data-analysis data-science kmeans-clustering linear-regression python unsupervised-machine-learning

Last synced: 22 May 2026

https://github.com/dimits-ts/college_analysis

A statistical study about US college admissions, featuring a full report in LaTeX.

anova data-analysis exploratory-data-analysis linear-regression statistics

Last synced: 25 Jan 2026

https://github.com/malucor/livros

Programa em Python para fazer uma análise de dados sobre livros, a partir de um arquivo Excel.

analise-de-dados book books bookshelf data-analysis ipynb jupyter-notebook livro livros python

Last synced: 16 May 2026

https://github.com/namratagulati/fraud_detection

This fulfills all the requirements of a fraud detection model developed on linear regression using feature scaling, engineering and testing model with the help of auc-roc curve and others.

data-analysis data-visualization machine-learning machine-learning-algorithms machinelearning-python

Last synced: 04 Jun 2026

https://github.com/gaaniruddha/mphil

This repository contains a copy of my final MPhil presentation and panel report.

data-analysis gpu-imager radio-astronomy

Last synced: 03 Mar 2026

https://github.com/l337x911/simulations

data analysis via in silico simulations

data-analysis machine-learning python3

Last synced: 06 Apr 2025

https://github.com/omnipotence-eth/manufacturing-quality-analytics

SQL + Python pipeline for semiconductor NCR analysis — supplier performance, defect Pareto, yield trends

analytics data-analysis etl manufacturing matplotlib pandas postgresql python quality sql

Last synced: 11 Apr 2026

https://github.com/johannaschmidle/road-collisions-project

Analyzed road accident data in the UK from 2019 to 2022 to identify patterns and trends in road accidents, for Effective Road Management [Excel]

data-analysis data-visualization excel pivot-tables traffic-analysis

Last synced: 01 Mar 2026

https://github.com/juliargubolin/sql-for-data-analysis

This repository was created in order to insert all the documents, files and notes I took while learning SQL and data analysis through "SQL for Data Analysis: Advanced Techniques for Transforming Data Into Insights" by Cathy Tanimura (O'Reilly).

advanced data-analysis data-science sql

Last synced: 11 Jan 2026

https://github.com/banyc/dfplot

Summarize a data frame by plotting. `cargo install --git https://github.com/Banyc/dfplot.git`.

csv data-analysis plotly plotting statistics

Last synced: 27 Apr 2026

https://github.com/csoren66/customer-personality-analysis

Predict how different customer segments will respond for a particular product or service.

data-analysis data-visualization python

Last synced: 03 Mar 2025

https://github.com/yash22222/pwc-power-bi-virtual-case-experience

The Power BI PwC Virtual Case Experience is an exciting and educational program designed to provide participants with hands-on exposure to Power BI, a prominent business intelligence and data visualization tool, within the context of consulting at PwC.

business-analyst business-analytics business-intelligence dashboard data-analysis data-analyst data-analytics dax microsoft-power-bi powerbi powerbi-dashboards powerbi-visuals pwc

Last synced: 02 Mar 2026

https://github.com/arianarmw/da01-bike-sharing-analysis

🚴‍♀️ Data analysis project on bike-sharing systems. Includes data wrangling, exploratory data analysis (EDA), visualization, and interactive dashboards built with Streamlit. Explore patterns in bike usage and rental data!

bike-sharing-analysis data-analysis exploratory-data-analysis python streamlit visualization

Last synced: 11 Apr 2026

https://github.com/szuzick/us-immigration-presidential-analysis

Power BI dashboard analyzing 40 years of U.S. immigration data across presidential administrations (1981-2020)

dashboard data-analysis data-visualization government-data immigration powerbi powerbi-dashboards powerbi-visuals presidential-analysis

Last synced: 10 Jun 2026

https://github.com/pipe199x/end-to-end-prediction-california

End-to-end prediction project using various technologies to predict housing prices in California.

california-housing data-analysis machine-learning python

Last synced: 11 May 2026

https://github.com/monarch1108/customerinsights-kmeans

understanding customers using KMeans and RFM(recency, frequency & monetary) analysis

data-analysis data-visualization kmeans-clustering machine-learning matplotlib numpy pandas scikit-learn

Last synced: 11 May 2026

https://github.com/hrosicka/czechpopulationestimation

This GitHub repository contains Python code for data analysis and population prediction in the Czech Republic up to the year 2050. The code is written in Python and utilizes the Pandas and Matplotlib libraries.

data-analysis data-visualization matplotlib matplotlib-figures matplotlib-pyplot pandas pandas-dataframe pandas-library pandas-python python python3

Last synced: 11 May 2026

https://github.com/ceia-prefeitura/urban-lit-tracker-etl

UrbanLitTracker coleta artigos acadêmicos sobre mudanças urbanas via OpenAlex API, processa e armazena em MongoDB. Oferece dashboard interativo com Dash, exibindo dados como trabalhos mais relevantes, autores e palavras-chave frequentes, facilitando a análise e visualização da literatura urbana.

academic-research bibliometrics data-analysis data-pipeline data-visualization etl openalex-api urban-studies

Last synced: 11 May 2026

https://github.com/OdessaZ/Portfolio-Projects

This is a repository I have created to showcase skills, share projects and track my progress in Data Analytics and Data Science

applied-mathematics data-analysis data-science excel jupyter-notebook matplotlib-pyplot pandas portfolio python r r-studio seaborn sql statistics

Last synced: 12 May 2026

https://github.com/jayita11/customer-engagement-insights-for-yelp-restaurant-business-success

This project analyzes Yelp restaurant data using SQLite, Python, and Tableau to explore user engagement, reviews, and ratings. It provides insights into restaurant success across cities, regions, and user behavior.

customer-engagement data-analysis interactive-visualizations json python ratings review sqlite3 tableau-dashboards-for-data-visualization yelp-restaurants

Last synced: 12 May 2026

https://github.com/ggarciajavier/udacity-dalf-project2-wrangle-openstreetmap-data

Work performed for the 2nd project of Udacity Data Analyst Nanodegree: OpenStreetMap data wrangling and analysis.

data-analysis openstreetmap python sql

Last synced: 12 May 2026

https://github.com/krypten/playingcardsstatisticalanalysis

Statistical Analysis of Playing Cards (Descriptive Statistics: Final Project)

data-analysis machine-learning machinelearning python statistics udacity

Last synced: 12 May 2026

https://github.com/priyanshu7639/data_visualization_dashboard

An Interactive data visualization tool that combines traditional plotting capabilities with modern AI assistance. It allows users to create and modify visualizations through natural language commands, making data exploration accessible to users of all skill levels.

business-analytics data-analysis data-engineering data-exploration data-science data-visualization datapreprocessing datascience interactive-visualizations matplotlib plotly plotting python research-tool streamlit

Last synced: 12 May 2026

https://github.com/johannaschmidle/amazon-cat-couch

Customer product reviews + ratings analysis and visualization [Python, Excel, Tableau, R]

data-analysis data-visualization jupyter-notebook python-notebook r-markdown sentiment-analysis text-analysis web-scraping

Last synced: 11 Jun 2026

https://github.com/agailloty/preprocess

preprocess is a fast data analysis preprocessing tool.

cli data-analysis preprocessing-data

Last synced: 12 May 2026

https://github.com/fisseha-estifanos/telecom

A showcase repository for a specific telecommunication company. Used to analyze several telecommunication data set features and generate useful insights accordingly. Insights generated could be seen at https://github.com/Fisseha-Estifanos/telecom-visualizer or at https://fisseha-estifanos-telecom-visualizer-home-huxgy0.streamlitapp.com/

data-analysis notebooks-jupyter python visual-studio-code visualization

Last synced: 12 May 2026

https://github.com/sricasea/fundraising-insights-mwpccc

Data storytelling meets impact strategy — a nonprofit fundraising analysis project combining SQL, Python, and Deepnote to uncover donor trends and guide smarter decisions.

data-analysis data-storytelling data-visualization deepnote fundraising nonprofit portfolio-project python sql

Last synced: 12 May 2026

https://github.com/roland045/smart_fluid_sedimentation_tester

Control program for custom developed smart fluid sedimentation tester system

arduino data-analysis instrumentation measurement sensor

Last synced: 13 May 2026

https://github.com/manukot/sturdy-engine-python-

I've leant not only various Theoretical Concepts but also practical projects in my Masters Coursework

data-analysis data-visualization python3

Last synced: 13 May 2026

https://github.com/mituskillologies/dkte-da-mar25

Programs conducted at DKTE's Engineering Institute, Ichalkaranji in training on Python Data Analytics March 2025.

data-analysis matplotlib numpy pandas python-programming tkinter-python

Last synced: 13 May 2026

https://github.com/nlink-jp/shell-agent-v2

macOS local-first chat & agent tool with interactive data analysis (Wails v2 + React)

data-analysis duckdb golang llm macos react wails

Last synced: 13 May 2026

https://github.com/madhurragarwal/advertising-data-set---eda-and-ml

Logistic Regression and EDA done on Advertising Data set

data-analysis machine-learning

Last synced: 13 May 2026

https://github.com/yeonjaee/data-analytics

converts raw data into actionable insights

data-analysis text-mining

Last synced: 11 Jun 2026

https://github.com/balajimohan18/foreign-exchange-rate-time-series-datascience-project

This project will use time series analysis to forecast the exchange rate between the euro and the US dollar. The project will use a variety of statistical techniques, such as ARIMA to model the data and forecast the exchange rate.

data-analysis data-analytics data-preprocessing data-science data-transformation data-visualization eda exploratory-data-analysis foreign-exchange-rates machine-learning model-fitting predictive-modeling python3 time-series time-series-analysis

Last synced: 14 May 2026

https://github.com/iamsainikhil/web-data-scraping

Data scraping from a webpage using Python

beautiful-soup data-analysis data-scraping python

Last synced: 11 Jun 2026

https://github.com/skuschel/postexperiment

postprocessor for experimental (event based) data.

data-analysis eventstore hacktoberfest postprocessing

Last synced: 12 Jun 2026

https://github.com/luizassimoes/q5ga-latency-and-throughput

Quick 5G Analyser: PyQT5 software developed to help with simple graphical analysis and chart generating for ping and iperf3 tests.

data-analysis data-visualization pyqt5 python

Last synced: 13 Jun 2026

https://github.com/gmalbert/immigration

Immigration Data Analysis

data-analysis immigration

Last synced: 14 Jun 2026

https://github.com/dipeshgoyal013/crypto-currency-dashboard

This project analyzes historical cryptocurrency data and builds an interactive Power BI dashboard. It includes time-series forecasting of Bitcoin and Ethereum using ARIMA and Power BI’s forecasting model.

data-analysis excel power-bi python

Last synced: 15 Jun 2026

https://github.com/juanse0330/registro-pacientes-terapia-python

Proyecto en Python para automatizar el registro y análisis de pacientes en terapia ocupacional domiciliaria. Herramienta orientada al sector salud.

automatizacion data-analysis python salud terapia-ocupacional

Last synced: 17 Jun 2026

https://github.com/kheriberto/bedu_dc

Ejercicios del curso de "python desde 0" de la plataforma BEDU

data-analysis python

Last synced: 18 Jun 2026

https://github.com/duoan/ds-nbs

Data analysis and machine learning notebook.

data-analysis data-scientists deep-learning kaggle-competition machine-learning

Last synced: 18 Jun 2026

https://github.com/ilhanseyhanx/car-price-prediction-with-machine-learning

🚗 ML-powered car price prediction model with 95.88% accuracy using Random Forest and comprehensive data preprocessing

car-price-prediction data-analysis data-science machine-learning pandas python random-forest regression sklearn

Last synced: 19 Jun 2026

https://github.com/dcs-training/r-visualisation-and-stats

This repository contains material from a 8 classes course on Data Visualisation and statistics with R

data-analysis data-visualisation data-wrangling intro-to-programming r statistics

Last synced: 20 Jun 2026

https://github.com/jayavarshini-jayakumaran/nba-exploratory-data-analysis

A data analytics project that explores NBA game and player data using Python and Power BI. Features data preprocessing, EDA, feature engineering, and an interactive dashboard for visualizing team and player performance trends.

data-analysis data-visualization exploratory-data-analysis powerbi python3

Last synced: 20 Jun 2026

https://github.com/haseebn19/urban-housing-demand

A full-stack web application for visualizing housing and labour market data

data-analysis data-visualization docker full-stack gradle statistics web webapp

Last synced: 22 Jun 2026

https://github.com/engusseus/warframe-market-set-profit-analyzer

Python tool that analyzes Warframe Market data to find profitable item sets to trade

api data-analysis python trading waframe

Last synced: 23 Jun 2026

https://github.com/anburocky3/cbse-schools-data

Fetch CBSE Schools in seconds and use it for your data projects

cbse data data-analysis data-science grabber nextjs

Last synced: 24 Jun 2026

https://github.com/parsabordbar/ctx3docs

The Documentation for context Tree Project.

ai-tools context ctx3 ctx3-docs data-analysis documentation tree workflow

Last synced: 25 Jun 2026

https://github.com/darkdk123/house-valuation-model

A Challenge Project in a Boot-Camp to create a ML Model to predict the prices of houses in Boston Massachusetts from multiple parameters Using Multivariable Regression.

data-analysis data-science data-visualization matplotlib-pyplot multivariate-regression predictive-modeling statistics

Last synced: 07 Jul 2025

https://github.com/borjamome/explorando-madrid

Exploring Madrid: A Data-driven Analysis with R 🐻🌳

data-analysis data-visualization madrid r

Last synced: 26 Mar 2025

https://github.com/manish506/loan-approval-prediction

Explore predictive modeling in this project by applying classification techniques to a loan approval dataset. Analyze and preprocess the data, then use models like K-Nearest Neighbors, Random Forest, SVC, and Logistic Regression to predict loan outcomes. Gain insights into approval factors and enhance prediction accuracy.

classification classification-models data-analysis data-science jupyter-notebook loan-approval-prediction machine-learning predictive-analytics predictive-modeling project python

Last synced: 19 Jan 2026

https://github.com/valyaevgeorgiy/r_basic

Работа с основами среды R и тем самым изучения нового языка программирования, связанного непосредственно с анализом данных и построением графиков и диаграмм.

coding data data-analysis r rstudio

Last synced: 12 Dec 2025

https://github.com/aakk23/netflix_sql_project

This SQL project provides an analytical overview of Netflix's movies and TV shows dataset, uncovering key insights related to content types, ratings, release trends, and geographic distribution. It helps explore patterns in content availability, audience targeting, and regional preferences to support data-driven decisions.

data-analysis netflix-data-analysis postgresql sql

Last synced: 10 Apr 2025

https://github.com/ritap03/neuralnetwork-shapeclassifier

Feedforward neural network system in MATLAB for geometric shape classification. Includes data preprocessing, network training and evaluation, confusion matrix analysis, and a graphical interface for user interaction and model testing.

ai data-analysis deep-learning feedforward-network gui image-classification machine-learning matlab neural-network pattern-recognition

Last synced: 14 May 2026

https://github.com/jackmnob/python-tableau-eda-stockdash

Data cleaning, preparation, and manipulation (EDA) for an interactive stock market dashboard with Tableau - using pandas (Python) via JupyterLab

cleaning-data dashboard data-analysis data-preparation eda jupyter-notebook jupyterlab python tableau-public

Last synced: 14 May 2026

https://github.com/sidsin0809/hmdb-endo-flagger

A Python toolkit to identify and score endogenous human metabolites from HMDB XML metadata

data-analysis hmdb metabolomics ontology pipeline python-3 streaming-parser xml-parsing

Last synced: 06 Jul 2025

https://github.com/hemangsharma/bookingdataanalysisreport

The report helps understand key trends and insights around customer bookings, pricing, and other related attributes.

analysis data data-analysis data-analytics data-visualization streamlit streamlit-dashboard

Last synced: 14 May 2026

https://github.com/sandergi/ekichabi

A digital phonebook to connect sustenance farmers in Tanzania. Works via USSD so farmers without an internet connection can use it (via their Telecom). Build with Django in Python and a MySQL database. This is a public copy of the private repo with user information stripped.

android data-analysis ict4d research ussd

Last synced: 14 May 2026

https://github.com/codeonthespectrum/web-scrap

Este projeto realiza o web scraping da Wikipédia para obter dados sobre os municípios mais populosos do estado do Rio de Janeiro.

data-analysis data-visualization webscraping

Last synced: 16 Feb 2026

https://github.com/muneeb706/r-programming

R-Programming examples for data analysis.

data-analysis r-programming

Last synced: 26 Mar 2025

https://github.com/wwgolay/hr1099-timelapse-vlbi

The repository for HR1099 timelapse VLBI.

astronomy astrophysics data-analysis website

Last synced: 03 Apr 2025

https://github.com/hevalhazalkurt/word_analyser

A web app developed in Python and Django that analyzes given text mathematically and sentimentally.

analyzer analyzes content data-analysis django emotion python python3 sentiment sentiment-analyser sentiment-analysis text text-analysis

Last synced: 19 May 2026

https://github.com/prakashjha1/whatsapp-chat-analyzer

WhatsApp Analyzer means we are analyzing our WhatsApp group activities. It tracks our conversation and analyses how much time we are spending or saying it as “wasting” on WhatsApp.

data-analysis data-science natural-language-processing pandas pyhton regular-expression

Last synced: 15 May 2026