An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/saksham-jain177/cryptodataanalysis

A Python powered project that fetches live cryptocurrency data from the CoinMarketCap API, analyzes it, and updates a live Excel sheet every 5 minutes.

api-integration coinmarketcap cryptocurrency data-analysis excel live-data python

Last synced: 12 Jun 2026

https://github.com/marialuizaleitao/walmartsalesanalysis

This project explored data collection and preprocessing, advanced application of SQL queries, and feature engineering. Key calculations, such as COGS (Cost of Goods Sold) and VAT (Value Added Tax), were performed to assess the profitability and financial efficiency of the branches.

business-analytics data-analysis mysql-database sql

Last synced: 13 Jun 2026

https://github.com/gmalbert/immigration

Immigration Data Analysis

data-analysis immigration

Last synced: 14 Jun 2026

https://github.com/jkazari/rollercoaster-eda

Repository of a small data-analysis project in R for Mathematical Software class on the 3rd semester of studying Mathematics at Gdańsk University of Technology

data-analysis r

Last synced: 14 Jun 2026

https://github.com/brunomontezano/sleep-quality-cognition

💤 Analysis of the paper "Associations between general sleep quality and measures of functioning and cognition in subjects recently diagnosed with bipolar disorder".

bipolar-disorder cognition data-analysis sleep-analysis sleep-research

Last synced: 15 Jun 2026

https://github.com/prathmesh2507/global-stock-intelligence-dashboard

Interactive Global Stock Market Analytics Dashboard built using Python, YFinance, Pandas, Streamlit, and Plotly. Analyze 20+ countries and 400+ top stocks with advanced visualizations and financial insights.

dashboard data-analysis data-visualization python stock-analysis streamlit

Last synced: 15 Jun 2026

https://github.com/anderson-andre-p/uber-data-analysis

This repository contains a comprehensive data analysis project focused on Uber rides. The dataset used in this project is a spreadsheet obtained from Uber, containing data related to ride details, such as pick-up and drop-off locations, date and time of the ride, and the fare amount.

data-analysis data-science data-visualization python

Last synced: 15 Jun 2026

https://github.com/victoryfanfare/car-price-prediction

ML модель для определения рыночной стоимости автомобилей с пробегом. Проект включает анализ данных, feature engineering и сравнение различных алгоритмов машинного обучения.

catboost data-analysis jupyter-notebook lightgbm machine-learning pandas python regression

Last synced: 15 Jun 2026

https://github.com/dcs-training/data-wrangling-and-vis-pandas

Introduction to analyzing structured data with the Python libraries pandas, for CSV and TSV data, and ElementTree, for XML data. Go to the readme file

data-analysis data-visualisation data-wrangling python

Last synced: 16 Jun 2026

https://github.com/llnl/cap

HPC workflow that automates the tedious actions of compiling, analyzing, and parsing with bincfg

data-analysis hpc python workflows

Last synced: 17 Jun 2026

https://github.com/juanse0330/registro-pacientes-terapia-python

Proyecto en Python para automatizar el registro y análisis de pacientes en terapia ocupacional domiciliaria. Herramienta orientada al sector salud.

automatizacion data-analysis python salud terapia-ocupacional

Last synced: 17 Jun 2026

https://github.com/ibttf/bayborhood

Interactive map to find the ideal neighborhood in San Francisco based on data.

data data-analysis data-visualization gis mapbox react

Last synced: 18 Jun 2026

https://github.com/lotfiferaga/amazon-alexa-reviews-sentiment-analysis

Amazon Alexa, developed by Amazon, allows users to interact with technology through voice commands. Analyzing user sentiments about Alexa, with over 40 million users worldwide, is an intriguing data project.

classification data-analysis python sentiment-analysis

Last synced: 18 Jun 2026

https://github.com/ilhanseyhanx/car-price-prediction-with-machine-learning

🚗 ML-powered car price prediction model with 95.88% accuracy using Random Forest and comprehensive data preprocessing

car-price-prediction data-analysis data-science machine-learning pandas python random-forest regression sklearn

Last synced: 19 Jun 2026

https://github.com/shahaf-f-s/feature-space

A modular framework for combining pandas series features

data-analysis data-science feature-engineering

Last synced: 19 Jun 2026

https://github.com/alinababer/covid19-timeseries-cases-and-deaths-forecasting-

This study is based on confirmed cases and deaths collected from Pakistan. Results demonstrate the promising potential of TIME SERIES model in forecasting COVID-19 cases and highlight the superior performance of the time series compared to the LSTM.we apply AI-based forecasting models such time series ARIMA, LSTM, prophet and VAR.

arima covid-19 data-analysis data-science data-visualization fbprophet forecasting lstm rnn time-series var vectorautoregression

Last synced: 19 Jun 2026

https://github.com/angelmtenor/idafc

Udacity's Intro to Data Analysis

data-analysis

Last synced: 20 Jun 2026

https://github.com/mahapeth/invest-track

Реализация инструмента для мониторинга активности пользователей ИС "Инвест" для ВКР по направлению 01.03.02 Прикладная математика и информатика

analitycs app data-analysis data-visualization jupyter-notebook python sites

Last synced: 20 Jun 2026

https://github.com/sakan811/stress-pattern-occurrence-in-english-words

This project is intended to provide English learners with data that allows them to make a data-driven guess when encountering words that they aren't sure where to stress

data-analysis data-visualization english english-language english-learning language powerbi powerbi-report powerbi-visuals

Last synced: 20 Jun 2026

https://github.com/aonurakman/data-analysis-and-ml-algorithms

An exploration of data analysis techniques and standard ML algorithms on QSAR oral toxicity dataset. - 2021 - Yıldız Technical University

classification clustering data-analysis data-mining isolation-forest python regression

Last synced: 20 Jun 2026

https://github.com/evanmathew/northwind-traders

SQL-powered analysis of sales, employee performance, and customer behavior using PostgreSQL window functions. This project uncovers key business insights to optimize decision-making.

case-study data-analysis jupyter-notebook northwind-traders postgresql python-postgresql sql

Last synced: 20 Jun 2026

https://github.com/dcs-training/datavisualisationwithr2021

Data Visualisation with R Course (delivered by the Centre in October/November 2021). This workshop is focusing on good practice of creating graphs with R and R Studio. Go to the readme file

data-analysis data-visualisation data-wrangling r

Last synced: 23 Jun 2026

https://github.com/anburocky3/cbse-schools-data

Fetch CBSE Schools in seconds and use it for your data projects

cbse data data-analysis data-science grabber nextjs

Last synced: 24 Jun 2026

https://github.com/imosudi/unsupervised-ml-kmeans-analysis

K-Means clustering analysis using synthetic datasets generated with scikit-learn, including meshgrid visualisation, silhouette score evaluation, and investigation of cluster count and random seed effects.

clustering data-analysis jupyter-notebook kmeans kmeans-clustering machine-learning matplotlib python3 scikit-learn silhouette-score unsupervised-learning

Last synced: 25 Jun 2026

https://github.com/manganite/vibespin

VibeSpin is a Python framework for simulating and analyzing 2D lattice spin systems (Ising, XY, and q-state Clock models) with Numba-accelerated Monte Carlo dynamics, correlation/structure diagnostics, and reproducible benchmarking workflows.

clock-model critical-phenomena data-analysis ising-model lattice-models monte-carlo-simulation phase-transitions physics-simulation python scientific-computing spin-models spin-systems statistical-mechanics xy-model

Last synced: 29 Jun 2026

https://github.com/victoorv/criminalite_us

Une analyse de la criminalité en fonction de variables socio-économiques a été menée, incluant la sélection et la comparaison de modèles de régression multiple ainsi que des tests d'hypothèses sur les coefficients et la significativité des modèles.

data-analysis data-science r regression regression-analysis regression-models statistical-analysis statistical-tests statistics

Last synced: 04 Apr 2026

https://github.com/fatihilhan42/eda-spacex-launches-falcon9-and-falcon-heavy

In this project, we analyze the space flight data of Spacex space research company Falcon 9 rocket.

data-analysis data-science data-visualization eda elonmusk spacex

Last synced: 23 Mar 2025

https://github.com/mohammad-malik/covid-visualizations-d3

This project provides a dashboard with five different perspectives on the pandemic, from patient-infection relationships to regional trends and hierarchical distributions. This was developed as part of a project for the course Data Analysis and Visualization (DS3001).

covid-19 d3 d3-visualization d3js data data-analysis data-analytics data-science visualization

Last synced: 28 May 2026

https://github.com/deller23/hotel_booking_data_cleaning

Efficiently transforming raw hotel booking data into actionable insights! This project leverages Python and Pandas for advanced data cleaning—handling missing values, detecting outliers, and optimizing features—ensuring a high-quality dataset ready for analysis and modeling.

data-analysis data-cleaning data-preprocessing data-visualization data-wrangling pandas python

Last synced: 31 Mar 2025

https://github.com/kailenroa/dashboad-excel-huisprijzen

This project focuses on developing a dashboard powered by Funda to visualize house pricing in the Netherlands. The dashboard simplifies the home-buying process by allowing users to compare prices, energy labels, number of rooms, and square meters across different provinces, all in one interactive platform..

dashboard data-analysis excel house-prices

Last synced: 05 Jan 2026

https://github.com/abelarduu/power_bi_analyst

Projeto Power BI para relatório de dados financeiros, com navegação intuitiva e recursos interativos. Oferece uma experiência completa ao usuário, combinando apresentação sofisticada e funcionalidade eficaz para análise de dados.

dashboard data-analysis data-analytics modelagem-de-dados powerbi tratamento-de-dados

Last synced: 08 Sep 2025

https://github.com/adilshamim8/eda-on-health-and-sleep-data

Exploratory Data Analysis (EDA) on health and sleep data, uncovering patterns and insights using Python and visualization tools.

data-analysis data-visualization eda health healthcare sleep sleep-analysis

Last synced: 15 Mar 2025

https://github.com/reddyprasade/r-program

R is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. The R language is widely used among statisticians and data miners for developing statistical software and data analysis.

data-analysis data-science r-programming

Last synced: 11 Apr 2026

https://github.com/syed-amjad-ali/-bank-churn-ml

Predicting bank customer churn using machine learning. This project includes exploratory data analysis (EDA), feature engineering, classification models (Logistic Regression, Random Forest), and customer segmentation using K-Means clustering.

classification data-analysis data-science eda jupyter-notebook k-means-clustering machine-learning ml python segmentation

Last synced: 09 Mar 2025

https://github.com/bablukumarjha/startup-funding-revenue-analysis-by-sql-and-pandas

SQL project analyzing startup funding, revenue, and founder data to extract business insights using Python and MySQL.

data data-analysis data-platform data-science dataanalysisusingpython dataanalytics pandas-dataframe pandas-library python sql sql-server sqlalchemy sqldatabase

Last synced: 18 May 2026

https://github.com/lucaso21/euro-2021-player-stats-analysis

A short project analyzing stats for players at the Euro 2021 tournament.

data-analysis data-science r rvest tidyverse

Last synced: 16 Mar 2025

https://github.com/hemangsharma/job-tracker

A comprehensive Streamlit application for tracking and analyzing job applications.

data-analysis python streamlit-dashboard streamlit-webapp

Last synced: 15 Mar 2025

https://github.com/atharvapathak/rsvp_movies_case_study

SQL queries performed on IMDb database to provide recommendations to RSVP Movies based on insights.

data-analysis data-cleaning data-science imdb-dataset rsvp-movies sql

Last synced: 28 Jan 2026

https://github.com/shivakumarhl/digital-music-store-analysis

Digital Music Store Data Analysis using SQL

data-analysis sql

Last synced: 10 Mar 2026

https://github.com/bocchio01/skyward_recruitment_assignment

Assignment to join the PoliMi SkyWard software team

data-analysis kalman-filter model-rocket

Last synced: 15 Mar 2025

https://github.com/zachbateman/easy_plot

Easy Statistical Visualization in Python

data-analysis data-visualization graphics matplotlib python seaborn

Last synced: 18 Jan 2026

https://github.com/thenazar9/user-behavior-email-campaign-analysis-sql

Analysis of user behavior and email campaign performance using BigQuery and Looker Studio, focusing on account creation trends, email engagement, and user segmentation.

analytics bigquery data-analysis data-visualization etl looker-studio sql structured-query-language

Last synced: 16 Oct 2025

https://github.com/darksoulnelson/json-to-excel-converter

This repository provides a tool to convert JSON data to Excel format (.xlsx). It allows you to easily transform structured JSON data into a well-organized spreadsheet for better analysis and visualization.

automation-script automation-tools data-analysis data-converter data-export data-formatting data-tools data-visualization excel excel-automation excel-converter excel-tools json json-exporter json-parser json-processing json-to-csv json-to-excel programming-tools spreadsheet-tools

Last synced: 05 Jul 2025

https://github.com/neerajcodes888/data-science

This repository is a hub for data science enthusiasts, offering a diverse collection of projects, notebooks, and resources covering topics such as data analysis, machine learning, deep learning, and generative AI. Explore innovative ideas, contribute to cutting-edge research, and enhance your skills in the dynamic field of data science

data-analysis data-science data-visualization deep-learning deep-learning-algorithms eda genai jupyter-notebook machine-learning machine-learning-algorithms openai-api pandas plotting python3 sklearn-library streamlit

Last synced: 01 May 2026

https://github.com/eea/eea.reveal

Reveal hidden knowledge by visualizing network structure in your data.

data-analysis data-visualization graphviz network-visualization

Last synced: 18 Mar 2025

https://github.com/aroramrinaal/spotistats

Spotistats is a data analysis and visualization project based on your Spotify streaming history.

data-analysis numbers spotify spotify-history visualization

Last synced: 15 Mar 2025

https://github.com/ginga1402/youtube_analysis

Exploratory Data Analysis on YouTube data

college-project data-analysis pandas-python

Last synced: 30 Mar 2025

https://github.com/balajimohan18/loan-classification-datascience-project

This project uses machine learning algorithms to predict the classification of loan status. The dataset is loaded and some transformation is done using SQL for getting a proper dataset with some valid informations.

classification data-analysis data-cleaning data-science data-visualization loan-prediction loan-status machine-learning sql supervised-learning

Last synced: 03 Sep 2025

https://github.com/alchemine/analysis-tools

Analysis tools for machine learning projects

data-analysis explanatory-data-analysis machine-learning python

Last synced: 06 Aug 2025

https://github.com/misaghmomenib/shop-revenue-analysis

A Data Analysis Project Aimed at Analyzing and Forecasting Shop Revenue Based on Sales and Other Business Metrics. It Helps to Identify Trends, Patterns, and Key Factors Influencing Revenue to Make Data-driven Decisions for Business Growth.

data-analysis data-visualization python

Last synced: 24 Mar 2025

https://github.com/noturlee/iris-dataanalyis

This project aims to classify Iris flowers into three species—setosa, versicolor, and virginica—based on their sepal and petal measurements using machine learning techniques. The dataset comprises 150 samples evenly distributed among these species

data-analysis data-modeling data-science data-structures-and-algorithms data-visualization

Last synced: 08 Apr 2025

https://github.com/adrianlardies/from-data-to-insight

This project creates and manages a MySQL database to analyze the performance of Bitcoin, Gold, and the S&P 500 in response to economic factors. It integrates historical data, executes advanced SQL queries, and visualizes key insights, showcasing the power of SQL and Python in financial analysis.

data-analysis data-science matplotlib pandas python seaborn sql

Last synced: 12 Apr 2026

https://github.com/adarshpheonix2810/fake-job-post-detection

This project focuses on detecting fake job posts using machine learning. Fake job advertisements are often created to scam individuals by stealing personal information or money.

data-analysis deep-learning joblib machine-learning nlp-machine-learning numpy pandas python scikit-learn tkinter

Last synced: 12 Apr 2026

https://github.com/mindlessmuse666/train-test-splitter

Анализ данных о пассажирах Титаника и разбиение на обучающую и тестовую выборки. Практическое задание по дисциплине "Основы применения методов искусственного интеллекта в программировании".

data-analysis data-preprocessing data-visualization machine-learning pandas python scikit-learn seaborn titanic train-test-split

Last synced: 12 Apr 2026

https://github.com/krzysikd/apartment-prices-in-poland-analysis-and-visualization

Data Analyst portfolio project that involves cleaning, transforming, and visualizing data to create an insightful dashboard. The project uses SSIS for ETL processes, SSMS for database management and queries, and Power BI for data visualization, focusing on the analysis of rental and sales apartment prices in Poland.

data-analysis data-cleaning data-visualizations powerbi sql sqlserver ssis

Last synced: 04 Feb 2026

https://github.com/rauhanahmed/auto-data-analyzer

AutoDataAnalyzer: Automate data ingestion, analysis, and visualization with AI/ML-powered pipelines. Features natural language query processing, interactive Plotly visualizations, and seamless deployment via Docker.

ai-powered-analysis automated-pipeline cicd data-analysis data-visualization docker end-to-end-project flask generative-ai langchain llama3-1 machine-learning natural-language-processing plotly python3 pywebio

Last synced: 12 Apr 2026

https://github.com/prgermux/defect-finder

Defect Finder is an interactive Python-based GUI application for detecting and analyzing mechanical and non-mechanical defects in data. It provides defect visualization, periodicity analysis, and statistical insights, making it ideal for research and quality control workflows.

data-analysis defect-detection gui pyqt5 python quality-control statistics visualization

Last synced: 24 Mar 2025

https://github.com/leosimoes/digitalinnovationone-analise-datasets

Projeto prático "Análise de dados com Python e Pandas" do Bootcamp "Banco Carrefour Data Engineer" da Digital Innovation One.

data-analysis data-science python

Last synced: 24 Mar 2025

https://github.com/hadeel-13/new_home

New Home is a Website for Buying and Selling Real Estate with user preferences, it is my Graduation project with a grade of 93%.

bootstrap5 chartjs css css3 data-analysis data-mining google-maps html html5 javascript jquery

Last synced: 12 Apr 2026

https://github.com/vikpires/ds_tips-dataset

Projeto individual do bootcamp de ciência de dados avanti 2024.2, com o objetivo de analisar e observar padrões no conjunto de dados "Tips".

data-analysis data-science data-visualization exploratory-data-analysis jupyter-notebook matplotlib numpy pandas python seaborn tips

Last synced: 17 Sep 2025

https://github.com/hari7261/data-visualization

Python-based application built using CustomTkinter for the graphical user interface (GUI) and Matplotlib for data visualization. It allows users to import datasets, perform real-time data visualization, and analyze data using various chart types and machine learning techniques.

data-analysis data-visualization export hari7261 import python realtime-visualization

Last synced: 17 Jun 2025

https://github.com/shreeparab1890/chat-analyzer

This project is a Data Analysis project to analyze the WhatsApp chats.

data-analysis numpy pandas python

Last synced: 12 Apr 2026

https://github.com/m4tice/qm_project

Bicycle project crowd evaluation.

data-analysis data-engineering data-visualization

Last synced: 16 Mar 2025

https://github.com/dhruvil-26/powerbi-projects

This repository contains Power BI projects showcasing data analysis and interactive dashboards. Each project includes detailed visualizations and insights on diverse topics such as loan analysis, sales performance, and customer behavior.

customer-behavior-analysis data data-analysis interactive-dashboards loan-analysis powerbi sales-performance visualization

Last synced: 04 Feb 2026

https://github.com/danielafishwickinacap/coderhouse_da

Data analyst Final Project files

data-analysis

Last synced: 18 Jan 2026

https://github.com/ajay1214/credit-card-transaction-dashboard

Credit Card weekly dashboard that provides real-time insights into key performance metrics and trends

data-analysis powerbi sql

Last synced: 04 Feb 2026

https://github.com/tomijuarez/lemmatisation

Lemmatisation fully implemented in Java.

algorithms data-analysis data-science java-8 lemmatization oop

Last synced: 08 Apr 2025

https://github.com/rahulsm20/insurance-data

A data analytics project dealing with risk assessment and it's effects in health insurance.

data-analysis data-analytics machine-learning matplotlib numpy pandas python scikit-learn

Last synced: 12 Apr 2026

https://github.com/pranav016/exploratory-data-analysis-of-sp500-dataset

This a data-analysis that I performed on the S&P 500 dataset and answered a few questions through data visualization techniques.

data-analysis

Last synced: 30 Oct 2025

https://github.com/soumya-thoutam/covid-19-impact-on-u.s.-states-and-colleges

Covid-19 analysis and impact on United States Colleges and States using SQL and Tableau.

covid-19 dashboard data-analysis data-visualization dataset sql sql-server tableau

Last synced: 04 Sep 2025

https://github.com/ernanej/data-science-dca0131

Files, developed throughout the 2024.1 semester of the Data Science discipline taught at the Federal University of Rio Grande do Norte by the Department of Computer Engineering and Automation (DCA). 📚

big-data data-analysis data-science ia

Last synced: 30 Mar 2025

https://github.com/hemangsharma/breast-cancer-patient-dashboard

This interactive Streamlit dashboard visualizes insights from the SEER Breast Cancer Dataset (2006-2010)

data-analysis streamlit streamlit-dashboard streamlit-webapp

Last synced: 05 May 2026

https://github.com/theveryhim/frequent-item-sets-and-lsh

A practice on finding frequent item sets and similar items in pysaprk framework

big-data data-analysis frequent-itemset-mining locality-sensitive-hashing pyspark text-processing

Last synced: 03 Jul 2025

https://github.com/quantumudit/groceries-basket-analysis

This project performs market basket analysis using Power BI and Python to reveal associations between grocery items. It involves transforming raw transaction data into a processed dataset, creating interactive Power BI reports, and generating key insights through Python, enabling data-driven decision-making.

data-analysis data-visualization pandas powerbi python

Last synced: 12 Apr 2026

https://github.com/abdelrahmanbayoumi/titanic-machine-learning-from-disasters

Knowing from a training set of samples listing passengers who survived or did not survive the Titanic disaster, can our model determine based on a given test dataset not containing the survival information, if these passengers in the test dataset survived or not.

data-analysis data-science data-visualization machine-learning pandas

Last synced: 09 Apr 2025

https://github.com/grooviter/tablesaw

Java dataframe and visualization library

data-analysis dataframe java visualization

Last synced: 28 Mar 2025

https://github.com/yanny-alt/competitor-sales-analysis-in-power-bi

This project aims to analyze competitor sales for a fictional manufacturing company, Sintec, using Power BI. The focus is on integrating, cleaning, and modeling data from multiple sources to generate insightful reports on company and competitor performance.

data-analysis powerbi sales-analysis

Last synced: 07 Jan 2026

https://github.com/jnyambok/epl_dashboard

English Premier League Dashboard summarizing match data from 2009-2024

data-analysis data-science gcp powerbi

Last synced: 04 Sep 2025

https://github.com/jbalooshie/election_analysis

A Python script built to analyze specific election's results, and be re-purposed to analyze the results of other elections. The script provides you with different breakdowns of the vote based on candidate and county,

data-analysis data-science elections python

Last synced: 09 Apr 2025