An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/muthukumar0908/youtube-data-harvesting-and-warehousing-using-sql-mongodb-and-streamlit

Create a simple and intuitive user interface using Streamlit, From the youtube getting and extracting the data by using API key. That data stored in database.

data-analysis mongodb-atlas python sqldatabase streamlit-webapp youtube-api

Last synced: 24 Apr 2026

https://github.com/manisharora96/data-analysis-of-smartwatch

The project is structured with sample data, step-by-step Jupyter notebooks, and modular Python scripts for automated analysis

data-analysis data-visualization jupyter-notebook python smartwatch-analysis

Last synced: 24 Apr 2026

https://github.com/cyberoctane29/python-for-data-analysis

A repository dedicated to learning Python for data analysis, data science, and data analytics. This collection of Jupyter notebooks covers practical exercises and concepts from the Google Advanced Data Analytics Professional Certificate program.

data data-analysis data-analytics data-science python

Last synced: 24 Apr 2026

https://github.com/edwinrlambert/emomap-sentiment-analysis

To analyze public sentiment related to specific locations in a city (e.g., parks, transit stations, restaurants, neighborhoods) using geo-tagged social media posts, reviews, and comments. The goal is to visualize how people feel across different areas and times.

data-analysis jupyter-notebook python sentiment-analysis

Last synced: 24 Apr 2026

https://github.com/amlanmohanty1/zepto-sql-data-analysis-project

Complete Data Analysis on Zepto Inventory data using SQL

data-analysis database inventory-management postgresql sql zepto

Last synced: 24 Apr 2026

https://github.com/gnodux/adb-link

An MCP server that connects to multiple databases. Supports access control and dynamic SQL query tool registration and invocation.

agent ai-tools data-analysis database-gateway go mcp mcp-server

Last synced: 06 Jun 2026

https://github.com/pedrohdosanjos/economic-data-analysis

This project aims to analyze the export data from various states in the United States to Brazil over time. The data is sourced from the FRED (Federal Reserve Economic Data) API and processed to identify the top 5 exporting states for each year, as well as the states with the highest total export value across all years.

api data-analysis data-visualization jupyter-notebook python

Last synced: 24 Apr 2026

https://github.com/ismielabir/pycsvsummarizer

A lightweight tool to summarize CSV files using various features.

csv data-analysis data-summary python

Last synced: 25 Apr 2026

https://github.com/mariann95/sql_data_warehouse_and_analytics_project

Building a modern data warehouse with SQL Server, including ETL processes, data modeling, and analytics. This repository also contains a collection of SQL scripts demonstrating various analytical techniques, such as changes over time, cumulative, performance, data segmentation, part-to-whole analysis.

data-analysis data-analytics data-cleaning data-engineering data-lakehouse data-science data-science-portfolio data-warehouse data-warehousing datalake datawarehouse datawarehousing etl etl-job etl-pipeline medallion-architecture sql sql-query sql-server sqlserver

Last synced: 06 Jun 2026

https://github.com/mehmetkahya0/gallstone_dataset_analysis_project

Safra Taşı Hastalığı (Gallstone-1) Veri Seti Analizi (https://archive.ics.uci.edu/dataset/1150/gallstone-1)

analysis analytics data data-analysis data-science data-visualization database graph matplotlib python

Last synced: 25 Apr 2026

https://github.com/fbarffmann/belly-button-challenge

Built an interactive JavaScript dashboard to visualize bacterial biodiversity from belly button samples. Analyzed data from 153 participants and identified OTU 1167 as the most common bacteria.

biodiversity dashboard data-analysis data-visualization interactive-charts javascript json plotly

Last synced: 25 Apr 2026

https://github.com/rubix982/product-quality-classification

This is an implementation for the CIKM AnalytiCup 2017, around the topic of "Product Title Quality". The goal is to take SKUs and rank its title's clarity and conciseness. Referenced papers are attached to this repository. And as such, the aim is to craft ensemble models that either try to replicate results or find new methods for classification.

data data-analysis information-retrieval jupyter-notebook machine-learning nlp python spacy-nlp

Last synced: 25 Apr 2026

https://github.com/tmoulik/bikeshare-python

Analysis of Bikeshare data from three major cities

data-analysis data-visualization python udacity-nanodegree

Last synced: 25 Apr 2026

https://github.com/xjwllmsx/hacker-news-engagement

Analyze Hacker News data to reveal which post types and posting hours spark the most discussion, using Python and a reproducible Jupyter notebook.

data data-analysis jupyter python

Last synced: 25 Apr 2026

https://github.com/ddihora1604/iit_patna

A multifaceted project involving applying ML models like Ridge Classifier, RNN, RIDOR, Rotation Forest and RUSBoost, integrating SMOTE for class balancing, and handling diverse datasets including those for seating arrangement tasks.

data-analysis data-visualization datamodelling machine-learning-algorithms python

Last synced: 25 Apr 2026

https://github.com/marielachirinosr/bellabeat-wellness-data-trends

Analyzing smart device data for insights on user activity patterns to optimize interventions for better health outcomes.

data data-analysis data-visualization pandas python python3 tableau tableau-public

Last synced: 25 Apr 2026

https://github.com/m-biriulova/python-job-market-analysis

Web scraping, data analysis, and visualization of Python developer vacancies in Czech Republic.

automation beautifulsoup data-analysis data-visualization portfolio-project python selenium web-scraping

Last synced: 25 Apr 2026

https://github.com/sarangs1621/weather-prediction

Weather Prediction Using Machine Learning is a project that leverages machine learning algorithms to predict weather conditions based on historical data. It evaluates three popular ML models (Decision Tree, KNN, and Logistic Regression) and provides performance insights through metrics and visualizations.

data-analysis decision-tree jupyter-notebook knn logistic-regression machine-learning predictive-modeling python scikit-learn weather-prediction

Last synced: 25 Apr 2026

https://github.com/aastopher/mma_outcome

Simple exploratory analysis of UFC Fights and Vegas fight odds from 1993 to 2021

data-analysis data-visualization

Last synced: 06 Jun 2026

https://github.com/edwinrlambert/investigating-netflix-movies

Demonstrates data analysis and visualization techniques for Netflix movies using Python in a Jupyter notebook. This is a DataCamp project.

data-analysis data-analysis-python netflix python

Last synced: 25 Apr 2026

https://github.com/marielachirinosr/hotel-data-analysis

Pandas & Matplotlib Learning Analysis. Repository featuring data analysis projects using Pandas and Matplotlib libraries

data data-analysis matplotlib pandas python

Last synced: 25 Apr 2026

https://github.com/devexpress-examples/winforms-create-a-custom-exporter-for-pivotgridcontrol-with-xtrareport

This example illustrates how to dynamically create a custom report based on PivotGridControl content in WinForms.

data-analysis dotnet pivot-grid pivot-grid-for-winforms winforms

Last synced: 26 Apr 2026

https://github.com/chandansoren/customer-personality-analysis

Predict how different customer segments will respond for a particular product or service.

data-analysis data-visualization python

Last synced: 26 Apr 2026

https://github.com/devexpress-examples/wpf-pivotgrid-customize-the-cell-template

This example demonstrates how to customize the cell appearance in Pivot Grid for WPF.

data-analysis dotnet dxpivotgrid pivot-grid pivot-grid-for-wpf wpf

Last synced: 26 Apr 2026

https://github.com/dcs-training/2023-10-22-carpentry-social-science

Go to https://dcs-training.github.io/2023-10-22-Carpentry-Social-Science/ to follow along the material

data-analysis data-visualisation data-wrangling intro-to-programming r

Last synced: 06 Jun 2026

https://github.com/rociobenitez/happiness-index-data-processing

Repository for Big Data Processing - Contains Jupyter Notebooks and Datasets for data analysis and processing tasks related to Big Data.

big-data big-data-processing data-analysis data-processing happiness-index happiness-report jupyter-notebook matplotlib pandas seaborn

Last synced: 15 May 2026

https://github.com/deliprofesor/cinematic-data-analytics-and-recommendation-platform

This project analyzes a movie dataset using machine learning algorithms to predict success, explore revenue-popularity relationships, and develop recommendation systems. It employs techniques like K-Means, DBSCAN, GMM, decision trees, PCA, and NLP for insights and personalized suggestions.

clustering content-based-recommendation data-analysis data-visualization decision-tree gmm k-means machine-learning natural-language-processing nlp pca predictive-modeling python recommendation-system scikit-learn user-based-recommendation

Last synced: 26 Apr 2026

https://github.com/moshora99/sql-data-warehouse-project

Build modern data warehouse with mysql, Including ETL processes, data modeling and analytics

data-analysis data-engineering data-science database datawarehouse datawarehousing etl scheme sql sql-query sql-server

Last synced: 27 Apr 2026

https://github.com/pararang/nams-thesis-fuzzy

A specialized data processing tool designed to help with Fuzzy Delphi Method calculations for thesis research data analysis. Then extended with some new features for data processing with different method.

data-analysis dematel hacktoberfest hacktoberfest-accepted house-of-quality python sustainability vibecoding

Last synced: 27 Apr 2026

https://github.com/akashvarma26/data-analysis-on-imbd-using-sqlite3

Data Analysis on IMDb dataset using sqlite3 and Pandas in Jupyter notebook.

data-analysis jupyter-notebook pandas-dataframe sqlite

Last synced: 27 Apr 2026

https://github.com/arush-codes/paris-olympic-de

data engineering project on paris olympics 2024

azure data-analysis data-engineering microsoft-azure olympics2024 pipeline

Last synced: 27 Apr 2026

https://github.com/odinleepro/airbnbnewyorkcityanalysis

AirbnbNewYorkCityAnalysis is a comprehensive data analysis and visualization project exploring short-term Airbnb rental trends across New York City (2008–2022). Using open source Airbnb data, the project combines data cleaning, statistical summaries, and Tableau dashboards to uncover pricing patterns, borough level distribution, and insights.

airbnb analytics-project data-analysis data-cleaning data-science data-visualization new-york-city real-estate-analytics tableau urban-analysis

Last synced: 27 Apr 2026

https://github.com/mumtaz4118/amazon-iphone-12-data-scrapped

Beautiful Soup is a Python library for getting data out of HTML, XML, and other markup languages.

data-analysis data-extraction data-science data-scraping html mark-up python

Last synced: 27 Apr 2026

https://github.com/malexandersalazar/covid-19-peru-estimacion-oxigeno-requerido

Análisis técnico de casos confirmados por COVID-19 en Perú para la estimación de oxígeno medicinal requerido.

covid-19 data-analysis data-science peru python

Last synced: 27 Apr 2026

https://github.com/sohamb21/analysis-of-superstore-dataset

I completed the IBM SkillsBuild Data Analytics Internship Program to develop my Data Analytics skills and apply them to a real-world problem by working on this project.

data-analysis python

Last synced: 27 Apr 2026

https://github.com/busesimsek/sql-projects

A collection of my SQL projects with insights into real-world datasets.

data-analysis data-analytics mysql sql

Last synced: 07 Jun 2026

https://github.com/garcane/exodus_analysis

This project analyses cryptocurrency transaction data exported from the Exodus wallet. The goal is to explore and visualize the inflows and outflows of assets, the types of transactions, and other key metrics over time.

bitcoin btc crypto cryptocurrencies cryptocurrency data-analysis data-visualization eth ethereum pandas seaborn

Last synced: 27 Apr 2026

https://github.com/edanur-y/laptop-price-prediction-with-regression-models

Comparing the performances of multi-layer perceptron, k-nearest neighbors, random forest, gradient boosting and extreme gradient boosting regression and on laptop data to predict the price.

data-analysis data-transformation feature-importance hyperparameter-tuning python

Last synced: 27 Apr 2026

https://github.com/hfzdzakii/dicoding-airqualityanalysisdata

This repo is a master submission for my Dicoding Final Project. Air Quality Dataset is being used to fulfill the submission. Feel free to explore and I hope my work give you some insight!

data-analysis data-visualization streamlit

Last synced: 27 Apr 2026

https://github.com/caesaredia/food-app-user-behavior-analysis

Analyze user behavior and optimize app experience in a food-tech startup through funnel analysis and A/A/B testing. Includes data prep, visualization, and statistical testing in Python.

a-b-testing chi-square data-analysis data-visualization funnel-analysis python statistical-testing user-behavior

Last synced: 27 Apr 2026

https://github.com/banyc/dfplot

Summarize a data frame by plotting. `cargo install --git https://github.com/Banyc/dfplot.git`.

csv data-analysis plotly plotting statistics

Last synced: 27 Apr 2026

https://github.com/airdac/ml-palmerpenguins

Classification and analysis of the palmerpenguins dataset in Python. Team project from UPC's Master's Degree in Data Science

classification data-analysis data-science machine-learning palmer-penguin python upc

Last synced: 07 Jun 2026

https://github.com/mango606/da__

2021.9 데이터분석프로그래밍 과제

data-analysis python task

Last synced: 27 Apr 2026

https://github.com/lotfiferaga/hotel-reviews-sentiment-analysis

Efficient Python-driven sentiment analysis for hotel reviews, providing insightful evaluations.

data-analysis data-visualization nlp python

Last synced: 07 Jun 2026

https://github.com/josedanielchg/1990s-netflix-movie-insight

Small exploratory analysis of Netflix movie data from the 1990s. This project is part of the DataCamp Associate Data Scientist in Python program and focuses on filtering, visualizing, and extracting insights from a dataset using Python. Analyze trends in movie durations and count short action films to practice key data science skills!

beginner data-analysis python

Last synced: 27 Apr 2026

https://github.com/tillscode/personal-finance-ml-analysis

Machine learning analysis of personal financial data with predictive modeling and interactive dashboard

dashboard data-analysis finance machine-learning python scikit-learn

Last synced: 28 Apr 2026

https://github.com/l2nce/datamining-study

Introduction to data mining

data-analysis data-mining matplotlib numpy panda

Last synced: 28 Apr 2026

https://github.com/sweta2501/netflix_dataanalysis

With the help of Netflix Data, I have done some Data Analysis.

data-analysis data-science jupyter-notebook python

Last synced: 28 Apr 2026

https://github.com/sujata-adhikari/data-analysis

Data analysis of Market sales data using PowerBi, created dashboard to show analysis.

data-analysis excel pandas powerbi

Last synced: 12 Jun 2026

https://github.com/simranshaikh20/diwali-sales-analysis-for-business-insights

A data analyst project on diwali sales . In this state according state , gender, age we are able to know how much sale it done.

data-analysis data-visualization python

Last synced: 28 Apr 2026

https://github.com/stefagnone/movies-dataset-analysis-project

Comprehensive analysis of the Movies dataset, exploring genre trends, comparisons, and qualitative insights using Python, Pandas, and visualizations. Designed to uncover actionable findings for stakeholders.

data-analysis data-visualization exploratory-data-analysis matplotlib movies-analysis pandas python seaborn storytelling-with-data

Last synced: 28 Apr 2026

https://github.com/datalopes1/warehouse_rfv

Neste projeto será realizada uma análise do tipo RFV (Recência, Frequência e Valor) com dados que encontrei neste video no Youtube do canal Jie Jenn.

analise-rfv data-analysis data-science kmeans python rfm-analysis

Last synced: 28 Apr 2026

https://github.com/hadson0/chess-live-ratings-data

A study project focused on web scraping the live chess ratings from chess.com, with data analysis and visualization on nearly 5000 players in the classical world ranking.

beautifulsoup chess data-analysis data-visualization numpy pandas python seaborn web-scraping

Last synced: 28 Apr 2026

https://github.com/emmanuelletocs/steam-game-recommender

A powerful recommendation system for Steam games, combining Content-Based and Collaborative Filtering techniques. Built with Python, Scikit-learn, and Streamlit to deliver accurate, real-time game recommendations. Perfect for gamers and data scientists interested in building intelligent recommendation engines.

als-algorithm data-analysis gaming-industry knn machine-learning mds mysql ncf neural-network pyspark recommendation-engine recommendation-system scikit-learn spark

Last synced: 28 Apr 2026

https://github.com/rajivaleaakash/customer-churn-prediction

A machine learning project focused on predicting customer churn using various data analysis and modeling techniques. The repository includes data preprocessing, feature engineering, exploratory data analysis (EDA), model training, evaluation, and visualization to help businesses identify customers at risk of leaving.

churn-prediction classification customer-churn data-analysis data-science gridsearchcv imblearn machine-learning numpy pandas pyhton randomsearchcv scikit-learn

Last synced: 28 Apr 2026

https://github.com/elmezianech/autoinventory

This project is an end-to-end, fully automated warehouse management solution designed to tackle real-world inventory challenges in the FMCG sector. From real-time data ingestion and predictive analytics to interactive dashboards, this project combines cutting-edge technologies and an event-driven architecture to simulate a business-ready system.

automation dashboard data-analysis data-engineering-pipeline docker etl glue-job inventory-management kafka kpis lambda-functions lstm ml-pipeline mlflow power-bi pytorch redshift s3 streamlit warehouse-management

Last synced: 28 Apr 2026

https://github.com/jovicdev97/financial-loan-datascience-notebook

using numpy and pandas to analyze a synthetic loan dataset with python

data-analysis matlabplot numpy pandas plotting python seaborn

Last synced: 28 Apr 2026

https://github.com/ovuiproduction/tabletalk

TableTalk - Your Data, Your Language Query tabular data using natural language—no SQL required! Upload your data, ask questions, and get instant insights. 🔹 Convert Natural Language to SQL 🔹 Handle Complex Queries & Aggregations 🔹 Upload CSVs for Easy Analysis 🔹 React + Flask + SQLite3 Backend 🔹 Powered by LLMs for Accuracy

ai data-analysis flask llm machine-learning natural-language-processing prompt-engineering react sql sqlite

Last synced: 28 Apr 2026

https://github.com/abdeldjalilchafai/us-flight-delay-eda

Structured EDA on 2015 US flight delay data. Clean, reproducible notebook using a 6-step data analysis framework for real-world datasets.

data-analysis data-cleaning eda exploratory-data-analysis flight-delays kaggle matplotlib numpy pandas python seaborn

Last synced: 28 Apr 2026

https://github.com/sufyan14/weather-data-analysis

A Streamlit dashboard that forecasts 30-day weather trends using uploaded CSV data and Facebook Prophet.

data-analysis python streamlit

Last synced: 28 Apr 2026

https://github.com/george-njuguna/spotify-etl-pipeline

This is an ETL pipeline that uses Spotify API , Docker and Airflow

apache-airflow data-analysis docker pipelines python

Last synced: 28 Apr 2026

https://github.com/shreeparab1890/indian-elections-2019-analysis-eda

This ipython notebook is the Exploratory data analysis (EDA) of the Indian Lok Sabha Elections 2019.

data data-analysis data-science data-visualization eda exploratory-data-analysis matplotlib numpy pandas plotly python python3 visualization

Last synced: 28 Apr 2026

https://github.com/dcs-training/decode-winterschool

In here you can find material on cluster analysis, data wrangling, and network analysis. Go to the readme file for more info

data-analysis data-visualisation data-wrangling gephi network-analysis python r statistics

Last synced: 28 Apr 2026

https://github.com/matheusafonseca/python-data-visualization-matplotlib-seaborn-masterclass-udemy

This repository is dedicated to storing the code developed during the "Python Data Visualization: Matplotlib & Seaborn Masterclass" course on Udemy.

charts data-analysis data-analysis-python data-science data-visualization database graphics graphics-programming jupyter-notebook matplotlib matplotlib-plots python python3 seaborn seaborn-plots

Last synced: 28 Apr 2026

https://github.com/manalisbhavsar/stock-price-prediction

Stock Price Prediction model using Machine Learning and LSTM to forecast future stock prices based on historical data. Achieved a low error rate of 3.2% by leveraging moving averages and deep learning techniques, ensuring accurate predictions.

data-analysis deep-learning lstm machine-learning matplotlib numpy pandas python

Last synced: 28 Apr 2026

https://github.com/rorrell/coviddeaths

A Jupyter Notebook where I create several visualizations based on data about COVID-19 deaths from 2020 to 2024

data-analysis data-visualization jupyter-notebook python3

Last synced: 28 Apr 2026

https://github.com/akash-srm/user-engagement-analysis

Analyzed user engagement and feedback data to derive actionable insights for an online learning platform.

analytics-projects data-analysis data-cleaning eda jupyter-notebook pandas python seaborn student-engagement

Last synced: 16 Apr 2026

https://github.com/lit26/data_jobs_analyzing

Data analysis for data jobs

data-analysis topic-modeling

Last synced: 26 Mar 2025

https://github.com/sgb31/covid-19-data-analysis

"In this project, I analyzed COVID-19 data to explore trends, case growth, and key patterns. I worked on cleaning the data, performing exploratory analysis, and visualizing infection rates, recoveries, and fatalities. The goal was to gain insights into how the pandemic evolved and its overall impact.

data-analysis data-visualization matplotlib pandas python seaborn

Last synced: 13 May 2026

https://github.com/who-else-but-arjun/isro_xrf_sr

Source Codes for super resolution of the lunar elemental abundance map using a semi-supervised deep spatial interpolation model. This hybrid approach combined ResNet50 for spatial feature extraction with Graph Neural Network (GATv2Conv) layers and Convolutional Neural Networks (CNNs), followed by fusion layers.

cnn data-analysis graph-neural-networks pytorch semi-supervised-learning spatial-interpolation super-resolution

Last synced: 30 Apr 2026

https://github.com/kevingastelum/mydataanalysis

My DataAnalyst Projects | Python, SQL, Excel, PowerBI & Tableau

data-analysis python sql visualization

Last synced: 20 May 2026

https://github.com/jofaval/iris-flowers

Multilabel Classification of the famous Iris Flowers Dataset from Ronald Aylmer Fisher in 1936

classification data-analysis data-science data-visualization google-colab iris-flowers kaggle machine-learning python scikit-learn xgboost

Last synced: 05 Apr 2026

https://github.com/cyberoctane29/noaa-lightning-analysis

This project explores lightning strike data from the National Oceanic and Atmospheric Administration (NOAA) to identify seasonal trends and analyze strike frequency across months. It demonstrates data manipulation, aggregation, and visualization using Python, providing insights into lightning activity patterns.

data-analysis data-science data-visualization eda python

Last synced: 20 Apr 2026

https://github.com/zen204/accenture-tech-news-summarization-engine

A tool developed to analyze knowledge graphs from technology news articles, uncovering insights and trends about technology products, platforms, services, and their industry impact. Built during an internship at Accenture to inform decision-making in the tech landscape.

data-analysis decision-making graph-visualization industry-insights jupyter-notebook knowledge-graph machine-learning python tech-news tech-trends

Last synced: 29 Apr 2026

https://github.com/clchinkc/zombie

Personal project, Python, NumPy, Matplotlib, Pygame, Scikit-learn, TensorFlow, Docker

algorithms data-analysis docker machine-learning matplotlib numpy pygame python sklearn tensorflow zombie-simulation

Last synced: 05 Apr 2026

https://github.com/lulloooo/bizdata-nexus

Collection of my Business & Data Analysis projects, from professional/academic endeavors to passion-driven explorations 📊

business-analysis data-analysis economics etl excel finance mysql python r risk-analysis

Last synced: 05 Apr 2026

https://github.com/darkdk123/house-valuation-model

A Challenge Project in a Boot-Camp to create a ML Model to predict the prices of houses in Boston Massachusetts from multiple parameters Using Multivariable Regression.

data-analysis data-science data-visualization matplotlib-pyplot multivariate-regression predictive-modeling statistics

Last synced: 07 Jul 2025

https://github.com/jovicdev97/Financial-Loan-DataScience-Notebook

using numpy and pandas to analyze a synthetic loan dataset with python

data-analysis matlabplot numpy pandas plotting python seaborn

Last synced: 12 Mar 2025