An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/fbarffmann/mycitibike

Built an interactive Leaflet.js map visualizing over 750 Citi Bike station locations in NYC. Analyzed usage patterns, station density, and user navigation across the network.

citibike data-analysis data-visualization geojson geospatial interactive-map javascript leaflet nyc web-mapping

Last synced: 07 Jul 2025

https://github.com/mainak-97/weather-data-analysis-using-python

A comprehensive analysis of time-series weather data using Python and Pandas, focusing on data exploration, cleaning, and uncovering insights.

data-analysis jupyter-notebook pandas pandas-dataframe python python3 time-series-analysis

Last synced: 08 May 2026

https://github.com/fbarffmann/python-challenge

Automated financial and election data analysis using Python. Cleaned and transformed large CSV datasets, calculated key business metrics, and generated automated reports for stakeholders.

automation csv data-analysis data-cleaning election-analysis financial-analysis python reporting

Last synced: 24 Apr 2025

https://github.com/anudeepkaddala/bankds

This repository contains a Python-based solution for cleaning, matching, and formatting bank data. The primary goal is to match banks from two datasets based on their names and associate each bank with its respective asset size. The final output is a cleaned dataset with asset sizes in Indian-style currency format.

data-analysis data-science fuzzy-matching pandas python

Last synced: 12 Apr 2026

https://github.com/subratamondal1/heart-attack-prediction

Heart Attack Prediction of patients based on the required data. Data Ingestion - Data Preparation - Exploratory Data Analysis (EDA) - Modelling - Evaluation.

data-analysis data-science data-visualization kaggle-dataset machine-learning matplotlib-pyplot numpy pandas python3 scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/shrinidhi857/simpledataanalysisonstartups

The Indian startup ecosystem has experienced remarkable growth over the past decade, becoming a hotbed of innovation and entrepreneurship. In this data analysis we are segregating fields ,finding new insights.

data-analysis data-science data-visualization indian-startups

Last synced: 17 Sep 2025

https://github.com/sarthakagg29/sql-share-trading-analysis

Analysis of share trading transactions using SQL. Includes table setup, sample data, and a variety of queries to answer typical business questions about stocks and trading.

data-analysis dbeaver portfolio postgresql share-market sql

Last synced: 04 Jul 2025

https://github.com/angchekar28/valorant-gameplay-analysis

This project analyzes Valorant gameplay data to understand key factors affecting match outcomes. It compares various machine learning models to predict player performance, rank classification, and match success.

data-analysis data-science data-visualization exploratory-data-analysis jupyter-notebook machine-learning python

Last synced: 12 Apr 2026

https://github.com/deliprofesor/behavioral-insights-and-data-exploration

This project analyzes Spanish speech data, focusing on acoustic features and demographics. It includes data cleaning, outlier detection, clustering, and time series modeling (ARIMA, Holt-Winters) to uncover patterns in speech duration and word frequency.

acoustic-features arima clustering data-analysis holt-winters k-means machine-learning speech-analysis time-series-analysis

Last synced: 10 Apr 2025

https://github.com/deliprofesor/k-means-clustering-for-retail-data-analysis

This project uses K-Means clustering to segment wholesale customers based on their spending habits. The data is preprocessed, scaled, and clustered into four groups. The Elbow and Silhouette methods determine the optimal number of clusters, and results are visualized using boxplots and scatter plots to uncover spending patterns.

clustering-visualisation data-analysis elbow-method k-means k-means-clustering r silhouette-score

Last synced: 10 Apr 2025

https://github.com/sabelomkhwanzi/data-alchemist-boot-camp

Built on Covalent's Unified API, Increment has the full historical data set for 40+ chains including every smart contract, event, transaction, address, etc. With access to all this data you can find:

covalent data-analysis increment

Last synced: 11 Mar 2026

https://github.com/jacktheprogrammer/time-series-forecasting-and-analysis

My personal project consisting of my personally created notebooks to work with time series forecasting and analysis. In these projects, I've used deep learning using tensorflow, xgboost, statsmodels and scipy libraries of python. The series were of weather, energy consumption and that of stocks.

data-analysis data-science deep-neural-networks energy-consumption machine-learning portfolio prophet-facebook prophet-model python python3 scipy statsmodels stocks tensorflow time-series time-series-analysis timeseries-forecasting weather xgboost

Last synced: 05 May 2026

https://github.com/sajjad425/edaipl

The dataset covers the Indian Premier League (IPL) with details on matches (date, teams, venue, results), player stats (runs, wickets), team stats (wins, losses), season summaries, and umpire info. The EDA reveals patterns and insights, highlighting dominant teams, star players, and trends across seasons.

data-analysis eda exploratory-data-analysis ipl python

Last synced: 05 May 2026

https://github.com/noeldevelops/stem-degrees-analysis-cpp

C++ Data Analysis, I/O - takes an external data file for processing, performs some statistical analysis, and displays the results in the console

cpp data-analysis

Last synced: 29 May 2026

https://github.com/shridhar1504/loan-classification-datascience-project

This project uses machine learning algorithms to predict the classification of loan status. The dataset is loaded and some transformation is done using SQL for getting a proper dataset with some valid informations.

classification data-analysis data-cleaning data-science data-visualization eda loan-prediction loan-status machine-learning predictive-modeling sql supervised-learning

Last synced: 09 Apr 2025

https://github.com/prajjwol09/data-cleaning-project

This project is dedicated to cleaning, standardizing a dataset, dealing with null values from a CSV file named "layoffs" using MySQL, with MySQL Workbench as the workspace environment. The goal is to prepare the data for analysis.

cleaning-data columns data-analysis database duplicates mysql rows standard

Last synced: 20 Apr 2026

https://github.com/prakshal0809/power-bi-analytics-dashboard

I have developed a dashboard in Power BI utilizing data from an Excel file. The dashboard effectively visualizes and analyzes the given data.

data-analysis powerbi

Last synced: 22 Feb 2026

https://github.com/nero103/airbnb-destination

This is and end-to-end project to uncover the ideal destination based on listings and hosts. Strategy included: Data workflow-SQL analysis-Data modeling-Data Visualization-Findings

data-analysis data-modeling data-visualization etl etl-pipeline excel microsoft-sql-server powerpoint sql tableau

Last synced: 27 Mar 2026

https://github.com/codesaadumair/pandas_exercises_personal

Personalized enhancements to pandas exercises with comprehensive solutions and practical insights for mastering data analysis in Python.

data-analysis data-science pandas python

Last synced: 09 May 2026

https://github.com/seblehner/feldprakt

Collection of plotting routines for a field exercise work using different measurement tools and Hobo weather stations.

data-analysis data-visualization jupyter-notebook python

Last synced: 05 Oct 2025

https://github.com/paphada1103/data-analysis-with-python

📊 Analyze data efficiently using Python’s top libraries. Learn to explore, clean, and visualize data for meaningful insights in your projects.

carpentries data-analysis data-carpentry data-visualisation dataframe-api dataset english hacktoberfest ibm jovian lsl machine-learning matplotlib programming python realtime social-sciences spark

Last synced: 09 May 2026

https://github.com/mahmoud2abdallah/improvado-marketing-homework

This Looker Studio dashboard provides a comprehensive analysis of marketing performance for August 2024, transforming raw data into actionable insights for data-driven decision making.

bigquery business-intelligence data-analysis looker-studio marketing

Last synced: 05 Oct 2025

https://github.com/faysalalmahmud/bd-med-professional-analysis

Analysis of healthcare professionals in Bangladesh through web scraping, data processing, and interactive visualization.

data-analysis data-visualization jupyter-notebook python scraper selenium selenium-webdriver tableau

Last synced: 04 Sep 2025

https://github.com/vishal786-commits/target-businesscasestudy-sql

This project analyzes Target’s e-commerce transactions in Brazil between 2016 and 2018 using SQL. The goal was to explore customer behavior, order patterns, payments, delivery times, and freight costs to generate actionable business insights.

bigquery data-analysis sql

Last synced: 05 Oct 2025

https://github.com/codewithmayank-py/box-office-analysis-with-seaborn-and-python

This repository contains Python code and datasets for analyzing box office data. Explore trends, patterns, and factors influencing movie performance.

analysis box-office-data-analysis data-analysis data-visualization dataset jupyter-notebook matplotlib pandas python3 seaborn

Last synced: 05 May 2026

https://github.com/affan005-ai/tesla-stock-prediction

This project analyzes Tesla stock data and builds machine learning models to predict and classify stock movements. The analysis includes EDA, feature correlation, moving averages, and two models

data data-analysis data-science data-visualization-project eda machine-learning matplotlib pandas predictive-analytics predictive-modeling python scikit-learn

Last synced: 05 Oct 2025

https://github.com/trismald/eurosoccer1023

Data Analyst - European Soccer 2010 2023

data-analysis data-visualization jupyter-notebook pandas powerbi python

Last synced: 06 May 2026

https://github.com/pcanadas/weather_scraper

Este proyecto automatiza la recopilación y el procesamiento de datos meteorológicos históricos y previsionales. Utiliza Selenium para extraer información de sitios web de clima, procesa los datos con Pandas y los almacena en archivos CSV limpios. Es ideal para análisis climáticos, visualización de datos o integración en otros sistemas.

beautifulsoup data-analysis pandas python selenium

Last synced: 05 May 2026

https://github.com/abhash-rai/analyzing-credit-card-eligibility

This work was performed as part of BCU undergraduate course.

data-analysis data-visualization ggplot ggplot2 latex r

Last synced: 20 Jan 2026

https://github.com/fbarffmann/citibike-covid-analysis

Analyzed NYC CitiBike usage during March 2020 to assess the impact of COVID-19 using Python and Tableau. Includes ridership breakdowns, user type trends, and interactive dashboard.

citibike covid19 data-analysis data-visualization exploratory-data-analysis pandas python tableau transportation

Last synced: 12 Apr 2026

https://github.com/victorlcastro-dsa/pbl-datacamp

This repository features projects from DataCamp's Project-Based Learning (PBL) courses, showcasing practical applications of data analysis, machine learning, and visualization. Explore real-world datasets and interactive results that highlight the skills gained through hands-on learning.

data-analysis data-science data-visualization datacamp-projects hypothesis-testing machine-learning project-based-learning

Last synced: 30 Jun 2026

https://github.com/mohitsai/boston-housing-data-analysis

Data Analysis Project for the City of Boston Government for insights into effect of property rennovations and remodelling on housing availability in the city

data-analysis data-science matplotlib numpy pandas python

Last synced: 05 May 2026

https://github.com/marielachirinosr/analysis-urgencias-hospital-pitalito

This project involves analyzing emergency room admission data from the E.S.E Hospital Departamental de Pitalito using a star schema model.

bigquery data data-analysis etl-pipeline tableau

Last synced: 21 Jan 2026

https://github.com/davifeliciano/modern_physics_experiments

Collection of data analysis and visualization scripts developed in Python around some modern physics experiments

data-analysis data-visualization modern-physics physics physics-experiments

Last synced: 18 Jan 2026

https://github.com/saroshfarhan/dublin_pedestrian_data_analysis

Pedestrian's footfall data analysis for the city of Dublin

data-analysis data-visualization r-programming

Last synced: 07 Jan 2026

https://github.com/bala-1409/power-bi-visualization-project

This repository contains Visualization Projects which is visualized through Power BI Software, by using the visualization we can gain multiple insights and strategies which helps to develop the business for gaining high profit margins and by the insights we can reduce the damages by accidents & calamities.

dashboard data-analysis data-science data-visualization exploratory-data-analysis microsoft-excel microsoft-power-bi microsoft-powerpoint power-bi powerbi powerbi-reports powerbi-visuals visualization

Last synced: 04 Jan 2026

https://github.com/ved-coder-king/wheat_ai_project

This project, Smart Wheat Farming AI System, was developed as part of the coursework for the Artificial Intelligence program at Esprit School of Engineering.

agriculture data-analysis data-visualization deep-learning image-classification machine-learning object-detection python wheat

Last synced: 15 Apr 2025

https://github.com/marielachirinosr/nyc-taxi-trip-exploration-2019-2020

Explores passenger behavior & impact of COVID-19 on NYC taxi industry (Q1 2019-2020).

bigquery data data-analysis data-visualization python sql tableau

Last synced: 15 Jun 2026

https://github.com/dacq-trap/dacq

DacQ Score Server

data-analysis go nextjs

Last synced: 14 Jan 2026

https://github.com/wisdom-osborn/data-analytics-course-online-

🔍 Data Analytics with Python — Hands-on Course Materials Jupyter notebooks, projects, and datasets based on the freeCodeCamp Data Analysis with Python certification. Learn NumPy, Pandas, data cleaning, and visualization through real-world examples

data data-analysis data-science data-visualization freecodecamp numpy pandas pandas-dataframe project python

Last synced: 19 Apr 2026

https://github.com/nuriadevs/informes-powerbi

Este repositorio contiene informes elaborados con Power BI.

data-analysis powerbi

Last synced: 18 Feb 2026

https://github.com/noorulhudaajmal/customer-segmentation-analysis

Customer segmentation and analysis of purchasing behaviour

cluster-analysis customer-segmentation data-analysis

Last synced: 07 Oct 2025

https://github.com/nilayhangarge/data-analysis-with-python

This repository provides a practical introduction to data acquisition and analysis using Pandas. It covers loading datasets, exploring data, manipulating data, and gaining insights through statistical summaries. Ideal for beginners, it offers code examples and explanations to enhance your data manipulation skills using Pandas for Python.

data-acquisition data-analysis data-analytics data-binning data-cleaning data-engineering data-fundamentals data-insights data-integration data-preprocessing data-science data-wrangling numpy pandas python

Last synced: 12 Apr 2026

https://github.com/prajjwol09/sql_retail_analysis_project

This project demonstrates SQL-based data cleaning, exploration, and business analysis on a retail sales dataset. It involves setting up a database, removing null values, performing EDA, and using SQL queries to extract key insights such as top customers, best-selling categories, and monthly sales trends.

data data-analysis datacleaning dataexploration pgadmin4 sql

Last synced: 15 Feb 2026

https://github.com/jnyambok/epl_dashboard

English Premier League Dashboard summarizing match data from 2009-2024

data-analysis data-science gcp powerbi

Last synced: 04 Sep 2025

https://github.com/gabboraron/biostatisztika_es_alkalmazasai

"A statisztika a matematika azon ága, melynek feladata, hogy eszközt adjon a politikusok kezébe, mellyel tetszőleges állítás és annak ellentéte is tudományos alapon igazolható"

biostatistics data-analysis data-visualization r statistics statistics-course

Last synced: 24 Oct 2025

https://github.com/banyc/csv_logger

Long-term logger for data analysis

csv data-analysis logging

Last synced: 07 Oct 2025

https://github.com/shourya1997/boston_housing

In this project, you will apply basic machine learning concepts on data collected for housing prices in the Boston, Massachusetts area to predict the selling price of a new home.

boston-housing-dataset data-analysis jupyter-notebook machine-learning python unsupervised-machine-learning

Last synced: 18 May 2026

https://github.com/omarsolieman/socialgiveawaydataanalysis

This project involved cleaning, analyzing, and processing data from an Instagram giveaway to ensure a fair and data-driven winner selection process. The primary goal was to automate the process of identifying valid entries, weighting them based on engagement (likes and multiple entries), and performing a post-giveaway analysis

data-analysis data-science data-visualization instagram scraping threejs

Last synced: 14 May 2026

https://github.com/aymanmomin/excel-coffee-data-analytics-exploring-coffee-orders-dataset

This project utilizes a coffee orders dataset to perform comprehensive data analytics and gain insights into customer preferences, popular items, and sales trends. The analysis aims to provide valuable information for coffee shop owners and enthusiasts, facilitating data-driven decision-making and improved customer satisfaction.

data-analysis data-visualization excel project

Last synced: 18 Jan 2026

https://github.com/jasontan22/aefes-time-series-forecasting

Bu proje, Anadolu Efes Biracılık ve Malt Sanayii A.Ş. (AEFES) piyasa verilerini kullanarak kapanış fiyatlarının gelecekteki değerlerini tahmin etmek amacıyla derin öğrenme yöntemleri (LSTM, BiLSTM, CNN+LSTM) kullanmaktadır. Projede, veri ön işleme, model eğitimi ve değerlendirme adımları detaylandırılmıştır.

bilstm cnn-lstm data-analysis deep-learning financial-forecasting lstm machine-learning python stock-price-prediction tensorflow

Last synced: 09 Aug 2025

https://github.com/edprice25/us-states-analysis

Presents a series of visualizations for folks looking to relocate to more affordable areas in the US. Click on my link below to see a full analysis.

data-analysis jupyter-notebook matplotlib pandas python us-states

Last synced: 04 Jul 2025

https://github.com/anushkundu/crime-pattern-analysis

Analyzing Crime Patterns in Montgomery County, USA: An Inclusive Study Based on NIBRS Data (2016-2022)

data-analysis data-visualization descriptive-statistics matplotlib numpy pandas python seaborn

Last synced: 05 May 2026

https://github.com/pranavsp108/market_basket_analysis-instacart

Customer segmentation and market basket analysis using the Instacart dataset with Python, Pandas, and K-Means clustering.

customer-segmentation-and-buying-behavior data-analysis data-visualization instacart jupyter-notebook kmeans-clustering market-basket-analysis pandas python scikit-learn

Last synced: 05 May 2026

https://github.com/ndomah1/learning-probability-and-statistics

This repo is a comprehensive learning resource that covers fundamental to advanced topics in probability and statistics, including probability theory, descriptive and inferential statistics, probability distributions, regression analysis, and data exploration techniques.

correlation-analysis data-analysis descriptive-statistics exploratory-data-analysis hypothesis-testing inferential-statistics probability regression statistics

Last synced: 18 Jan 2026

https://github.com/kmranrg/bikeshare

a project based on Data Analysis

data-analysis python

Last synced: 08 Oct 2025

https://github.com/kernix13/github-readme-seo-analysis

A Jupyter Notebook GitHub README and Repo SEO Analysis to determine what makes a repo rank in the SERPS

accessibility data-analysis readme seo seo-analysis

Last synced: 29 May 2026

https://github.com/alexquilis1/spanish-fuel-stations-analysis

Real-time analysis of Spanish fuel prices using government API data with interactive maps and regional comparisons

data-analysis data-visualization fuel-prices geospatial-analysis ggplot2 government-data leaflet open-data r shiny spain tidyverse

Last synced: 08 Oct 2025

https://github.com/sorebit/pdrpy-pd-2

Data analysis of various stackechange.com archives.

data-analysis stackexchange time-travel university-project

Last synced: 08 Oct 2025

https://github.com/vasulab/knightshock

Shock tube experiment planning and data analysis package.

cantera data-analysis matplotlib numpy shock-tube

Last synced: 18 Jul 2025

https://github.com/tyriek-cloud/power-bi-nyc-housing-financial-report

This report was conducted to provide a comprehensive analysis of various NYC housing and financial data.

dashboard data-analysis data-visualization financial-analysis powerbi statistics

Last synced: 21 Jan 2026

https://github.com/debjyotisaha/power-bi-projects-phase-2

Created interactive dashboards and reports using Power BI to visualize complex datasets. Demonstrated proficiency in data modelling, DAX calculations, and storytelling through data to provide actionable insights.

dashboards data-analysis data-modeling data-visualisation power-query powerbi

Last synced: 18 Jan 2026

https://github.com/aryar-06/linear-regression

A Python project demonstrating basic linear regression with gradient descent and matrix operations, alongside scikit-learn comparison.

data-analysis data-preprocessing educational-project gradient-descent linear-regression machine-learning python regression-algorithms scikit-learn

Last synced: 05 May 2026

https://github.com/faisal-khann/ipl-analysis

The IPL Analysis project is a comprehensive data-driven exploration of the Indian Premier League (IPL), analyzing historical match data to uncover patterns in team performance, player statistics, and match outcomes.

data-analysis exploratory-data-analysis jupyter-notebook matplotlib numpy pandas seaborn

Last synced: 08 May 2026

https://github.com/charlescro/reddit-classification-nlp

Analyzing subreddit language via Reddit API and NLP techniques.

data-analysis data-science data-visualization nlp-machine-learning reddit-api scikit-learn

Last synced: 03 Apr 2025

https://github.com/l1ght14/tradersentiment_primetrade

Analyzes Bitcoin market sentiment's impact on Hyperliquid trader PnL & behavior. Uncovers patterns using Python (Pandas, Seaborn) to derive actionable trading insights. Junior Data Scientist assignment for PrimeTrade

bitcoin crypto-trading cryptocurrency data-analysis financial-data-analysis jupyter-notebook market-sentiment pandas python trader-behavior web3

Last synced: 20 Oct 2025

https://github.com/izzyl3333/mosquito_analysis

An exercise using Python and statistical analysis in mosquito data to understand the relationship between the different variables and the mosquito number.

chicago data-analysis data-science exploratory-data-analysis mosquitoes python statistical-analysis west-nile-virus

Last synced: 19 Jan 2026

https://github.com/takshshah-16/pizza_sales_sql

SQL-powered pizza sales analytics project using MySQL Workbench to derive business insights through data exploration and queries.

business-intelligence data-analysis database-management mysql sql

Last synced: 09 Oct 2025

https://github.com/borjamome/soho_cholera

Cholera deaths in the Soho District (London)

data-analysis data-visualization london r

Last synced: 04 Sep 2025

https://github.com/nkamilla/titanic-eda

Exploratory Data Analysis of the Titanic dataset using Python (Pandas, NumPy, Matplotlib). Includes data cleaning, visualizations, correlations, and key business insights.

data-analysis eda jupyter-notebook matplotlib numpy pandas python titanic-dataset

Last synced: 05 May 2026

https://github.com/debjyotisaha/sql-projects

Designed and implemented SQL-based projects to analyse and manage datasets efficiently. Demonstrated expertise in writing complex queries, optimizing database performance, and performing data extraction, transformation, and loading (ETL) processes.

data-analysis database sql

Last synced: 09 Oct 2025

https://github.com/nimbostratos/titanic-survival-prediction

Machine learning project predicting Titanic survival using AdaBoost with feature engineering and hyperparameter optimization

data-analysis data-science data-science-projects kaggle machine-learning machine-learning-models python scikit-learn

Last synced: 05 May 2026

https://github.com/mirwais-farahi/data-visualization-with-tableau-specialization

The Specialization provides Tableau for data visualization and business intelligence. The series covers skills like assessing data quality, designing visualizations and dashboards, and combining data sources to create compelling, data-driven stories.

dashboard data-analysis geospatial map tableau visualization

Last synced: 16 Feb 2026

https://github.com/sillyash/untappd-viz

A data visualisation page using public datasets and HTML/CSS/JS with D3.js.

beer beer-statistics data data-analysis data-visualization kaggle kaggle-dataset public-dataset school-project

Last synced: 18 May 2026

https://github.com/sarveshdhond/top_25_cad_stocks

In this project I have used Python Jupyter lab and Pandas to import data set from Yahoo stocks website. I have imported the top 25 most active Canadian stocks on 12th July 2024. This project shows skills such as Python, Web Scrapping and Pandas.

data-analysis pandas-dataframe python webscraping

Last synced: 01 Apr 2025

https://github.com/ninadpatil09/hospital_emergency_room_analysis

This comprehensive analysis delves into the performance and characteristics of the hospital's emergency room over the past year. By scrutinizing key metrics and patient demographics, this study aims to provide valuable insights for optimizing patient care, resource allocation, and overall operational efficiency.

data-analysis tableau-public visualization

Last synced: 15 Feb 2026

https://github.com/sabdikay/analysis-of-biodiversity

This project analyzes biodiversity data from the National Parks Service, focusing on species in various park locations. Conducted in Jupyter Notebook, it uses pandas, matplotlib, NumPy, seaborn, and chi2_contingency for analysis and visualization.

data-analysis data-analysis-python data-visualization exploratory-data-analysis jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 14 Apr 2026

https://github.com/caesaredia/ymusic-project

Exploratory data analysis (EDA) of music streaming behavior in two fictional cities using Python, Pandas, and Jupyter Notebook. It explores user behavior, genre preferences, and listening patterns throughout the week.

data-analysis eda pandas python

Last synced: 05 May 2026