An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/incubrain/awesome-maharashtra-data

A collection of datasets specific to Maharashtra, India. WIP

ai artificial-intelligence data data-analysis data-science datasets maharashtra marathi

Last synced: 23 May 2026

https://github.com/birkkarlsen/beam_dynamics_tools

Repository filled with functions related to the analysis of longitudinal beam dynamics measurements and simulations

accelerator-physics beam-dynamics data-analysis

Last synced: 19 Sep 2025

https://github.com/tbep-tech/red-tide-twitter

Supplementary materials to accompany Skripnikov et al. red tide Twitter analysis

ccmp-li1 ccmp-wq3 data-analysis open-science tampa-bay tberf water-quality

Last synced: 19 Feb 2026

https://github.com/rayyan9477/youtube-spam-detection-with-flask-and-machine-learning

This is a web application built using Flask that detects spam comments on YouTube using a Naive Bayes classifier. It leverages techniques such as CountVectorizer for feature extraction and scikit-learn for machine learning. The application reads data from a CSV file and predicts whether a comment is spam or not.

data-analysis data-science machine-learning nlp-machine-learning spam-detection

Last synced: 21 Sep 2025

https://github.com/gappeah/london-housing-price-dashboard

This Excel-based Housing Visual Dashboard provides a comprehensive view of average house prices across various boroughs in London from 1996 to 2013. The dashboard is designed to offer insights into housing market trends and price variations across different areas of London over time.

data data-analysis data-visualization excel visual

Last synced: 31 Jul 2025

https://github.com/dannyben/datamix

DSL for manipulating tabular data

csv data data-analysis data-engineering gem ruby tabular-data

Last synced: 31 Jul 2025

https://github.com/banyc/dfsql

SQL REPL/lib for Data Frames

cli csv data-analysis jsonl ndjson repl sql

Last synced: 31 Jul 2025

https://github.com/shubhamgoyal575/diwali-sankranti-promotion-sales

This Power BI dashboard analyzes sales performance during Diwali and Sankranti festivals. It provides insights into revenue trends, top-selling products, regional sales distribution, and customer purchasing behavior to help optimize festive season sales strategies. 🚀

buisness-intelligence dashboard data-analysis data-visualization diwali-sankranti-sales-analysis excel fast-moving-consumers-goods fmcg microsoft-power-bi mysql power-query powerbi revenue-insights sales-dashboard sales-insights sql

Last synced: 02 Mar 2026

https://github.com/sarathchandranpm/vehicle_theft_analysis

This project is a comprehensive data analysis of vehicle theft patterns, utilizing advanced SQL techniques to explore when, which, and where vehicles are most likely to be stolen. The analysis provides deep insights into vehicle theft characteristics through systematic, multi-dimensional exploration.

data-analysis mysql sql

Last synced: 02 Aug 2025

https://github.com/juliusmarkwei/iris-dataset-analysis

Data analysis, data visualization and model training using the popular Iris Dataset

data-analysis data-visualisation linear-regression machine-learning

Last synced: 03 Aug 2025

https://github.com/ganesh2409/cricket-player-performance

This repository contains a comprehensive project focused on analyzing cricket player performance using various datasets, including batting, bowling, and match results. The project involves data preprocessing, feature engineering, and model training to predict and evaluate player performance scores. It includes detailed scripts for data analysis

cricket-performance-analysis data-analysis machine-learning sports-analytics

Last synced: 05 Aug 2025

https://github.com/filiplangiewicz/businessintelligence

🏭 Data warehouses and business intelligence project

airbnb business-intelligence data-analysis data-warehouse

Last synced: 09 Mar 2026

https://github.com/virajbhutada/walmart-retail-analyzer

Gain valuable insights into retail sales with the "Walmart Retail Performance Dashboard" in MS Excel. This user-friendly tool facilitates an in-depth analysis of key sales metrics, providing a comprehensive view of Walmart's performance. Make data-driven decisions for informed and strategic business outcomes.

analytics data-analysis data-science data-visualization excel insights interactive-visualizations performance-analysis retail-sales walmart

Last synced: 04 Mar 2026

https://github.com/Narius2030/Hive-DataWarehouse-Analysis

Implement a Hive data warehouse to store meaningful data, apply Machine Learning like Clustering or Regression for dealing with business problems

apache-hadoop apache-hive data-analysis etl-pipeline hiveql machine-learning statistics

Last synced: 12 Aug 2025

https://github.com/floressek/data_analysis_and_visualization

This repository contains a collection of statistical data analysis laboratories using R. Each lab focuses on different aspects of data exploration, visualization, and analysis techniques.

data-analysis data-visualization

Last synced: 05 Oct 2025

https://github.com/ilchen/eu_economic_data_analysis

Jupyter notebooks for analysis of Eurozone GDP, yields on government bonds, inflation expectations, unemployment and participation rates, money supply, personal consumption and savings, stock market. Using APIs from Eurostat, ECB, OECD and Yahoo-Finance.

data-analysis disposable-income finance gdp hicp inflation interest-rates jupyter-notebook money-supply participation-rate risk-free-interest-rate savings stock-market unemployment-rate

Last synced: 10 Oct 2025

https://github.com/airscholar/data_analysis_with_ai

A repository showing how to use AI and ChatGPT for Data Analysis with Pandas and Python

chatgpt data-analysis gpt4 openai pandas pandasai python

Last synced: 10 Apr 2026

https://github.com/viztruth/google-play-store-data-analysis

This repository contains all the materials of my final project 'Google Play store Data Analysis' for the 'Telling Stories with Data' course at PES University.

data-analysis data-visualization

Last synced: 21 Aug 2025

https://github.com/mohamed3nan/udacity

Udacity Data Analysis Nanodegree Program

data-analysis data-visualization numpy pandas python

Last synced: 10 Apr 2026

https://github.com/realorangeone/docker-cyberchef

A containerized deployment of CyberChef, with additional protections

cyberchef data-analysis data-manipulation docker encoding

Last synced: 24 Aug 2025

https://github.com/nafisalawalidris/sales-performance-dashboard

Sales Performance Dashboard: Analyze and visualize sales data using Power BI. Gain insights into trends, customer segments, product performance, and geographic distribution. Make data-driven decisions to optimize sales strategies and maximize revenue.

analytics-revenue dashboard-power-bi data data-analysis intelligence-sales optimization performance sales visualization-business

Last synced: 03 Feb 2026

https://github.com/prajakta1321/kaggle-ai-report-2023

A Report describing the trends in emergence of AI over the years !

data-analysis data-visualization python3

Last synced: 28 Jun 2025

https://github.com/discdiver/new-belgium-ratings

Find the most popular New Belgium beers of all time!

beautifulsoup data-analysis pandas python seaborn webscraping

Last synced: 10 Apr 2026

https://github.com/prernarohra/heart-disease-prediction

This project develops a machine learning model to predict heart disease risk based on symptoms and medical history. The model achieved the best accuracy with Logistic Regression, as it works well for binary classification problems.

artificial-intelligence data-analysis data-science dataset heartdisease-prediction machine-learning models

Last synced: 06 Nov 2025

https://github.com/ibnaleem/cyberchef-discord

A versatile Discord bot that implements CyberChef's features for encoding, decoding, encrypting, compressing, analysing data directly and more in your Discord server

compression cti cyberchef cybersecurity data-analysis data-manipulation discord-bot discord-js encoding encryption hashing infosec parsing redteam

Last synced: 28 Jan 2026

https://github.com/hifza-khalid/book-management-system-sql

A Book Management System SQL project 📚 featuring tables for Authors ✍️, Books 📖, Customers 👤, and Orders 🛒. Includes sample queries for tracking book sales 💰, pricing by genre 🎭, and customer order history 📅.

book-management data-analysis database-management sql sql-queries

Last synced: 03 Feb 2026

https://github.com/satvikvirmani/engineering-graduate-salary-analysis

A Data Science/Machine Learning project to analyse and study salary patterns of a engineering graduate in India.

data-analysis data-science jupyter-notebook machine-learning prediction python regression

Last synced: 19 Jun 2026

https://github.com/paezha/isdas

Companion package for An Introduction to Spatial Data Analysis and Statistics with R

data-analysis gis rstats spatial-analysis spatial-statistics

Last synced: 04 Jan 2026

https://github.com/5ekastanx/data-analysis

Extracting data from parsing, for example, like hacking using Python using all sorts of function methods

data-analysis html python

Last synced: 14 Mar 2025

https://github.com/kishlayjeet/zomato-data-exploration

In this project, we will be exploring a dataset containing information on various restaurants and their ratings, location, and other attributes.

data-analysis eda matplotlib numpy pandas zomato-data-exploration

Last synced: 10 Apr 2026

https://github.com/walkerdustin/vergleich-von-messmethoden-fuer-punktwolken

Bei der Vermessung eines physischen Raumes ist das Ergebnis eine Punktwolke. Diese Punktwolke beschreibt dann ausgewählte Punkte im Raum, zum Beispiel auf den Wänden und der Decke. Wenn diese Punkte in zwei seperaten Messungen gemessen werden, vielleicht sogar von unterschiedlichen Geräten, soll hinterher herausgefunden werden wie genau diese Punktwolken übereinstimmen. Dafür gibt es zwei grundsätzlich verschiedene Methoden. Diese sollen hier verglichen werden.

3d-models accuracy-metrics data-analysis data-visualization kaggle measure-distance numpy point-cloud pointcloudprocessing punkte python science-research simulation statistics

Last synced: 11 Apr 2026

https://github.com/ivanildobarauna-dev/api-to-dataframe

Python library that simplifies obtaining data from API endpoints by converting them directly into Pandas DataFrames. This library offers robust features, including retry strategies for failed requests.

data-analysis data-analytics data-engineering library pypi-packages python

Last synced: 06 Mar 2025

https://github.com/prime-infinity/type-one

Software to visualize and analyze GitHub repos based on certain statistics such as stars, forks and issues

data-analysis data-visualization

Last synced: 03 Feb 2026

https://github.com/happybono/sonatasmooth

Provides three different noise reduction algorithms for smoothing out data : Rectangular Averaging, Binomial Median Filtering, and Binomial Averaging. It processes data from a list and displays the results in another list.

algorithms average binomial binomial-coefficient binomial-theorem calibration csharp data-analysis data-calibration dynamic-noise-reduction median noise-algorithms noise-reduction noise-reduction-kernel outliers rectangular-averaging windows-desktop windows-desktop-application windows-forms winforms

Last synced: 30 Oct 2025

https://github.com/ryanfranklin237/data-visualization-python

A tool that allows you to visualize data from a csv or excel file in a graph or charts form

data-analysis data-science data-visualization matplotlib pandas-dataframe python

Last synced: 11 Jun 2026

https://github.com/luochang212/weibo-analysis

Data analysis based on sina weibo.

data-analysis weibo

Last synced: 03 Apr 2026

https://github.com/ryannapp12/quant_trading_engine

A modular, and scalable quantitative trading engine built in Python. This project demonstrates efficient data caching with SQLite, concurrent backtesting, and advanced risk analytics, showcasing best practices in clean code architecture and performance optimization.

algorithmic-trading backtesting dash data-analysis data-visualization fintech lstm machine-learning numpy pandas plotly python quantitative-finance real-time risk-management sqlite technical-analysis tensorflow time-series-analysis trading-strategies

Last synced: 11 Apr 2026

https://github.com/quantitext/quantitext

Official repository for QuantiText applications in the .NET ecosystem.

api aspnet-core csharp data-analysis dotnet-core mvc-architecture

Last synced: 30 Mar 2025

https://github.com/vatshayan/hospital-discharge-analysis

Analysis of Hospitalization Discharge Rates in Lake County, Illinois of various attributes like Anxiety, Alcohol, mood, Diabetes, Asthma, etc

data-analysis data-visualization jupyter-notebook machine machine-learning machine-learning-algorithms scikit-learn

Last synced: 04 Mar 2025

https://github.com/faisal-khann/diwali-sales-analysis

The "Diwali Sales Analysis" project aims to analyze the sales data during the Diwali festival period to uncover insights and trends that can help improve marketing strategies and sales performance in the future

csv data-analysis eda jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 11 Apr 2026

https://github.com/tqhungdev0605/crawl_200_jd_dataanalyst

Automate job data scraping for 200 Data Analyst postings on https://vn.indeed.com using Python

data-analysis jupyter-notebook python3 scraping selenium

Last synced: 11 Apr 2026

https://github.com/shadan100/stroke-prediction-analysis

A web based application to predict whether a patient is likely to get stroke based on the input parameters like gender, age, various diseases, and smoking status. Each row in the data provides relevant information about the patient.

artificial-intelligence data-analysis data-science django django-framework jupyter-notebook machine-learning matplotlib pandas predictive-modeling python stroke-prediction web-application

Last synced: 08 Mar 2026

https://github.com/equicirco/cirquant

Code and data delivering for quantifying circularity through open data and digital innovation.

circular-economy data-analysis database julialang official-statistics

Last synced: 13 Jan 2026

https://github.com/rcv911/lyapunov-indicators

Calculating Lyapunov indicators with multiprocessing in Python

data-analysis lyapunov lyapunov-indicators multiprocessing

Last synced: 18 Jan 2026

https://github.com/dcs-training/good-data-visualisation-with-r

Our guide on how we create data visualisations through R. Go to the readme file

data-analysis data-visualisation r rmarkdown

Last synced: 16 Jun 2026

https://github.com/carterlasalle/sportsarbfinder

Sports Betting Arbitrage Finder: Python tool for identifying profitable arbitrage opportunities across bookmakers. Features multi-region support, customizable profit margins, interactive calculator, and web interface. Uses real-time odds data from The Odds API. Ideal for betting enthusiasts, analysts, and educational purposes.

arbitrage-betting betting-strategy data-analysis finance gambling odds-api python sports-analytics sports-betting

Last synced: 31 Mar 2025

https://github.com/virajbhutada/google-stock-price-forecasting-lstm

Analyzing and predicting Google's stock prices through detailed data exploration and advanced LSTM models. This project involves data preprocessing, creating time-series sequences, constructing and training LSTM networks, and evaluating their performance to forecast future stock prices utilizing Python and Machine Learning libraries.

data-analysis data-science data-visualization future-prediction google-dataset google-stock-price-prediction google-stocks lstm-model lstm-neural-network machine-learning machine-learning-models matplotlib model-building model-training numpy python stock-forecasting

Last synced: 27 Feb 2025

https://github.com/amstuta/cpp-neural-network

Simple implementation of a feedforward neural network in c++

data-analysis deep-learning machine-learning neural-network

Last synced: 08 Apr 2025

https://github.com/priboy313/pandasflow

A set of custom python modules for friendly workflow on pandas

catboost data-analysis data-science pandas phik python scikit-learn shap

Last synced: 20 Jan 2026

https://github.com/silvano315/med-physics

This would be a repository about medical physics. It will based on 4 paths: medical data to analyse, SOTA programs for medical purposes, computer vision and eXplainability.

computer-vision data-analysis data-science explainable-ai medical-imaging medical-physics medical-tool

Last synced: 24 Mar 2025

https://github.com/rayyan9477/household-transactions-analysis-and-clustering

This project involves analyzing household transaction data to gain insights into spending patterns and behaviors. The analysis includes data cleaning, exploratory data analysis (EDA), clustering using K-Means, and visualization of customer segments.

customer-segmentation data-analysis data-cleaning data-science exploratory-data-analysis kmeans-clustering machine-learning

Last synced: 27 Feb 2025

https://github.com/saeun-park/lg-aimers-4th

MQL 데이터 기반 B2B 영업기회 창출 예측 모델 개발

b2b data-analysis data-science machine-learning mql

Last synced: 08 Apr 2025

https://github.com/john-science/data_science_by_example

Examples of Data Science Tools & Libraries

data-analysis data-science ipython pandas

Last synced: 12 May 2025

https://github.com/priyanshubiswas-tech/aws-mwaa-elt-airflow-sql-dbt-superset-project

This project was created as part of an assessment for DigitalXC AI. It demonstrates a cloud-based ELT pipeline using AWS MWAA, Airflow, dbt, PostgreSQL, and Superset. The pipeline automates data ingestion from S3, transformation with dbt, and visualization through Superset, following modern data engineering practices on a scalable AWS architecture.

apache-airflow apache-superset aws-s3 dag data-analysis data-engineering-pipeline data-visualization dbt elt-pipeline python rds-postgres

Last synced: 03 Jul 2025

https://github.com/gher-uliege/bluecloud-plankton

Spatial interpolation of plankton data using a neural network

data data-analysis data-visualization neural-network oceanography

Last synced: 30 Mar 2025

https://github.com/neerajcodes888/whatsapp-chat-analyzer

A Python tool for effortless analysis of WhatsApp conversations. Gain insights with basic statistics, word cloud visualizations, and URL statistics. Powered by pandas, urlextract, wordcloud, seaborn, and Streamlit. 📊📱

analyzer chat data-analysis data-visualization pandas python3 seaborn urlextract whatsapp wordcloud

Last synced: 12 Apr 2026

https://github.com/rajnish93/jpandas

A lightweight JavaScript library for working with tabular data, inspired by Pandas in Python. Built with TypeScript, it provides an intuitive API for data manipulation and analysis.

data-analysis data-analytics data-manipulation data-science dataframe javascript pandas stream-processing table typescript

Last synced: 11 Jun 2025

https://github.com/lobooooooo14/badwords-pt-br

💬 Wordlist com palavrões em pt-BR para análise de dados, filtros, ou texto considerado "evitável"

badword-filter badwords brasil data-analysis filter filter-lists filterlist portugues portuguese text-analysis wordlist

Last synced: 25 Mar 2025

https://github.com/lmuffato/analise-de-diarias-prefeituas-do-es

Esse código faz parte de um projeto de descoberta e combate a esquemas de corrupção, através do tratamento e cruzamento de dados abertos disponíveis em diversas prefeituras do Espirito Santo através do portal da transparência. Junção e análise de várias tabelas importadas em csv.

data-analysis personal-project r rstudio

Last synced: 12 Jun 2025

https://github.com/colburncodes/se_pudding_2023

This project is a React app designed to showcase research conducted by a team of data scientists and data analysts. The app is utilizing React and React-Chartjs-2

chartjs-2 data-analysis data-science data-visualization react-chartjs-2 reactjs

Last synced: 11 May 2026

https://github.com/lmuffato/dados-meteorologicos-inmet-tratamento

Tratamento e enálise de dados meteorológicos das estações locais fornecidos pelo INMET, utilizando a linguagem R

data-analysis personal-project r rstudio

Last synced: 12 Jun 2025

https://github.com/randomshek/Working-With-Excel

Using Excel Power Query and PowerPivot, reorganise the data into a star schema and showcasing reports that can be created by data analysts using DAX formulae and PowerPivot

data-analysis excel power-pivot power-query

Last synced: 20 Jul 2025

https://github.com/danhenriquex/data-science-project

The main goal of this project was to apply the concepts of data visualization and analysis.

data-analysis data-science numpy pandas python

Last synced: 12 Apr 2026

https://github.com/rani-sikdar/pwc-virtual-internship-powerbi

Comprehensive Power BI dashboards showcasing insights on Call Centre Trends, Customer Retention, and Diversity & Inclusion to drive business impact.

business-analytics business-intelligence data-analysis data-cleaning data-visualization interactive interactive-visualizations powerbi

Last synced: 07 Jan 2026

https://github.com/anurag-kumar-molankala/blinkit-grocery-sales-dashboard

The BlinkIT Grocery Sales Dashboard is an interactive Power BI dashboard that provides insights into grocery sales performance. It includes key KPIs, sales trends, and outlet performance analysis.

business-intelligence dashboards data-analysis data-visualization dax excel kpi-dashboard power-bi powerquery slicers-kpi-card-multirow-card sql-server ssis ssms ssrs-reports

Last synced: 09 Apr 2025

https://github.com/gappeah/nike_web_crawler

This project involves web scraping Nike's product pages to extract product names, prices and links. The project showcases three different implementations of the web crawler using Selenium and BeautifulSoup. It also includes visualisation of the scraped data using Matplotlib and Seaborn.

beautifulsoup data-analysis data-visualization python selenium web-crawler web-scraper webcrawler webscraper webscraping webscraping-beautifulsoup

Last synced: 04 Jul 2025

https://github.com/sanam2405/chatinfo

Analysing the WhatsApp Chat with my crush over a 6M period

data-analysis data-visualization python

Last synced: 27 Apr 2026

https://github.com/muneeb1030/dataannotation

This streamlines the process of annotating data for machine learning tasks, making it easier and more efficient for teams to create labeled datasets by leveraging Label Studio and Bulk

bulk data-analysis data-annotation label-studio python

Last synced: 10 May 2026

https://github.com/aravindnathan02/sales-and-customer-analytics

This is a repository for sales and customer performance Tableau dashboard.

customer-dashboard dashboard data-analysis data-visualization sales-analysis sales-dashboard tableau

Last synced: 08 Jan 2026

https://github.com/brunomontezano/benzocovid

💊 Data Analysis Project of Benzodiazepines during COVID-19 Pandemic.

benzodiazepines covid-19 data-analysis

Last synced: 28 Feb 2025

https://github.com/whis99/userfunnelanalysis

An ecommerce user funnel conversion data analysis with matplotlib & python.

data-analysis data-analysis-python data-analyst data-visualization google-colab jupyter-notebook matplotlib python

Last synced: 13 Apr 2026

https://github.com/jakubkorytko/data-graphs

Transform raw data into captivating visual stories with this app, effortlessly craft stunning data charts that unveil insights and trends

charts data-analysis mit-license open-source

Last synced: 14 May 2026

https://github.com/cintia0528/data_cleaning_and_analytics-python

Evaluate if aggressive discounting benefits Eniac long-term, considering differing views on customer acquisition and brand positioning. Focus on data cleaning for informed decision-making.

colab-notebook data data-analysis datacleaning dataquality jupyter-notebook matplotlib pandas python seaborn

Last synced: 08 Jan 2026

https://github.com/derrickbaruga7/mapping-median-age-europe

An R project that creates an interactive map of the median age across European regions using Eurostat data and spatial visualization packages.

data-analysis data-science data-visualization datascience european-union mapping r

Last synced: 25 Mar 2025