An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/azmainadel/twitter-data-neo4j

Playing with graph database on a large dataset of twitter data.

data-analysis data-visualization neo4j-database snap

Last synced: 06 Apr 2025

https://github.com/emso-exe/reclamacoes_de_consumidores_com_empresa_de_telecomunicacoes

Projeto de análise de reclamações de consumidores com empresa de telecomunicações no 1º semestre de 2021 com base nos dados do site consumidor.gov.br.

analise-de-dados ciencia-de-dados data-analysis data-science datascience python python-3 python3

Last synced: 02 May 2026

https://github.com/dangerousfish/uk-climate-trends-dashboard-metoffice

A data pipeline and Streamlit dashboard that aggregates, cleans and visualises historical UK Met Office station data - interactive charts, heatmaps and maps for temperature, rainfall and sunshine.

climate climate-analysis climate-change climate-data climate-science data-analysis data-visualization metoffice metofficeweather streamlit temperature weather

Last synced: 02 May 2026

https://github.com/jakobzmrzlikar/fake-news-analysis

An analysis of the FakeNewsNet dataset using NLP techniques.

data-analysis fake-news ipynb-jupyter-notebook nlp-machine-learning

Last synced: 05 Mar 2026

https://github.com/victoriapm/analyze_a-b_test_results

Understand the results of an A/B test run by an e-commerce website.

ab-testing data-analysis ecommerce-website

Last synced: 06 Oct 2025

https://github.com/pngo1997/axa-xl-insurance-bi-dashboard

Provides a comprehensive analysis of insurance submissions, approvals, compliance rates, and profitability for AXA XL Insurance.

bi-analytics bi-dashboard business-analytics data-analysis filtering performance-analysis powerbi segmentation visualization

Last synced: 08 Feb 2026

https://github.com/an1mch1k-theone/project_2_hh_analyze

Анализ вакансий из HeadHunter

data-analysis data-analysis-project postgresql python sql

Last synced: 14 Apr 2026

https://github.com/samruddhi3012/customer-behavior-analysis

Hello there! This repo contains python project based on E-Commerce Customer Behavior analysis.

customer-segmentation customerbehavior data-analysis ecommerce python

Last synced: 02 May 2026

https://github.com/msthamizh/phonepe-pulse-data-visualization-and-exploration

Developing a Streamlit application that allows users to explore and analyze transaction data from the PhonePe Pulse dataset. The project aims to provide insights into digital payment trends across India.

data-analysis data-visualization dataframe mysql pandas plotly python streamlit

Last synced: 02 May 2026

https://github.com/seankwarren/water-quality-analysis

An examination of water quality in the Atlanta watershed with a focus on identifying neglected areas and potential strategies for improving water quality monitoring

analytics data-analysis jupyter-notebook python

Last synced: 03 May 2026

https://github.com/whitehathackerpr/data-visualization-tool

This is a Python-based web application that allows users to upload datasets, analyze data, and create visualizations interactively. The tool is designed for ease of use and provides a simple interface to perform basic data analysis and generate visualizations

data data-analysis data-visualization python python3

Last synced: 05 Sep 2025

https://github.com/chahiriabderrahmane/carpricepredictor

🚗 Cars Exploration & Price Prediction | Analyzing Cars.com Listings

data-analysis data-science data-visualization machine-learning python streamlit web-scraping

Last synced: 08 Feb 2026

https://github.com/madhuresh2011/telco-customer-churn-analysis-using-python

The analysis primarily investigates factors influencing customer churn, particularly focusing on payment methods and contract types.

csv data-analysis matplotlib numpy pandas pyhton seaborn vizualisation

Last synced: 02 May 2026

https://github.com/geetisha/sales_insight_data_analysis_using_sql_and_tableau-etl-

Sales Insights - A Data Analysis Project performed on Tableau & SQL Topics

dashboard data-analysis data-visualization mysql project sales-analysis sql tableau

Last synced: 07 Jan 2026

https://github.com/sadia-khan13/supervised_machine_learning

This repository is meant to document my hands-on experience with supervised learning algorithms and techniques. It includes a variety of exercises, and experiments using different types of data and tools. Each file represents a step forward in building my machine learning skills.

data-analysis data-science jupyter-notebook machine-learning machine-learning-algorithms python sciket-learn supervised-machine-learning

Last synced: 06 Mar 2026

https://github.com/snehilk1312/data_science

This Repository contains the Data Science things I have done in recent times along with visualization , cleaning , models, statistics, Courses, Datasets. :=)

data-analysis data-science glove natural-language-processing nlp nltk statistics word2vec

Last synced: 02 Apr 2026

https://github.com/dcs-training/intromachinelearning

This course is aimed at providing an introduction to machine learning for those with some beginner level python skills. Go to the readme file

data-analysis data-wrangling machine-learning python statistics

Last synced: 06 Mar 2026

https://github.com/gholamrezadar/most-profitable-actors

Finds the list of actors with the most boxoffice profit using TMDB API.

crawling data-analysis tmdb

Last synced: 16 Jan 2026

https://github.com/siddharthbadal/sql-case-studies-data-analysis

Data Analysis case studies on various databases using SQL . Demonstrating proficiency in solving diverse business problems. Projects cover sales, orders, products, finance, healthcare and other sectors, and highlight my ability to analyze complex datasets through SQL queries, data manipulation, and visualization techniques.

data-analysis sql sql-query sql-server sqlserver

Last synced: 08 Jan 2026

https://github.com/ferrangarciarovira/premier-league-betting-analysis

Comprehensive Python analysis of Premier League betting market inefficiencies (2005–2024). Evaluates bookmaker biases, betting strategies, and market efficiency using statistical methods and Monte Carlo simulations.

betting-strategies bias-detection data-analysis market-efficiency monte-carlo-simulation premier-league python sports-analytics

Last synced: 03 May 2026

https://github.com/raccoon-hero/gender-equality-tracker

A web application visualizing gender equality metrics with a focus on Ukraine. Built with Flask, it's powered by live data from global open sources, with dynamic research insights and analysis.

chartjs css dashboard data-analysis data-visualization flask frontend gender-equality global-metrics html linked-data openalex opendata python representation semantic-web ukraine webapp wikidata world-bank-api

Last synced: 07 May 2026

https://github.com/jakubkorytko/data-graphs

Transform raw data into captivating visual stories with this app, effortlessly craft stunning data charts that unveil insights and trends

charts data-analysis mit-license open-source

Last synced: 14 May 2026

https://github.com/manikantasanjay/time_series_data_analysis_on_stocks

Time Series Data Analysis project on Daily Stock Prices of the following companies(Apple, Microsoft, Google, Amazon) for a span of 5 years.

data-analysis pandas stock time-series time-series-analysis

Last synced: 03 May 2026

https://github.com/jossimmar/ensa-scripts_py

Repositorio destinado al manejo de datos de consumo de los Clientes Mayores de ENSA del Grupo Distriluz.

data-analysis electrical-engineering python sqlite

Last synced: 10 May 2026

https://github.com/zeynepcol/data-analysis-visualization

Data visualization and interactive analytics - Olympics Dataset

data-analysis data-science data-visualization matplotlib pandas plotly python scipy seaborn streamlit

Last synced: 03 May 2026

https://github.com/aritrakar/statpy

A simple package containing some functions for analysing Gaussian and Binomial distributions. Created for the Udacity AWS MLE Foundations 2021 course.

data-analysis python statistics

Last synced: 24 Oct 2025

https://github.com/wewoc/garmin_local_archive

Secure, local-first archive for Garmin Connect health data (HRV, sleep, activities). Private & offline. Structured for local analysis (Excel, HTML-Dashboard, Ollama, Open WebUI, AnythingLLM). Your data stays on your machine.

backup dashboard data-analysis fitness-tracker garmin garmin-connect ollama open-webui privacy privacy-enhancing-technologies privacy-first privacy-focused python self-hosted

Last synced: 16 Apr 2026

https://github.com/nomadsdev/sys-moninsight

System Monitoring and Analysis Tool is a utility for real-time performance tracking. It logs CPU, memory, and disk usage, provides visual graphs, and offers performance recommendations. Perfect for optimizing system efficiency.

automation cpu-usage data-analysis data-visualization disk-usage matplotlib memory-usage performance-analysis performance-optimization psutil python real-time-monitoring resource-management sys-moninsight system-metrics

Last synced: 19 Jun 2026

https://github.com/ahammadshawki8/playing-with-pandas

🐼 Pandas is one of my favourite library in python. It is well-known for "Analyzing" data. Learn basics and beyond the basics of Pandas from this repository. 🤍🖤

beginner-friendly data-analysis favourite-library pandas python

Last synced: 17 Apr 2026

https://github.com/carolinedotxyz/dp_sgd_classification

A hands-on educational walkthrough of training a CelebA (Eyeglasses) image classifier with Differentially Private SGD using PyTorch and Opacus. The focus of this repo is on clarity and reproducibility through balanced subsets, deterministic preprocessing, and side-by-side baseline vs. DP training, while acknowledging real trade-offs.

celeba-dataset classification data-analysis dp-sgd machine-learning opacus python pytorch

Last synced: 16 May 2026

https://github.com/flexmonster/svelte-flexmonster

Svelte wrapper for Flexmonster Pivot Table & Charts

data-analysis data-visualization frontend pivot-tables svelte sveltekit

Last synced: 27 Feb 2026

https://github.com/cworld1/novel-analysis

A simple project for analyzing Chinese novels

data-analysis novel

Last synced: 17 Mar 2025

https://github.com/ankit21111/filmilytics

This repository contains data and analysis on RSVP Movie House Production, focusing on past performance metrics and audience trends. Our goal is to derive actionable insights that can guide future productions for greater success. Explore the data, analysis scripts, and recommendations to understand how RSVP can thrive in the film industry.

data-analysis database database-design database-schema erdiagram sql

Last synced: 13 Jun 2025

https://github.com/gappeah/nike_web_crawler

This project involves web scraping Nike's product pages to extract product names, prices and links. The project showcases three different implementations of the web crawler using Selenium and BeautifulSoup. It also includes visualisation of the scraped data using Matplotlib and Seaborn.

beautifulsoup data-analysis data-visualization python selenium web-crawler web-scraper webcrawler webscraper webscraping webscraping-beautifulsoup

Last synced: 04 Jul 2025

https://github.com/sleeplessglory/big-data

Projects regarding big data analysis, presented within Jupyter Notebook

big-data data-analysis data-visualization jupyter python

Last synced: 16 Apr 2026

https://github.com/shelton-beep/predicting-gpa-using-lifestyle-factors

Predicting student GPA using lifestyle factors like study habits, sleep, and stress levels. A machine learning model built to help students and educators understand the impact of lifestyle choices on academic performance.

data-analysis data-preprocessing data-science feature-engineering gpa-prediction machine-learning model-interpretability predictive-modeling python regression-analysis student-performance xgboost

Last synced: 04 May 2026

https://github.com/equicirco/cirquant

Code and data delivering for quantifying circularity through open data and digital innovation.

circular-economy data-analysis database julialang official-statistics

Last synced: 13 Jan 2026

https://github.com/shadowk29/cusumtools

An eclectic collection of python scripts I have found to be useful in processing nanopore data

data-analysis data-visualization time-series-analysis

Last synced: 16 Mar 2026

https://github.com/sunnybibyan/exploratory-data-analysis-eda

Welcome to the Titanic Dataset - Exploratory Data Analysis (EDA) project repository! This project aims to uncover insights from the Titanic dataset using Python and Jupyter Notebook. By analyzing key variables such as age, gender, and class, we aim to visualize relationships between passenger characteristics and survival rates.

data-analysis data-visualization jupyter-notebook python titanic-dataset

Last synced: 18 Jan 2026

https://github.com/smahala02/calorimtery

A calorimetry lab project involving Python and Excel for computing heat transfer from experimental data.

calorimetry chemistry data-analysis excel jupyter-notebook python thermodynamics

Last synced: 05 Feb 2026

https://github.com/aaryan-agr/canadian-energy

This project analyzes Canada's energy trade, focusing on imports, exports, and market trends in the energy sector.

data-analysis data-cleaning data-manipulation data-processing data-science data-vizualisation energy-sector time-series-analysis

Last synced: 10 Jun 2025

https://github.com/foggy-projects/foggy-data-mcp-bridge

MCP Data Bridge for Java. Enabling safe Text-to-Query via a semantic layer, making enterprise data accessible to AI Agents.

agent data-analysis java llm mcp semantic-layer spring-boot text-to-sql

Last synced: 16 Mar 2026

https://github.com/lobooooooo14/badwords-pt-br

💬 Wordlist com palavrões em pt-BR para análise de dados, filtros, ou texto considerado "evitável"

badword-filter badwords brasil data-analysis filter filter-lists filterlist portugues portuguese text-analysis wordlist

Last synced: 25 Mar 2025

https://github.com/jabhij/eda_experiments

In this repo I'll use different types of datasets to explore and implement various Exploratory Data Analysis (EDA) approaches.

ames-housing analysis battery-life blackfriday-analysis data-analysis data-science data-visualization eda matplotlib-pyplot numpy pandas python seaborn visualization zomato-data-analysis

Last synced: 14 Apr 2026

https://github.com/kirkalyn13/open-signal-report-generator

Script used to generate results/summary, including the trends of flagged provinces, from the raw excel data file,

data-analysis data-science data-visualization matplotlib numpy pandas python

Last synced: 19 Jun 2026

https://github.com/ivanildobarauna-dev/currency-quote

Complete solution for extracting currency pair quotes data with comprehensive testing, parameter validation, flexible configuration management, Hexagonal Architecture, CI/CD pipelines, code quality tools, and detailed documentation.

data-analysis data-analytics data-engineering library pypi-packages python

Last synced: 27 Oct 2025

https://github.com/chelseammatta/nopd-cad-data-analysis

Analysis of 911 call data from New Orleans' 3rd & 4th police districts (2019-2022) using BigQuery

911-calls 911-data bigquery cad-data crime-analysis data-analysis emergency-response new-orleans public-safety sql

Last synced: 01 Jul 2025

https://github.com/ehtisham-sadiq/building-an-ml-based-heart-disease-diagnosis-system-with-flask

It is an end-to-end project that combines machine learning to create a user-friendly Heart Disease Diagnosis System, powered by Flask.

data-analysis exploratory-data-analysis feature-engineering flask machine-learning model-building model-evaluation pipelines python3 rest-api

Last synced: 04 May 2026

https://github.com/vatshayan/hospital-discharge-analysis

Analysis of Hospitalization Discharge Rates in Lake County, Illinois of various attributes like Anxiety, Alcohol, mood, Diabetes, Asthma, etc

data-analysis data-visualization jupyter-notebook machine machine-learning machine-learning-algorithms scikit-learn

Last synced: 04 Mar 2025

https://github.com/kimtth/agent-data-analyst-stream-chainlit

⚡️Chainlit-based Data Analyst Chat Agent (Responses API, Server Sent Events) 📈

agent azure-openai chainlit code-interpreter data-analysis server-sent-events stream-response

Last synced: 09 Jun 2026

https://github.com/shuddha2021/stellar-candidate-selector

A sophisticated candidate selection algorithm leveraging multi-criteria analysis and machine learning to identify top software engineering candidates. This tool features flexible filtering, score adjustment, and detailed visualizations to streamline the recruitment process.

candidate-selection data-analysis data-visualization machine-learning pandas plotting-in-python python python-data-analysis recruitment scikit-learn

Last synced: 05 May 2026

https://github.com/whis99/userfunnelanalysis

An ecommerce user funnel conversion data analysis with matplotlib & python.

data-analysis data-analysis-python data-analyst data-visualization google-colab jupyter-notebook matplotlib python

Last synced: 13 Apr 2026

https://github.com/wizardoftrap/football-team-analytics

This Jupyter notebook, created on Kaggle, analyzes football player and team statistics for the 2024-2025 season. It provides insights into player performance, team metrics, and playing styles across major European leagues using data from the dataset players_data-2024_2025.csv.

data-analysis data-visualization jupyter-notebook pandas python

Last synced: 05 May 2026

https://github.com/steno-aarhus/legliv

Substitution of red meat with legumes and risk of primary liver cancer in UK Biobank participants: A prospective cohort study

cancer-research data-analysis epidemiology nutritional-epidemiology nutritional-science open-science reproducibility reproducible-research rstats ukbiobank

Last synced: 03 Mar 2026

https://github.com/sivas-2/food-demand

This project aims to predict food demand based on historical data, leveraging various statistical methods to achieve accurate forecasts.

data-analysis data-science dataanalysis food-demand-forecasting statistics

Last synced: 12 Aug 2025

https://github.com/kiranmayi5/python-projects

A collection of Python projects showcasing skills in data analysis and visualization.

data-analysis data-visualization machine-learning nlp python

Last synced: 05 May 2026

https://github.com/myounus-codes/saleprice-prediction-dataset-analysis-and-cleaning-advance-regression

In this project I have cleaned the data for the model. Project Google Colab Link: https://colab.research.google.com/drive/1vQY-XEFJSdEkW2PQOSf1j13Yk8L-XXNw?usp=sharing

algorithms data-analysis data-science eda google-colab machine-learning numpy pandas python scikit-learn scikit-learn-python

Last synced: 05 May 2026

https://github.com/colburncodes/se_pudding_2023

This project is a React app designed to showcase research conducted by a team of data scientists and data analysts. The app is utilizing React and React-Chartjs-2

chartjs-2 data-analysis data-science data-visualization react-chartjs-2 reactjs

Last synced: 11 May 2026

https://github.com/githubuseraccountamazing/the-amari-project

a project in which I attempted to push some of the limits of stable-diffusion while taking some data along the way

ai ai-generated-images bash data-analysis machine-learning stable-diffusion textual-inversion

Last synced: 05 May 2026

https://github.com/mr-vozhyk/test-tasks

Выполненные тестовые задания (не запрещенные к публикации)

analysis data-analysis digital-analysis e-commerce excel google-colab google-sheets marketing marketplace power-bi power-query python sql

Last synced: 07 May 2026

https://github.com/lmuffato/dados-meteorologicos-inmet-tratamento

Tratamento e enálise de dados meteorológicos das estações locais fornecidos pelo INMET, utilizando a linguagem R

data-analysis personal-project r rstudio

Last synced: 12 Jun 2025

https://github.com/saeun-park/lg-aimers-4th

MQL 데이터 기반 B2B 영업기회 창출 예측 모델 개발

b2b data-analysis data-science machine-learning mql

Last synced: 08 Apr 2025

https://github.com/PatriLoto/Intro_R_para_reinventarTEC_2021

Material para el taller de Primeros pasos en R para el análisis de datos

data-analysis rstats

Last synced: 10 Oct 2025

https://github.com/kinshuk-code-1729/data-visualisation-using-python

This Repository consists of several python snippets for creating Two-Dimensional (2D) Graphics

data-analysis data-science data-visualization matplotlib visualization

Last synced: 02 Jun 2026

https://github.com/brunomontezano/benzocovid

💊 Data Analysis Project of Benzodiazepines during COVID-19 Pandemic.

benzodiazepines covid-19 data-analysis

Last synced: 28 Feb 2025

https://github.com/kamanhang/sqldatawarehousedataengineeringproject

This project delivers a modern data warehouse which focuses on building clean, organized data pipeline which covers important aspects such as ETL Pipeline Development, Data Cleaning, Data Modelling and Data Analytics

customer-analytics data-analysis data-cleaning data-engineering data-modeling data-pipeline data-visualization datascience etl-pipeline postgresql powerbi powerbidashboard sales-analysis sql

Last synced: 10 Oct 2025

https://github.com/thameran/mmar

Benchmark data and code for MMAR: A Challenging Benchmark for Deep Reasoning in Speech, Audio, Music, and Their Mix

cli data-analysis datascience finance generator github-config go haskell hurst markdown mmar mmark time-series xml2rfc

Last synced: 06 May 2026

https://github.com/cosmoduende/r-twitter

Explore your Twitter activity with R: Sentiment Analysis and Data Visualization. How to analyze your Twitter account (or any account), discover your habits and sentiments with the "rtweet" package and NLP.

data-analysis data-visualization lemmatization nlp nlp-library nlp-resources nltk nltk-library r-package r-programming r-studio rtweet stemming twitter twitter-api twitter-data twitter-data-analysis twitter-data-extraction twitter-sentiment-analysis udpipe

Last synced: 10 Oct 2025

https://github.com/kirkalyn13/opensignal_autogenerate_report

Script used to generate results/summary, including the trends of flagged provinces, from the raw excel data file,

data-analysis data-science data-visualization matplotlib numpy pandas python

Last synced: 06 May 2026

https://github.com/amirhosseinhonardoust/customer-sentiment-intelligence-platform

An enterprise-grade NLP + Streamlit + SQL platform for analyzing customer feedback. Performs automated sentiment detection, stores labeled reviews in SQLite, and delivers real-time dashboards with probability insights to support business, marketing, and product optimization decisions.

community-project cost-of-living dashboard data-analysis data-visualization economic-analysis inflation-tracking local-data open-data pandas price-tracker public-insight python sqlite streamlit

Last synced: 06 May 2026

https://github.com/silvano315/med-physics

This would be a repository about medical physics. It will based on 4 paths: medical data to analyse, SOTA programs for medical purposes, computer vision and eXplainability.

computer-vision data-analysis data-science explainable-ai medical-imaging medical-physics medical-tool

Last synced: 24 Mar 2025

https://github.com/phillbertnevinemmanuel/automotivesalesdataanalysis

This marks my inaugural venture into personal data analysis, employing SQL and Python for Correlation Analysis. I've sourced the dataset from Kaggle, specifically focusing on automotive sales. You can find the dataset linked on my website below. I'm excited to share that I've independently managed the majority of tasks involved in this project.

data-analysis dataset microsoft-sql-server python python-lambda sql ssms tsql

Last synced: 14 Mar 2026

https://github.com/mayankyadav23/air-bnb-data-analysis

Data analysis and insights from NYC Airbnb listings, focusing on key metrics such as host performance, neighborhood trends, pricing, and customer reviews. Comprehensive documentation of ETL processes and analytical methodologies is provided. Perfect for understanding Airbnb dynamics and decision-making in the NYC market.

advanced-excel business-intelligence data-analysis data-analytics data-visualization power-bi ppt

Last synced: 19 Mar 2026

https://github.com/mohamedhany99/human-voice-identifier-counter

the application developed in (KIVY) it can identify the users imported into the dataset based on the support vector machine training model it has two features ( Importing new voice - Detection to detect the human voices and count them)

android android-app android-application automation automation-framework data data-analysis data-mining data-science data-visualization datascience kivy kivy-framework machine-learning python

Last synced: 27 Mar 2026

https://github.com/as16082023/coffee-bean-sales-analysis

Analyzing coffee bean sales data to optimize consumer targeting, product offerings, and strategic marketing in the coffee industry.

coffee-bean-sales dashboard data-analysis data-visualization ms-excel

Last synced: 22 Jan 2026

https://github.com/anthonybench/datapeek

Peek summary of datafile in a succinct, opinionated manner.

cli data data-analysis

Last synced: 02 Mar 2026

https://github.com/priyanshubiswas-tech/pwc-power-bi-task-1-2

Power BI dashboards analyzing Phonenow's call center performance and customer retention. Task 1 focuses on KPIs like satisfaction rating, call count, and agent efficiency. Task 2 analyzes retention trends and customer behavior to enhance loyalty. Built using Power BI, DAX, and Excel.

dashboard data data-analysis dax-measures excel powerbi powerbidashboard

Last synced: 23 Jan 2026

https://github.com/karthikmprakash/911-call-dataanalysis

Data Analysis of Emergency (911) Calls: Fire, Traffic, EMS for Montgomery County, PA

911-call-analysis data-analysis data-visualization python3 united-states-data

Last synced: 10 May 2026

https://github.com/allanotieno254/awsome-chocolate-company-sales-analysis-dashboard

This repository contains an in-depth analysis of chocolate consumption trends, focusing on various factors influencing consumer preferences, production, and market performance.

data-analysis data-science data-transformation measures powerbi sales-analysis visualization

Last synced: 23 Feb 2026

https://github.com/rupav/fifa17-detailed-analysis

⚽ FIFA 17 data analysis using various Machine Learning Algorithms. ⚽

data-analysis data-visualization fifa17 machine-learning-algorithms radar-chart

Last synced: 16 May 2026