An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/chandrashekhar-01/globalterrorism-analysis

A data mining and analytics project on the Global Terrorism Database (Kaggle) that explores worldwide terrorism trends through Python-based data visualization and statistical analysis.

data-analysis data-mining data-visualization exploratory-data-analysis

Last synced: 28 Feb 2026

https://github.com/an4pdm/relatorio-de-vendas

O presente projeto foi feito através das ferramentas oferecidas pelo Power BI afim de aprimorar meus conhecimentos sobre ETL. Os dados utilizados foram de origem do site "Kaggle".

data-analysis data-visualization database etl powerbi

Last synced: 20 Jun 2026

https://github.com/cassandrajm/reddit-dashboard

INTERACTIVE DASHBOARD: Analyzing Political Discourse on Reddit: A Multi-Faceted NLP Approach to Toxicity, Bias, and Political Stance

capstone data data-analysis data-science politics python reddit

Last synced: 09 Apr 2025

https://github.com/kseniatyschuk/excel-data-matcher

Compare and match Excel files via a simple Python GUI

automation data-analysis etl excel gui pandas python3 tkinter

Last synced: 23 Apr 2025

https://github.com/drod75/burger_king_analysis

A simple analysis on a burger king dataset.

data-analysis data-visualization jupyter-notebook pandas python seaborn

Last synced: 09 May 2026

https://github.com/rizkipragustono/data_analysis_spark

Exploration: Data Analysis using Spark

apache-spark data-analysis pyspark python spark-sql sql

Last synced: 09 May 2026

https://github.com/r-xue/xlib

Rui's IDL code library for Astrophysics

astrophysics data-analysis idl

Last synced: 16 Feb 2026

https://github.com/kernix13/github-readme-seo-analysis

A Jupyter Notebook GitHub README and Repo SEO Analysis to determine what makes a repo rank in the SERPS

accessibility data-analysis readme seo seo-analysis

Last synced: 29 May 2026

https://github.com/prekshivyas/cis-595-big-data-analytics

Comprehensive real estate price prediction project, integrating socioeconomic indicators and property features.

data-analysis data-cleaning data-mining data-preprocessing data-science data-visualization data-wrangling exploratory-data-analysis web-scraping

Last synced: 16 Feb 2026

https://github.com/emanoelcampos/python-onemonth

This repository contains educational materials and projects developed during a Python course offered by OneMonth. It covers Python basics, intermediate concepts, web development with Flask, and data analysis with pandas. The course is structured into weeks, each focusing on a different aspect of Python programming and its applications.

data-analysis flask jupyter-notebook onemonth python python3

Last synced: 09 May 2026

https://github.com/tsbarr/toronto-open-data

Analysis of Toronto's open data initiatives. 🌆 Exploring Toronto's urban systems through data science 📊 Python-based analyses of public datasets 🔍 Focus on community impact and urban patterns 🎓 Academic rigour meets practical insights 🔄 Regularly updated with new analyses

api-integration civic-tech ckan-api data-analysis data-cleaning data-science data-visualization exploratory-data-analysis jupyter-notebook open-data pandas public-data python tableau toronto urban-analytics

Last synced: 09 May 2026

https://github.com/eea/eea.reveal

Reveal hidden knowledge by visualizing network structure in your data.

data-analysis data-visualization graphviz network-visualization

Last synced: 18 Mar 2025

https://github.com/tejas-130704/dataanalysis-hr-manager

Presence Insights of Employees This project provides insightful data analysis on employee attendance and presence, including work-from-home (WFH) data, sick leave records, and presence excluding holidays. The analysis spans a three-month period and is visualized using Power BI to help HR managers understand trends and optimize workflow.

dashboard data-analysis data-visualization hr-manager power-bi

Last synced: 01 Mar 2026

https://github.com/arunesh-tiwari/sales-analysis

Tableau Data Analysis Project.

data-analysis data-visualization tableau

Last synced: 01 Mar 2026

https://github.com/grooviter/tablesaw

Java dataframe and visualization library

data-analysis dataframe java visualization

Last synced: 28 Mar 2025

https://github.com/dcs-training/intro-to-statistics

Intro to Statistics workshop. In this repo, you are going to find the code and files we are going to use for the practical part of the workshop, together with the ppt associated with this training. Go to the readme file

data-analysis data-visualisation data-wrangling r statistics

Last synced: 20 Jun 2026

https://github.com/cuadernin/dispositivos_analisis

Breve análisis de un conjunto de datos sobre dispositivos móviles

data-analysis data-science data-visualization descriptive-statistics jupyter-notebook python-3 seaborn

Last synced: 18 Apr 2026

https://github.com/abhroroy365/market_analysis

This project explores customer segmentation and market analysis in the context of online retail using an online retail dataset. By applying advanced analytics, we aim to uncover insights that can drive strategic decisions and enhance business performance.

clustering data data-analysis data-visualization kmeans-clustering machine-learning market-analysis python silhouette-analysis

Last synced: 09 May 2026

https://github.com/oscarmtr/metrov

Interactive viewer for tropospheric meteorological soundings

climate data-analysis meteorology skew-t soundings temperature tropospheric web

Last synced: 01 Mar 2026

https://github.com/chrispsang/customerchurnanalysis

Predicting customer churn using a RandomForestClassifier with detailed EDA, model evaluation, and visualization. Includes a Tableau dashboard for interactive insights.

customerchurn data-analysis data-visualization datapreprocessing machine-learning python scikit-learn tableau

Last synced: 31 Jan 2026

https://github.com/johannaschmidle/road-collisions-project

Analyzed road accident data in the UK from 2019 to 2022 to identify patterns and trends in road accidents, for Effective Road Management [Excel]

data-analysis data-visualization excel pivot-tables traffic-analysis

Last synced: 01 Mar 2026

https://github.com/0-mostafa-rezaee-0/sandwich_structures

Impact test of Sandwich Structures

composite-materials data-analysis r

Last synced: 09 Aug 2025

https://github.com/aleks-andrs/bigdataanalytics

Public repository for CM3111: Big Data Analytics Coursework (Meteorite landings analysis)

data-analysis data-science machine-learning

Last synced: 02 Mar 2026

https://github.com/magnus0969/black-friday-sales-analysis

An in-depth analysis of Black Friday sales data to uncover trends, customer behavior, and product insights. Utilizing Python, data visualization, and machine learning techniques, this project provides key business intelligence to optimize sales strategies.

analysis data-analysis data-science python sales-analysis

Last synced: 09 May 2026

https://github.com/abeltavares/hotel_performance_analysis

A Power BI project that analyzes the performance of a hotel, including revenue, expenses, customer data, hospitality metrics and financial ratios.

business-intelligence data-analysis expenses financial-analysis hospitality-industry power-bi revenue

Last synced: 02 Mar 2026

https://github.com/badranalyst/covid-deaths-dashboard-with-tableau

This project showcases an interactive dashboard developed in Tableau to visualize COVID-19 deaths data. It provides insights into trends, geographical distributions, and key metrics related to mortality during the pandemic. The dashboard aims to enhance understanding of the data, supporting public health analysis and decision-making.

covid-19 dashboard data data-analysis data-visualization dataset tableau tableau-dashboards visualization

Last synced: 02 Mar 2026

https://github.com/charlescro/reddit-classification-nlp

Analyzing subreddit language via Reddit API and NLP techniques.

data-analysis data-science data-visualization nlp-machine-learning reddit-api scikit-learn

Last synced: 03 Apr 2025

https://github.com/zxjahid/matplotlib

A comprehensive guide to mastering data visualization with Matplotlib through hands-on examples and advanced techniques. 🚀📊

candlestick candlestick-chart cheatsheet data-analysis data-visualization gtk jupyter-notebook maps matplotlib-python pandas thesis-template tk tutorial wx

Last synced: 09 May 2026

https://github.com/bpkaur/exploring-the-evolution-of-linux

This project explores the evolution of the Linux kernel by finding top 10 contributors and visualization of commits over the years.

data-analysis data-science datacamp ipynb-jupyter-notebook python3

Last synced: 21 Feb 2026

https://github.com/chaitanyaprasad60/sql-queries

This is a list of complex SQL Queries I have practiced.

data-analysis sql window-functions

Last synced: 03 Mar 2026

https://github.com/darksoulnelson/json-to-excel-converter

This repository provides a tool to convert JSON data to Excel format (.xlsx). It allows you to easily transform structured JSON data into a well-organized spreadsheet for better analysis and visualization.

automation-script automation-tools data-analysis data-converter data-export data-formatting data-tools data-visualization excel excel-automation excel-converter excel-tools json json-exporter json-parser json-processing json-to-csv json-to-excel programming-tools spreadsheet-tools

Last synced: 05 Jul 2025

https://github.com/quantumudit/groceries-basket-analysis

This project performs market basket analysis using Power BI and Python to reveal associations between grocery items. It involves transforming raw transaction data into a processed dataset, creating interactive Power BI reports, and generating key insights through Python, enabling data-driven decision-making.

data-analysis data-visualization pandas powerbi python

Last synced: 12 Apr 2026

https://github.com/anas436/student-performance-analysis

In this project I have constructed a Machine Learning System which will analyis students performance with about their academic records. Note that, this project will work with any students recods which you want to provide.

data-analysis jupyter-notebook matplotlib numpy pandas python3 seaborn

Last synced: 16 Apr 2026

https://github.com/vinitgurjar/r_lang_exp

This is a collection of my collage Data Analytics lab work and assignment, the files here contains program of R language

data-analysis data-visualization r

Last synced: 02 Jul 2025

https://github.com/mugambi645/exploring-ebay-car-sales-data

Exploring ebay car sales dataset

car-sales data-analysis numpy pandas

Last synced: 16 Apr 2026

https://github.com/mpoojithavigneswari/bangalore-house-price-prediction

This project involves creating a website that predicts Bangalore house prices with 94.65% accuracy using a machine learning algorithm.

data-analysis data-science flask-server machine-learning matplotlib numpy pandas python scikit-learn seaborn

Last synced: 12 Apr 2026

https://github.com/grindelfp/logistic-regression-study

Example of logical regression data analysis and exercise on it.

data-analysis ipynb logistic-regression python

Last synced: 03 Mar 2026

https://github.com/nahiyanhkhan/sales-insight-dashboard_powerbi

Build a dashboard to display the sales insights of a company's sales data over the 4 years period. It includes displaying revenue, sales quantity in different regions over the years.

dashboard data-analysis data-analytics data-visualization powerbi salesdashboard

Last synced: 08 Jan 2026

https://github.com/banner-19/extraction-and-analysis-of-text

The objective is to analyze text content from a list of URLs. This involves extracting article titles and text, then performing natural language processing to generate metrics like sentiment, readability, and word usage. Finally, the results are stored for further analysis or visualization.

data-analysis data-analytics data-science nlp nltk python3 text-analysis text-extraction

Last synced: 03 May 2026

https://github.com/lintangwisesa/ujian_analyticsvisualization_jcds07

Panduan Soal Ujian Data Analytics & Visualization Job Connector Data Science batch 7

data-analysis data-science data-visualisation exam

Last synced: 04 Mar 2026

https://github.com/adrianlardies/feelms_predict_by_emotion

Feelms is a mood-based movie recommendation app that uses collaborative filtering and machine learning to suggest films based on your emotions. Built with Streamlit and powered by AWS, Feelms personalizes each user's experience through simulated interactions and tailored predictions.

aws-ec2 aws-rds data-analysis data-science machine-learning python streamlit

Last synced: 16 Apr 2026

https://github.com/abhipatel35/gym-performance-analysis

Analyzing gym performance and user engagement in Arizona using Spark SQL, PySpark, and visualization techniques on the Yelp dataset.

apache-spark asu business-insights data-analysis data-processing-at-scale data-visualization dps gym-analysis rating-patterns sql trend-analysis user-insights yelp-dataset

Last synced: 16 Apr 2026

https://github.com/bishopce16/school_district_analysis

The school board requested an analysis on the various performance metrics for the school district.

data-analysis jupyter-notebook numpy pandas python visual-studio-code

Last synced: 16 Apr 2026

https://github.com/kosuri-indu/allaboutolympics

All About Olympics is an interactive dashboard presenting comprehensive data and insights on Olympic Games from 1896 to 2020.

data-analysis pandas plotly python streamlit

Last synced: 16 Apr 2026

https://github.com/akash-srm/user-engagement-analysis

Analyzed user engagement and feedback data to derive actionable insights for an online learning platform.

analytics-projects data-analysis data-cleaning eda jupyter-notebook pandas python seaborn student-engagement

Last synced: 16 Apr 2026

https://github.com/themihirmathur/qlik-intern-project

Qlik Analysis of Road Safety & Accident Patterns in India 📈 Analyzed & visualized road safety data for 20.85k+ accident cases with 9+ accident data patterns in India using Qlik 📉 Reduced inefficiencies by 25% by developing design of an avant-garde data tracking dashboard that monitored injuries.

data-analysis data-visualization presentation qlik qlik-cloud qlik-sense qlikview

Last synced: 04 Mar 2026

https://github.com/marben06/rent-in-germany

Interactive visualizations and maps depicting topics around rent prices and income in Germany built with Svelte.

charts d3 d3-visualization d3js data-analysis data-visualization gis gis-data infographic infographics map mapbox mapbox-gl mapbox-gl-js mapboxgl svelte

Last synced: 27 Apr 2026

https://github.com/zachbateman/easy_plot

Easy Statistical Visualization in Python

data-analysis data-visualization graphics matplotlib python seaborn

Last synced: 18 Jan 2026

https://github.com/danpoynor/omdb-api-data-analysis

Gathers data for Oscar-winning movies using their IMDB ids, saves the information to a CSV file, and answers a few data analysis questions about the movies using JupyterLab.

analytics csv data-analysis jupyter-notebook matplotlib omdb-api pandas-dataframe python-dotenv python3 seaborn-plots

Last synced: 16 Apr 2026

https://github.com/yasumorishima/yasumorishima

Manufacturing Engineer & Data Analyst. 17 years exp in MFG. Python, VBA, Automation Specialist. (盛島康徳 / Yasunori Morishima)

automation data-analysis manufacturing portfolio python vba

Last synced: 05 Mar 2026

https://github.com/ribin-baby/the-sparks-foundation-data-science-internship

This repository contains tasks and solutions assigned as part of internship program. This repository contains workbooks on data analysis and model building parts.

data-analysis eda python3

Last synced: 16 Apr 2026

https://github.com/theveryhim/massive-text-processing-1

cleaning, processing and analysis of papers' dataset in pyspark(rdd) framework

big-data data-analysis frequent-itemsets massive-datasets pyspark text-preprocessing

Last synced: 03 Jul 2025

https://github.com/pizofreude/divvybikes-share-success

Developing data-driven marketing campaign for Divvy to convert casual riders into annual members. Divvy is a bike-share program of the Chicago Department of Transportation (CDOT).

airflow bi-analytics data-analysis data-engineering data-visualization database dbt docker etl jupyterlab python r redshift s3

Last synced: 17 Apr 2026

https://github.com/junpenglao/spafv

SPAFV - Surface Profile Analysis for Free Viewing eye movement experiment in 2AFC task

data-analysis statistics temporal-logic

Last synced: 31 Mar 2025

https://github.com/dina-hosny/analyze-and-model-airline-system

Analyzing Airline System and Building Data Warehouse Model to Store the Data and Answer Some Business Questions

data-analysis data-modeling data-warehouse datawarehousing dwh plsql sql

Last synced: 05 Mar 2026

https://github.com/shashwat9kumar/us-accidents-data-analysis

Analysis of the US accidents using the US-Accidents dataset (4.2 million entries) from Kaggle

accidents accidents-analysis data-analysis data-analytics data-visualisation data-visualization matplotlib numpy pandas python

Last synced: 17 Apr 2026

https://github.com/marvinmarnold/oipm_stop_search

OIPM's analysis on Stop & Search (frisk) activity by the New Orleans Police Department.

data-analysis frisk new-orleans oipm police search stop

Last synced: 22 Jul 2025

https://github.com/tanaybhadula/twitter-trends-dashboard

An interactive dashboard to visualizes data on current Twitter trends by country and globally. Collects data of over 60 countries using the python Tweepy library, processed it,and visualized it in the form of bar chart and pie chart using the Plotly Dash framework.

dash dashboard data-analysis data-visualization plotly python trends twitter

Last synced: 31 May 2026

https://github.com/vaishnavis03/finlatics_ml_program

This repository contains the .ipynb files for 3 datasets, along with a PPT for each. The datasets included are Facebook Marketplace Data, Sales Prediction Data, and Wine Quality data.

correlation data-analysis data-science data-visualization knn linear-regression machine-learning matplotlib numpy pandas random-forest-classifier scikit-learn

Last synced: 17 Apr 2026

https://github.com/bocchio01/skyward_recruitment_assignment

Assignment to join the PoliMi SkyWard software team

data-analysis kalman-filter model-rocket

Last synced: 15 Mar 2025

https://github.com/dulajkavinda/pandas-exploring-data-ml

🐼 Exploring data with pandas library.

data-analysis machine-learning pandas python

Last synced: 09 May 2026

https://github.com/apoorvalal/misc_stata_ados

Misc Utility programs in Stata.

data-analysis stata stata-command

Last synced: 04 Feb 2026

https://github.com/ernanej/data-science-dca0131

Files, developed throughout the 2024.1 semester of the Data Science discipline taught at the Federal University of Rio Grande do Norte by the Department of Computer Engineering and Automation (DCA). 📚

big-data data-analysis data-science ia

Last synced: 30 Mar 2025

https://github.com/jabercrombia/video-game-data

This project integrates FastAPI as the backend and Next.js as the frontend to create a full-stack web application. It processes and displays vides game sales data, enabling seamless API communication while maintaining a scalable and efficient architecture.

data-analysis nextjs nintendo playstation python typescript video-game

Last synced: 02 Apr 2026

https://github.com/quocduyenanhnguyen/roi-modeling-and-analysis-of-sports-dataset

In this project, you will find my ROI model for retirement savings and PowerPoint presentation of my ROI model, as well as my data analysis/visualization of Sports Ticket Sales dataset that I concluded with a PDF group written report

data-analysis data-visualization microsoft-excel rate-of-return-modeling sports-ticket-sales-dataset

Last synced: 08 Feb 2026

https://github.com/mansogf/datascience_introduction

Data Science Introductions Practices

data-analysis data-science data-visualization graph

Last synced: 04 Apr 2025

https://github.com/salma-mamdoh/exploring-the-evolution-of-linux-project

My Project to learn the Basics of Analysis on DataCamp

data-analysis datacamp pandas python time-series-analysis

Last synced: 09 May 2026

https://github.com/filip-kustura/python-covid-19-behaviors-analysis

Using Jupyter Notebook, this university project analyzes attitudes and behaviors related to the COVID-19 pandemic using a two-year survey from Imperial College London and YouGov research company. Utilizing Pandas, NumPy and Matplotlib, the data analysis focuses on three countries, exploring trends and insights throughout the pandemic.

covid-19 data-analysis data-visualization jupyter-notebook matplotlib numpy pandas python university-project

Last synced: 12 Apr 2026

https://github.com/eliasdehondt/learn-r

Welcome to the Learn-R repository! This is your go-to resource for learning the R programming language, whether you're a beginner or looking to enhance your skills.

data-analysis data-visualization education machine-learning programming r statistics tutorials

Last synced: 03 Apr 2026

https://github.com/jhrcook/checkplease

Analysis of an immune checkpoint-blockade screen.

bayesian-statistics data-analysis pymc3 python python3 r

Last synced: 17 Apr 2026

https://github.com/tyriek-cloud/nyc-mobility-survey-analysis

An end-to-end data engineering project in which five NYC DOT datasets were modified in an ETL process and analyzed for insights.

aws aws-athena aws-glue aws-glue-crawler aws-quicksight aws-s3 data-analysis data-engineering etl-pipeline json python

Last synced: 09 May 2026

https://github.com/shimazadeh/ft_linear_regression

Implementing a modular linear regression from scratch to predict the price of cars using a gradient descent algorithm.

data-analysis data-science hyperparameter-tuning linear-regression predictive-modeling

Last synced: 03 Jun 2026

https://github.com/mahmoudwal27/manufacturing_downtime

This project focuses on improving manufacturing efficiency by analyzing production data. Using Python, SQL, and Power BI, we built interactive dashboards to uncover patterns, minimize downtime, and optimize operations. The goal is to help stakeholders make data driven decisions for enhanced productivity.

data-analysis data-analysis-python data-visualization google-colab powerbi python sql

Last synced: 17 Apr 2026

https://github.com/ridemountainpig/education-level-data-analysis

An analysis of the relationship between education levels, unemployment rates, and credit card spending in Taiwan's six major cities.

data-analysis matplotlib pandas-python

Last synced: 17 Apr 2026

https://github.com/datalopes1/manufacturing_defects

Projeto de EDA utilizando o Manufacturing Defects que pode ser encontrado no Kaggle

data-analysis data-visualization eda exploratory-data-analysis python

Last synced: 09 May 2026