An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/muthukumar0908/imdb_movie_analysis_with_powerbi

The project aim is to analyze the dataset using Power Bi, The dataset is related to IMDB Movies.

data-analysis data-visualization powerbi

Last synced: 12 Jun 2025

https://github.com/hari7261/data-visualization

Python-based application built using CustomTkinter for the graphical user interface (GUI) and Matplotlib for data visualization. It allows users to import datasets, perform real-time data visualization, and analyze data using various chart types and machine learning techniques.

data-analysis data-visualization export hari7261 import python realtime-visualization

Last synced: 17 Jun 2025

https://github.com/wardenkenny/data-analyst-portfolio

A repository I have created to show and explore data analytics.

data-analysis excel r spreadsheets sql tableau

Last synced: 02 Apr 2025

https://github.com/victorlcastro-dsa/pbl-datacamp

This repository features projects from DataCamp's Project-Based Learning (PBL) courses, showcasing practical applications of data analysis, machine learning, and visualization. Explore real-world datasets and interactive results that highlight the skills gained through hands-on learning.

data-analysis data-science data-visualization datacamp-projects hypothesis-testing machine-learning project-based-learning

Last synced: 30 Jun 2026

https://github.com/vbhvsingh0/coulombic_dyn_formaltetra

The Python code simulates a formaldehyde tetra-cation molecule using Coulombic forces

data-analysis physics-simulation python shell-scripting

Last synced: 24 Jun 2026

https://github.com/hadeel-13/new_home

New Home is a Website for Buying and Selling Real Estate with user preferences, it is my Graduation project with a grade of 93%.

bootstrap5 chartjs css css3 data-analysis data-mining google-maps html html5 javascript jquery

Last synced: 12 Apr 2026

https://github.com/wo0fle/sfrcp

The program used for a research study I conducted: "Comparison of Star Formation Rate in Spiral versus Elliptical Galaxies."

astronomy astropy data-analysis galaxy jupyter-notebook python research research-project

Last synced: 03 Apr 2025

https://github.com/leosimoes/digitalinnovationone-analise-datasets

Projeto prático "Análise de dados com Python e Pandas" do Bootcamp "Banco Carrefour Data Engineer" da Digital Innovation One.

data-analysis data-science python

Last synced: 24 Mar 2025

https://github.com/mindlessmuse666/train-test-splitter

Анализ данных о пассажирах Титаника и разбиение на обучающую и тестовую выборки. Практическое задание по дисциплине "Основы применения методов искусственного интеллекта в программировании".

data-analysis data-preprocessing data-visualization machine-learning pandas python scikit-learn seaborn titanic train-test-split

Last synced: 12 Apr 2026

https://github.com/josephbarbierdarnal/matoolkit

matoolkit is a python package containing a toolbox for creating visually appealing graphs/annotations in matplotlib

data-analysis data-visualization matplotlib

Last synced: 31 Mar 2025

https://github.com/wtmcgrew/sql-credit-risk-analysis

Credit Risk Analysis using SQL & Excel – Approval trends by FICO, DTI, PTI, LTV, and delinquencies.

case-study credit-risk data-analysis financial-analysis loan-applications portfolio-project sql sqlite underwriting

Last synced: 04 Jul 2025

https://github.com/sarthakagg29/sql-share-trading-analysis

Analysis of share trading transactions using SQL. Includes table setup, sample data, and a variety of queries to answer typical business questions about stocks and trading.

data-analysis dbeaver portfolio postgresql share-market sql

Last synced: 04 Jul 2025

https://github.com/adrianlardies/from-data-to-insight

This project creates and manages a MySQL database to analyze the performance of Bitcoin, Gold, and the S&P 500 in response to economic factors. It integrates historical data, executes advanced SQL queries, and visualizes key insights, showcasing the power of SQL and Python in financial analysis.

data-analysis data-science matplotlib pandas python seaborn sql

Last synced: 12 Apr 2026

https://github.com/odessaz/portfolio-projects

This is a repository I have created to showcase skills, share projects and track my progress in Data Analytics and Data Science

applied-mathematics data-analysis data-science excel jupyter-notebook matplotlib-pyplot pandas portfolio python r r-studio seaborn sql statistics

Last synced: 12 Apr 2026

https://github.com/noturlee/iris-dataanalyis

This project aims to classify Iris flowers into three species—setosa, versicolor, and virginica—based on their sepal and petal measurements using machine learning techniques. The dataset comprises 150 samples evenly distributed among these species

data-analysis data-modeling data-science data-structures-and-algorithms data-visualization

Last synced: 08 Apr 2025

https://github.com/codeslash21/tmdb_data_analysis

We analysed TMDB dataset which contains around 11000 movies details. We analyzed to find some interesting facts about the dataset.

data-analysis data-visualization matplotlib nanodegree-project numpy pandas python tmdb-movie

Last synced: 03 May 2026

https://github.com/doughtnerd/pod-old

Read and write Excel data

data data-analysis excel poi-library workbook

Last synced: 21 Jan 2026

https://github.com/busesimsek/dataanalysisportfolio

A compilation of my data analysis projects using SQL, Python, and Tableau.

data-analysis data-visualization python sql tableau

Last synced: 12 Jun 2025

https://github.com/itskshitija/lego-set-explorer

As a part of the Maven Analytics Lego challenge, I developed an interactive Power BI dashboard exploring the evolution of LEGO sets from 1970 to 2022.

data-analysis data-science data-visualization dataanalysis dataset powerbi powerbi-desktop powerbi-report

Last synced: 12 Jun 2025

https://github.com/avratanubiswas/fluorpenplugin

A matlab user interface for analysing OJIP curve datasets from FluorPen instrument. That is, serving as an additional plug in for "quick categorical analysis".

data-analysis fluorpen ojip-curve

Last synced: 18 Mar 2026

https://github.com/abhipatel35/diabetes_ml_classification

Predict diabetes using machine learning models. Experiment with logistic regression, decision trees, and random forests to achieve accurate predictions based on health indicators. Complete lifecycle of ML project included.

classification data-analysis data-science data-visualization descision-tree diabetes-prediction jupiter-notebook logistic-regression machine-learning model-evaluation open-source pandas pycharm-ide python random-forest scikit-learn

Last synced: 20 Jan 2026

https://github.com/hazim-hf/data-science

This course covers basic data science principles, Python programming, and the concept of big data and its types. It explores algorithms, methods, and analyses in data science with practical Python examples. Additionally, it highlights current data technologies for storing and archiving.

data-analysis data-wrangling time-series

Last synced: 04 Jul 2025

https://github.com/aroramrinaal/spotistats

Spotistats is a data analysis and visualization project based on your Spotify streaming history.

data-analysis numbers spotify spotify-history visualization

Last synced: 15 Mar 2025

https://github.com/0-mostafa-rezaee-0/sandwich_structures

Impact test of Sandwich Structures

composite-materials data-analysis r

Last synced: 09 Aug 2025

https://github.com/lexiortiz/advanced-data-analytics

Structured learning notes, code snippets, and key takeaways from the Google Advanced Data Analytics Professional Certificate. Serves as a personal reference for reinforcing concepts and as a resource for others on a similar learning journey.

data data-analysis data-engineering google python-3 sql

Last synced: 29 May 2026

https://github.com/syed-m-nofel/python-data-science-fundamentals

Python notebooks for data manipulation (Pandas/NumPy) and API workflows – from basics to practical examples.

api beginner-friendly data-analysis data-science http-requests jupyter-notebook numpy pandas pandas-dataframe python tutorial

Last synced: 03 May 2026

https://github.com/analysisbyvivek/Crime-data

Analyzes crime patterns across different areas, exploring factors such as crime type, weapon usage, demographic influences, and geographic distribution to uncover trends in frequency, correlations, and hotspots.

apache-superset data-analysis eda jupyter-notebook python

Last synced: 29 Jan 2026

https://github.com/thenazar9/user-behavior-email-campaign-analysis-sql

Analysis of user behavior and email campaign performance using BigQuery and Looker Studio, focusing on account creation trends, email engagement, and user segmentation.

analytics bigquery data-analysis data-visualization etl looker-studio sql structured-query-language

Last synced: 16 Oct 2025

https://github.com/amoghkori/working-with-apache-spark-mllib

Implemented Apache Spark MLLib to analyze a large car dataset, predict car selling prices, and gain insights into the car market.

amazon-web-services data-analysis data-visualization exploratory-data-analysis linear-regression machine-learning model-selection pyspark python random-forest sagemaker spark

Last synced: 13 Apr 2026

https://github.com/zachbateman/easy_plot

Easy Statistical Visualization in Python

data-analysis data-visualization graphics matplotlib python seaborn

Last synced: 18 Jan 2026

https://github.com/hassanislam463/data-cleaning-and-modelling-top-5-categories-analysis-forage

This project involves cleaning, merging, and analyzing datasets to identify the top 5 performing categories based on aggregate popularity scores. It includes cleaned datasets, a final merged dataset, visualizations, and a presentation summarizing the tasks and results. Tools used: Microsoft Excel, Python, and PowerPoint.

data-analysis data-visualization microsoft-excel

Last synced: 07 Jan 2026

https://github.com/junpenglao/spafv

SPAFV - Surface Profile Analysis for Free Viewing eye movement experiment in 2AFC task

data-analysis statistics temporal-logic

Last synced: 31 Mar 2025

https://github.com/anuragmudgal96/data-warehouse-project

Designing and implementing a modern data warehouse on SQL Server, covering ETL pipelines, dimensional modeling, and analytical reporting.

data-analysis data-engineering data-warehouse datawarehousing etl etl-job etl-pipeline sql sql-server

Last synced: 09 Oct 2025

https://github.com/mansogf/datascience_introduction

Data Science Introductions Practices

data-analysis data-science data-visualization graph

Last synced: 04 Apr 2025

https://github.com/ljadhav25/data-engineering-poc

This repository contains a beginner-level Data Engineering Proof of Concept (POC) project designed for practice. The objective is to provide hands-on experience with data engineering concepts, including data extraction, transformation, loading (ETL), and basic data analysis. This project is ideal for those looking to build foundational skills in da

data-analysis etl matplotlib numpy pandas python

Last synced: 13 Apr 2026

https://github.com/cyberoctane29/epa-carbon-monoxide-aqi-analysis

This project continues my EPA Air Quality AQI Analysis, focusing on carbon monoxide levels in EPA data. Using Python, I applied statistics, probability analysis, outlier detection, sampling, and hypothesis testing to assess pollution and health impacts. Leveraging Pandas, NumPy, SciPy, and Matplotlib, it supports environmental policy decisions.

data-analysis eda hypothesis-testing probability-distribution sampling sampling-distribution statistical-analysis

Last synced: 24 Mar 2025

https://github.com/wojtekdomino/titanic-eda

Exploratory Data Analysis (EDA) of Titanic dataset using Pandas, Matplotlib, and Seaborn.

data-analysis eda matplotlib pandas python seaborn

Last synced: 10 Jun 2025

https://github.com/extwiii/datascience-jhu

Ask the right questions, manipulate data sets, and create visualizations to communicate results - Coursera

biostatistics data-analysis data-science linear-regression multivariate-regression r r-programming toolbox visualization

Last synced: 05 Jul 2025

https://github.com/jameswrigley/laph

A node-based data analysis program.

cpp data-analysis nodes qml

Last synced: 05 Jun 2026

https://github.com/lunafrost-lab/berry-donut

Exploring berry combinations to produce Donut in Pokémon Legends: Z-A: Mega Dimensions.

data-analysis data-filtering parquet pokemon winforms

Last synced: 13 Jan 2026

https://github.com/manel15279/datamining-project

A university project that aims to explore various data mining techniques like Data Exploration, Association Rule Mining, Supervised and Unsupervised Learning, applied to real-world datasets, focusing on soil fertility analysis and COVID-19 cases evolution over time.

covid-19 data-analysis data-mining data-visualization datascience gradio machine-learning python soil-properties

Last synced: 10 Jun 2025

https://github.com/evan-dg31/data-science

Exploratory Data Analysis (EDA), Predictive Modeling (Supervised and Unsupervised), Regression, Classification, Clustering

classification clustering data-analysis data-science data-visualization machine-learning matplotlib numpy pandas python regression-analysis seaborn

Last synced: 13 Apr 2026

https://github.com/ibrahimhabibeg/national-university-of-singapore-sms-analysis

Analysis of SMS messages collected by the National University of Singapore

analytics data-analysis data-science nlp python

Last synced: 13 May 2026

https://github.com/lucaso21/euro-2021-player-stats-analysis

A short project analyzing stats for players at the Euro 2021 tournament.

data-analysis data-science r rvest tidyverse

Last synced: 16 Mar 2025

https://github.com/lijesh010/covid-19_global_analytics_power_bi_project

This repository is a data visualization project that offers an in-depth analysis of the Covid-19 pandemic using Microsoft Power BI. This interactive dashboard provides valuable insights into key metrics related to Covid-19 cases, deaths, recoveries, and more, helping users understand the global impact of the pandemic.

dashboard data-analysis data-visualization powerbi report

Last synced: 08 Jan 2026

https://github.com/singhrdeep/croppilot

CropPilot is a lightweight, Python-based command-line tool designed to help small-scale farmers, gardeners, and students manage crop data, track profits, and explore sustainable practices. Built for usability and extensibility.

agriculture data-analysis farm-management open-source python

Last synced: 25 Apr 2025

https://github.com/prakashjha1/new-analysis-using-llm-locally

An interactive news analysis tool built with Streamlit and local LLMs. This app allows users to analyze and gain insights from the latest news articles using advanced language models, all running locally. Explore trends, sentiment, and key topics with an intuitive interface.

artificial-intelligence data-analysis data-science llms ollama python streamlit

Last synced: 14 Mar 2025

https://github.com/kittonn/data-analysis-freecodecamp

freecodecamp - data analysis projects.

data-analysis freecodecamp

Last synced: 05 Apr 2025

https://github.com/mxagar/data_science_udacity

My personal notes, code and projects of the Udacity Data Science Nanodegree.

dashboard data-analysis data-engineering data-science machine-learning-pipelines

Last synced: 09 Apr 2025

https://github.com/matteospanio/speed-analysis

A project to analyze the internet speed

bash-script data-analysis

Last synced: 03 May 2026

https://github.com/hemangsharma/streamingcontentanalyzer

This Streamlit application provides an interactive dashboard for analyzing streaming content data. It allows users to explore movie and TV show ratings, distributions, temporal trends, and genre breakdowns through various visualizations and filters.

dashboard data-analysis data-science data-visualization python streamlit-dashboard streamlit-webapp

Last synced: 02 Apr 2025

https://github.com/syarwinaaa09/analyzing-crime-in-los-angeles

Exploratory data analysis of Los Angeles crime data with insights on temporal patterns, locations, and age demographics.

crime-data data-analysis eda los-angeles pandas public-safety python visualization

Last synced: 03 May 2026

https://github.com/mchirico/go_slicestore

Pull Data from Slice Store

data-analysis go ibm

Last synced: 16 Mar 2025

https://github.com/samruddhi3012/screen-time-analysis

Hi! This repo demonstrates a python project on Screen Time Analysis.

data-analysis data-visualization python

Last synced: 04 May 2026

https://github.com/marianamartiyns/rfm-cluster-analysis

Customer behavior and sales analysis, including data cleaning, RFM calculation, churn analysis and customer clustering.

cluster-analysis data-analysis data-cleaning data-visualization pyhton

Last synced: 16 Mar 2025

https://github.com/akashvarma26/data-analysis-on-olympics-csv-dataset

Data Analysis on Olympics dataset of csv format using re and Pandas in Jupyter notebook.

data-analysis jupyter-notebook pandas regex

Last synced: 02 May 2026

https://github.com/luminati-io/Walmart-dataset-samples

A sample dataset of over 1000 Walmart products, extracted using the Bright Data API, ideal for consumer market insights and competitor analysis.

api data-analysis dataset walmart walmart-scraper web-scraping

Last synced: 09 Apr 2025

https://github.com/reddyprasade/r-program

R is a programming language and free software environment for statistical computing and graphics supported by the R Foundation for Statistical Computing. The R language is widely used among statisticians and data miners for developing statistical software and data analysis.

data-analysis data-science r-programming

Last synced: 11 Apr 2026

https://github.com/shahriarha/sql

Structured query language

data-analysis mysql mysql-database sql

Last synced: 02 Sep 2025

https://github.com/leandrocollares/nyc-film-permits

NYC film permits: an exploratory data analysis

data-analysis data-visualization pandas plotly

Last synced: 05 Jul 2025

https://github.com/fatihilhan42/the-office-eda

Data analysis study of my favorite sitcom, The Office (US).

data-analysis data-science data-visualization fatihilhan office python sitcom

Last synced: 04 May 2026

https://github.com/devexpress-examples/wpf-pivot-grid-define-custom-cell-template-to-performing-data-editing

This example shows how to edit a cell with the cell editing template in Pivot Grid for WPF.

data-analysis dotnet dxpivotgrid pivot-grid pivot-grid-for-wpf wpf

Last synced: 02 May 2026

https://github.com/shoebjoarder/superstore

A Dash app to analyze Superstore dataset.

dashboard data-analysis data-visualization python-3

Last synced: 02 Apr 2025

https://github.com/shellynagar27/business-insights-360-project

A comprehensive Dashboard which provides better understanding of the business's market standing, key focus areas for optimization, underperforming customers, and year-wise financial insights, aiding in better inventory planning and performance tracking. Further it can be used in answering n number of why questions based on the situations.

dashboard data-analysis data-visualization dax-languague dax-studio excel performance-optimization power-bi reporting sql storage-manager

Last synced: 27 Jan 2026

https://github.com/nurulashraf/customer-segmentation-hierarchical-clustering

A customer segmentation project using hierarchical clustering to group customers based on their spending behaviour and demographics. This helps businesses identify patterns and create targeted marketing strategies.

business-analytics clustering-algorithm customer-segmentation data-analysis hierarchical-clustering machine-learning python unsupervised-learning

Last synced: 18 Apr 2025

https://github.com/soham7998/data-analysis-projects

My Data Analysis Projects which are completed by me and gain a hands on Experience from each project. the project showcase different Concepts , Visualization and many things.

data data-analysis data-science machine-learning nlp python soham visualization

Last synced: 04 May 2026

https://github.com/lopes51789/salaryanalysis

This salary dataset is a good candidate for descriptive analysis, and we can identify which demographics experience reduced or increased salaries. For example, we could explore the salary variations by gender, age, industry, and even years of prior work.

data-analysis json mysql python3 sql tableau

Last synced: 13 Apr 2026

https://github.com/mehedi-hassan81/mastercourse

Data analysis project analysing renewable energy production across 212 countries, visualizing trends with Tableau. Highlights China's dominance (2,894 TWh) and Paraguay's 100% renewable share.

data-analysis pandas python renewable-energy selenium tableau-dashboards tableau-public web-scraping

Last synced: 08 May 2026

https://github.com/kailenroa/dashboad-excel-huisprijzen

This project focuses on developing a dashboard powered by Funda to visualize house pricing in the Netherlands. The dashboard simplifies the home-buying process by allowing users to compare prices, energy labels, number of rooms, and square meters across different provinces, all in one interactive platform..

dashboard data-analysis excel house-prices

Last synced: 05 Jan 2026

https://github.com/annnieglez/nlp-stock-market-and-news

This project focuses on detecting fake news from news headlines using advanced Natural Language Processing (NLP) techniques. It combines sentiment analysis with news headlines embeddings, generated from Hugging Face transformer models, to train a binary classification model that distinguishes between real and fake news.

classification-model data-analysis embeddings machine-learning machine-learning-models nlp nlp-deep-learning nlp-machine-learning python scraping-websites sentiment-analysis

Last synced: 25 Apr 2026

https://github.com/mr-chang95/sf_data_visualization

In this personal project, I am interested in examining all of the active businesses in the San Francisco Bay Area while performing some simple data visualizations, mainly on categorical variables.

business data-analysis data-visualization jupyter-notebook pandas python san-francisco

Last synced: 04 May 2026

https://github.com/ravi-prakash1907/covid-19-china

A data-science research work to understand the growth rate of the novel Coronavirus.

china coronavirus covid-19 data-analysis data-mining data-science mathematical-modelling project r research research-paper

Last synced: 06 Sep 2025

https://github.com/chiragkumargohil/co2-emissions-data-analysis

A Python programme that analyses CO2 emission data from 1997 to 2010. This programme prints data, provides brief of a given year, displays and compares Year vs. Emission graphs for chosen countries, and generates a separate data file for chosen countries. It was a self-paced project that Guru 99 provided.

co2-emission data-analysis matplotlib python

Last synced: 28 Aug 2025

https://github.com/felpzreiz/stockdata_pipeline

Este projeto consiste no desenvolvimento de um pipeline de dados que consome informações financeiras de uma API da Bolsa de Valores Americana (StockData.org) para análise e tratamento. Utilizando Python e bibliotecas como pandas, matplotlib e pyarrow

api data-analysis data-science jupyter-notebook pandas python

Last synced: 19 Apr 2026