An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/harshals499/ecosecure-visualization

Data visualization project using Qlik to analyze sales performance for EcoSecure Systems.

business-intelligence data-analysis data-visualization qlik-sense sales-analysis

Last synced: 12 Jun 2026

https://github.com/Narius2030/Hive-DataWarehouse-Analysis

Implement a Hive data warehouse to store meaningful data, apply Machine Learning like Clustering or Regression for dealing with business problems

apache-hadoop apache-hive data-analysis etl-pipeline hiveql machine-learning statistics

Last synced: 12 Aug 2025

https://github.com/tqhungdev0605/crawl_200_jd_dataanalyst

Automate job data scraping for 200 Data Analyst postings on https://vn.indeed.com using Python

data-analysis jupyter-notebook python3 scraping selenium

Last synced: 11 Apr 2026

https://github.com/md-emon-hasan/1-simple-stock-price-ml-app

A simple mahcine learning application for stock prices, demonstrating data preprocessing, model training, and deployment using scikit-learn.

data-analysis data-science eda ml-app streamlit-webapp time-series time-series-analysis webapp

Last synced: 31 May 2026

https://github.com/faris771/investigate_a_dataset

This repository contains a Jupyter Notebook that investigates a dataset using data analysis techniques.

data-analysis

Last synced: 29 Apr 2026

https://github.com/sujitmahapatra/blinkit-analysis-dashboard-powerbi

Power BI dashboard analyzing Blinkit sales data to address challenges in Quick Commerce, examining outlet types, establishment years, and customer demographics to uncover insights and trends.

data-analysis powerbi powerbi-dashboards

Last synced: 28 Jan 2026

https://github.com/varshithdupati/yelp-business-analysis

Big Data analysis on Yelp reviews/businesses for Arizona. Using Hadoop, Spark, PySpark.

arizona-state-university big-data big-data-analytics data-analysis hadoop pyspark spark yelp

Last synced: 04 May 2026

https://github.com/reusjimenez/powerbi-data-analysis

Dashboards interactivos desarrollados en Power BI, orientados al análisis de datos y visualización efectiva. 📊

business-intelligence dashboards data-analysis dax power-query powerbi

Last synced: 28 Jan 2026

https://github.com/nirmalvatsyayan/data-analyst-nanodegree

Udacity data analyst nanodegree project submissions and learning

data-analysis numpy pandas python statistics udacity-data-analyst-nanodegree

Last synced: 12 Apr 2026

https://github.com/jku-vds-lab/loops

Loops is a JupyterLab extension to support iterative and exploratory data analysis in computational notebooks.

data-analysis data-science data-visualization jupyter jupyter-notebook notebook provenance

Last synced: 29 Jan 2026

https://github.com/titanscouting/tra-analysis

Titan Robotics 2022 Strategy Team Analysis Repository

data-analysis frc frc-scouting hacktoberfest python

Last synced: 29 Jan 2026

https://github.com/rcv911/lyapunov-indicators

Calculating Lyapunov indicators with multiprocessing in Python

data-analysis lyapunov lyapunov-indicators multiprocessing

Last synced: 18 Jan 2026

https://github.com/robinmillford/analytics_for_fashion_supply_management

This Streamlit dashboard provides a comprehensive analysis of supply chain data, focusing on key metrics such as production volumes, stock levels, order quantities, revenue, manufacturing costs, lead times, shipping costs, transportation routes, risk factors, and sustainability factors

dashboard data-analysis data-visualization streamlit supply-chain-management

Last synced: 07 Sep 2025

https://github.com/rani-sikdar/pwc-virtual-internship-powerbi

Comprehensive Power BI dashboards showcasing insights on Call Centre Trends, Customer Retention, and Diversity & Inclusion to drive business impact.

business-analytics business-intelligence data-analysis data-cleaning data-visualization interactive interactive-visualizations powerbi

Last synced: 07 Jan 2026

https://github.com/bcko/ud-da-stroopeffect

Udacity Data Analyst Nanodegree Project : Test a Perceptual Phenomenon (Stroop Effect)

data-analysis data-analyst-nanodegree stroop-effect udacity udacity-data-analyst-nanodegree

Last synced: 04 Jul 2025

https://github.com/as16082023/restaurant-order-analysis

Analyzing order data to identify the most and least popular menu items and types of cuisine

data-analysis maven-analytics mysql restaurant-order sql

Last synced: 10 Apr 2025

https://github.com/nafisalawalidris/hotel-reservation-analysis

This project analyses hotel reservation data from Resort Hotels and City Hotels to uncover booking trends and insights. Utilising Microsoft Excel for initial data cleaning, PostgreSQL for data analysis and Tableau for creating visualisations, the project aims to deliver a comprehensive dashboard that highlights key metrics such as booking status.

data-analysis data-cleaning data-visualisation hotel-reservations microsoft-excel postgresql sql tableau tableau-dashboards tableau-desktop tableau-public

Last synced: 06 Jul 2025

https://github.com/arielle0222/data_analysis

📊 Data analysis projects for autonomous driving and smart mobility engineering using Python and SQL.

autonomous-driving composite data-analysis electric-vehicles environmental-data python visualizatoin

Last synced: 30 Apr 2026

https://github.com/tbep-tech/verified-wbids

Materials for evaluation of verified WBIDs in the Tampa Bay watershed

data-analysis open-science tampa-bay tbep water-quality

Last synced: 19 Feb 2026

https://github.com/matthewgrosman/messenger-analytics

Project that ingests Facebook Messenger conversations and generates analytics.

analytics data-analysis excel facebook facebook-messenger java mongodb

Last synced: 15 Apr 2025

https://github.com/virajbhutada/walmart-retail-analyzer

Gain valuable insights into retail sales with the "Walmart Retail Performance Dashboard" in MS Excel. This user-friendly tool facilitates an in-depth analysis of key sales metrics, providing a comprehensive view of Walmart's performance. Make data-driven decisions for informed and strategic business outcomes.

analytics data-analysis data-science data-visualization excel insights interactive-visualizations performance-analysis retail-sales walmart

Last synced: 04 Mar 2026

https://github.com/alrza2003/alrza2003.github.io

This repository contains the source files for my personal portfolio website. It highlights my background as a data analyst and radiology student, and showcases real-world projects, tools I use, and ways to connect with me. The site is based on a pre-built template that I customized to reflect my profile and experience.

data data-analysis data-visualization portfolio portfolio-website python

Last synced: 30 Apr 2026

https://github.com/rogernet/desafio-profissional-produto-data-driven

Ajudar a formar Analistas de Produto, PMs e Gestores de Negócio capazes de tomar decisões estratégicas baseadas em dados.

data-analysis data-science data-visualization product

Last synced: 23 Jun 2026

https://github.com/avijit-jana/redbus-data_scraping_and_filtering_with_streamlit_app

A Streamlit-based application leveraging Selenium to automate data scraping from Redbus, enabling efficient collection, analysis, and visualization of bus travel data for improved operational efficiency and strategic planning in the transportation industry.

automation dashboard data-analysis data-visualization datadrivendecisions python3 redbus selenium-python streamlit-application webscrapping

Last synced: 15 Mar 2025

https://github.com/swarnim1812/crime_project

AI-Driven Crime Forecasting Across Indian States — A pioneering machine learning project that harnesses time series modeling (SARIMAX, Ridge Regression) to uncover patterns and forecast crime trends using real-world multi-state temporal and socio-economic data.

analytics crime-locator crime-prediction data-analysis deep-learning machine-learning prophet-facebook sarimax-model time-series-forecasting

Last synced: 31 Jan 2026

https://github.com/karthikmprakash/911-call-dataanalysis

Data Analysis of Emergency (911) Calls: Fire, Traffic, EMS for Montgomery County, PA

911-call-analysis data-analysis data-visualization python3 united-states-data

Last synced: 10 May 2026

https://github.com/hetuvpatel/research-chatgpt

Research and data analysis project evaluating the social, ethical, and educational impacts of ChatGPT using survey-driven insights and Python-powered data analysis. 📚🤖

data-analysis matplotlib pandas python seaborn

Last synced: 01 May 2026

https://github.com/kirkalyn13/open-signal-report-generator

Script used to generate results/summary, including the trends of flagged provinces, from the raw excel data file,

data-analysis data-science data-visualization matplotlib numpy pandas python

Last synced: 19 Jun 2026

https://github.com/scarblase/salary-comparison

Submission for the DataCamp Salary Competition(1 level). 🏆

data data-analysis data-science data-visualization engineering python sql structured-data

Last synced: 01 May 2026

https://github.com/silvano315/med-physics

This would be a repository about medical physics. It will based on 4 paths: medical data to analyse, SOTA programs for medical purposes, computer vision and eXplainability.

computer-vision data-analysis data-science explainable-ai medical-imaging medical-physics medical-tool

Last synced: 24 Mar 2025

https://github.com/bretsw/beds

Bookdown project for an open education resource (OER) book: Becoming Educational Data Scientists

analytics data-analysis data-analytics data-science

Last synced: 31 Mar 2025

https://github.com/lisa-ho/breadit

Respository for scraping and analysing data from the Reddit/Sourdough community to explore lockdown baking trends.

data-analysis data-viz nltk python reddit-api sentiment-analysis web-scraping

Last synced: 01 May 2026

https://github.com/whitehathackerpr/data-visualization-tool

This is a Python-based web application that allows users to upload datasets, analyze data, and create visualizations interactively. The tool is designed for ease of use and provides a simple interface to perform basic data analysis and generate visualizations

data data-analysis data-visualization python python3

Last synced: 05 Sep 2025

https://github.com/rara-ch/data-analysis-portfolio

This repository to store my data analytics projects, showcasing my skills in SQL and Python.

data-analysis mathematics matplotlib numpy pandas portfolio probability python seaborn sql statistics

Last synced: 12 Mar 2025

https://github.com/com-480-data-visualization/project-2023-choo-choo-data-darlings

This repository contains the source code for our data visualization project, an interactive platform designed to explore the intricate Swiss transportation network. Developed by the Choo Choo Data Darlings team at EPFL, the project provides an in-depth view into the vast array of Swiss transportation operations, including trains, buses, and trams.

boats buses data-analysis data-science data-visualisation data-visualization epfl metro public-transport public-transportation switzerland trains trams

Last synced: 01 May 2026

https://github.com/mathpfreitas/top-hits-spotify-2000-to-2019-

# 🎧 Top Hits Spotify (2000 - 2019)Explore music trends from 2000 to 2019 with this dataset of songs, artists, and genres. Use the insights to understand what makes a hit in today's music landscape. 🐙💻

analysis analytics chart data-analysis data-visualization exploratory-data-analysis hits interactivedashboards jupyter-notebook matplotlib music musical numpy pandas plotly python timeseries track-hits

Last synced: 01 Jul 2025

https://github.com/elissorokin/data-analyst-portfolio-rus

Это репозиторий, в котором я демонстрирую свои навыки, делюсь проектами и отслеживаю прогресс в области анализа данных и Data Science.

ab-testing data data-analysis datalense matplotlib numpy pandas plotly portfolio postgresql python scipy seaborn sql statistical-analysis

Last synced: 25 Feb 2026

https://github.com/nafisalawalidris/data-analysis-with-python

This repo features Jupyter Notebook labs for learning data analysis with Python. Explore data acquisition, wrangling, visualization, modeling, and evaluation. Enhance your skills in Python data analysis.

data-acquisition data-analysis data-science data-wrangling exploratory-data-analysis feature-engineering machine-learning model-development model-evaluation-and-refinement pandas

Last synced: 02 May 2026

https://github.com/emso-exe/reclamacoes_de_consumidores_com_empresa_de_telecomunicacoes

Projeto de análise de reclamações de consumidores com empresa de telecomunicações no 1º semestre de 2021 com base nos dados do site consumidor.gov.br.

analise-de-dados ciencia-de-dados data-analysis data-science datascience python python-3 python3

Last synced: 02 May 2026

https://github.com/ac12644/fractz-ai-data-analyst

Analyze data and gain insights instantly with FRACTZ's AI Data Analyst. Flexible, fast analytics tailored to your needs.

ai data-analysis data-visualization

Last synced: 01 Feb 2026

https://github.com/melogabriel/nubank-expenses-analysis

This project consolidates monthly credit card statement data from Nubank into a single CSV file using Python, enabling data visualization through a Google Sheets dashboard in Looker Studio.

data-analysis data-visualization googlesheets lookerstudio pandas python

Last synced: 02 May 2026

https://github.com/allanotieno254/codsoft

This repository showcases a series of data science projects completed during an internship with CODESOFT. Each project utilizes Python and various machine learning techniques to solve specific problems in data analysis, classification, regression, and predictive modeling.

classification data-analysis data-science feature-engineering machine-learning model-evaluation predictive-modeling python-programming regression

Last synced: 15 May 2025

https://github.com/samruddhi3012/customer-behavior-analysis

Hello there! This repo contains python project based on E-Commerce Customer Behavior analysis.

customer-segmentation customerbehavior data-analysis ecommerce python

Last synced: 02 May 2026

https://github.com/m-faizan-mahmood/detailed-exploratory-data-analysis-eda-marketing-recomendations.

This project focuses on cleaning, preprocessing, and analyzing data using Pandas and NumPy. Key steps include handling missing values, removing outliers, feature engineering, and exploratory data analysis (EDA). Visualizations with Matplotlib and Seaborn highlight trends in customer spending, campaign performance, and product sales.

big-data data-analysis data-processing data-science eda exploratory-data-analysis numpy pandas python

Last synced: 11 Apr 2026

https://github.com/leosimoes/udacity-starbucks

Project 3 of the Udacity Machine Learning Engineer Nanodegree Program. Data analysis and machine learning application to Starbukcs data.

aws-iam aws-s3 aws-sagemaker data-analysis data-science machine-learning python

Last synced: 24 Mar 2025

https://github.com/jpotter80/notebook-examples

This repository demonstrates a systematic approach to cleaning and standardizing e-commerce product data using DuckDB. The notebook serves as a detailed walkthrough of our data cleaning methodology, showcasing how we handle common data quality challenges in e-commerce datasets.

data-analysis data-cleaning jupyter-notebook

Last synced: 12 Jun 2025

https://github.com/pngo1997/axa-xl-insurance-bi-dashboard

Provides a comprehensive analysis of insurance submissions, approvals, compliance rates, and profitability for AXA XL Insurance.

bi-analytics bi-dashboard business-analytics data-analysis filtering performance-analysis powerbi segmentation visualization

Last synced: 08 Feb 2026

https://github.com/rayyan9477/youtube-spam-detection-with-flask-and-machine-learning

This is a web application built using Flask that detects spam comments on YouTube using a Naive Bayes classifier. It leverages techniques such as CountVectorizer for feature extraction and scikit-learn for machine learning. The application reads data from a CSV file and predicts whether a comment is spam or not.

data-analysis data-science machine-learning nlp-machine-learning spam-detection

Last synced: 21 Sep 2025

https://github.com/archived-blueprints/amazonathena-blueprints

Simplified blueprints for building data pipelines with Amazon Athena.

amazon-athena athena cli data-analysis data-engineering data-science elt etl

Last synced: 29 Jul 2025

https://github.com/ryanfranklin237/data-visualization-python

A tool that allows you to visualize data from a csv or excel file in a graph or charts form

data-analysis data-science data-visualization matplotlib pandas-dataframe python

Last synced: 11 Jun 2026

https://github.com/dwidevelopes/database-input-pelanggran-mahasiswa

Menginput data Mahasiswa Yang Melakukan Pelanggran yang siap di data dan di hukum Dan juga siap Terkena Sanksi

aplikasi aplikasi-sekolah data data-analysis database input-method mahasiswa sekolah siswa siswi website

Last synced: 02 May 2026

https://github.com/anandanraju/youtube-data-api-model

The YouTube Analytics API enables you to generate custom reports containing YouTube Analytics data. The API supports reports for channels and for content owners. Report fields are characterized as either dimensions or metrics

analytics data-analysis data-science metrics model python telemetry youtube youtube-api

Last synced: 03 May 2026

https://github.com/ivanildobarauna-dev/api-to-dataframe

Python library that simplifies obtaining data from API endpoints by converting them directly into Pandas DataFrames. This library offers robust features, including retry strategies for failed requests.

data-analysis data-analytics data-engineering library pypi-packages python

Last synced: 06 Mar 2025

https://github.com/sing-group/bew

Public repository for Biofilmfs Experiment Workbench (BEW).

aibench data-analysis data-management java jfreechart workbench

Last synced: 03 Jul 2025

https://github.com/shivshah19/movie-recommendation-system

This Movie Recommendation System is designed to provide personalized movie recommendations based on user preferences.

cosine-similarity data-analysis machine-learning pandas python streamlit

Last synced: 03 May 2026

https://github.com/incubrain/awesome-maharashtra-data

A collection of datasets specific to Maharashtra, India. WIP

ai artificial-intelligence data data-analysis data-science datasets maharashtra marathi

Last synced: 23 May 2026

https://github.com/aryansharma5/data-visualization-and-thorough-analysis

comprehensive guide for data analysis and visualization

data-analysis data-visualization

Last synced: 18 Mar 2025

https://github.com/chouaib-629/customersegmentation

Hadoop-based Customer Segmentation project using the Online Retail Dataset. Implements MapReduce for processing and Python for preprocessing to uncover customer purchasing patterns for targeted marketing.

big-data customer-segmentation data-analysis data-science distributed-computing hadoop hadoop-mapreduce java mapreduce marketing-analytics python

Last synced: 04 May 2026

https://github.com/kyleprotho/analysistoolbox

Analysis Tool Box (i.e. "analysistoolbox") is a collection of tools in Python for data collection and processing, statisitics, analytics, and intelligence analysis.

analytics data-analysis open-source-intelligence python3 r research snippets statistics

Last synced: 22 Aug 2025

https://github.com/sunnybibyan/random_data_generation

A project that generates a dataset using various statistical distributions (Normal, Uniform, Exponential, Random Integers, and Binomial) and performs data analysis. Includes visualizations and an option to export the data as a CSV file.

data-analysis data-visualization python random-data-generation statistics streamlit-webapp

Last synced: 13 Jun 2026

https://github.com/flexmonster/svelte-flexmonster

Svelte wrapper for Flexmonster Pivot Table & Charts

data-analysis data-visualization frontend pivot-tables svelte sveltekit

Last synced: 27 Feb 2026

https://github.com/akash1070/data-science-virtual-internship-by-anz

Exploratory data analysis and prediction of annual salary for customers from the dataset provided by ANZ.

data-analysis data-science predictive-analytics presentation-slides

Last synced: 24 Mar 2025

https://github.com/viztruth/google-play-store-data-analysis

This repository contains all the materials of my final project 'Google Play store Data Analysis' for the 'Telling Stories with Data' course at PES University.

data-analysis data-visualization

Last synced: 21 Aug 2025

https://github.com/ahmad-ali-rafique/handwritten-digit-recognition-mnist

This project demonstrates a complete pipeline for recognizing handwritten digits using the MNIST dataset. The project is implemented in Python using Jupyter Notebook, and it covers data loading, preprocessing, model training, and performance evaluation of a Fully Connected Neural Network (FCNN).

ai artificial-intelligence data data-analysis datascience deep-learning deep-neural-networks fcnn fully-connected-network machine-learning machine-learning-algorithms ml modeling

Last synced: 09 Jun 2026

https://github.com/happybono/sonatasmooth

Provides three different noise reduction algorithms for smoothing out data : Rectangular Averaging, Binomial Median Filtering, and Binomial Averaging. It processes data from a list and displays the results in another list.

algorithms average binomial binomial-coefficient binomial-theorem calibration csharp data-analysis data-calibration dynamic-noise-reduction median noise-algorithms noise-reduction noise-reduction-kernel outliers rectangular-averaging windows-desktop windows-desktop-application windows-forms winforms

Last synced: 30 Oct 2025

https://github.com/chelseammatta/nopd-cad-data-analysis

Analysis of 911 call data from New Orleans' 3rd & 4th police districts (2019-2022) using BigQuery

911-calls 911-data bigquery cad-data crime-analysis data-analysis emergency-response new-orleans public-safety sql

Last synced: 01 Jul 2025

https://github.com/arhcoder/base-hackathon-2022

💸 Sistema que analiza las facturas de compra-venta de una empresa de importaciones y exportaciones, y crea una base de conocimiento con la que crea sugerencias de abastecimiento para las empresas clientes de Banco BASE, con el fin de ahorrarles dinero.

algorithms bank companies data-analysis decision-making exportation hackaton importation javascript mysql python suggestions

Last synced: 16 Apr 2026

https://github.com/cworld1/novel-analysis

A simple project for analyzing Chinese novels

data-analysis novel

Last synced: 17 Mar 2025

https://github.com/garcane/london-housing-price-dashboard

This Excel-based Housing Visual Dashboard provides a comprehensive view of average house prices across various boroughs in London from 1996 to 2013. The dashboard is designed to offer insights into housing market trends and price variations across different areas of London over time.

data data-analysis data-visualization excel visual

Last synced: 13 Feb 2026

https://github.com/shuddha2021/stellar-candidate-selector

A sophisticated candidate selection algorithm leveraging multi-criteria analysis and machine learning to identify top software engineering candidates. This tool features flexible filtering, score adjustment, and detailed visualizations to streamline the recruitment process.

candidate-selection data-analysis data-visualization machine-learning pandas plotting-in-python python python-data-analysis recruitment scikit-learn

Last synced: 05 May 2026

https://github.com/nikhilash45/power-bi-vsualisation-of-joins

In This Power Bi Report User Can Visualis Join By Themselves , and it is easy to understand joins now.

business-analytics business-intelligence data data-analysis data-visualization joins powerbi sql visualization

Last synced: 19 Mar 2026

https://github.com/emredurukn/data-analysis

Example notebooks for analyzing data

data-analysis data-visualization python

Last synced: 12 May 2026

https://github.com/prime-infinity/type-one

Software to visualize and analyze GitHub repos based on certain statistics such as stars, forks and issues

data-analysis data-visualization

Last synced: 03 Feb 2026

https://github.com/virajbhutada/google-stock-price-forecasting-lstm

Analyzing and predicting Google's stock prices through detailed data exploration and advanced LSTM models. This project involves data preprocessing, creating time-series sequences, constructing and training LSTM networks, and evaluating their performance to forecast future stock prices utilizing Python and Machine Learning libraries.

data-analysis data-science data-visualization future-prediction google-dataset google-stock-price-prediction google-stocks lstm-model lstm-neural-network machine-learning machine-learning-models matplotlib model-building model-training numpy python stock-forecasting

Last synced: 27 Feb 2025

https://github.com/githubuseraccountamazing/the-amari-project

a project in which I attempted to push some of the limits of stable-diffusion while taking some data along the way

ai ai-generated-images bash data-analysis machine-learning stable-diffusion textual-inversion

Last synced: 05 May 2026

https://github.com/chiemekaifemegbulem/make.com

A curated portfolio of Make.com automation workflows engineered to streamline operations and ensure precision. Featuring solutions for e-commerce, data integration, marketing, and bespoke business processes, it exemplifies expertise in designing scalable, efficient, and dependable automated systems.

api automate automated automation business data-analysis data-science dataengineering integration integromat make scenario software-engineering upwork workflows

Last synced: 15 Feb 2026

https://github.com/mr-vozhyk/karpov.courses-study

Часть заданий, мини-проектов и финальный проект от karpov.courses

airflow data-analysis git python sql statistics

Last synced: 05 May 2026