An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/codeonthespectrum/web-scrap

Este projeto realiza o web scraping da Wikipédia para obter dados sobre os municípios mais populosos do estado do Rio de Janeiro.

data-analysis data-visualization webscraping

Last synced: 16 Feb 2026

https://github.com/maazie-khan/olympics-data-enigeering

Worked with Azure Data Factory, Databricks, Data Lake Storage, and Synapse Analytics to build an ETL pipeline for processing and analyzing Olympic Games data from Kaggle.

azure big-data data-analysis dataengineering devops pipeline

Last synced: 13 May 2026

https://github.com/faint-liebfraumilch101/fraud-detection-sql-unsupervised

🕵️♂️ Detect fraud in bank transactions using SQL for feature engineering and Python's Isolation Forest for unsupervised anomaly detection.

anomaly-detection banking-data data-analysis data-science financial-analytics fraud-detection isolation-forest machine-learning portfolio-project python sql sqlite unsupervised-learning

Last synced: 07 May 2026

https://github.com/prasannnnn/real-time-share-price-scraping-and-analysis

The Stock Sentiment Analyzer is a web-based application built with Streamlit, BeautifulSoup, and Pandas to help users analyze the sentiment of a stock (BUY, SELL, or HOLD) based on its financial data. The tool extracts key financial metrics like Market Cap, Stock P/E, Dividend Yield, ROCE, ROE, and the 52-week High/Low from Screener.in.

beautifulsoup4 data-analysis python sentiment-analysis streamlit streamlit-dashboard webscraping

Last synced: 03 Aug 2025

https://github.com/darkdk123/house-valuation-model

A Challenge Project in a Boot-Camp to create a ML Model to predict the prices of houses in Boston Massachusetts from multiple parameters Using Multivariable Regression.

data-analysis data-science data-visualization matplotlib-pyplot multivariate-regression predictive-modeling statistics

Last synced: 07 Jul 2025

https://github.com/tashi-2004/apache-flink-spark-data-streaming

This project showcases a real-time data streaming pipeline using Apache Flink, Apache Spark, and Grafana. It streams data, stores it in Parquet format, and performs aggregations for insights, with seamless visualization via Grafana dashboards.

apache-flink apache-spark data-aggregation data-analysis data-science data-streaming data-visualization flink flink-stream-processing flink-streaming grafana-dashboard grafana-plugin pyflink python3

Last synced: 09 Feb 2026

https://github.com/niaid/genetic-linkage-analysis

Materials for ACE course on Genetic Linkage Analysis.

ace ace-uganda2020 analysis bcbb-training clinical data-analysis genetics ngs ngs-analysis

Last synced: 24 Jun 2025

https://github.com/mahdikh03/custumers_clustering_rmf

A data analysis project to implement RFM (Recency, Frequency, Monetary) analysis for customer segmentation and behavior analysis using the K-Means algorithm.

customer-segmentation data-analysis k-means-clustering unsupervised-learning

Last synced: 09 May 2025

https://github.com/muneeb1030/webscrapper_altnews

The project utilizes a combination of Python, Scrapy, and Selenium to navigate through the dynamic content of AltNews.in and collect valuable information for analysis and verification.

data-analysis data-collection python3 scrapy scrapy-spider selenium selenium-python

Last synced: 17 May 2026

https://github.com/macorisd/instagram-fake-account-analysis

A project in R focused on detecting fake Instagram accounts. It includes exploratory data analysis, data visualization, and analysis using three techniques: association rules, formal concept analysis, and regression. The results are presented in an interactive Quarto book.

data-analysis data-science data-visualization r

Last synced: 10 Jun 2025

https://github.com/0xnu/england-house-prices

Predict house prices for the next five years across all English local authorities.

data-analysis england england-house-prices housing-market housing-market-analysis predictive-modeling regression

Last synced: 03 Aug 2025

https://github.com/eslamdyab21/apara-data-gui

Custom application for Apara's data wrangling scripts, Technologies used are Qt-designer, PyQt5 for the GUI and Pandas, Numpy for the data work.

csv data data-analysis data-wrangling gui pandas pyqt5-desktop-application qt5-gui

Last synced: 17 May 2026

https://github.com/sadratehranian/prediction-of-covid-19-diagnosis

Build an algorithm in MATLAB using ML techniques to predict if a person is having COVID-19 or not depending on the existing medical conditions. Further research has been conducted on identifying the most suitable machine learning techniques and increase their prediction accuracy.

covid-19 data-analysis data-science data-visualization machine-learning matlab prediction visualization

Last synced: 11 Sep 2025

https://github.com/souravsuvarna/whatsapp-chat-analyzer-and-visualizer-web-application

The WhatsApp chat analyzer and visualizer uses NLP algorithms to analyze chat data, tracking usage patterns and presenting insights through visually appealing charts and graphs. It helps users understand communication patterns and behaviors on WhatsApp.

data-analysis data-science data-visualization python python3 streamlit

Last synced: 18 Apr 2026

https://github.com/shaikh-raj/data-science-portfolio

Data Science Portfolio of Raj Shaikh including Case Studies and Articles that I have completed that solve various business problems.

articles case-study data-analysis deep-learning machine-learning nlp statistics

Last synced: 20 Jul 2025

https://github.com/lauratrigo/codigo_roti

Análise de ROTI é uma ferramenta em MATLAB para processar e visualizar dados ionosféricos (ROTI) de múltiplas estações GNSS. Desenvolvido para pesquisas em geofísica espacial, o script gera gráficos temporais comparativos com filtros de qualidade e tratamento de dados faltantes. 📡

data-analysis geophysics image-processing matlab roti scientific-initiation

Last synced: 24 Jun 2025

https://github.com/arction/lcjs-example-0507-dashboardfiberanalysis

A demo application showcasing using LightningChart JS to visualize fiber analysis data.

area-plot area-series chart charts dashboard data-analysis demo heatmap javascript lcjs lightningchart-js performance visualization webgl

Last synced: 12 Mar 2025

https://github.com/arv-anshul/pw-api

Perform data analysis on PW Skills APIs. Made a web app using streamlit. See any course syllabus, analytics, quizzes and assignments.

api course data-analysis ineuron-ai physics-wallah project pw-skills python3 streamlit

Last synced: 18 Apr 2026

https://github.com/nishumehta/house-sales-analysis

House Sales Analysis Dashboard for King County, Washington, built with Tableau. Features interactive charts and maps to explore sales patterns, price distributions, and property conditions.

dashboard data-analysis data-visualization tableau tableau-dashboards tableau-public

Last synced: 11 Jan 2026

https://github.com/soumasish2005/ai-chatbot-using-snowflake

This project is a Streamlit application that allows users to upload a CSV file and ask questions about their data in natural language.

cloud data-analysis data-science data-visualization python snowflake streamlit

Last synced: 17 May 2026

https://github.com/tinaland101/python-api-challenge

This project involves analyzing weather data from cities around the world using the OpenWeatherMap API and creating visualizations to explore the relationship between weather variables and latitude.

api-integration-and-data-retrieval data-analysis data-collection-and-geospatial-analysis problem-solving-and-decision-making statistical-analysis

Last synced: 03 Mar 2025

https://github.com/satvikpraveen/pcc-vizforge

🎨 Personal data visualization toolkit generating synthetic datasets across multiple domains (random walks, dice simulations, weather patterns, earthquakes, GitHub analytics) with beautiful Matplotlib & Plotly visualizations. Includes Jupyter notebooks, interactive dashboards & statistical analysis. Perfect for learning data science! 🚀📊

analytics dashboard data-analysis data-generation data-science data-visualization github-analytics interactive-visualization jupyter-notebook matplotlib plotly probability python random-walk scientific-computing seismology statistical-analysis synthetic-data time-series weather-data

Last synced: 17 May 2026

https://github.com/balajimohan18/rafik-s-kitchen-data-analysis

The Project is about the Analysis of the Sales and Expenses Data of a Famous Fast-food Restaurant. This mainly focuses on gaining Insights that will boost the Future Sales and also Business Strategies it Improve the Profit Margins. Handled Tools are SQL, Python, Power BI, MS Office Tools.

business-analytics business-intelligence data-analysis data-analytics data-visualization eda ms-office powerbi-report powerpoint-presentations python sql-server

Last synced: 05 Apr 2026

https://github.com/srinibas-masanta/deloitte-forage-virtual-internship

This repository contains my work from the Deloitte Forage Virtual Internship, where I analyzed factory telemetry data in Tableau to identify machine breakdown patterns and assessed gender pay equality using Excel. From interactive dashboards to insightful classifications, this project showcases hands-on data analysis and visualization skills. 🚀📊

data-analysis data-visualization deloitte excel forage tableau

Last synced: 15 Jan 2026

https://github.com/mxagar/space_exploration

This repository is a collection of mini-projects and tutorials related to space images and geo-spatial data.

data-analysis deep-learning geospatial machine-learning

Last synced: 29 Sep 2025

https://github.com/gemaquejr/restaurant-orders

Projeto com o objetivo de aplicar os conceitos de POO e trabalhar com Set, Hashmap e Dict. Este projeto foi criado para avaliação final na seção 06 do módulo de ciência da computação do Curso de Desenvolvimento Web na Trybe.

data-analysis dict hashmap poo python set

Last synced: 30 Oct 2025

https://github.com/collins-kimotho/communicate-data-findings

Data Analysis Project: Investigating Factors Contributing to No-Show Appointments in Medical Records

data-analysis data-science data-visualization dataset pandas python

Last synced: 17 May 2026

https://github.com/rohitdusane/interactive-ibd-analysis-dashboard-with-dash-plotly

This repository showcases a project that combines data analysis and visualization through Dash and Plotly. The goal of this project is to offer an efficient and user-friendly way to integrate robust data analysis with an interactive web-based interface.

clinical-research data-analysis exploratory-data-analysis pyhton statistical-reports

Last synced: 24 Jun 2025

https://github.com/mhkamel/ecommerce-targeting-system

A Flask-based E-Commerce Targeting System that provides customer segmentation and personalized product recommendations. Users can upload structured interaction data for analysis, receive AI-driven recommendations, and gain insights into user behavior. The application is built with Flask, Pandas, Scikit-Learn, and integrates an interactive web inter

ai bootstrap csv-processing customer-segmentation data-analysis data-science e-commerce flask machine-learning pandas python recommendation-system scikit-learn user-behavior web-application

Last synced: 09 Apr 2026

https://github.com/brownred/python-and-sql

Python and SQL (postgreSQL & mySQL) for data analysis.

data-analysis databases python3 sql

Last synced: 11 May 2026

https://github.com/sotirismos/pattern-recognition-labs

Lab exercises and quizzes for Pattern Recognition course, Auth winter semester 20-21

classification clustering data-analysis machine-learning pattern-recognition

Last synced: 17 Jun 2025

https://github.com/rachelresende/regressaolinear

Este repositório é destinado as aulas de regressão linear que realizei em um curso da Udemy sobre o assunto em 2025. Sendo um curso de reciclagem, pois estudei esse tratamento também em 2020 em um curso de estatística da Alura.

data-analysis data-science linear-regression

Last synced: 11 Sep 2025

https://github.com/haroontrailblazer/user_behavioral_analysis

Social Media User Engagement Analysis Using Power BI

data-analysis data-science data-visualization database powerbi

Last synced: 29 Mar 2025

https://github.com/mainak-97/pizza-sales-analysis-project

Pizza Sales Analysis Project: This project optimizes a pizza restaurant's operations by analyzing demand patterns, revenue, and efficiency, providing insights to enhance profitability, streamline production, and improve customer satisfaction.

business-analytics business-intelligence dashboards data-analysis operations-optimization peak-hours power-bi restaurant-analysis revenue-analysis

Last synced: 06 Jan 2026

https://github.com/edumoraes1/journey_active_users

Segmentação de base via SQL para jornada de vendedores ativos

bq data-analysis salesforce sql

Last synced: 02 Feb 2026

https://github.com/hari00887/analysis-of-global-terrorism

Analysis of Global Terrorism Using AHP A quantitative study of GTD data to assess attack severity and evolution across time and space.

data-analysis data-visualization powerbi

Last synced: 02 Mar 2026

https://github.com/nitins17/tableauvisualizations

Visualizations I created while learning to work with Tableau

data-analysis data-science data-visualization tableau visualization

Last synced: 01 Mar 2026

https://github.com/parth-jatav/ipl-data-analysis-mentorness

This project uses Power BI to analyze IPL cricket data, featuring dashboards with insights on batting averages, strike rates, and player roles. It identifies the top 11 players and includes navigable pages focused on specific roles like Anchors, Finishers, and All-Rounders.

dashboard data-analysis ipl ipl-dashboard powerbi

Last synced: 07 Mar 2026

https://github.com/monish-nallagondalla/cement_strength_prediction

The Cement Strength Prediction project uses machine learning to predict the compressive strength of cement based on its components, such as Cement, Fly Ash, Water, Superplasticizer, Coarse Aggregate, Fine Aggregate, and Age. The goal is to forecast compressive strength (MPa) for optimized cement production and quality control.

cement-strength-prediction construction-industry data-analysis data-preprocessing data-science data-visualization feature-engineering machine-learning predictive-modeling python regression-analysis scikit-learn

Last synced: 11 May 2026

https://github.com/anandanraju/sql_data_analysis_projects

About This Two projects involves analyzing Pizza Data & Walmart Sales data using SQL to identify insights and trends. The aim is to do data-driven approaches to understand sales performance, identify key factors influencing sales, and provide actionable recommendations for business improvement.

csv data-analysis data-management mysql pizza sql sql-schema walmart

Last synced: 24 Jun 2025

https://github.com/dionixius7/titanic-disaster-ml-model

This project predicts the survival of passengers on the Titanic by using Kaggle Titanic Disaster Dataset. The dataset contains information related to passengers, such as age, gender, and class. Different machine learning algorithms have been applied for this predictive model to accomplish an accurate prediction that will define the survival chances

data-analysis data-science data-visualization eda knn-classifier machine-learning neural-network python scikit-learn svm tensorflow titanic-kaggle titanic-survival-prediction

Last synced: 07 Feb 2026

https://github.com/surajsanap/employee-resigning-analysis-powerbi-dashboard-data-analytics

Effortlessly analyze employee resignations with our concise Power BI dashboard. Download the XML file, open the dashboard, and gain quick insights into resignation trends and reasons for departure. Streamlined and effective

dashboard data-analysis data-analytics powerbi python xml-dataset

Last synced: 08 May 2025

https://github.com/arv-anshul/pw-experience-portal

Data Analysis on PW Skills and Ineuron.ai experience/internship portal.

data-analysis experience ineuron-ai internship physics-wallah portal pw-skills python3

Last synced: 16 Apr 2026

https://github.com/saksham-jain177/data-analysis

A collection of data analysis and machine learning projects across various datasets. Explore predictive modeling, data visualization, and insights from real-world data. Projects include sales predictions, disease detection, customer segmentation, and more.

api data data-analysis data-cleaning data-science data-visualization datamodeling dataset datasets exploratory-data-analysis python python3 web-scraping youtube-api

Last synced: 01 May 2026

https://github.com/sangampaudel530/bhutan-rainfall-explorer

Interactive dashboard to explore, analyze, and forecast rainfall trends in Bhutan (2021–2025) using Streamlit, Plotly, and Prophet.

bhutan climate-change data-analysis prophet-facebook rainfall-prediction streamlit visualization

Last synced: 17 May 2026

https://github.com/rahulsm20/car-data

A data analytics project that involves analyzing a car dataset that includes information on various car brands, years, prices, mileage, and fuel types, in order to gain insights into the car market.

data-analysis data-analytics matplotlib numpy pandas python

Last synced: 09 Apr 2026

https://github.com/wesleych3n/my-work-log

A self project to record and analyze work's check in/out time on google sheet with telegram bot.

data-analysis telegram-bot worklog

Last synced: 20 Jul 2025

https://github.com/rahulsm20/trackbyte

A full-stack web application that helps users keep track of their playlist and provides analytics based on their music taste. Built using React, Node.js, Express.js, MySQL and Bootstrap.

bootstrap data-analysis expressjs mysql nodejs reactjs sql

Last synced: 07 Apr 2026

https://github.com/asghar-rizvi/hotel_reservation_data_analysis

This project involves a comprehensive data analysis of a hotel reservation dataset using Excel. The primary focus is on examining reservation cancellations. Through detailed analysis and visual representation.

dashboard dashboard-templates data-analysis data-analysis-excel data-representation data-science excel

Last synced: 02 Mar 2026

https://github.com/pylena/movies-prediction

This project focuses on clustering movies based on their genres using machine learning techniques. By analyzing genre data, the model groups similar movies together, facilitating recommendations and insights into genre-based patterns.

data-analysis machine-learning render streamlit unsupervised-learning

Last synced: 18 May 2026

https://github.com/judyway2/de-data

A brief analysis on schools ARR data

data-analysis jupyter-notebook

Last synced: 11 May 2025

https://github.com/natanel567/university_machine_learning_project

Machine Learning final project Tel Aviv University

data-analysis jupyter-notebook machine-learning

Last synced: 11 May 2025

https://github.com/prakhar-code/british_airways_review_analysis

Analysis of the British Airways Reviews by Customers, filtered by several different factors such as food, entertainment, services, etc.

data-analysis data-cleaning excel tableau-dashboards tableau-public tableau-visualization

Last synced: 15 Jan 2026

https://github.com/deliprofesor/2024-salary-analysis-for-machine-learning-engineers

This project analyzes a salary dataset to explore factors like experience, company size, remote work ratio, and country. It includes data cleaning, group analysis, visualizations, and machine learning models (linear regression and Random Forest) to predict salaries and identify key features.

data-analysis data-cleaning data-visualization ggplot2 linear-regression machine-learning plotly r-programming random-forest salary-prediction salary-trends

Last synced: 07 Mar 2026

https://github.com/lyubov0406/data_analyst_portfolio

В репозитории собраны пет-проекты, демонстрирующие мои навыки в аналитике данных

data-analysis matplotlib numpy pandas portfolio python scipy seaborn sql tableau visualization

Last synced: 09 Apr 2026

https://github.com/marcosvbras/udacity-nd109-project-titanic

Data Analysis project to Udacity Nanodegree's course: Artificial Intelligence Programming with Python.

data-analysis data-analyst-nanodegree data-science jupyter-notebook machine-learning python udacity

Last synced: 19 May 2026

https://github.com/ziaeemehr/neuro_toolbox

Single Header File C++ library for analysis of neurophysiological and simulated data.

data-analysis data-science signal-processing synchronization

Last synced: 21 Jul 2025

https://github.com/rafinha0rafinha/web-analyzer-backend

(Legacy) This is the backend for Mazaoro SARLU's lead magnet "Web Analyzer". This project analyzes websites using Google Lighthouse and returns a detailed report consumed by the frontend.

azure-app-service azure-devops chartjs cicd data-analysis data-science data-visualization express flask hacktoberfest lighthouse numpy sentiment-analysis vader-sentiment-analyzer

Last synced: 10 Apr 2026

https://github.com/mfakhriazhar/stock-price-prediction

Stock prices are highly volatile and influenced by various factors, making accurate prediction a major challenge in investment decisions.

data-analysis data-science deep-learning python recurrent-neural-networks

Last synced: 18 May 2026

https://github.com/aelmah/ibm-applied-ds

Find here : A collection of projects I've done throught Applied DS Specialization !

applied-data-science-capstone beautifulsoup data-analysis data-visualization machine-learning python-for-ai-and-data-science web-scraping

Last synced: 11 Sep 2025

https://github.com/fatihilhan42/wnba-draft-player-dataanalysis-1997-2022-with-python

In this project, the statistics of the players in the WNBA drafts from 1997 to 2022 were examined. The data in the dataset, which you can find in the repo, was first organized using data cleaning algorithms. These cleaned data were then graphically extracted using data visualization algorithms.

data-analysis data-analysis-python data-visualization jupyter-notebook python

Last synced: 17 May 2026

https://github.com/spring-0/netflix-media-data-analysis

Exploring and analyzing Netflix data to uncover trends through data visualization and statistical analysis.

data-analysis netflix

Last synced: 27 Mar 2025

https://github.com/jasonsu131/cps188-term-project

A data analysis program developed in C to extract information about diabetic patients across Canada from a governmental spreadsheet available online. The program showcases summaries and averages based on the extracted data.

c data-analysis data-statictics file-reading

Last synced: 28 Mar 2025

https://github.com/dhruvil-26/sql-projects

This repository contains SQL projects focusing on data analysis and insights. Currently, it includes: 1. RSVP Movies Analysis - SQL queries to analyze movie trends, ratings, and genres. 2. Pizza Sales Analysis - SQL queries to explore sales patterns, customer behavior, and profitability in a pizza business.

analysis data-analysis database mysql pizza-sales-analysis rdbms rsvp sql

Last synced: 17 May 2026

https://github.com/leoz0214/foodhygieneanalysis

Data analysis regarding Food Hygiene Ratings in England, Wales and Northern Ireland.

data-analysis food-hygiene-ratings pandas python

Last synced: 17 May 2026

https://github.com/PanosChatzi/Healthcare_and_Bioinformatics_Analyses

This repo contains the final assignments of the Data Analyst bootcamp by Workearly. Python and SQL were used to complete the assignments.

data-analysis data-cleaning data-visualisation jupyter matplotlib pandas python seaborn

Last synced: 05 Aug 2025

https://github.com/velut/thesis-sw

Software and datasets used in the "Cost-effective and Scalable Activity Matching using Crowdsourcing" thesis

bpmn cost crowdflower crowdsourcing data-analysis dataset performance-analysis plotting-algorithms r thesis

Last synced: 19 Jun 2025

https://github.com/mae776569/weratedogs-wrangling

Wrangling WeRateDogs Twitter data to create interesting and trustworthy analyses and visualizations

data-analysis data-science data-visualization tweets twitter-api

Last synced: 25 Jan 2026

https://github.com/dadvaiahpavan/ai-data-scientist-

AI-powered tool for dataset analysis, featuring data preprocessing, classification, regression, anomaly detection, and text analysis. Built with scikit-learn, pandas, and Plotly for visualization. Includes an interactive Streamlit web interface for real-time data analysis.

ai anomaly-detection classification data-analysis data-science machine-learning panda plotu regression scikit-learn sentiment-analysis streamlit

Last synced: 03 May 2026

https://github.com/iamber12/stack-overflow-analysis-using-stack-exchange-api

This Python-based project utilizes the Stack Exchange API to analyze StackOverflow data, focusing on the 'R' and 'Dot Net' programming tags.

data-analysis data-visualization python stack-exchange-api

Last synced: 20 Jul 2025

https://github.com/sandergi/ekichabi

A digital phonebook to connect sustenance farmers in Tanzania. Works via USSD so farmers without an internet connection can use it (via their Telecom). Build with Django in Python and a MySQL database. This is a public copy of the private repo with user information stripped.

android data-analysis ict4d research ussd

Last synced: 14 May 2026

https://github.com/mfakhriazhar/ecom-qtt-prediction

In e-commerce, understanding seasonal sales trends and best-selling products is critical to business strategy. However, companies often struggle with predicting sales, determining factors that influence sales (discounts, product categories, locations), and optimizing stock and marketing.

data-analysis data-science data-visualization e-commerce-project eda machine-learning python

Last synced: 19 May 2026

https://github.com/c17an/data-analysis-exercise

데이터 분석 수련장

data-analysis python3

Last synced: 05 Apr 2025

https://github.com/kenwuqianghao/scotiabank-datathon-2023

Code and data analysis done for 2023 Scotiabank Datathon

data-analysis fraud-detection jupyter-notebook python

Last synced: 18 May 2026

https://github.com/uofuepibio/intro-r-ggplot2-quarto

Introduction to R via ggplot2 and quarto

data-analysis ggplot2 quarto r r-programming rmarkdown workshop

Last synced: 29 Jun 2026

https://github.com/dsite42/simple_data_visualizer

This is a simple tool to visualize data for a quick Exploratory Data Analysis (EDA). You can create various plot types as seaborn or plotly plot via a GUI in multiple windows (RelPlot, PairPlot, JointPlot, DisPlot, CatPlot, LmPlot, 3DPlot).

data-analysis data-science data-visualisation data-visualization eda exploratory-data-analysis plotly seaborn

Last synced: 12 May 2026