An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/swat1563/recommendation-system

This repository features a recommendation system and analytics engine using datasets on users, organizations, contents, contacts, events, and recommendations. It includes data preprocessing, building a recommendation system, and creating visual reports with Power BI.

analytics data-analysis data-visualization engine kaggle numpy pandas powerbi powerbi-dashboards powerbi-desktop powerbi-reports python recommendation-engine recommendation-system recommender-systems scikit-learn scipy

Last synced: 07 Jan 2026

https://github.com/anandanraju/sql_data_analysis_projects

About This Two projects involves analyzing Pizza Data & Walmart Sales data using SQL to identify insights and trends. The aim is to do data-driven approaches to understand sales performance, identify key factors influencing sales, and provide actionable recommendations for business improvement.

csv data-analysis data-management mysql pizza sql sql-schema walmart

Last synced: 24 Jun 2025

https://github.com/josedanielchg/nyc-schools-test-scores-exploration

DataCamp project analyzing NYC public school test scores to identify top math-performing schools, the best overall SAT scores, and borough-level variability using Python and pandas

data-analysis jupyter-notebook python

Last synced: 19 Mar 2025

https://github.com/monish-nallagondalla/cement_strength_prediction

The Cement Strength Prediction project uses machine learning to predict the compressive strength of cement based on its components, such as Cement, Fly Ash, Water, Superplasticizer, Coarse Aggregate, Fine Aggregate, and Age. The goal is to forecast compressive strength (MPa) for optimized cement production and quality control.

cement-strength-prediction construction-industry data-analysis data-preprocessing data-science data-visualization feature-engineering machine-learning predictive-modeling python regression-analysis scikit-learn

Last synced: 11 May 2026

https://github.com/nitins17/tableauvisualizations

Visualizations I created while learning to work with Tableau

data-analysis data-science data-visualization tableau visualization

Last synced: 01 Mar 2026

https://github.com/edumoraes1/journey_active_users

Segmentação de base via SQL para jornada de vendedores ativos

bq data-analysis salesforce sql

Last synced: 02 Feb 2026

https://github.com/brownred/python-and-sql

Python and SQL (postgreSQL & mySQL) for data analysis.

data-analysis databases python3 sql

Last synced: 11 May 2026

https://github.com/collins-kimotho/communicate-data-findings

Data Analysis Project: Investigating Factors Contributing to No-Show Appointments in Medical Records

data-analysis data-science data-visualization dataset pandas python

Last synced: 17 May 2026

https://github.com/badranalyst/startup-expansion-analysis-with-pandas-matplotlib-and-power-bi

Analyzes startup growth and expansion factors using Pandas for data analysis and Matplotlib for visualizations. Complements findings with data visualizations in Power BI, providing actionable insights into funding and market trends.

dashboard data-analysis data-visualization dataset matplotlib matplotlib-pyplot pandas power-bi powerbi

Last synced: 16 May 2026

https://github.com/gemaquejr/restaurant-orders

Projeto com o objetivo de aplicar os conceitos de POO e trabalhar com Set, Hashmap e Dict. Este projeto foi criado para avaliação final na seção 06 do módulo de ciência da computação do Curso de Desenvolvimento Web na Trybe.

data-analysis dict hashmap poo python set

Last synced: 30 Oct 2025

https://github.com/estevan-ulian/py-agent-voice

Um projeto para lidar com interações de voz entre humano e agente de I.A. permitindo a leitura e análise de dados de um arquivo CSV.

agent-based-modeling data-analysis python3 whisper-ai

Last synced: 11 Apr 2025

https://github.com/fmind/malpop

Rank the popularity of malware applications by their occurrence on VirusTotal

data-analysis malware popularity ranking virustotal

Last synced: 11 Apr 2025

https://github.com/czesctuklap/sustainable-fashion-database-analysis

This project, analyzes a dataset of sustainable fashion trends for 2024. It includes data preprocessing, exploration, visualization, and insights on environmental impact factors such as carbon footprint, water usage, waste production, and sustainability practices.

data-analysis data-visualization database dataset keggle sustainable-fashion

Last synced: 30 Apr 2026

https://github.com/felipe-veas/visor-sueldos-publicos

Herramienta interactiva para visualizar y analizar remuneraciones del sector público en Chile, construida con Streamlit.

audit chile data-analysis python streamlit transparency

Last synced: 16 May 2026

https://github.com/satvikpraveen/pcc-vizforge

🎨 Personal data visualization toolkit generating synthetic datasets across multiple domains (random walks, dice simulations, weather patterns, earthquakes, GitHub analytics) with beautiful Matplotlib & Plotly visualizations. Includes Jupyter notebooks, interactive dashboards & statistical analysis. Perfect for learning data science! 🚀📊

analytics dashboard data-analysis data-generation data-science data-visualization github-analytics interactive-visualization jupyter-notebook matplotlib plotly probability python random-walk scientific-computing seismology statistical-analysis synthetic-data time-series weather-data

Last synced: 17 May 2026

https://github.com/abidshafee/google.colaboratory_projects

This repository contains the collections of interactive python notebooks (ipynb) that are some of my projects on Data Science, Machine Learning (ML), and Natural Language Processing (NLP).

colaboratory data-analysis data-science lstm machine-learning nlp statistics time-series

Last synced: 09 Jul 2025

https://github.com/tinaland101/python-api-challenge

This project involves analyzing weather data from cities around the world using the OpenWeatherMap API and creating visualizations to explore the relationship between weather variables and latitude.

api-integration-and-data-retrieval data-analysis data-collection-and-geospatial-analysis problem-solving-and-decision-making statistical-analysis

Last synced: 03 Mar 2025

https://github.com/betkh/datascieneinpython

Jupiter Notebook files

data-analysis data-visualization

Last synced: 16 Jun 2025

https://github.com/nishumehta/house-sales-analysis

House Sales Analysis Dashboard for King County, Washington, built with Tableau. Features interactive charts and maps to explore sales patterns, price distributions, and property conditions.

dashboard data-analysis data-visualization tableau tableau-dashboards tableau-public

Last synced: 11 Jan 2026

https://github.com/arv-anshul/pw-api

Perform data analysis on PW Skills APIs. Made a web app using streamlit. See any course syllabus, analytics, quizzes and assignments.

api course data-analysis ineuron-ai physics-wallah project pw-skills python3 streamlit

Last synced: 18 Apr 2026

https://github.com/qorah/vic-edu-housing-insights

Analysis of education outcomes and housing affordability in Victoria, Australia.

data-analysis jupyter-notebook

Last synced: 18 Mar 2025

https://github.com/lauratrigo/codigo_roti

Análise de ROTI é uma ferramenta em MATLAB para processar e visualizar dados ionosféricos (ROTI) de múltiplas estações GNSS. Desenvolvido para pesquisas em geofísica espacial, o script gera gráficos temporais comparativos com filtros de qualidade e tratamento de dados faltantes. 📡

data-analysis geophysics image-processing matlab roti scientific-initiation

Last synced: 24 Jun 2025

https://github.com/sadratehranian/prediction-of-covid-19-diagnosis

Build an algorithm in MATLAB using ML techniques to predict if a person is having COVID-19 or not depending on the existing medical conditions. Further research has been conducted on identifying the most suitable machine learning techniques and increase their prediction accuracy.

covid-19 data-analysis data-science data-visualization machine-learning matlab prediction visualization

Last synced: 11 Sep 2025

https://github.com/ebrizzzz/data-visualization-project-using-tableau

A data visualization project for the Visual Data Analysis course (Spring Term 2025) at the University of Skövde. This project explores the factors influencing national happiness scores across different global regions from 2005 to 2022.

analytics data data-analysis data-science data-visualization python regression tableau

Last synced: 16 Jun 2025

https://github.com/mahdikh03/custumers_clustering_rmf

A data analysis project to implement RFM (Recency, Frequency, Monetary) analysis for customer segmentation and behavior analysis using the K-Means algorithm.

customer-segmentation data-analysis k-means-clustering unsupervised-learning

Last synced: 09 May 2025

https://github.com/iamsainikhil/us-births-analysis

Analysis of US-Births during 1994-2003 based on CDC-NCHS data set.

data-analysis python

Last synced: 16 May 2026

https://github.com/niaid/genetic-linkage-analysis

Materials for ACE course on Genetic Linkage Analysis.

ace ace-uganda2020 analysis bcbb-training clinical data-analysis genetics ngs ngs-analysis

Last synced: 24 Jun 2025

https://github.com/tknishh/investing-platform

An investing platform application to help users get information and analyze various foreign currency assets. The investing platform uses an ETL pipeline to insert new batches of Forex data once a day.

data-analysis investing-platform pipeline

Last synced: 18 Mar 2025

https://github.com/muneeb1030/webscrapper_altnews

The project utilizes a combination of Python, Scrapy, and Selenium to navigate through the dynamic content of AltNews.in and collect valuable information for analysis and verification.

data-analysis data-collection python3 scrapy scrapy-spider selenium selenium-python

Last synced: 17 May 2026

https://github.com/macorisd/instagram-fake-account-analysis

A project in R focused on detecting fake Instagram accounts. It includes exploratory data analysis, data visualization, and analysis using three techniques: association rules, formal concept analysis, and regression. The results are presented in an interactive Quarto book.

data-analysis data-science data-visualization r

Last synced: 10 Jun 2025

https://github.com/eslamdyab21/apara-data-gui

Custom application for Apara's data wrangling scripts, Technologies used are Qt-designer, PyQt5 for the GUI and Pandas, Numpy for the data work.

csv data data-analysis data-wrangling gui pandas pyqt5-desktop-application qt5-gui

Last synced: 17 May 2026

https://github.com/brevex/hotel-booking-demand-data-analysis

Data analysis in Python of demand for urban hotels and resorts showing their causes and relationships

data-analysis data-science hotel-booking-analysis kaggle python

Last synced: 08 May 2026

https://github.com/nehul1149/olympic-data-analysis

This project is an interactive data visualization and analytics platform for exploring historical Olympic Games data. Built with Python and Streamlit, it offers an in-depth analysis of medal tallies, athlete statistics, and country-wise performance trends, providing users with powerful insights into the world's biggest sporting event.

analysis data-analysis data-science data-visualization matplotlib python streamlit

Last synced: 18 May 2026

https://github.com/shaikh-raj/data-science-portfolio

Data Science Portfolio of Raj Shaikh including Case Studies and Articles that I have completed that solve various business problems.

articles case-study data-analysis deep-learning machine-learning nlp statistics

Last synced: 20 Jul 2025

https://github.com/arction/lcjs-example-0507-dashboardfiberanalysis

A demo application showcasing using LightningChart JS to visualize fiber analysis data.

area-plot area-series chart charts dashboard data-analysis demo heatmap javascript lcjs lightningchart-js performance visualization webgl

Last synced: 12 Mar 2025

https://github.com/thecoderpinar/globalwarmingforecast

🌍 Global Warming Forecast Tool An advanced tool for analyzing and forecasting climate trends using ARIMA and Prophet models, with interactive visualizations and scenario simulations.

arima climate-change data-analysis environmental-science forecasting global-warming machine-learning prophet streamlit time-series-analysis visualization

Last synced: 27 Mar 2025

https://github.com/soumasish2005/ai-chatbot-using-snowflake

This project is a Streamlit application that allows users to upload a CSV file and ask questions about their data in natural language.

cloud data-analysis data-science data-visualization python snowflake streamlit

Last synced: 17 May 2026

https://github.com/balajimohan18/rafik-s-kitchen-data-analysis

The Project is about the Analysis of the Sales and Expenses Data of a Famous Fast-food Restaurant. This mainly focuses on gaining Insights that will boost the Future Sales and also Business Strategies it Improve the Profit Margins. Handled Tools are SQL, Python, Power BI, MS Office Tools.

business-analytics business-intelligence data-analysis data-analytics data-visualization eda ms-office powerbi-report powerpoint-presentations python sql-server

Last synced: 05 Apr 2026

https://github.com/srinibas-masanta/deloitte-forage-virtual-internship

This repository contains my work from the Deloitte Forage Virtual Internship, where I analyzed factory telemetry data in Tableau to identify machine breakdown patterns and assessed gender pay equality using Excel. From interactive dashboards to insightful classifications, this project showcases hands-on data analysis and visualization skills. 🚀📊

data-analysis data-visualization deloitte excel forage tableau

Last synced: 15 Jan 2026

https://github.com/darshan1924/house-price-pridiction

This repository contains a machine learning project for predicting house prices based on various features, including geographical coordinates. The project includes data preprocessing steps to handle# House Price Prediction Project

data-analysis data-preprocessing house-prices jupyter-notebook machine-learning prediction

Last synced: 27 Mar 2025

https://github.com/mosalem149/pythonutilities

A collection of Python scripts for common utility tasks including file manipulation, word counting, longest word detection, and grade categorization. Perfect for quick and easy solutions to everyday programming problems.

data-analysis educational-tools file-io file-manipulation grade-calculation python text-analysis text-processing utility word-counting

Last synced: 15 May 2026

https://github.com/spshah1701/world-development-indicators

Analysis of World Development Indicators (WDI) using big data technologies, specifically Databricks, Apache Spark, and Scala.

apache-spark big-data data-analysis spark-sql

Last synced: 17 Mar 2025

https://github.com/rohitdusane/interactive-ibd-analysis-dashboard-with-dash-plotly

This repository showcases a project that combines data analysis and visualization through Dash and Plotly. The goal of this project is to offer an efficient and user-friendly way to integrate robust data analysis with an interactive web-based interface.

clinical-research data-analysis exploratory-data-analysis pyhton statistical-reports

Last synced: 24 Jun 2025

https://github.com/sotirismos/pattern-recognition-labs

Lab exercises and quizzes for Pattern Recognition course, Auth winter semester 20-21

classification clustering data-analysis machine-learning pattern-recognition

Last synced: 17 Jun 2025

https://github.com/rachelresende/regressaolinear

Este repositório é destinado as aulas de regressão linear que realizei em um curso da Udemy sobre o assunto em 2025. Sendo um curso de reciclagem, pois estudei esse tratamento também em 2020 em um curso de estatística da Alura.

data-analysis data-science linear-regression

Last synced: 11 Sep 2025

https://github.com/haroontrailblazer/user_behavioral_analysis

Social Media User Engagement Analysis Using Power BI

data-analysis data-science data-visualization database powerbi

Last synced: 29 Mar 2025

https://github.com/mainak-97/pizza-sales-analysis-project

Pizza Sales Analysis Project: This project optimizes a pizza restaurant's operations by analyzing demand patterns, revenue, and efficiency, providing insights to enhance profitability, streamline production, and improve customer satisfaction.

business-analytics business-intelligence dashboards data-analysis operations-optimization peak-hours power-bi restaurant-analysis revenue-analysis

Last synced: 06 Jan 2026

https://github.com/pramodkondur/dataspark-end-to-end-dataanalytics

Cleaned, performed EDA and stored data in MySQL. Queried, and analyzed data, uncovering opportunities to drive revenue growth and optimize operations, with a potential revenue growth of $30.03 million. Reported key insights using Power BI.

data-analysis data-visualization eda powerbi python sql

Last synced: 21 May 2026

https://github.com/ljadhav25/knn-algorithm-data-science-

This repository contains a project demonstrating the implementation and application of the K-Nearest Neighbors (K-NN) algorithm in Data Science. The objective is to provide a comprehensive understanding of the K-NN algorithm, including data preprocessing, model training, evaluation, and visualization of results. This project is ideal for beginners

data-analysis data-science knn-classification machine-learning matplotlib-pyplot numpy pandas-library seaborn

Last synced: 16 Apr 2026

https://github.com/parth-jatav/ipl-data-analysis-mentorness

This project uses Power BI to analyze IPL cricket data, featuring dashboards with insights on batting averages, strike rates, and player roles. It identifies the top 11 players and includes navigable pages focused on specific roles like Anchors, Finishers, and All-Rounders.

dashboard data-analysis ipl ipl-dashboard powerbi

Last synced: 07 Mar 2026

https://github.com/mindlessmuse666/iris-knn

Проект демонстрирует применение алгоритма k-ближайших соседей (KNN) для классификации набора данных Iris. Включает загрузку данных, обучение модели, оценку производительности и визуализацию результатов с использованием библиотек Pandas, Scikit-learn, Matplotlib, Seaborn и Plotly.

algorithm classification data-analysis data-visualization iris-dataset knn lazy-learning machine-learning python scikit-learn

Last synced: 17 Aug 2025

https://github.com/mindlessmuse666/apartment-price-predictor

Python-проект по прогнозированию стоимости аренды квартир с помощью линейной регрессии. Практическая работа по теме: "Основы машинного обучения" дисциплины "МДК 13.01: Основы применения методов искусственного интеллекта в программировании".

apartment-price-prediction data-analysis data-science linear-regression linear-regression-models machine-learning matplotlib python regression sklearn unit-testing

Last synced: 11 Apr 2026

https://github.com/emmarhoffmann/analysis-of-student-debt-among-first-generation-college-students

Explores the financial landscape of first-generation college students, analyzing patterns in student debt based on factors like median income, net price of attendance, and enrollment size.

data-analysis first-generation-college-students r statistical-models

Last synced: 17 Mar 2025

https://github.com/emmarhoffmann/analysis-of-california-real-estate-market-factors-influencing-home-prices

Investigates how home size, number of bedrooms, and bathrooms influence home prices, with comparisons across California, New York, New Jersey, and Pennsylvania.

data-analysis r real-estate statistical-models

Last synced: 17 Mar 2025

https://github.com/janashanaa/flightanalysis

This Jupyter Notebook presents an exploratory data analysis of data derived from a flight booking website.

data-analysis data-visualization exploratory-data-analysis jupyter-notebook python

Last synced: 15 May 2026

https://github.com/chingu-voyages/v47-tier3-team-30

An easily accessible tool for calculating electricity-related carbon emissions, along with insights for reducing environmental impact. | Voyage-47 | https://chingu.io/ | Twitter: https://twitter.com/ChinguCollabs

carbon-emissions carbon-footprint data-analysis data-engineering data-science

Last synced: 10 May 2026

https://github.com/rahulsm20/trackbyte

A full-stack web application that helps users keep track of their playlist and provides analytics based on their music taste. Built using React, Node.js, Express.js, MySQL and Bootstrap.

bootstrap data-analysis expressjs mysql nodejs reactjs sql

Last synced: 07 Apr 2026

https://github.com/pylena/movies-prediction

This project focuses on clustering movies based on their genres using machine learning techniques. By analyzing genre data, the model groups similar movies together, facilitating recommendations and insights into genre-based patterns.

data-analysis machine-learning render streamlit unsupervised-learning

Last synced: 18 May 2026

https://github.com/judyway2/de-data

A brief analysis on schools ARR data

data-analysis jupyter-notebook

Last synced: 11 May 2025

https://github.com/natanel567/university_machine_learning_project

Machine Learning final project Tel Aviv University

data-analysis jupyter-notebook machine-learning

Last synced: 11 May 2025

https://github.com/prakhar-code/british_airways_review_analysis

Analysis of the British Airways Reviews by Customers, filtered by several different factors such as food, entertainment, services, etc.

data-analysis data-cleaning excel tableau-dashboards tableau-public tableau-visualization

Last synced: 15 Jan 2026

https://github.com/satyacoder29/comparison-of-region-based-sales-tableau

The region-based sales comparison analyzes sales performance across different regions. It identifies trends, top-performing regions, and areas needing improvement by comparing metrics like revenue, growth rate, and product demand. This analysis helps optimize sales strategies and resource allocation for better performance.

data-analysis data-cleaning data-collection data-visualization powerquerym relationships tableau tableau-desktop unions

Last synced: 02 Feb 2026

https://github.com/ziaeemehr/neuro_toolbox

Single Header File C++ library for analysis of neurophysiological and simulated data.

data-analysis data-science signal-processing synchronization

Last synced: 21 Jul 2025

https://github.com/rafinha0rafinha/web-analyzer-backend

(Legacy) This is the backend for Mazaoro SARLU's lead magnet "Web Analyzer". This project analyzes websites using Google Lighthouse and returns a detailed report consumed by the frontend.

azure-app-service azure-devops chartjs cicd data-analysis data-science data-visualization express flask hacktoberfest lighthouse numpy sentiment-analysis vader-sentiment-analyzer

Last synced: 10 Apr 2026

https://github.com/mfakhriazhar/stock-price-prediction

Stock prices are highly volatile and influenced by various factors, making accurate prediction a major challenge in investment decisions.

data-analysis data-science deep-learning python recurrent-neural-networks

Last synced: 18 May 2026

https://github.com/dina-hosny/sparkify---data-modeling-with-cassandra

Sparkify - Data Modeling with Cassandra - Udacity Data Engineering Expert Track.

cassandra cql data-analysis data-engineering data-modeling data-warehousing etl python

Last synced: 11 Apr 2026

https://github.com/spring-0/netflix-media-data-analysis

Exploring and analyzing Netflix data to uncover trends through data visualization and statistical analysis.

data-analysis netflix

Last synced: 27 Mar 2025

https://github.com/jasonsu131/cps188-term-project

A data analysis program developed in C to extract information about diabetic patients across Canada from a governmental spreadsheet available online. The program showcases summaries and averages based on the extracted data.

c data-analysis data-statictics file-reading

Last synced: 28 Mar 2025

https://github.com/sciencesar-labs/py485-final-project

ROOT-based muon data analysis using Python & Jupyter – final project for PY485E @ CERN

cern computational-physics data-analysis jupyter-notebook muons python root uproot

Last synced: 15 May 2026

https://github.com/velut/thesis-sw

Software and datasets used in the "Cost-effective and Scalable Activity Matching using Crowdsourcing" thesis

bpmn cost crowdflower crowdsourcing data-analysis dataset performance-analysis plotting-algorithms r thesis

Last synced: 19 Jun 2025

https://github.com/mae776569/weratedogs-wrangling

Wrangling WeRateDogs Twitter data to create interesting and trustworthy analyses and visualizations

data-analysis data-science data-visualization tweets twitter-api

Last synced: 25 Jan 2026

https://github.com/jonek/pv-city-mastr

Extract and analyze data about photovoltaic systems in Germany

data-analysis germany jupyter-notebook pandas photovolatic-power photovoltaic

Last synced: 11 May 2026

https://github.com/mfakhriazhar/ecom-qtt-prediction

In e-commerce, understanding seasonal sales trends and best-selling products is critical to business strategy. However, companies often struggle with predicting sales, determining factors that influence sales (discounts, product categories, locations), and optimizing stock and marketing.

data-analysis data-science data-visualization e-commerce-project eda machine-learning python

Last synced: 19 May 2026

https://github.com/c17an/data-analysis-exercise

데이터 분석 수련장

data-analysis python3

Last synced: 05 Apr 2025

https://github.com/kenwuqianghao/scotiabank-datathon-2023

Code and data analysis done for 2023 Scotiabank Datathon

data-analysis fraud-detection jupyter-notebook python

Last synced: 18 May 2026

https://github.com/lord3008/instances-of-data-analysis

This repository of mine shows my work on data analysis of various projects that I made. I feel data analysis is the very key to investigate a solution. Further more it enlightens the direction towards model building.

data data-analysis

Last synced: 03 Mar 2025

https://github.com/yash22222/web-scraping-for-data-analysis-predictive-model-on-customer-data

Utilized web scraping for customer feedback at Air India, conducting robust data analysis, and applying machine learning for predictive modeling. Drove data-driven decisions, enhancing services, and elevating customer satisfaction. Expertise in web scraping, analysis, and predictive modeling for actionable insights.

data-analysis data-preprocessing data-science data-visualization exploratory-data-analysis machine-learning powerbi random-forest-classifier sentiment-analysis tableau web-scraping

Last synced: 30 May 2026

https://github.com/sabdikay/telco-customer-churn-analysis-ibm-dataset

This project explores customer churn trends for a company in California using an IBM dataset. Built in a Jupyter Notebook, it employs pandas, NumPy, matplotlib, seaborn, plotly, and scipy to clean, analyze, and visualize data. Through statistical tests and interactive maps, it uncovers key drivers behind customer cancellations

business-intelligence customer-churn data-analysis data-analysis-python data-visualization exploratory-data-analysis jupyter-noteboook matplotlib numpy pandas plotly predictive-modeling python scipy seaborn statistical-analysis

Last synced: 07 Apr 2026