An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/mmzong/gee_lifestyleeffectsonhypertension

Generalized Estimating Equations (GEE), Quasi-likelihood under the Independence Model Criterion (QIC), Longitudinal data, Embedded box plots within violin plots with hypertension risk categories, spaghetti plots, aggregate line plots, histograms, faceted-area plots, box and jitter plots. Investigating the impact of lifestyle on health.

aggregate-line-plot area-faceted-plots box-plots data-analysis data-manipulation data-science data-visualization generalized-estimating-equations histograms jitter-plots longitudinal-data qic quasi-likelihoods r spaghetti-plots violin-plots

Last synced: 29 Jul 2025

https://github.com/ginga1402/car_price_prediction

Predict the price of a car using MS Excel.

college-project data-analysis excel linear-regression

Last synced: 30 Mar 2025

https://github.com/chaedoll/teamproject-foreignerreport

국내 외국인 대상 인프라 개선을 위한 보고서 (Report on improving infrastructure for foreigners)

data-analysis python

Last synced: 25 Feb 2025

https://github.com/mr-chang95/datascience_airbnb

Data Science Project for Udacity's Data Scientist Program. Using Python in Jupyter Notebook.

airbnb data-analysis data-science data-visualization jupyter-notebook numpy pandas python sklearn

Last synced: 08 Apr 2026

https://github.com/myke003/data-analysis-projects

This repository serves as a collection of all my projects.

data-analysis jupyter-notebook powerbi

Last synced: 14 Mar 2025

https://github.com/vishal-038/real_estate_price_prediction

The Real Estate Price Prediction project aims to develop a machine learning model to predict house prices based on various features

data-analysis data-science data-visualization machine-learning python

Last synced: 21 May 2026

https://github.com/easonsyc/kc-house-price-prediction

Prediction for House Price in King County.

data-analysis jupyter-notebook machine-learning python

Last synced: 21 May 2026

https://github.com/garcane/credit-card-transactions-fraud-detection-project

The Credit Card Transactions Fraud Detection Project repository is designed to analyse and detect fraudulent transactions in credit card data.

data-analysis postgresql sql

Last synced: 03 Feb 2026

https://github.com/ishmal793/basic-python-

Beginner-friendly Python code examples and exercises – a strong foundation for aspiring data analysts.

data-analysis data-analytics learning-python-code problem-solving python-basics python-for-beginners

Last synced: 23 Jul 2025

https://github.com/nishumehta/british-airways-reviews-analysis

This project analyzes British Airways reviews using Tableau to create an interactive dashboard. The dashboard visualizes average ratings across multiple metrics and trends over time.

dashboard data-analysis data-visualization tableau tableau-public

Last synced: 12 Jan 2026

https://github.com/achronus/data-exploration

A repository dedicated to interesting data exploration projects I've completed

data-analysis exploratory-data-analysis machine-learning matplotlib pandas python scikit-learn seaborn

Last synced: 02 Jan 2026

https://github.com/saymyname1337/bachelor-s-thesis

Bachelor's thesis of a student of the MPEI of Shevts G. V.

data-analysis ml python

Last synced: 23 Jul 2025

https://github.com/sdley/logiciel-de-deliberation-uam-2022

Del-Annuel est logiciel de deliberation annuelle des ecoles superieures ou universités

data-analysis pandas python tkinter-gui

Last synced: 08 May 2026

https://github.com/maazie-khan/austin-housing-insights-powerbi

Worked with a real estate dataset, we will build a tool to evaluate trends and drivers of house prices around Austin, Texas.

dashboard data-analysis data-science data-visualization database powerbi

Last synced: 02 Jan 2026

https://github.com/akshaypratapsingh09/zomato-blogs-all-links-dataset

Engineering / Culture / Blogs Data gathered for Educational and Learning purposes from Zomato's Blogs and spreading the better problem solving Methodologies adapted by Modern Unicorns

data-analysis dataset regex selenium webdriver zomato-data-analysis

Last synced: 06 Apr 2025

https://github.com/tushar2704/hiring-process-analytics

In this project, I am analyzing hiring process data to gain insights from about records of previous hires within a multinational company. By analyzing this data, I am aiming to uncover valuable trends and information about the company's hiring process, which can contribute to making informed decisions and improvements for the future.

data-analysis data-cleaning data-science data-wrangling excel tushar2704

Last synced: 25 Jan 2026

https://github.com/tushar2704/consumables_sales_dashboard

Welcome to the Consumable Sales Dashboard, a powerful and intuitive data visualization tool built using Power BI. This dashboard offers a comprehensive view of sales data for consumable products, allowing you to quickly and easily analyze performance and identify trends.

dashboard data-analysis data-analytics data-science excel postgresql powerbi streamlit-tushar2704 tushar2704

Last synced: 04 Nov 2025

https://github.com/omari-kd/environmental-impact-on-food-production

The goal of this project is to assess the environmental impact of food production at both macro and micro levels and propose data-driven insights to mitigate the negative effects of food production on the environment.

data data-analysis data-science data-visualization environmental-impact-analysis r

Last synced: 30 Mar 2025

https://github.com/zborovskaanna/grosery_store_sales_analysis

Python data analysis project. Analysis of grocery store sales using visualizations and reporting in Tableau

data-analysis data-visualization matplotlib numpy pandas python seaborn tableau

Last synced: 08 Apr 2026

https://github.com/omari-kd/transborder-freight-data-analysis

This project analyses transportation data from the Bureau of Transportation Statistics (BTS) to uncover insights into cross-border freight's efficiency, safety and environmental impacts across road, rail, air and water modes.

data-analysis data-analysis-in-r data-cleaning-and-preprocessing data-science data-visualization powerbi

Last synced: 30 Mar 2025

https://github.com/sunnyrao07/data-analysis-dashboard-in-excel

I implemented a comprehensive data analysis solution using Excel, developing multiple dashboards and tables to visualize and interpret the data. This involved a rigorous data cleaning and preprocessing pipeline followed by data visualization.

dashboard data-analysis excel visualization

Last synced: 03 Feb 2026

https://github.com/analyticslover/salifort-motors-turnover-project

The Salifort Motors H.R. Project serves as the capstone for the Google Advanced Analytics Program on Coursera. This project presents a business scenario and a problem on the scnario context, employee turnover. In this project, essential techniques as EDA and Data Modeling are used to analyze and predict the employee turnover rates in the company.

data data-analysis datamodeling eda machine-learning pandas python sklearn

Last synced: 10 Apr 2026

https://github.com/yrohitha/titanic-data-analysis

Predict Survival Outcomes from the 1912 Titanic disaster based on each passenger's features, such as sex and age.

data-analysis machine-learning matplotlib pandas scipy-stats statistical-models

Last synced: 13 Mar 2025

https://github.com/fazej99/u.s-climate-and-temperature-analysis

This project analyzes historical temperature trends in the U.S., explores their economic impacts, predicts future changes using machine learning, visualizes regional anomalies with GIS, and presents findings through a secure and interactive Streamlit dashboard.

data-analysis data-science data-visualization gis machine-learning streamlit

Last synced: 22 May 2026

https://github.com/data-edd/california_population_projection

This project demonstrates a population projection analysis for the state of California using MySQL

data-analysis mysql

Last synced: 30 Mar 2025

https://github.com/nymarya/analise-correlacao-sifilis

Código da análise de correlação entre notificações de casos de sífilis e disponibilidade de testes e medicamentos

data-analysis healthcare pandas

Last synced: 03 Jan 2026

https://github.com/macnianios/salifort-motors_retention

Google Advanced Data Analytics Capstone: Analyzing customer retention at Salifort Motors.

data-analysis machine-learning pandas python seaborn sklearn

Last synced: 08 Apr 2026

https://github.com/sanjana-bongale/cancer_survival_data_analysis_and_prediction_using_logistic_regression

This project performs data analysis using Python to predict cancer patient survival outcomes. It involves data cleaning, exploratory analysis, and visualizations to explore factors like cancer type, stage, and treatments. A logistic regression model is built to predict patient survival based on demographic and medical data.

data-analysis data-cleaning data-science data-visualization eda jupyter-notebook kaggle logistic-regression machine-learning matplotlib numpy pandas predictive-modeling python scikit-learn seaborn

Last synced: 08 Apr 2026

https://github.com/onome-joseph/flexisaf

Generative AI & Data Science

data-analysis data-science machine-learning

Last synced: 16 Sep 2025

https://github.com/tushar2704/sql-query

Repository is designed to help you strengthen your SQL query skills by providing a collection of common and interview-based SQL queries for practice.

artificial-intelligence data-analysis data-engineering data-science database database-management database-schema relational-databases sql sql-database sql-query tushar2704

Last synced: 04 Nov 2025

https://github.com/sayamalt/fake-news-classification-using-fine-tuned-bert

Successfully developed a text classification model to predict whether a given news text is fake or not by fine-tuning a pretrained BERT transformed model imported from Hugging Face.

bert-embeddings bert-model data-analysis data-visualization deep-learning fine-tuning-bert model-evaluation model-training-and-evaluation text-classification text-preprocessing text-tokenization tokenizer-nlp wordcloud-visualization

Last synced: 05 Apr 2025

https://github.com/bala-1409/sales-forecasting-datascience-project

Develop a data science project using historical sales data to build a regression model that accurately predicts future sales. Preprocess the dataset, conduct exploratory analysis, select relevant features, and employ regression algorithms for model development. Evaluate model performance, optimize hyperparameters, and provide actionable insights.

data data-analysis data-science data-visualization datacleaning exploratory-data-analysis machine-learning-algorithms modelfitting prediction predictive-analytics predictive-modeling python3 regression-models salesforecast supervised-learning

Last synced: 26 Apr 2026

https://github.com/aleskandro/r-hadoop-madreduce-examples

A lot of examples about using R with hadoop for MapReduce with and without libraries as rhadoop/rhipe - DIEEI@unict.it - Advanced Programming Languages

data-analysis hadoop mapreduce r

Last synced: 04 Nov 2025

https://github.com/esr-style/stylegrid

A free alternative to AG grid built by me for personal use case.

aggrid data-analysis grid pivot-chart pivot-grid table

Last synced: 16 Sep 2025

https://github.com/hatamiarash7/ir-system

IR System for Reuters DB

data-analysis data-mining ir python

Last synced: 29 Mar 2025

https://github.com/mindlessmuse666/eda-explorer

Инструмент на Python для разведочного анализа данных (EDA) и визуализации, поддерживающий загрузку данных CSV и JSON, с модульной архитектурой ООП. Практическая работа по теме: "Обнаружение и визуализация данных для понимания их сущности" дисциплины "МДК 13.01: Основы применения методов искусственного интеллекта в программировании".

csv-visualization data-analysis data-science data-visualization exploratory-data-analysis json-visualization matplotlib oop pandas python seaborn

Last synced: 13 Apr 2026

https://github.com/tejaswirupa/data-analysis-of-departure-delays-at-united-airlines

Explored how weather and time factors influence delays in 58,000+ UA flights. Used permutation testing and visual analytics to show how temperature, visibility, and time of day affect departure punctuality.

data-analysis r statistics

Last synced: 25 Jan 2026

https://github.com/rohitha-tata/bike-sales

This project focuses on data cleaning, transformation, and dashboard creation using a bike buyers dataset. It includes Pivot Tables, slicers, visualizations, and statistical insights to analyze trends based on income, age, occupation, and other key factors. Insights help understand customer behavior, purchasing patterns, and decision-making trends.

data-analysis data-cleaning excel-dashboards interactive-slicers pivot-charts pivot-tables

Last synced: 08 Mar 2026

https://github.com/nimomach/amazon-sales-data

This is a small dataset containing Amazon sales data analysis for few regions.

dashboards data data-analysis data-visualization

Last synced: 08 Mar 2026

https://github.com/mrham17/spotify_streaming_analytics

Project is stable & documentation will be completed soon. Thank you for your understanding and patience.

big-data-analytics data-analysis google-colab music-data r-programming spotify streaming-analytics

Last synced: 24 Jul 2025

https://github.com/dimits-ts/sport-repression-repl-study

A replication Study for the recent paper "International Sports Events and Repression in Autocracies: Evidence from the 1978 FIFA World Cup" paper.

data-analysis jupyter regression-models replication-study statistical-analysis

Last synced: 25 Jul 2025

https://github.com/labex-labs/numpy-for-beginners

This comprehensive course covers the fundamental concepts and practical techniques of NumPy, the essential library for numerical computing in Python. Learn to create, manipulate, and analyze arrays efficiently.

array-manipulation array-slicing beginner-friendly course data-analysis data-science data-structures fast-computation hands-on labex labs linear-algebra matrix-operations numerical-computing numpy programming python python-programming scientific-computing vectorized-operations

Last synced: 20 Jun 2026

https://github.com/karsterr/repeated-measurement

An R-based workflow for conducting repeated measures ANOVA using the ez package, with data wrangling via tidyverse and visualization through ggplot2. Includes data import, transformation to long format, statistical analysis, and graphical summary.

anove data-analysis experimental-design ezanove ggplot2 r repeated-measurements rstats statistics tidyverse

Last synced: 18 Sep 2025

https://github.com/ramapinnimty/udacity-mlfoundation-nanodegree

This is a repository containing solutions to the assignments that are a part of the Udacity Machine Learning Foundation Nanodegree program.

assignments data-analysis python3 statistics udacity-machine-learning-nanodegree

Last synced: 26 Jul 2025

https://github.com/hfzdzakii/dicoding-shipclusteringanalysisdataandmodelling

This repo is a master submission for my Dicoding Final Project. Ship Performance Clustering Dataset was being used to fulfill the submission. Feel free to explore and I hope my work give you some insight!

clustering data-analysis machine-learning

Last synced: 27 Jul 2025

https://github.com/jofaval/ionosphere

Binary Classification of Ionosphere signals at Goose Bay, Labrador in 1988

data-analysis data-science data-visualization deep-learning google-colab keras machine-learning python scikit-learn tensorflow uci xgboost

Last synced: 09 Apr 2026

https://github.com/tbep-tech/tbep-r-training

Repository for miscellaneous R training materials

data-analysis open-science workshop

Last synced: 19 Feb 2026

https://github.com/tbep-tech/tberf-oyster

Materials for evaluating TBERF oyster restoration success

ccmp-bh4 ccmp-bh6 data-analysis tampa-bay tbep tberf

Last synced: 19 Feb 2026

https://github.com/lucas-mazzolim/superstore-bi

Project where I prepared two data sources for querying and created a BI visualization in Data Studio. Used tools as Mysql, Looker Studio, Google Spreadsheet and Python.

business-intelligence data-analysis data-visualization google-looker-studio mysql spreadsheet

Last synced: 27 Jul 2025

https://github.com/tkhoa2711/twitter-hate-speech

Hate speech detection on Twitter

data-analysis python twitter

Last synced: 28 Jul 2025

https://github.com/tbep-tech/fim-seagrass

Materials for analysis of FIM data, seagrass, and other datasets

data-analysis fim seagrass tampa-bay

Last synced: 19 Feb 2026

https://github.com/tbep-tech/rookery-bay-training

Materials for R training at Rookery Bay Monitoring Workshop 2020

data-analysis open-science workshop

Last synced: 19 Feb 2026

https://github.com/cyprianfusi/data-scientist-technical-exercise-10ds

With recommendations to UK Department for Education of 10 Local Authorities where National Tutoring Programme (NTP) should be intensified and a response to UK Secretary of Health regarding a 76% Accident and Emergency (A&E) performance target which seems far-fetched.

data-analysis data-cleaning data-visualization hypothesis-testing pandas-python policy statistics

Last synced: 21 Sep 2025

https://github.com/ozep/genshincharacteranalysis

Uses a spreadsheet with Character Data and organizes it into readable graphs.

data-analysis jypyternotebook python

Last synced: 18 Apr 2026

https://github.com/karencofre/riesgorelativo-lookerstudio

proyecto de análisis de datos y análisis perdicitvo en looker studio y google colab

bigquery data-analysis data-science machine-learning matplotlib python sklearn sql

Last synced: 03 Jan 2026

https://github.com/kartikey2807/bike-classification-1rt700

Binary classification problem involving Logistic regression, SMOTE and feature expansion.

data-analysis data-engineering data-visualization logistic-regression

Last synced: 30 Jul 2025

https://github.com/teamtigers/echartify

A web application built with .net core 2.2 that has come with the idea of reading the National Election's Data-set of Bangladesh in a fastest possible time and then representing the data-set with different statistical charts.

bangladesh chartjs code-first-migration cross-platform data-analysis data-structures data-visualization dotnet-core election-analysis election-data entity-framework-core materializecss mvc npoi razor-pages

Last synced: 16 Apr 2026

https://github.com/pauliorandall/airline-passenger-satisfaction-r

Analysing the Airline Passenger Satisfaction dataset from Maven Analytics

data-analysis data-analytics r

Last synced: 01 Aug 2025

https://github.com/0xunkn0wn4m1r/data_engineering_banking_project

🏦 Build a complete data engineering workflow for a banking system, showcasing ETL processes, data transformations, and an interactive financial dashboard.

automation data-analysis data-cleaning data-science feature-engineering fintech-bank flask-api loan-default-prediction machine-learning mlops model-explainability numpy postgresql scikit-learn segmentation shap sql unsupervised-learning

Last synced: 09 Apr 2026

https://github.com/vishal-bhandary/sql-data-analytics

This repository contains a collection of SQL scripts demonstrating various analytical techniques, such as changes over time, cumulative, performance, data segmentation, part-to-whole analysis.

analytics business-intelligence customer-segmentation dashboarding data-analysis data-reporting data-visualization data-warehouse etl kpi product-analysis sql sql-server star-schema t-sql

Last synced: 02 Aug 2025

https://github.com/ausaaf-rh/movie-recommendation-system-collaborative-filtering

🎬 A comprehensive movie recommendation system implementing item-based collaborative filtering with cosine similarity. Features real-time recommendations, performance evaluation metrics (Precision@K, Recall@K), and interactive user interface. Built with Python, scikit-learn, and MovieLens dataset for academic research and learning purposes.

agents data-analysis jupyter-notebook python python3

Last synced: 17 Apr 2026

https://github.com/nahiyanhkhan/stock-market-data-analysis_capstone-project

In this course, learned and solved assignments on SQL and Python. Final capstone project was on analyzing "Stock Market Data". Achieved 100% score in every assignment.

data-analysis data-analytics matplotlib mysql mysql-database numpy pandas python sql

Last synced: 09 Apr 2026

https://github.com/thedevreda/jadaerospace

A Real life project showing how to improve selling aircraftparts and helping salers to focus more on effective products at JadAero

data data-analysis data-cleaning data-visualization jupyter-notebook powerbi python

Last synced: 02 Aug 2025

https://github.com/prasannnnn/real-time-share-price-scraping-and-analysis

The Stock Sentiment Analyzer is a web-based application built with Streamlit, BeautifulSoup, and Pandas to help users analyze the sentiment of a stock (BUY, SELL, or HOLD) based on its financial data. The tool extracts key financial metrics like Market Cap, Stock P/E, Dividend Yield, ROCE, ROE, and the 52-week High/Low from Screener.in.

beautifulsoup4 data-analysis python sentiment-analysis streamlit streamlit-dashboard webscraping

Last synced: 03 Aug 2025

https://github.com/mxagar/space_exploration

This repository is a collection of mini-projects and tutorials related to space images and geo-spatial data.

data-analysis deep-learning geospatial machine-learning

Last synced: 29 Sep 2025

https://github.com/hari00887/analysis-of-global-terrorism

Analysis of Global Terrorism Using AHP A quantitative study of GTD data to assess attack severity and evolution across time and space.

data-analysis data-visualization powerbi

Last synced: 02 Mar 2026

https://github.com/rahulsm20/car-data

A data analytics project that involves analyzing a car dataset that includes information on various car brands, years, prices, mileage, and fuel types, in order to gain insights into the car market.

data-analysis data-analytics matplotlib numpy pandas python

Last synced: 09 Apr 2026

https://github.com/lyubov0406/data_analyst_portfolio

В репозитории собраны пет-проекты, демонстрирующие мои навыки в аналитике данных

data-analysis matplotlib numpy pandas portfolio python scipy seaborn sql tableau visualization

Last synced: 09 Apr 2026

https://github.com/theashishmavii/job-trends-analyzer-automation

End-to-end automation: job scraping, data analysis, and trends reporting for job seekers and researchers.

automation beautifulsoup data-analysis open-source pandas python selenium webscraping

Last synced: 07 Aug 2025

https://github.com/jagoda11/elastic-vision

This repository contains a full-stack application designed to explore data from ElasticSearch🧐indices and visualize it using charts and graphs. The backend is built using Node.js and the frontend is powered🚀 by React.

backend chartjs dashboard-development data-analysis data-visualization docker elasticsearch frontend fullstack javascript material-ui monorepo mui-x node pie-chart react restful-api tables

Last synced: 09 Apr 2026

https://github.com/debjyotisaha/data-analytics-projects-phase-2

Developed and showcased various data analytics projects, including data preprocessing, exploratory data analysis, and visualization. Utilized tools such as Python, Pandas, NumPy, and Matplotlib to derive actionable insights and demonstrate problem-solving capabilities.

data-analysis data-preprocessing eda matplotlib numpy pandas python seaborn

Last synced: 09 Apr 2026

https://github.com/thc1006/taiwan-ai-usage-index

台灣 AI 使用指數 (TAUI) - 開源資料分析框架,測量分析台灣各地區 AI 技術採用率 | Taiwan AI Usage Index - Open-source framework for measuring regional AI adoption

ai-adoption anthropic-index bilingual data-analysis human-ai-collaboration onet-classification open-source policy-analysis privacy-protection python research taiwan tdd usage-index visualization

Last synced: 03 Oct 2025

https://github.com/devexpress-examples/web-forms-pivot-grid-calculate-running-totals

This example demonstrates how to calculate running totals in Pivot Grid for Web Forms.

asp-net-web-forms data-analysis dotnet pivot-grid pivot-grid-for-web-forms

Last synced: 08 Aug 2025