An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/sebastiansauer/hans-hackathon2025

Materials for a course on the evaluation of the AI student learn tool "HaNS"

ai data-analysis evaluaton r

Last synced: 04 Oct 2025

https://github.com/farzeennimran/fashion-mnist-dataset-classification-using-neural-network

Implementation of a Multi-layer Perceptron classifier with hyperparameter tuning and k-fold cross-validation employing GridSearchCV for classifying images on the Fashion MNIST dataset 👗👚👖

artificial-intelligence data-analysis data-mining data-science dataset deep-learning fashion-mnist-dataset gridsearchcv hyperparameter-tuning kfold-cross-validation machine-learning multilayer-perceptron-network neural-network numpy pandas python sklearn

Last synced: 03 Apr 2026

https://github.com/gappeah/credit-card-transactions-fraud-detection-project

The Credit Card Transactions Fraud Detection Project repository is designed to analyse and detect fraudulent transactions in credit card data.

data-analysis postgresql sql

Last synced: 12 Jul 2025

https://github.com/arkww/matmap

Making maps from a Database and making the user guess which map is displayed

data-analysis data-science javascript python

Last synced: 24 Apr 2026

https://github.com/myktorijus/retention-cohort

Extracted cohort data using SQL in BigQuery focusing on weekly retention from week 0 to week 6

bigquery data-analysis data-visualization powerbi sql

Last synced: 13 Jul 2025

https://github.com/jhermienpaul/google-data-analytics-program

Hands-on learning materials from the 8-course Google Data Analytics Professional Certificate program, covering foundational data skills, tools, and real-world business problem-solving

bigquery dashboard data-analysis data-analytics data-modeling data-storytelling data-visualization data-wrangling descriptive-analytics diagnostic-analytics etl-pipeline r-programming rstudio sql tableau

Last synced: 13 Jul 2025

https://github.com/guilherme-marcello/r-data-analysis-piechart

Reading RDS files, processing and presentation in pie charts

data-analysis data-visualization pie-chart r

Last synced: 13 Jul 2025

https://github.com/htsandaruvan/attrition-analytics-suite-by-hello-green

I have created a comprehensive data analytics dashboard to identify factors contributing to attrition,

data-analysis data-analytics data-visualization powerbi

Last synced: 20 Jan 2026

https://github.com/simranrayait51/internshala-ds-projects

Projects from the Internshala Data Science course, showcasing my skills in Excel, SQL, Python, and Tableau for data manipulation, analysis, and visualization.

data-analysis data-science data-visualization excel internshala-project pgc postgresql python sql tableau

Last synced: 17 May 2026

https://github.com/percival33/machine-learning-engineering

Uni project about enhancing fictional music streaming service, by developing machine learning models to generate popular playlists

data-analysis data-science machine-learning python

Last synced: 14 Jul 2025

https://github.com/bonelesswater/tradingbot

This project is a web application for a trading bot that displays financial data and indicators. It includes functionality for researching financial data, displaying market indicators, and more.

ai azure css d3 data-analysis django html javascript jquery materializecss python stock-market

Last synced: 30 Dec 2025

https://github.com/arkww/chinesenewspaperwordcount

Analysis the word count of Chinese characters in Simplified and Traditional Chinese characters and comparing the results

chinese-language data-analysis data-science python

Last synced: 16 May 2026

https://github.com/caesaredia/chicago-taxi-data-insights

Exploratory data analysis and hypothesis testing on Chicago taxi trip data to uncover patterns in demand and the effects of rainy weather on travel time.

chicago data-analysis data-visualization exploratory-data-analysis hypothesis-testing python statistical-analysis taxi-trips weather-analysis

Last synced: 17 May 2026

https://github.com/theo-jenkins/fmri-brain-scan-analyser

MATLAB toolkit for reading, analysing and simulating rs-fMRI brain scans in .nii format.

algorithms data-analysis data-visualization fmri-data-analysis matlab neuroimaging

Last synced: 15 Jul 2025

https://github.com/priyanshubiswas-tech/farmlab-report-and-case-study-iot

This project was developed through live interviews and case studies with farmers in the year 2023 to address key agricultural challenges. The device provides real-time farm insights for better decision-making. Future plans include a digital portal, increased range, more sensors, and improved design. Open to collaboration!

arduino-ide c case case-study data data-analysis iot iot-device serialization

Last synced: 15 Jul 2025

https://github.com/miusarname2/proyectos-final-analitica-de-datos

Welcome to the repository where the magic of data analytics comes to life! This is the result of our effort and creativity in the subject of data analysis at the Universidad Cooperativa de Colombia (UCC). Here we keep everything we did to analyse data, draw cool conclusions and solve the workshop we were given. 🎯📊

data-analysis data-science data-visualization pip python

Last synced: 15 Jul 2025

https://github.com/jarrarshahid/nutrition-calculator

Simple python app to calculate nutritions in everyday meals.

data-analysis health json jupyter-notebook logic-programming python

Last synced: 15 Jul 2025

https://github.com/hosseinkarimi128/zed-one

An AI-powered assistant that analyzes CSV data using natural language queries to generate pandas code and visualizations.

ai-data-analysis automated-pandas automated-pandas-queries csv data-analysis fastapi langchain machine-learning matplotlib nlp openai pandas restful-api summarization visualization-tools

Last synced: 07 Apr 2026

https://github.com/vishnu-vamshii/layoffs-data-analysis-in-sql

This project focuses on the cleaning and exploratory analysis of a dataset containing layoff information. It includes data deduplication, standardization of columns, handling null and blank values, and analyzing layoffs by company, industry, country, and date. Various SQL queries are used to explore trends and patterns in layoffs over time.

data-analysis eda mysql

Last synced: 15 Jul 2025

https://github.com/viper373/lol-dataanalytics

腾讯游戏-英雄联盟赛事20/21/22年数据综合分析预测

crawler-python data-analysis jupyter-notebook lol python spider

Last synced: 15 Jul 2025

https://github.com/shrutiijoshi/airbnb-listing-reviews

Airbnb is an online marketplace that connects people who want to rent out their homes with travelers seeking accommodations.

data-analysis matplotlib-pyplot pandas-python python seaborn

Last synced: 17 May 2026

https://github.com/gagan8605/zepto_sql_analysis

This project explores and analyzes the inventory data of Zepto, a rapidly growing 10-minute grocery delivery platform in India. The dataset contains over 3,000+ SKUs across key product categories such as Fruits & Vegetables, Dairy, Beverages, Packaged Foods, and more. The analysis was performed using PostgreSQL, covering both data cleaning and bus

cleaning-data data-analysis database-management postgresql sql

Last synced: 16 Jul 2025

https://github.com/amr-yasser226/interactive-sales-analytics-dashboard

An interactive web-based dashboard for visualizing multinational electronics sales data. This project for the DSAI 203 course integrates a Python/Flask backend with an amCharts frontend to provide dynamic insights into product revenues, sales distribution, and employee statistics across different countries.

am5charts amcharts business-intelligence css dashboard data-analysis data-analytics data-visualization flask html javascript python sqlalchemy sqlite web-application

Last synced: 13 Apr 2026

https://github.com/ifigeneiatsiflidou/popular-items-sales-analysis

Two data tasks in Python: popular items by ZIP & store sales breakdown with plots.

data-analysis matplotlib pandas

Last synced: 16 May 2026

https://github.com/RLAlpha49/AniSearch-Model

AniSearchModel leverages Sentence-BERT (SBERT) models to generate embeddings for synopses, enabling the calculation of semantic similarities between descriptions. This allows users to find the most similar anime or manga based on a given description.

anime api data-analysis data-merging embeddings flask hugging-face-datasets kaggle-datasets machine-learning manga natural-language-processing nlp python sentence-bert similarity-search

Last synced: 06 May 2025

https://github.com/rachkat/random-foresst-analysis-r-studio-plotting-classification-tree

Classification analysis in R using the birthwt dataset. Built and compared Decision Tree and Random Forest models to predict low birth weight. Both achieved 71.05% accuracy, with Random Forest reducing overfitting and confirming maternal weight and age as key predictors.

classification data-analysis decision-trees machine-learning predictive-modeling r random-forest

Last synced: 04 Oct 2025

https://github.com/maheera421/pandas

Implementation of essential Pandas functions.

data-analysis data-manipulation pandas-dataframes pandas-datareader pandas-python

Last synced: 17 Jul 2025

https://github.com/venkat-023/thyroid-cancer-prediction

This project aims to develop a machine learning pipeline to predict thyroid cancer based on patient data. The dataset was sourced from multiple public repositories, cleaned, and merged to create a comprehensive dataset for modeling. Various classification algorithms were implemented, including Random Forest, Logistic Regression, K-Nearest Neighbors

data-analysis data-cleaning deep-learning ensembling-methods hyperparameter-tuning machine-learning-algorithms nueral-networks

Last synced: 17 May 2026

https://github.com/ofir-frd/predict-success-of-a-restaurant

Apply machine learning on a restaurante database. Study and analyse the data for prediction of a successful restaurant.

data-analysis data-science machine-learning visualization

Last synced: 11 Jun 2026

https://github.com/alfioma/ada-xtq

🔗 Simplify data transfer with ada-xtq, a lightweight tool for seamless integration and efficient handling of data between platforms.

ada algorithms api-development artificial-intelligence automation data-analysis data-visualization docker machine-learning neural-networks open-source programming python software-development xtq

Last synced: 01 May 2026

https://github.com/nikbarb810/covid_growth_rate_390.51

Exploring Covid Growth Rate of European Population using genetic data analysis

bioinformatics data-analysis r rcpp

Last synced: 07 Apr 2026

https://github.com/panoschatzi/erythrocyte_study_statistical_analyses

R code for data transformation, analysis and visualization of experimental data, as well as for statistical analyses and quantitative simulations.

afex data-analysis emmeans ggplot2 lme4 purrr r rprogramming rstats rstudio statistics tidyverse visualization

Last synced: 04 Apr 2025

https://github.com/ahmeddhus/exploring-football-data-analysis

Learning and exploring data analysis through real-world datasets using Python and StatsBomb APIs and mlpsoccer library

data-analysis jupyter-notebook mplsoccer python statsbomb

Last synced: 17 May 2026

https://github.com/cyblx/clustering

This project explores clustering techniques and supervised learning applied to World Cup team performance analysis. The methodologies include K-Means, DBSCAN, K-Nearest Neighbors, Gaussian Mixture Models (GMM), and Agglomerative Clustering.

clustering data-analysis dbscan gmm kmeans supervised-learning unsupervised-learning world-cup

Last synced: 18 Jul 2025

https://github.com/theveryhim/basic-data-analysis

Working with basic Python tools frequently used in data science

data-analysis data-processing visualization

Last synced: 18 Jul 2025

https://github.com/theveryhim/web-scraping-and-statistical-tests

Crawling web for data and perform statistical tests to verify judgments

data-analysis hypothesis-testing web-scraping

Last synced: 18 Jul 2025

https://github.com/theveryhim/massive-text-processing

cleaning, processing and analysis of papers' dataset in pyspark(rdd) framework

big-data data-analysis frequent-itemsets massive-datasets pyspark text-preprocessing

Last synced: 18 Jul 2025

https://github.com/abhinavhariyal/diwali-sales-analysis

This project is based on data visualization and analysis using python and jupyter notebook on the data for diwali sales.

data-analysis data-visualization jupyter python

Last synced: 19 May 2026

https://github.com/amyanchen/sf-airbnb

Exploratory Data Analysis of San Francisco Airbnb's

data-analysis data-science data-visualization r rmarkdown statistics

Last synced: 18 Jul 2025

https://github.com/lit26/data_jobs_analyzing

Data analysis for data jobs

data-analysis topic-modeling

Last synced: 26 Mar 2025

https://github.com/jelhamm/internode-hellinger-distance-based-decision-tree

Simulations for the paper "Inter node Hellinger Distance based Decision Tree by Pritom Saha Akash, Md. Eusha Kadir, Amin Ahsan Ali, Mohammad Shoyaib"

articles data-analysis data-mining decision-tree decision-tree-classifier hddt hellinger-distance-criterion machine-learning numpy-library paper-implementations python scipy-library simulation tree-node

Last synced: 04 Apr 2025

https://github.com/ashwin331133/sql-healthcare-data

This repository contains SQL queries designed to analyze health care data. The queries focus on patient demographics, encounter costs, and flu shot statistics, aiming to provide insights into patient behavior and financial impacts. The datasets include information on patient encounters, flu shots, and hospital admissions.

data-analysis mysql sql

Last synced: 29 Oct 2025

https://github.com/mfakhriazhar/housing-price-analysis

Determining the price of a house also depends on various factors such as building area, exterior quality, and amenities. This dataset provides information on properties for sale, and through Exploratory Data Analysis (EDA), patterns and key factors affecting house prices can be identified.

data-analysis data-science data-visualization eda exploratory-data-analysis python

Last synced: 16 May 2026

https://github.com/alexjackson1/commons-indicative-votes

A cluster analysis of the House of Commons' Indicative Brexit Voting Process on 27 Match 2019

data-analysis politics r

Last synced: 19 Jul 2025

https://github.com/stkisengese/numpy-data-fundamentals

A comprehensive collection of NumPy exercises covering array manipulation, slicing, broadcasting, random data generation, and real-world data analysis applications.

data data-analysis numpy pre-processing

Last synced: 16 May 2026

https://github.com/mh0386/motorcycle_data_analysis

Data analysis applied to motorcycle dataset.

data-analysis

Last synced: 19 Jul 2025

https://github.com/carlosvinimsouza/jupyter-notebook-basic

Armazenado todos os trabalhos referentes a Ciência de Dados.

data-analysis data-science programas-jupyter-notebook python

Last synced: 11 May 2026

https://github.com/mboula/mboula.github.io

GitHub portfolio + interactive resume | Showcasing data projects in civil rights (housing), cannabis, and analytics

cannabis case-study civil-rights compliance dashboards data-analysis data-cleaning data-vizualization excel google-data-analytics housing open-data pattern-analysis portfolio pro-se public-data r sql tableau

Last synced: 10 Jul 2025

https://github.com/malexandersalazar/tools-python-mssql-statistics-descriptor

A lightweight tool based on sweetviz that generates high-density visualizations to kickstart Exploratory Data Analysis within Microsoft Azure SQL Database using ODBC with just one line of code

azure-sql-database data-analysis data-visualization eda python

Last synced: 16 May 2026

https://github.com/katarinatmb/serbia-protest-analysis

This project analyzes the frequency, regional distribution, and group characteristics of protests that emerged across Serbia following the fatal collapse of the Novi Sad train station roof in November 2024. The analysis explores how different communities responded in the aftermath of the disaster, using data visualization in RStudio

data-analysis data-visualization r r-mark rstudio

Last synced: 10 Jul 2025

https://github.com/preciousclement/maternal-experiences-in-nigeria

This repository contains a Python-based project that generates realistic synthetic data simulating the maternal health journey of 5,000 women in Nigeria.

data-analysis data-generation maternal-health nigeria public-health python

Last synced: 08 May 2025

https://github.com/jm199504/data-analysis-practice

数据分析练习(Titanic / BankCustomers)

data-analysis python

Last synced: 02 May 2026

https://github.com/colindean/allegheny_voter_reg_analysis

Allegheny County Voter Registration Analysis Tools

data-analysis data-science elections pandas polars python voting

Last synced: 16 May 2026

https://github.com/sharoonjoseph321/social_media_eda

Data Analysis on social media apps ,using pandas, python, matplotlib.

data data-analysis data-science data-visualization matplotlib programming-language project python pythonprojects

Last synced: 03 Mar 2025

https://github.com/gaurav-van/data-analysis-projects

Collections of Projects that involves Data Analysis and Informed Decision Making

data-analysis database powerbi sql

Last synced: 06 Sep 2025

https://github.com/josafary-ds/curso_dnc

Repositório para armazenamento dos arquivos de estudo e projetos DNC - Cientista de Dados

data-analysis data-science data-visualization machine-learning powerbi python

Last synced: 13 Mar 2025

https://github.com/dsite42/simple_data_visualizer

This is a simple tool to visualize data for a quick Exploratory Data Analysis (EDA). You can create various plot types as seaborn or plotly plot via a GUI in multiple windows (RelPlot, PairPlot, JointPlot, DisPlot, CatPlot, LmPlot, 3DPlot).

data-analysis data-science data-visualisation data-visualization eda exploratory-data-analysis plotly seaborn

Last synced: 12 May 2026

https://github.com/engraulleite/local-data-warehousing-with-docker

Creating a DW from 0 to hero. Starting with logical and physical modeling to valuable reports.

airbyte data-analysis datawarehouse docker etl-pipeline metabase pgadmin4 postgresql

Last synced: 01 May 2026

https://github.com/iamber12/stack-overflow-analysis-using-stack-exchange-api

This Python-based project utilizes the Stack Exchange API to analyze StackOverflow data, focusing on the 'R' and 'Dot Net' programming tags.

data-analysis data-visualization python stack-exchange-api

Last synced: 20 Jul 2025

https://github.com/datalopes1/fifa21_datacleaning

Neste projeto será feito o processo de limpeza e manipulação a partir do dataset FIFA 21 messy, raw dataset for cleaning/ exploring, que pode ser encontrado no Kaggle, com licensa CC0: Public Domain e enviado por Rachit Toshniwal.

data-analysis data-cleaning python

Last synced: 30 Apr 2026

https://github.com/leoz0214/foodhygieneanalysis

Data analysis regarding Food Hygiene Ratings in England, Wales and Northern Ireland.

data-analysis food-hygiene-ratings pandas python

Last synced: 17 May 2026

https://github.com/dhruvil-26/sql-projects

This repository contains SQL projects focusing on data analysis and insights. Currently, it includes: 1. RSVP Movies Analysis - SQL queries to analyze movie trends, ratings, and genres. 2. Pizza Sales Analysis - SQL queries to explore sales patterns, customer behavior, and profitability in a pizza business.

analysis data-analysis database mysql pizza-sales-analysis rdbms rsvp sql

Last synced: 17 May 2026

https://github.com/fatihilhan42/wnba-draft-player-dataanalysis-1997-2022-with-python

In this project, the statistics of the players in the WNBA drafts from 1997 to 2022 were examined. The data in the dataset, which you can find in the repo, was first organized using data cleaning algorithms. These cleaned data were then graphically extracted using data visualization algorithms.

data-analysis data-analysis-python data-visualization jupyter-notebook python

Last synced: 17 May 2026

https://github.com/martachesnova/sql

Performing data modeling (ERD) and data engineering. Then, writing series of SQL queries to analyze Employee Database of a company.

data-analysis data-engineering data-modeling erd postgresql sql

Last synced: 16 May 2026

https://github.com/aelmah/ibm-applied-ds

Find here : A collection of projects I've done throught Applied DS Specialization !

applied-data-science-capstone beautifulsoup data-analysis data-visualization machine-learning python-for-ai-and-data-science web-scraping

Last synced: 11 Sep 2025

https://github.com/marcosvbras/udacity-nd109-project-titanic

Data Analysis project to Udacity Nanodegree's course: Artificial Intelligence Programming with Python.

data-analysis data-analyst-nanodegree data-science jupyter-notebook machine-learning python udacity

Last synced: 19 May 2026

https://github.com/hassanislam463/british-airways-data-science

Analyze Skytrax reviews to uncover customer sentiments and key themes while predicting booking behavior using machine learning. This repository includes data collection, analysis, and modeling scripts alongside concise, visualized insights to improve customer experience and operational efficiency.

data-analysis data-science data-visualization

Last synced: 28 Mar 2025

https://github.com/deliprofesor/2024-salary-analysis-for-machine-learning-engineers

This project analyzes a salary dataset to explore factors like experience, company size, remote work ratio, and country. It includes data cleaning, group analysis, visualizations, and machine learning models (linear regression and Random Forest) to predict salaries and identify key features.

data-analysis data-cleaning data-visualization ggplot2 linear-regression machine-learning plotly r-programming random-forest salary-prediction salary-trends

Last synced: 07 Mar 2026

https://github.com/wesleych3n/my-work-log

A self project to record and analyze work's check in/out time on google sheet with telegram bot.

data-analysis telegram-bot worklog

Last synced: 20 Jul 2025

https://github.com/hassanislam463/sentiment_analysis_of_financial_news_headlines_and_affect_on_stock_price_prediction

This project analyzes financial news sentiment using a fine-tuned RoBERTa model and integrates it with stock data to predict price movements using LSTM and GRU. It highlights the role of sentiment in enhancing stock market forecasting.

data-analysis data-science data-visualization deep-learning lstm-neural-networks nlp-machine-learning

Last synced: 28 Mar 2025

https://github.com/sangampaudel530/bhutan-rainfall-explorer

Interactive dashboard to explore, analyze, and forecast rainfall trends in Bhutan (2021–2025) using Streamlit, Plotly, and Prophet.

bhutan climate-change data-analysis prophet-facebook rainfall-prediction streamlit visualization

Last synced: 17 May 2026

https://github.com/saksham-jain177/data-analysis

A collection of data analysis and machine learning projects across various datasets. Explore predictive modeling, data visualization, and insights from real-world data. Projects include sales predictions, disease detection, customer segmentation, and more.

api data data-analysis data-cleaning data-science data-visualization datamodeling dataset datasets exploratory-data-analysis python python3 web-scraping youtube-api

Last synced: 01 May 2026

https://github.com/arv-anshul/pw-experience-portal

Data Analysis on PW Skills and Ineuron.ai experience/internship portal.

data-analysis experience ineuron-ai internship physics-wallah portal pw-skills python3

Last synced: 16 Apr 2026

https://github.com/chrisrobertsjr/chrisrobertsjr

Welcome to my Github Profile!

data data-analysis java r sql statistics

Last synced: 03 May 2026

https://github.com/satyacoder29/smartfinance-dynamic-financial-dashboard

SmartFinance: Dynamic Financial Dashboard is an interactive tool designed to visualize key financial metrics like revenue, expenses, and profit. It features real-time data updates, charts, slicers, and navigation for easy analysis. This dashboard helps businesses make data-driven decisions and optimize financial performance.

data-analysis data-cleaning data-modeling data-visualization powerbi powerbi-desktop powerbi-visuals powerquerym

Last synced: 13 Feb 2026

https://github.com/nick-peter-marcus/chocolate-bar-analysis

Analyzing Chocolate Bar Features and Ratings - Data Visualization, Decision Trees, Random Forest

data-analysis data-visualization decision-trees python random-forest seaborn sklearn

Last synced: 10 May 2026

https://github.com/karishmagupta05/e-commerce-sales-dashboard

This project is an interactive E-Commerce Sales Dashboard built using Power BI. It provides key insights into sales, profit, and customer behavior through visually engaging charts and graphs.

data-analysis data-visualization powerbi

Last synced: 09 Feb 2026

https://github.com/surajsanap/employee-resigning-analysis-powerbi-dashboard-data-analytics

Effortlessly analyze employee resignations with our concise Power BI dashboard. Download the XML file, open the dashboard, and gain quick insights into resignation trends and reasons for departure. Streamlined and effective

dashboard data-analysis data-analytics powerbi python xml-dataset

Last synced: 08 May 2025