An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/takshak26/predict_blood_donations-

About The title of the project is “Predict Blood Donations”. It uses python as language, data science, and machine learning as the field of operation, TPOT library for model selection, logistic regression for model building, and jupyter notebook as the code editor.

data-analysis data-visualization datascience machine-learning python3

Last synced: 16 May 2026

https://github.com/jotstolu/netflix-sql-data-analysis-project

This project explores the Netflix dataset using SQL queries to uncover trends, patterns, and business insights that could help stakeholders understand content distribution, viewer preferences, and platform optimization

data-analysis sql sql-server tsql

Last synced: 02 Aug 2025

https://github.com/nushratjabenaurnima/cse_477_data_mining

A collection of labs, reports, Jupyter notebooks, and project outputs for the CSE 477 Data Mining course. This repository tracks my learning journey through data preprocessing, association rules, clustering, classification, and real-world data analysis with Python.

data data-analysis data-mining data-science google-colab-notebook jupyter-notebook machine-learning python python-3

Last synced: 09 Apr 2026

https://github.com/jedrzej-wydra/competition-cooperation

Competition, cooperation, and parental effects in larval aggregations formed on carrion by communally breeding beetles Necrodes littoralis (Staphylinidae: Silphinae)

data-analysis non-linear-regression r

Last synced: 20 Aug 2025

https://github.com/nahiyanhkhan/stock-market-data-analysis_capstone-project

In this course, learned and solved assignments on SQL and Python. Final capstone project was on analyzing "Stock Market Data". Achieved 100% score in every assignment.

data-analysis data-analytics matplotlib mysql mysql-database numpy pandas python sql

Last synced: 09 Apr 2026

https://github.com/rajesh9943/web-scraping-analysis-of-top-us-company-revenue-growth-in-2023

Explore the landscape of US business growth in 2023 with our dynamic project, 'Web Scraping for US 2023 Revenue Growth.' Utilizing advanced web scraping techniques, we unveil insights into the top companies driving economic expansion.

cleaning-data data data-analysis data-visualization manipulation numpy pandas pre-fill

Last synced: 16 Aug 2025

https://github.com/syed-amjad-ali/restaurant-sales-sql-project

This was a simple SQL project where I analyzed restaurant sales data, showcasing skills in data creation and querying. The project explores menu performance, order trends, and customer insights.

aggregations business-intelligence data-analysis guided-project joins maven-analytics querying restaurant-sales sales-data sql subqueries

Last synced: 03 Jan 2026

https://github.com/quesocosteno03/data-analysis-projects

This repository serves as a collection of all my projects.

data-analysis jupyter-notebook powerbi

Last synced: 02 Aug 2025

https://github.com/manish506/loan-approval-prediction

Explore predictive modeling in this project by applying classification techniques to a loan approval dataset. Analyze and preprocess the data, then use models like K-Nearest Neighbors, Random Forest, SVC, and Logistic Regression to predict loan outcomes. Gain insights into approval factors and enhance prediction accuracy.

classification classification-models data-analysis data-science jupyter-notebook loan-approval-prediction machine-learning predictive-analytics predictive-modeling project python

Last synced: 19 Jan 2026

https://github.com/yamslam/contentsunderpressure_processing

A repository for data processing and analysis for Contents Under Pressure.

data-analysis data-processing data-visualization game-based-learning judgments process-safety

Last synced: 07 Sep 2025

https://github.com/idaraabasiudoh/credit_card_fraud_detection

This repository contains a machine learning project focused on detecting credit card fraud using Decision Tree and Support Vector Machine (SVM) classifiers.

data-analysis jupyter-notebook machine-learning python3 scikit-learn snapml

Last synced: 19 Feb 2026

https://github.com/thedevreda/jadaerospace

A Real life project showing how to improve selling aircraftparts and helping salers to focus more on effective products at JadAero

data data-analysis data-cleaning data-visualization jupyter-notebook powerbi python

Last synced: 02 Aug 2025

https://github.com/abdullahashfaqvirk/PowerBI-Dashboards

A collection of Microsoft Power BI dashboards and reports designed to address business challenges and support data driven decision-making.

dashboards data-analysis data-driven data-science microsoft powerbi reports visualization

Last synced: 27 Sep 2025

https://github.com/borjamome/explorando-madrid

Exploring Madrid: A Data-driven Analysis with R 🐻🌳

data-analysis data-visualization madrid r

Last synced: 26 Mar 2025

https://github.com/lc-rezende/eqx_boston_dataset

Exploratory data analysis, clustering, and forecasting on Boston crime data (2011-2015), revealing key crime trends, hotspots, and temporal patterns to support data-driven insights for urban safety and policing strategies.

data-analysis exploratory-data-analysis jupyter-notebook kmeans matplotlib numpy pandas prophet-facebook python scikit-learn seaborn

Last synced: 09 Apr 2026

https://github.com/borjamome/top-goleadores

Mejores delanteros en Europa según los datos

data-analysis data-visualization football-analytics r

Last synced: 26 Mar 2025

https://github.com/faint-liebfraumilch101/fraud-detection-sql-unsupervised

🕵️♂️ Detect fraud in bank transactions using SQL for feature engineering and Python's Isolation Forest for unsupervised anomaly detection.

anomaly-detection banking-data data-analysis data-science financial-analytics fraud-detection isolation-forest machine-learning portfolio-project python sql sqlite unsupervised-learning

Last synced: 07 May 2026

https://github.com/tashi-2004/apache-flink-spark-data-streaming

This project showcases a real-time data streaming pipeline using Apache Flink, Apache Spark, and Grafana. It streams data, stores it in Parquet format, and performs aggregations for insights, with seamless visualization via Grafana dashboards.

apache-flink apache-spark data-aggregation data-analysis data-science data-streaming data-visualization flink flink-stream-processing flink-streaming grafana-dashboard grafana-plugin pyflink python3

Last synced: 09 Feb 2026

https://github.com/0xnu/england-house-prices

Predict house prices for the next five years across all English local authorities.

data-analysis england england-house-prices housing-market housing-market-analysis predictive-modeling regression

Last synced: 03 Aug 2025

https://github.com/mxagar/space_exploration

This repository is a collection of mini-projects and tutorials related to space images and geo-spatial data.

data-analysis deep-learning geospatial machine-learning

Last synced: 29 Sep 2025

https://github.com/mhkamel/ecommerce-targeting-system

A Flask-based E-Commerce Targeting System that provides customer segmentation and personalized product recommendations. Users can upload structured interaction data for analysis, receive AI-driven recommendations, and gain insights into user behavior. The application is built with Flask, Pandas, Scikit-Learn, and integrates an interactive web inter

ai bootstrap csv-processing customer-segmentation data-analysis data-science e-commerce flask machine-learning pandas python recommendation-system scikit-learn user-behavior web-application

Last synced: 09 Apr 2026

https://github.com/hari00887/analysis-of-global-terrorism

Analysis of Global Terrorism Using AHP A quantitative study of GTD data to assess attack severity and evolution across time and space.

data-analysis data-visualization powerbi

Last synced: 02 Mar 2026

https://github.com/rahulsm20/car-data

A data analytics project that involves analyzing a car dataset that includes information on various car brands, years, prices, mileage, and fuel types, in order to gain insights into the car market.

data-analysis data-analytics matplotlib numpy pandas python

Last synced: 09 Apr 2026

https://github.com/asghar-rizvi/hotel_reservation_data_analysis

This project involves a comprehensive data analysis of a hotel reservation dataset using Excel. The primary focus is on examining reservation cancellations. Through detailed analysis and visual representation.

dashboard dashboard-templates data-analysis data-analysis-excel data-representation data-science excel

Last synced: 02 Mar 2026

https://github.com/lyubov0406/data_analyst_portfolio

В репозитории собраны пет-проекты, демонстрирующие мои навыки в аналитике данных

data-analysis matplotlib numpy pandas portfolio python scipy seaborn sql tableau visualization

Last synced: 09 Apr 2026

https://github.com/PanosChatzi/Healthcare_and_Bioinformatics_Analyses

This repo contains the final assignments of the Data Analyst bootcamp by Workearly. Python and SQL were used to complete the assignments.

data-analysis data-cleaning data-visualisation jupyter matplotlib pandas python seaborn

Last synced: 05 Aug 2025

https://github.com/mikhaelmounay/salty-med

Salty Mediterranean - Grade 12 Data Analysis & Visualization Capstone Project

data-analysis data-visualization

Last synced: 02 Feb 2026

https://github.com/shrutiijoshi/corporate-campus-hiring-analysis

This project analyzes corporate campus hiring trends for fresh graduates in India.

dashboard data-analysis data-visualization excel powerbi

Last synced: 09 Mar 2026

https://github.com/elissorokin/data-analyst-portfolio

Это репозиторий, в котором я демонстрирую свои навыки, делюсь проектами и отслеживаю прогресс в области анализа данных и Data Science.

ab-testing data data-analysis datalense matplotlib numpy pandas plotly portfolio postgresql python scipy seaborn sql statistical-analysis

Last synced: 09 Apr 2026

https://github.com/acerbilab/svbmc

Stacking Variational Bayesian Monte Carlo (S-VBMC) algorithm for combining Variational Bayesian Monte Carlo (VBMC) posteriors to boost inference performance.

bayesian-inference data-analysis machine-learning model-fitting python stacking variational-inference

Last synced: 20 Jan 2026

https://github.com/theashishmavii/job-trends-analyzer-automation

End-to-end automation: job scraping, data analysis, and trends reporting for job seekers and researchers.

automation beautifulsoup data-analysis open-source pandas python selenium webscraping

Last synced: 07 Aug 2025

https://github.com/fortunewalla/birdstrikes

birdstrikes database created for postgresql with simple sample queries

birdstrikes csv data-analysis data-science database dataset pgsql postgresql practice sample sql sql-query workshop

Last synced: 02 Oct 2025

https://github.com/omdoshi13/pricing-of-laptops-using-ml

Data Analysis, training Machine Learning models, and Model Evaluation and Refinement for Pricing of Laptops dataset.

data-analysis data-analysis-project datascience google-colab jupyter-notebook machine-learning matplotlib model-evaluation model-refinement numpy pandas python scikit-learn

Last synced: 09 Apr 2026

https://github.com/sebastianofazzino/ibm-data-science-professional-certificate

In this repository I've stored exercises and projects I've been working on while attending IBM Data Science Professional Certificate, using Python and its libraries.

data-analysis data-mining data-science data-structures data-visualization database machine-learning matplotlib numpy pandas python regression seaborn sql

Last synced: 09 Apr 2026

https://github.com/namratagulati/tweets_analysis

This repository focuses on sentiment analysis of Twitter data using Python, Natural Language Processing (NLP), and the Natural Language Toolkit (NLTK). The goal is to extract valuable insights from social media discussions, such as word frequency, hashtag trends, and sentiment patterns.

analysis data-analysis natural-language-processing nlp-machine-learning nltk-corpus nltk-python sentiment-analysis twitter-sentiment-analysis

Last synced: 07 Aug 2025

https://github.com/v41bh4vr4jput/data-analysis-with-python

This repository is a comprehensive collection of data analysis projects and tutorials using Python's most powerful libraries: NumPy, Pandas, Seaborn, and Matplotlib. It is designed to help you explore, clean, visualize, and analyze data efficiently.

api data data-analysis data-visualization matplotlib numpy pandas python sakila-db seaborn

Last synced: 09 Apr 2026

https://github.com/gmasson/datadash

DataDash é uma biblioteca JavaScript e CSS para criar dashboards interativos, para visualização de dados dinâmicos em páginas web.

dashboard dashboard-application dashboards data-analysis data-science data-visualization javascript

Last synced: 08 Aug 2025

https://github.com/nurulashraf/linear-regression-insurance-premium

This analysis applies simple linear regression to explore the relationship between age and insurance premium. It includes model training, visualisation, and evaluation using MSE and RMSE to assess prediction accuracy.

beginner-project data-analysis insurance-data linear-regression machine-learning matplotlib predictive-modeling python regression-models scikit-learn

Last synced: 05 May 2026

https://github.com/jagoda11/elastic-vision

This repository contains a full-stack application designed to explore data from ElasticSearch🧐indices and visualize it using charts and graphs. The backend is built using Node.js and the frontend is powered🚀 by React.

backend chartjs dashboard-development data-analysis data-visualization docker elasticsearch frontend fullstack javascript material-ui monorepo mui-x node pie-chart react restful-api tables

Last synced: 09 Apr 2026

https://github.com/debjyotisaha/data-analytics-projects-phase-2

Developed and showcased various data analytics projects, including data preprocessing, exploratory data analysis, and visualization. Utilized tools such as Python, Pandas, NumPy, and Matplotlib to derive actionable insights and demonstrate problem-solving capabilities.

data-analysis data-preprocessing eda matplotlib numpy pandas python seaborn

Last synced: 09 Apr 2026

https://github.com/thc1006/taiwan-ai-usage-index

台灣 AI 使用指數 (TAUI) - 開源資料分析框架,測量分析台灣各地區 AI 技術採用率 | Taiwan AI Usage Index - Open-source framework for measuring regional AI adoption

ai-adoption anthropic-index bilingual data-analysis human-ai-collaboration onet-classification open-source policy-analysis privacy-protection python research taiwan tdd usage-index visualization

Last synced: 03 Oct 2025

https://github.com/devexpress-examples/web-forms-pivot-grid-calculate-running-totals

This example demonstrates how to calculate running totals in Pivot Grid for Web Forms.

asp-net-web-forms data-analysis dotnet pivot-grid pivot-grid-for-web-forms

Last synced: 08 Aug 2025

https://github.com/muneeb706/human_activity_recognition

This project performs data cleaning and data exploration steps for Human Activity Recognition Using Smartphones Data Set in R programming language.

data-analysis data-cleaning data-exploration r-programming

Last synced: 08 Aug 2025

https://github.com/akunna1/energy-data-analysis-unc-campus

Link to Report: https://adminliveunc-my.sharepoint.com/:w:/r/personal/tadennis_ad_unc_edu/Documents/Capstone%20Group/Final%20Report%20Draft.docx?d=wba9e7182a9b948898133e4f89def1d90&csf=1&web=1&e=fQGAfy

arcgis-pro data-analysis dplyr excel geospatial-data-analysis ggplot ggplot2 lubricants tidyr tidyverse

Last synced: 08 Aug 2025

https://github.com/jakobzmrzlikar/trg-dela

Data analysis of student job offers.

data-analysis ipython-notebook web-scraping

Last synced: 09 Aug 2025

https://github.com/busradeveci/odev2-branching

This project is prepared for Artificial Intelligence and Technology Academy Git GitHub Assignment 2. Using the “Wine Reviews” dataset from Kaggle, it converts wine ratings into star ratings and analyzes them.

data-analysis kaggle-dataset python wine-reviews-dataset

Last synced: 03 Oct 2025

https://github.com/lashawnfofung/super-heroes-analysis-project

This portfolio project involves a detailed analysis of 732 superhero records from the heroes_information.csv dataset, comprising 11 columns of unique characteristics for each hero. The primary goal is to showcase key insights derived from this rich dataset, demonstrating proficiency in data analysis using SQL.

data-analysis datasets mysql-database mysql-server mysql-workbench sql

Last synced: 07 Jul 2025

https://github.com/yash22222/data-analysis-on-real-time-social-media-comments

EngageInsight analyzes user interactions in comment data. It provides insights through visualizations created using Python libraries like Pandas and Matplotlib. The project aims to uncover patterns and trends in user engagement. The visualizations provide an overview of comment lengths, the frequency of different types of replies.

data-analysis data-cleaning-and-preprocessing data-visualization matplotlib pandas pattern-recognition real-time-social-media-data seaborn trend-analysis

Last synced: 14 May 2026

https://github.com/svetlanam/pycon-workshop

Pycon CZ workshop: Better data analyses and product recommendations with Instagram data

data-analysis data-science martinus matplotlib pandas pycon2016 pyconcz python scikit-learn workshop

Last synced: 09 Apr 2026

https://github.com/abhigyan126/prompt2query

A Python desktop application for streamlined data analysis, enabling users to generate and execute Pandas and SQL queries with ease. Focus on reducing analysis time through an intuitive interface and efficient workflows

data-analysis data-science data-visualization database gemini generative-ai ide llm pandas pandas-interface python sql-interface

Last synced: 13 Feb 2026

https://github.com/brunomontezano/digital-interventions-for-depression

📱 "Digital interventions for depressive symptoms: a randomized clinical trial" code

academia clinical-trials cognitive-behavioral-therapy data-analysis digital-health open-science smartphone-app

Last synced: 03 Oct 2025

https://github.com/blackcub3s/msc-finalthesis

The most important programming files, code functions and data processing pipelines for the Machine learning final thesis of my Master's degree. Also, the LaTeX code of the thesis.

data-analysis latex machine-learning numpy python sklearn

Last synced: 09 Apr 2026

https://github.com/dcostachar/telco-customer-churn-dashboard

An interactive Tableau dashboard using the Telco Customer Churn dataset to analyze key drivers of customer churn and develop data-driven retention strategies for the telecommunications industry.

business-intelligence customer-churn-analysis data-analysis data-visualization marketing-analytics tableau

Last synced: 09 Mar 2026

https://github.com/alan-oliveir/state-of-data-2022

Neste projeto faço a análise da distribuição das faixas salariais para os profissionais de nível júnior para o cargo de analista, cientista e engenheiro de dados.

data-analysis jupyter-notebook pandas-python seaborn-python

Last synced: 03 Oct 2025

https://github.com/m-coder-umer/sales-dashboard-power-bi-project

An interactive Sales Dashboard built with Power BI using MySQL data, showcasing monthly trends, top-performing products, and key sales KPIs (Key Performance Indicators).

business-intelligence data-analysis data-cleaning data-modeling data-visualization dax interactive-dashboard mysql power-query powerbi sales-dashboard sql time-series-analysis

Last synced: 07 Jul 2025

https://github.com/susshiii/sql-layoffs-data-cleaning-and-eda

Full SQL project using MySQL to clean and analyze a real-world tech layoff dataset from 2020–2023.

data-analysis data-analytics-project data-cleaning eda layoffs mysql sql

Last synced: 07 Jul 2025

https://github.com/mkoeppe/jiawei-computations

Computations supporting Chapters 2 and 3 of Jiawei Wang's dissertation "Subadditivity of Piecewise Linear Functions", UC Davis, Ph.D. program in Mathematics, 2020

benchmark-framework branch-and-bound cluster cutting-planes data-analysis hpc integer-programming reproducible-research sagemath

Last synced: 10 Aug 2025

https://github.com/hemangsharma/hotel-revenue-booking-analysis

This project provides a comprehensive revenue and reservation analysis for Highfield Hotel using historical data exported from booking systems and internal revenue reports. The goal is to derive actionable insights to improve room profitability, understand booking patterns, and support data-driven decision-making.

analysis data-analysis data-visualization hotel

Last synced: 10 Aug 2025

https://github.com/nafisrayan/decentai

A comprehensive platform built using ReactJS and Flask, combining blockchain technology with AI to create a secure and intelligent space for community engagement and policy discussions. Leverages NLP and LLM for meaningful interactions and sentiment analysis while ensuring data security and user privacy.

chatbot data-analysis data-visualization flask gemini gemini-ai gemini-ai-chatbot gemini-api government government-tech llm mongodb nlp polls python react tailwind voting-systems winknlp

Last synced: 12 Apr 2026

https://github.com/ifigeneiatsiflidou/applied-statistics-project

Project for an Applied Statistics course, involving exploratory data analysis and predictive modeling of movie revenue using engineered features and multiple linear regression.

correlation-analysis data-analysis linear-regression python scikit-learn visualization

Last synced: 29 Apr 2026

https://github.com/r8vnhill/hdp

Hentai data processing

data-analysis e-hentai hentai kotlin

Last synced: 02 Apr 2025

https://github.com/antononcube/java-tilestats

Java package for statistics over 2D tillings. (Tile binning, aggregation functions application, etc.)

data-analysis hexagonal-grids

Last synced: 02 Apr 2025

https://github.com/nuraj250/datainsighthub

A Node.js backend application that processes and analyzes personal user data to generate personalized insights and recommendations. It features secure user authentication, data upload and storage, custom algorithms for data analysis, and optional real-time notifications and third-party API integrations. Perfect for showcasing backend development

api-development backend-development bcrypt data-analysis data-analytics data-insights dotenv express jwt-authentication mongodb nodejs passport secure-api user-authentication

Last synced: 09 Apr 2026

https://github.com/srikarveluvali/dataanalysis

The "Dataset - Extraction, Analysis, and Visualization" project is a Python-based data analysis venture that focuses on exploring and interpreting the "Video Game Sales Analysis" dataset.

css data-analysis html javascript matplotlib numpy pandas python seaborn tableau

Last synced: 09 Apr 2026

https://github.com/gui-sitton/prepaid

In this project I work as an analyst for the telecommunications company Megaline. The company offers its customers prepaid plans, Surf and Ultimate. The sales department wants to know which plans bring in the most revenue in order to adjust the advertising budget

data data-analysis data-analysis-python data-science data-visualization python

Last synced: 22 May 2026

https://github.com/antononcube/wl-tilestats-paclet

Wolfram Language (aka Mathematica) paclet for statistics over 2D tillings. (Tile binning, aggregation functions application, etc.)

2d-data data-analysis geospatial-data mathematica wolfram-language

Last synced: 20 Mar 2026

https://github.com/andrii04/andreamonforte-bi-assignment

Automated Data Pipeline that ingests daily GA4-formatted CSV files from a private Google Cloud Storage bucket, validates and loads them into BigQuery, and prepares analysis-ready views. The solution is built for deployment as a Cloud Function triggered by Cloud Scheduler and uses Python with the Google Cloud Storage and BigQuery client libraries.

automation bigquery cloud cloudfunctions data data-analysis data-engineering etl etlpipeline gcp google googlecloudplatform pipeline python sql

Last synced: 09 Nov 2025

https://github.com/gutow/langmuir_trough

Code to run homebuilt Langmuir Trough using Jupyter and Python. Link below for API docs:

data-acquisition data-analysis jupyter langmuir-trough plotting

Last synced: 11 Aug 2025

https://github.com/rajkumargara/bike_rental_data_analysis

Chicago bike rental data analysis for business insights using R programming

data-analysis data-visualization data-wrangling large-dataset machine-learning-algorithms

Last synced: 11 Aug 2025

https://github.com/jovicdev97/Financial-Loan-DataScience-Notebook

using numpy and pandas to analyze a synthetic loan dataset with python

data-analysis matlabplot numpy pandas plotting python seaborn

Last synced: 12 Mar 2025

https://github.com/erayagdogan/simplecharts

Simple Charts is a chart maker compose app with material 3 design. Charts are created using the lets-plot-compose library.

android android-app charts data-analysis data-visualization jetpack-compose lets-plot-kotlin material-3 viewmodel

Last synced: 11 Aug 2025