An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/mindlessmuse666/missing-data-processing

Проект по обработке пропущенных значений в данных о пассажирах Титаника с использованием библиотек Python Matplotlib и Seaborn.

data-analysis data-visualization matplotlib missing-values-analysis missing-values-handling pandas python seaborn titanic

Last synced: 16 May 2026

https://github.com/jhrcook/protein-language-models

Experimenting with protein language model predictions

data-analysis protein-language-model variant-effect-prediction

Last synced: 28 May 2026

https://github.com/amishidesai04/interactive-data-visualisation-tool

A Java-based application leveraging JavaFX to create dynamic and interactive charts, including pie charts, bar charts, and line graphs. Ideal for visualizing various datasets, this tool offers customizable features and a user-friendly interface. Easily input and manage data, customize chart styles, and observe trends and patterns effectively.

charts data-analysis data-visualisation data-visualization-project gui java javafx visualization-tools

Last synced: 17 Apr 2026

https://github.com/sukitsubaki/screen-time-tracker

A minimalist Python tracker that records the usage time of various applications and provides insights into your computer usage habits.

application-usage data-analysis monitoring productivity python python-cli screen-time time-tracking

Last synced: 12 Apr 2025

https://github.com/andrewzgheib/football-database-analysis

Football database utilizing PostgreSQL and Pandas for data management, with PowerBI for intuitive KPI visualization

data-analysis data-visualization database pandas pgsql postgr powerbi sql

Last synced: 04 Apr 2025

https://github.com/nerooc/device-downtime-detection

Repozytorium dotyczące projektu z przedmiotu "Sztuczne Sieci Neuronowe"

data-analysis detection-model recurrent-neural-networks

Last synced: 22 Mar 2025

https://github.com/timkong21/siemens-mobility-operations-industrial-engineer-simulation

Operations Industrial Engineer job simulation with Siemens Mobility. Includes time study analysis to identify assembly bottlenecks (Task 1) and a proposed layout redesign to improve efficiency without automation (Task 2).

data-analysis forage industrial-engineering job-simulation manufacturing process-improvement production-engineering python siemens time-analysis

Last synced: 19 May 2026

https://github.com/akunna1/energy-data-analysis-unc-campus

Link to Report: https://adminliveunc-my.sharepoint.com/:w:/r/personal/tadennis_ad_unc_edu/Documents/Capstone%20Group/Final%20Report%20Draft.docx?d=wba9e7182a9b948898133e4f89def1d90&csf=1&web=1&e=fQGAfy

arcgis-pro data-analysis dplyr excel geospatial-data-analysis ggplot ggplot2 lubricants tidyr tidyverse

Last synced: 08 Aug 2025

https://github.com/lopez86/datascienceexamples

Examples of various data science & data analysis topics using various sources of data.

data-analysis data-science pandas scikit-learn tutorial visualization

Last synced: 13 Apr 2026

https://github.com/sharduljunagade/human-activity-recognition

This repository contains the code for the Assignment-1 of the course ES 335: Machine Learning 2024 at IIT Gandhinagar taught by Prof. Nipun Batra.

data-analysis data-collection decision-trees groq-api human-activity-recognition jupyter langchain-python machine-learning pandas prompt-engineering python sklearn tsfel

Last synced: 08 Apr 2026

https://github.com/nagar2nd/zomato-bangalore-analysis-tableau

Analysing restaurant data in Bengaluru to enhance customer satisfaction by optimizing the restaurant experience. The focus is on improving the popularity of different cuisines, enhancing delivery times, and boosting restaurant ratings. An interactive Tableau dashboard has been developed to help Zomato identify key areas for improvements.

data-analysis data-visualization tableau

Last synced: 05 Mar 2026

https://github.com/shubhamgoyal575/credit-card-fraud-detection

📌 Credit Card Fraud Detection using Machine Learning This project focuses on detecting fraudulent credit card transactions using machine learning models like Random Forest, XGBoost, and Deep Learning. The dataset is preprocessed to handle class imbalance, and multiple models are evaluated based on ROC AUC Score and F1 Score.

adaboost-classifier artificial-neural-networks credit-card-fraud data-analysis data-cleaning data-preprocessing data-science data-visualization deep-learning exploratory-data-analysis lightgbm machine-learning machine-learning-algorithms random-forest-classifer scikit-learn tensorflow xgboost

Last synced: 08 Feb 2026

https://github.com/swatisinghit/e-commerce-trend-analysis-for-target

An exploratory and in-depth study of the E-Commerce sales data for a Brazilian store using SQL.

bigquery data-analysis mysql sql

Last synced: 19 May 2026

https://github.com/amarlearning/exploring-the-evolution-of-linux

Data Analysis about the development of the Linux operating system by exploring its Git repository history.

cleaning-data data data-analysis data-wrangling datacamp first-commit git-history linux

Last synced: 12 May 2026

https://github.com/imnotamr/datasets-used

A comprehensive collection of datasets for machine learning and data science projects, covering topics from advertising and sales to health and sports analytics

ai classification data-analysis data-science data-visualization deep-learning jupyter-notebook machine-learning models python regression-models

Last synced: 19 May 2026

https://github.com/mulukensholaye/spark_kafka_streaming_csv

Real-time streaming data analysis pipeline with integrating apache spark's streaming library to read records from kafka topic

apache-kafka apache-spark data-analysis python3 realtime-messaging

Last synced: 19 May 2026

https://github.com/syed-amjad-ali/airbnb-listing-analysis

Analyzing AirBnB listings in Paris to determine the impact of recent regulations

business-intelligence data-analysis jupyter-notebook maven-analytics python

Last synced: 19 May 2026

https://github.com/hawmex/aut_data_and_information_analysis_project

This repository contains the files of my project for the "Data & Information Analysis" course at AUT (Tehran Polytechnic).

data-analysis data-science k-means outlier-detection python

Last synced: 19 May 2026

https://github.com/halyusa16/sql-employee-insights

This project dives into employee data to uncover actionable insights using SQL. It mimics real-world HR and business analysis tasks, from salary comparisons to workforce demographics and potential cost-cutting strategies.

data-analysis mysql sql

Last synced: 11 Apr 2025

https://github.com/devexpress-examples/wpf-pivotgrid-how-to-display-underlying-data

This example demonstrates how to obtain the records from the control's underlying data source for a selected cell or multiple selected cells.

data-analysis dotnet dxpivotgrid pivot-grid pivot-grid-for-wpf wpf

Last synced: 19 May 2026

https://github.com/samir-atra/share-lm_dataset_analysis

Analysis, studies and optimizations on the ShareLM extension dataset

data-analysis data-visualization gemma3n huggingface huggingface-transformers pandas

Last synced: 19 May 2026

https://github.com/rita94105/smart_contract_vulnerability_detector

Smart contracts are pivotal in blockchain applications but are prone to vulnerabilities that can lead to significant losses. SmartGuard: Multi-Stage Smart Contract Vulnerability Detection tackles this issue by developing a machine learning framework to identify eight vulnerability types using datasets from Kaggle and Hugging Face.

data-analysis machine-learning smart-contracts streamlit vulnerability-detection

Last synced: 01 Aug 2025

https://github.com/prakshal0809/sql-data-analysis-project

This project involves analyzing pizza sales data using SQL to address various data analysis questions, providing essential foundational to advanced SQL knowledge.

data-analysis sql

Last synced: 26 Jun 2025

https://github.com/borjamome/radiografia-madrid

Análisis de Población, Economía y Sociedad de Madrid con R.

data-analysis data-visualization madrid r

Last synced: 17 Jun 2025

https://github.com/singingsandhill/data_analysis

데이터 분석_개인 프로젝트 정리

data-analysis python

Last synced: 19 May 2026

https://github.com/jakobzmrzlikar/trg-dela

Data analysis of student job offers.

data-analysis ipython-notebook web-scraping

Last synced: 09 Aug 2025

https://github.com/rorrell/spotifyhistory

A Jupyter Notebook where I wrangle some data and plot a chart to draw some conclusions about a user's Spotify history

data-analysis data-visualisation data-wrangling jupyter-notebook python3

Last synced: 19 May 2026

https://github.com/ygalvao/uow_ai_final_project

This was my Final Project for the Artificial Intelligence Diploma program of The University of Winnipeg - Professional, Applied and Continuing Education (PACE).

data-analysis data-analytics dbscan elections k-means k-means-clustering machine-learning som som-clustering

Last synced: 10 Jul 2025

https://github.com/riborings/uranouchi42microdiversity

In this repository live the bash, R and Julia scripts used to explore the microdiversity of the prokaryotic community at Uranouchi Inlet (42-sample time-series) by means of metagenomic shotgun sequencing under the supervision of the Ogata Lab.

big-data data-analysis data-visualisation diversity-analysis marine-ecology marine-ecosystem metagenomics microbiome-analysis prokaryotic-genomes

Last synced: 29 Oct 2025

https://github.com/abhishekyadav915/data-analytics-projects

This project focuses on performing comprehensive data analysis to extract valuable insights from a given dataset. By leveraging various data manipulation, cleaning, and visualization techniques, the project aims to uncover patterns, trends, and correlations that can inform decision-making and strategy.

data-analysis data-visualization dataset

Last synced: 05 Apr 2025

https://github.com/sukhitashvili/pca_tutorial

PCA algorithm from scrach, using only matrix-vector multiplications

data-analysis data-science data-visualization machine-learning-algorithms pca

Last synced: 29 Mar 2025

https://github.com/lucashomuniz/project-15

[Dashboard] Enhancing Business Intelligence: Leveraging SQL, Python, and DAX for Strategic Insights in Sales Analysis

business-analytics business-intelligence data-analysis data-science data-visualization dax-languague machine-learning powerbi python

Last synced: 12 Jul 2025

https://github.com/samukiszhsd/alteryx-analytics

Você está trabalhando com dados de transações bancárias do Itaú e precisa fazer algumas análises para ajudar o time de auditoria a detectar padrões incomuns e possíveis transações suspeitas.

alteryx data-analysis data-structures data-visualization etl workflow

Last synced: 18 Feb 2026

https://github.com/prady2309/stock-analysis

Analysis on the stock prices of Apple, Google, Microsoft and Amazon

data-analysis data-science data-visualization python stock-market

Last synced: 19 May 2026

https://github.com/rohitha-tata/churn-predict

Churn Predict uses Machine Learning to analyze customer behavior and identify those likely to leave. It involves data preprocessing, feature selection, model training (Logistic Regression, Random Forest, XGBoost), and evaluation using accuracy and ROC-AUC. The model provides actionable insights to help businesses reduce churn and improve retention

data-analysis logistic-regression machine-learning python

Last synced: 16 May 2026

https://github.com/eve-ning/ppshift

Analyzes maps and scores from 2015

data-analysis data-mining osu osugame

Last synced: 13 Feb 2026

https://github.com/saroshfarhan/irish_hospital_data_anaysis

Irish hospital's patient discharge data for four counties analysis

data-analysis data-science data-visualization healthcare irish-data r-programming-language

Last synced: 18 Feb 2026

https://github.com/coditheck/data_analysis

Data analysis is the process of inspecting, cleaning, transforming, and modeling data in order to discover useful information, draw conclusions, and support decision making.

data-analysis python

Last synced: 17 Jun 2025

https://github.com/sebastianurdaneguibisalaya/colocaciones-de-credito-fondo-mivivienda-peru

Exploro las Colocaciones de Crédito del Fondo MIVIVIENDA S.A. entre 2018 y 2022, con un conjunto de datos descargado del Portal Nacional de Datos Abiertos del Perú. 🏠

data-analysis jupyter-notebook python

Last synced: 24 Feb 2025

https://github.com/busradeveci/odev2-branching

This project is prepared for Artificial Intelligence and Technology Academy Git GitHub Assignment 2. Using the “Wine Reviews” dataset from Kaggle, it converts wine ratings into star ratings and analyzes them.

data-analysis kaggle-dataset python wine-reviews-dataset

Last synced: 03 Oct 2025

https://github.com/arkww/matmap

Making maps from a Database and making the user guess which map is displayed

data-analysis data-science javascript python

Last synced: 24 Apr 2026

https://github.com/parthkumarmpatel/sql-exploratory-data-analysis

SQL EDA scripts for sales data warehouse — metrics, insights, and rankings from my data warehouse project.

data-analysis exploratory-data-analysis sql-server

Last synced: 26 Jun 2025

https://github.com/pooja-manjunatha/nyc_parking_violations_dbt

This project uses dbt to transform NYC parking violations data through a layered architecture: Bronze: Raw ingested data Silver: Cleaned and enriched data Gold: Aggregated tables for analytics Using DuckDB as the warehouse backend, it ensures data quality with tests and documentation. The project enables reliable analysis of parking violations

data data-analysis data-engineering dbt duckdb python sql

Last synced: 14 May 2026

https://github.com/adeebkhan25/dataset_suicide_susceptible

The "Student Suicide Risk Factors Dataset" is a comprehensive collection of data aimed at understanding and mitigating the factors contributing to student suicides.

data-analysis dataset machine-learning supervised-learning

Last synced: 24 Dec 2025

https://github.com/htsandaruvan/attrition-analytics-suite-by-hello-green

I have created a comprehensive data analytics dashboard to identify factors contributing to attrition,

data-analysis data-analytics data-visualization powerbi

Last synced: 20 Jan 2026

https://github.com/alimiheb/advwokcube-analysis

A comprehensive SSAS cube project based on AdventureWorksDW2019, featuring data cleaning, multidimensional modeling, and visualizations in Power BI and Excel.

adventureworks data-analysis excel powerbi sql-server ssas-multidimensional visualization

Last synced: 26 Jun 2025

https://github.com/nivasharmaa/friskwatch

A Java program for analyzing stop-and-frisk data from the NYPD. Features data import, organization, and statistical analysis to compare occurrences during and after policy implementation.

data-analysis data-visualization dataprocessing datascience file-io java java-oop nypd-data

Last synced: 19 May 2026

https://github.com/arkww/chinesenewspaperwordcount

Analysis the word count of Chinese characters in Simplified and Traditional Chinese characters and comparing the results

chinese-language data-analysis data-science python

Last synced: 16 May 2026

https://github.com/shellynagar27/marketing-content-performance-analysis

Analyzed 2024 social media campaign data from TikTok, Instagram, LinkedIn, and X.com using Power BI to uncover performance trends across platforms, content types, and regions. Built an interactive dashboard to drive insights on engagement, optimal posting times, and content strategy.

data-analysis data-modelling data-visualization excel figma marketing-analytics powerbi powerquery wireframing

Last synced: 26 Jun 2025

https://github.com/progati00/marketing-mix-modeling-mmm-for-marketing-budget-optimization

A Marketing Mix Modeling (MMM) project using Python to analyze channel performance, calculate ROI, and simulate marketing budget changes for better business decisions. Includes a trained Linear Regression model, ROI analytics, and a Flask API for revenue prediction.

api budget-optimization data data-analysis data-science ecommerce eda flask jupyter-notebook linear-regression machine-learning marketing-analytics marketing-mix-modeling python roi-analysis vscode

Last synced: 14 Apr 2026

https://github.com/yash22222/data-analysis-on-real-time-social-media-comments

EngageInsight analyzes user interactions in comment data. It provides insights through visualizations created using Python libraries like Pandas and Matplotlib. The project aims to uncover patterns and trends in user engagement. The visualizations provide an overview of comment lengths, the frequency of different types of replies.

data-analysis data-cleaning-and-preprocessing data-visualization matplotlib pandas pattern-recognition real-time-social-media-data seaborn trend-analysis

Last synced: 14 May 2026

https://github.com/kevin-rsj/sectores_economicos_covid-19

Análisis Exploratorio de Datos (EDA): Comportamiento de Sectores Económicos antes, durante y después de la Pandemia de COVID-19 (2019-2022)

data-analysis financial-analysis pandemic-analysis python stock-market time-series visualization yahoo-finance

Last synced: 20 May 2026

https://github.com/ryuzen6/kaggle-series

This is a series of Machine Learning/Deep Learning Models made for practice.

artificial-intelligence data-analysis data-science deep-learning machine-learning python3

Last synced: 20 May 2026

https://github.com/svetlanam/pycon-workshop

Pycon CZ workshop: Better data analyses and product recommendations with Instagram data

data-analysis data-science martinus matplotlib pandas pycon2016 pyconcz python scikit-learn workshop

Last synced: 09 Apr 2026

https://github.com/abhigyan126/prompt2query

A Python desktop application for streamlined data analysis, enabling users to generate and execute Pandas and SQL queries with ease. Focus on reducing analysis time through an intuitive interface and efficient workflows

data-analysis data-science data-visualization database gemini generative-ai ide llm pandas pandas-interface python sql-interface

Last synced: 13 Feb 2026

https://github.com/palakjainanalyst/ecommerce-customer-spending-analysis

An end-to-end Ecommerce analytics project uncovering customer spending trends using Excel, Python, SQL, and Power BI. From raw data to interactive dashboards, this project delivers deep insights on spending patterns, high-value customer segments - showcasing a complete data-to-decisions workflow.

data-analysis data-visualization database ecommerce excel jupyter-notebook powerbi python spending sql

Last synced: 06 May 2026

https://github.com/badranalyst/restaurant-reviews-sentiment-analysis-nlp-case-study

This project analyzes restaurant reviews using Natural Language Processing (NLP) for sentiment analysis. It covers data exploration, pre-processing (NLTK text cleaning), model building, prediction, and deployment. The goal is to predict sentiment from reviews using Python libraries such as Pandas, NumPy, Matplotlib, and Seaborn.

data-analysis data-science eda exploratory-data-analysis matplotlib-pyplot model model-building numpy pandas pre-processing predictive-modeling python seaborn

Last synced: 13 Apr 2026

https://github.com/techshot25/graduateadmissions

Looking at the probability of being accepted in a graduate program using a machine learning model

bayesian-regression correlation-matrices data-analysis data-science linear-regression machie-learning random-forest-regression regression ridge-regression

Last synced: 25 Feb 2025

https://github.com/srinibas-masanta/hotel-revenue-analysis-dashboard

This project focuses on analyzing hotel booking data to uncover key metrics and insights that drive revenue management decisions. By creating an interactive Power BI dashboard, the project aims to improve strategic decision-making, optimize occupancy rates, and enhance overall financial performance within the hospitality industry.

business-analytics data-analysis data-science data-visualization dax-functions hospitality powerbi

Last synced: 12 Jan 2026

https://github.com/ifigeneiatsiflidou/popular-items-sales-analysis

Two data tasks in Python: popular items by ZIP & store sales breakdown with plots.

data-analysis matplotlib pandas

Last synced: 16 May 2026

https://github.com/rugwiroparfait/alx_sql

This repo is where I save my queries and learning materials in Data Science program from ALX

anaconda data data-analysis jupyter-notebook sql

Last synced: 19 Aug 2025

https://github.com/kiran-kumar-k3/sales-performance-dashboard

The Sales Performance Dashboard is an interactive Python-based web application that visualizes and analyzes sales data, providing actionable insights through dynamic charts and metrics.

data-analysis python streamlit

Last synced: 20 May 2026

https://github.com/RLAlpha49/AniSearch-Model

AniSearchModel leverages Sentence-BERT (SBERT) models to generate embeddings for synopses, enabling the calculation of semantic similarities between descriptions. This allows users to find the most similar anime or manga based on a given description.

anime api data-analysis data-merging embeddings flask hugging-face-datasets kaggle-datasets machine-learning manga natural-language-processing nlp python sentence-bert similarity-search

Last synced: 06 May 2025

https://github.com/archanakokate/bank_term_deposit_prediction

Build a Decision Tree classifier to predict if the client will subscribe to a Term Deposit based on their demographic and behavioral data.

data-analysis data-visualization exploratory-data-analysis machine-learning

Last synced: 14 Sep 2025

https://github.com/alfioma/ada-xtq

🔗 Simplify data transfer with ada-xtq, a lightweight tool for seamless integration and efficient handling of data between platforms.

ada algorithms api-development artificial-intelligence automation data-analysis data-visualization docker machine-learning neural-networks open-source programming python software-development xtq

Last synced: 01 May 2026

https://github.com/panoschatzi/erythrocyte_study_statistical_analyses

R code for data transformation, analysis and visualization of experimental data, as well as for statistical analyses and quantitative simulations.

afex data-analysis emmeans ggplot2 lme4 purrr r rprogramming rstats rstudio statistics tidyverse visualization

Last synced: 04 Apr 2025