An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/ct83/become-a-data-analyst-udacity

This repository contains all of the code, projects and reports that I wrote as I pursued my Udacity - Data Analyst NanoDegree.

data-analysis data-analysis-python data-analyst data-visualisation data-visualization-project datascience python udacity udacity-data-analyst-nanodegree

Last synced: 12 Aug 2025

https://github.com/mindlessmuse666/eda-pandas

Проект по разведочному анализу данных (EDA) о пассажирах Титаника с использованием библиотеки Pandas. Включает в себя загрузку данных, предобработку, статистический анализ, визуализацию и создание сводных таблиц. Цель проекта - демонстрация основных методов и инструментов EDA для анализа и понимания данных.

data-analysis data-processing data-science data-visualization eda exploratory-data-analysis matplotlib pandas python titanic

Last synced: 18 Apr 2026

https://github.com/r12habh/canada-imigration-data-analysis

Dataset: Immigration to Canada from 1980 to 2013 - International migration flows to and from selected countries - The 2015 revision from United Nation's website. (Cognitive Class Data Analysis with Python)

canada data-analysis data-science data-visualization datascience python python3

Last synced: 23 May 2026

https://github.com/nabilalibou/uber_fare_prediction_explained

This repository documents a complete ML workflow to model Uber fares in Paris, from granular EDA and feature engineering to building and fine-tuning a stacking regressor on 10k real-world rides.

data-analysis data-science eda feature-engineering machine-learning predictive-analytics pricing-model python regression-model stacking-ensemble uber

Last synced: 12 Aug 2025

https://github.com/omari-kd/data-analytics

Welcome to my Data Analytics Portfolio, which includes structured projects in both Data Science and Data Analysis, implemented in R and Python.

data-analysis data-analytics data-science machine-learning

Last synced: 12 Aug 2025

https://github.com/jprmaulion/cholera-gedeo-ethiopia-spatial-analysis

Exploratory spatial analysis and visualization of cholera case clusters in Gedeo Zone, Ethiopia that integrates demographic and geographic data to identify environmental risk patterns and inform public health interventions. Includes geospatial mapping of cholera incidence relative to waterways and administrative boundaries.

cholera data-analysis data-analysis-python epidemiology ethiopia openstreetmap python spatial-analysis

Last synced: 12 Apr 2026

https://github.com/lulloooo/bizdata-nexus

Collection of my Business & Data Analysis projects, from professional/academic endeavors to passion-driven explorations 📊

business-analysis data-analysis economics etl excel finance mysql python r risk-analysis

Last synced: 05 Apr 2026

https://github.com/aaisha-nexus/sql_company_insights

A beginner-friendly SQL project for managing employee records, departments, and sales transactions. Includes table creation, optimized queries, stored procedures, and window functions to extract business insights.

business-analytics data data-analysis dataanalysis-projects dataanalytics database-schema mssql-database query relational-databases sql sql-query ssms

Last synced: 12 Aug 2025

https://github.com/farhad-here/adventureworks_interactive_sales_dashboard_powerbi

An interactive Power BI dashboard for Adventure Works sales team to analyze performance, customers, products, and employees. Includes data cleaning, data modeling, DAX measures and advanced visualization features.

business-intelligence chart csv data-analysis data-cleaning data-cleaning-and-preprocessing data-visualization dax powerbi

Last synced: 13 Aug 2025

https://github.com/imgabreuw/minicurso-python-para-financas

Mini curso de Python para finanças, disponibilizado por Varos.

data-analysis financial-analysis python

Last synced: 13 Aug 2025

https://github.com/itsachrafmansari/moroccan-real-estate-analysis

Scrape, process, analyze, and visualize data from Avito.ma to uncover current trends in Morocco's real estate market.

api-scraping data data-analysis data-mining data-science data-scraping data-visualization eda exploratory-data-analysis morocco real-estate web-scraping

Last synced: 13 Aug 2025

https://github.com/Solrikk/PicTrace-Web

PicTraceV2 is a highly efficient image matching platform that leverages computer vision using OpenCV, deep learning with TensorFlow and the ResNet50 model, asynchronous processing with aiohttp, and Selenium for browser automation. PicTraceV2 allows users to upload images directly or provide URLs, quickly scanning a vast database to find image

automation computer-vision data-analysis data-extraction deep-learning image-processing image-search machine-learning natural-language-processing opencv openpyxl pandas python selenium tensorflow web-scraping yandex yandex-api

Last synced: 15 Aug 2025

https://github.com/clchinkc/zombie

Personal project, Python, NumPy, Matplotlib, Pygame, Scikit-learn, TensorFlow, Docker

algorithms data-analysis docker machine-learning matplotlib numpy pygame python sklearn tensorflow zombie-simulation

Last synced: 05 Apr 2026

https://github.com/shubhamgoyal575/tableau-visualization-dashboard

This repository features interactive Tableau dashboards for sales performance and healthcare analysis. It includes insights on revenue trends, regional sales, patient demographics, and hospital occupancy for data-driven decision-making. 🚀

dashborad data-analysis data-cleaning-and-preprocessing healthcare-analysis healthcare-dashboard sales-dashboard sales-data-analysis-project tableau tableau-dashboards tableau-public visualization visualization-tools

Last synced: 20 Feb 2026

https://github.com/ggarciajavier/udacity-dalf-project1-investigate-dataset

Work performed for the 1st project of Udacity Data Analyst Nanodegree: exploratory data analysis of a football dataset.

data-analysis football-analytics python python36 udacity-data-analyst-nanodegree

Last synced: 15 May 2026

https://github.com/zen204/accenture-tech-news-summarization-engine

A tool developed to analyze knowledge graphs from technology news articles, uncovering insights and trends about technology products, platforms, services, and their industry impact. Built during an internship at Accenture to inform decision-making in the tech landscape.

data-analysis decision-making graph-visualization industry-insights jupyter-notebook knowledge-graph machine-learning python tech-news tech-trends

Last synced: 29 Apr 2026

https://github.com/dcs-training/scottishaccounts

This repo contains various examples of analysis that can be performed on the Statistical Accounts of Scotland dataset. Go to the readme file

data-analysis data-visualisation data-wrangling geographical-data r rmarkdown text-analysis

Last synced: 16 Aug 2025

https://github.com/sebastiansauer/hans-hackathon2025

Materials for a course on the evaluation of the AI student learn tool "HaNS"

ai data-analysis evaluaton r

Last synced: 04 Oct 2025

https://github.com/chandkund/loan-eligibility-prediction

This project is designed to predict the eligibility of loan applicants based on various factors such as income, credit history, and marital status. By analyzing historical loan application data, the model helps to determine whether a loan application should be approved or not.

data-analysis data-science data-visualization machine-learning-algorithms matplotlib numpy pandas python seaborn

Last synced: 09 Apr 2026

https://github.com/cyberoctane29/noaa-lightning-analysis

This project explores lightning strike data from the National Oceanic and Atmospheric Administration (NOAA) to identify seasonal trends and analyze strike frequency across months. It demonstrates data manipulation, aggregation, and visualization using Python, providing insights into lightning activity patterns.

data-analysis data-science data-visualization eda python

Last synced: 20 Apr 2026

https://github.com/davidzajac1/four-percent-rule-pandas-analysis

Analysis of the 4% Personal Finance Rule of Thumb

data-analysis data-visualization pandas python

Last synced: 20 Apr 2026

https://github.com/jpgiant/nyc_energy_prediction

A comprehensive code for predicting energy usage in NYC using Machine Learning Algorithms.

data-analysis data-science data-visualization folium jupyter-notebook machine-learning matplotlib numpy pandas python seaborn sklearn

Last synced: 10 Apr 2026

https://github.com/progati00/marketing-mix-modeling-mmm-for-marketing-budget-optimization

A Marketing Mix Modeling (MMM) project using Python to analyze channel performance, calculate ROI, and simulate marketing budget changes for better business decisions. Includes a trained Linear Regression model, ROI analytics, and a Flask API for revenue prediction.

api budget-optimization data data-analysis data-science ecommerce eda flask jupyter-notebook linear-regression machine-learning marketing-analytics marketing-mix-modeling python roi-analysis vscode

Last synced: 14 Apr 2026

https://github.com/palakjainanalyst/ecommerce-customer-spending-analysis

An end-to-end Ecommerce analytics project uncovering customer spending trends using Excel, Python, SQL, and Power BI. From raw data to interactive dashboards, this project delivers deep insights on spending patterns, high-value customer segments - showcasing a complete data-to-decisions workflow.

data-analysis data-visualization database ecommerce excel jupyter-notebook powerbi python spending sql

Last synced: 06 May 2026

https://github.com/harmanveer-2546/movie-industry

Investigate the film industry to gain sufficient understanding of what attributes to success and in turn utilize this analysis to create actionable recommendations for companies to enter the industry.

business business-analytics data-analysis datatime film-industry graphs matplotlib movie-database numpy pandas python scraping-websites seaborn visualization web-scraping-python

Last synced: 10 Apr 2026

https://github.com/berkekaragoz/media-investments-data-analysis

Advertisement Investments Distribution of Turkey by Medium

data-analysis r

Last synced: 19 Aug 2025

https://github.com/lucalullo/italian-justice-workload

Multidimensional analysis of the Italian justice system workload (2003–2024). A study of civil and criminal proceedings using judicial pressure and litigation indicators.

data-analysis italy judicial-workload justice-system kaggle legal-analytics pandas python time-series

Last synced: 24 May 2026

https://github.com/chiamakaukwuoma/portfolio

This repository contains various projects I've been privileged to work on outside of work.

aws-rds azure-fabric bigquery data-analysis docker-container elasticsearch excel grafana hadoop looker-studio mssql mysql postgresql powerbi python sql tableau

Last synced: 10 Apr 2026

https://github.com/j-wu1/analyse_ventes_jeuxvideo_python

Analyse Exploratoire de Données (EDA) sur les ventes de jeux vidéo avec Python, Pandas, Matplotlib et Seaborn dans un Jupyter Notebook.

data-analysis eda jupyter-notebook matplotlib pandas python seaborn

Last synced: 19 Aug 2025

https://github.com/jailsonsb2/kit-analise-de-dados

🚀 Um kit de ferramentas Python para acelerar a análise de dados. Carregue arquivos de forma inteligente (CSV, Excel, etc.) e converta notebooks Jupyter para scripts de produção sem esforço.

analise-de-dados analise-exploratoria analise-exploratoria-de-dados automation automations dados data-analysis data-cleaning etl etl-automation jupyter-notebook pandas powerquery python toolkit

Last synced: 29 Apr 2026

https://github.com/apostolis-bloutsos-data/employee-data-eda

Mini EDA project on synthetic employee records using Python, pandas, and matplotlib

data-analysis eda jupyter-notebook matplotlib pandas python seaborn

Last synced: 09 May 2026

https://github.com/mjshubham21/ny_yellow_taxi_python_da_project

A data analysis project of New York Yellow Taxi (Feb of 2025) using Python and its libraries for analytics like : NumPy, MatPlotLib, Pandas and Seaborn.

data-analysis jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 04 May 2026

https://github.com/cyberoctane29/epa-air-quality-aqi-analysis

This project involved analyzing air quality data from the EPA, focusing on the Air Quality Index (AQI). I used Python data structures like dictionaries and sets to manage and process the data, simulating real-world data analysis to assess pollution levels and their health implications.

data-analysis numpy pandas python statistics

Last synced: 10 Apr 2026

https://github.com/jedrzej-wydra/competition-cooperation

Competition, cooperation, and parental effects in larval aggregations formed on carrion by communally breeding beetles Necrodes littoralis (Staphylinidae: Silphinae)

data-analysis non-linear-regression r

Last synced: 20 Aug 2025

https://github.com/jofaval/iris-flowers

Multilabel Classification of the famous Iris Flowers Dataset from Ronald Aylmer Fisher in 1936

classification data-analysis data-science data-visualization google-colab iris-flowers kaggle machine-learning python scikit-learn xgboost

Last synced: 05 Apr 2026

https://github.com/kaoutarmi/analyse-des-ventes-pour-optimiser-la-performance

Analyse des données de ventes pour identifier des opportunités d'amélioration des performances commerciales. Utilisation de Pandas pour le traitement des données, et Matplotlib/Seaborn pour la visualisation des tendances et des résultats.

business-intelligence data-analysis data-visualization jupyter-notebook matplotlib pandas sales-optimization seaborn

Last synced: 20 Aug 2025

https://github.com/souravsuvarna/whatsapp-chat-analyzer-and-visualizer-web-application

The WhatsApp chat analyzer and visualizer uses NLP algorithms to analyze chat data, tracking usage patterns and presenting insights through visually appealing charts and graphs. It helps users understand communication patterns and behaviors on WhatsApp.

data-analysis data-science data-visualization python python3 streamlit

Last synced: 18 Apr 2026

https://github.com/myriamba/neuraview

AI-Powered Data Insights and Visualization Generator

data-analysis data-engineering data-insights data-visualization generative-ai llm

Last synced: 21 Aug 2025

https://github.com/kevingastelum/mydataanalysis

My DataAnalyst Projects | Python, SQL, Excel, PowerBI & Tableau

data-analysis python sql visualization

Last synced: 20 May 2026

https://github.com/halyusa16/sql-employee-insights

This project dives into employee data to uncover actionable insights using SQL. It mimics real-world HR and business analysis tasks, from salary comparisons to workforce demographics and potential cost-cutting strategies.

data-analysis mysql sql

Last synced: 11 Apr 2025

https://github.com/rita94105/smart_contract_vulnerability_detector

Smart contracts are pivotal in blockchain applications but are prone to vulnerabilities that can lead to significant losses. SmartGuard: Multi-Stage Smart Contract Vulnerability Detection tackles this issue by developing a machine learning framework to identify eight vulnerability types using datasets from Kaggle and Hugging Face.

data-analysis machine-learning smart-contracts streamlit vulnerability-detection

Last synced: 01 Aug 2025

https://github.com/sukitsubaki/screen-time-tracker

A minimalist Python tracker that records the usage time of various applications and provides insights into your computer usage habits.

application-usage data-analysis monitoring productivity python python-cli screen-time time-tracking

Last synced: 12 Apr 2025

https://github.com/rorrell/spotifyhistory

A Jupyter Notebook where I wrangle some data and plot a chart to draw some conclusions about a user's Spotify history

data-analysis data-visualisation data-wrangling jupyter-notebook python3

Last synced: 19 May 2026

https://github.com/mindlessmuse666/missing-data-processing

Проект по обработке пропущенных значений в данных о пассажирах Титаника с использованием библиотек Python Matplotlib и Seaborn.

data-analysis data-visualization matplotlib missing-values-analysis missing-values-handling pandas python seaborn titanic

Last synced: 16 May 2026

https://github.com/ygalvao/uow_ai_final_project

This was my Final Project for the Artificial Intelligence Diploma program of The University of Winnipeg - Professional, Applied and Continuing Education (PACE).

data-analysis data-analytics dbscan elections k-means k-means-clustering machine-learning som som-clustering

Last synced: 10 Jul 2025

https://github.com/riborings/uranouchi42microdiversity

In this repository live the bash, R and Julia scripts used to explore the microdiversity of the prokaryotic community at Uranouchi Inlet (42-sample time-series) by means of metagenomic shotgun sequencing under the supervision of the Ogata Lab.

big-data data-analysis data-visualisation diversity-analysis marine-ecology marine-ecosystem metagenomics microbiome-analysis prokaryotic-genomes

Last synced: 29 Oct 2025

https://github.com/abhishekyadav915/data-analytics-projects

This project focuses on performing comprehensive data analysis to extract valuable insights from a given dataset. By leveraging various data manipulation, cleaning, and visualization techniques, the project aims to uncover patterns, trends, and correlations that can inform decision-making and strategy.

data-analysis data-visualization dataset

Last synced: 05 Apr 2025

https://github.com/beatrice-b-m/bea-tools

🐝 𝓉𝑜𝑜𝓁𝓈 𝓂𝒶𝒹𝑒 𝒷𝓎, 𝒶𝓃𝒹 𝒻𝑜𝓇, 𝒷𝑒𝒶 🐝 . ݁₊ ⊹ . ݁ ⟡ ݁ . ⊹ ₊ ݁ ⊹ . ݁ ⟡ ݁ . ⊹ ₊ ݁. ⊹ . ݁ ⟡ ݁ .⊹ . ݁ ⟡ A Python package of random functions and tools that I use regularly. Data science / analysis focused since, ya know, I'm a data scientist c:

data-analysis data-science data-visualization

Last synced: 15 Jan 2026

https://github.com/rohitha-tata/churn-predict

Churn Predict uses Machine Learning to analyze customer behavior and identify those likely to leave. It involves data preprocessing, feature selection, model training (Logistic Regression, Random Forest, XGBoost), and evaluation using accuracy and ROC-AUC. The model provides actionable insights to help businesses reduce churn and improve retention

data-analysis logistic-regression machine-learning python

Last synced: 16 May 2026

https://github.com/coditheck/data_analysis

Data analysis is the process of inspecting, cleaning, transforming, and modeling data in order to discover useful information, draw conclusions, and support decision making.

data-analysis python

Last synced: 17 Jun 2025

https://github.com/arkww/matmap

Making maps from a Database and making the user guess which map is displayed

data-analysis data-science javascript python

Last synced: 24 Apr 2026

https://github.com/htsandaruvan/attrition-analytics-suite-by-hello-green

I have created a comprehensive data analytics dashboard to identify factors contributing to attrition,

data-analysis data-analytics data-visualization powerbi

Last synced: 20 Jan 2026

https://github.com/georgiifirsov/educational-research-work

Educational research project on 3rd year (6th semester). Topic: ARMA models in time series analysis

arma data-analysis jupyter-notebook python time-series time-series-analysis tsa

Last synced: 27 Apr 2026

https://github.com/datalopes1/desafio_delivery

Desafio do Clube de Assinaturas da Universidade dos Dados para simular as demandas reais de um analista de dados

data-analysis jupyter python

Last synced: 06 Mar 2026

https://github.com/arkww/chinesenewspaperwordcount

Analysis the word count of Chinese characters in Simplified and Traditional Chinese characters and comparing the results

chinese-language data-analysis data-science python

Last synced: 16 May 2026

https://github.com/prathmesh2507/ctc-hackthon

A data-driven system designed to reduce overcrowding and optimize urban public transport using real-world geospatial data and intelligent simulation.

dashboard data-analysis data-visualization python streamlit

Last synced: 16 May 2026

https://github.com/kakri787/alcoholism-and-grade-analysis

A mini project for university data science module where we analyzed on the relationship between alcohol consumption in students and their academic performance, making use of exploratory data analysis and machine learning techniques to see if we can predict student's grades.

data-analysis data-science data-vizualisation lasso-regression machine-learning neural-network

Last synced: 12 Apr 2025

https://github.com/svetlanam/pt-data-analyse

Data analyse of the czech parcel tracking providers

data-analysis matplotlib pandas parcel-tracking python3 visualisation

Last synced: 21 Aug 2025

https://github.com/jatin-mehra119/sales-analysis

Sales Analysis of super market

data-analysis salesanalysis visualization

Last synced: 29 Oct 2025

https://github.com/ifigeneiatsiflidou/popular-items-sales-analysis

Two data tasks in Python: popular items by ZIP & store sales breakdown with plots.

data-analysis matplotlib pandas

Last synced: 16 May 2026

https://github.com/RLAlpha49/AniSearch-Model

AniSearchModel leverages Sentence-BERT (SBERT) models to generate embeddings for synopses, enabling the calculation of semantic similarities between descriptions. This allows users to find the most similar anime or manga based on a given description.

anime api data-analysis data-merging embeddings flask hugging-face-datasets kaggle-datasets machine-learning manga natural-language-processing nlp python sentence-bert similarity-search

Last synced: 06 May 2025

https://github.com/alfioma/ada-xtq

🔗 Simplify data transfer with ada-xtq, a lightweight tool for seamless integration and efficient handling of data between platforms.

ada algorithms api-development artificial-intelligence automation data-analysis data-visualization docker machine-learning neural-networks open-source programming python software-development xtq

Last synced: 01 May 2026

https://github.com/panoschatzi/erythrocyte_study_statistical_analyses

R code for data transformation, analysis and visualization of experimental data, as well as for statistical analyses and quantitative simulations.

afex data-analysis emmeans ggplot2 lme4 purrr r rprogramming rstats rstudio statistics tidyverse visualization

Last synced: 04 Apr 2025

https://github.com/jelhamm/internode-hellinger-distance-based-decision-tree

Simulations for the paper "Inter node Hellinger Distance based Decision Tree by Pritom Saha Akash, Md. Eusha Kadir, Amin Ahsan Ali, Mohammad Shoyaib"

articles data-analysis data-mining decision-tree decision-tree-classifier hddt hellinger-distance-criterion machine-learning numpy-library paper-implementations python scipy-library simulation tree-node

Last synced: 04 Apr 2025

https://github.com/ashwin331133/sql-healthcare-data

This repository contains SQL queries designed to analyze health care data. The queries focus on patient demographics, encounter costs, and flu shot statistics, aiming to provide insights into patient behavior and financial impacts. The datasets include information on patient encounters, flu shots, and hospital admissions.

data-analysis mysql sql

Last synced: 29 Oct 2025

https://github.com/mfakhriazhar/housing-price-analysis

Determining the price of a house also depends on various factors such as building area, exterior quality, and amenities. This dataset provides information on properties for sale, and through Exploratory Data Analysis (EDA), patterns and key factors affecting house prices can be identified.

data-analysis data-science data-visualization eda exploratory-data-analysis python

Last synced: 16 May 2026

https://github.com/stkisengese/numpy-data-fundamentals

A comprehensive collection of NumPy exercises covering array manipulation, slicing, broadcasting, random data generation, and real-world data analysis applications.

data data-analysis numpy pre-processing

Last synced: 16 May 2026

https://github.com/danitilahun/exploratory-data-analysis-projects

This repository contains a collection of my personal Exploratory Data Analysis (EDA) projects. Each project involves exploring various datasets to gain insights, uncover patterns, and visualize trends.

data-analysis data-science data-visualization exploratory-data-analysis python

Last synced: 16 May 2026

https://github.com/carlosvinimsouza/jupyter-notebook-basic

Armazenado todos os trabalhos referentes a Ciência de Dados.

data-analysis data-science programas-jupyter-notebook python

Last synced: 11 May 2026

https://github.com/j-faria/bicerin

Working on the RV challenge in Torino

data-analysis gp radial-velocity rv-challenge

Last synced: 07 Apr 2026

https://github.com/nafisrayan/crypto-trading-platform

This React Crypto Exchange Template is designed to provide a solid foundation for building a comprehensive cryptocurrency exchange platform. With its sleek and modern design, this template is perfect for anyone looking to create a user-friendly and intuitive trading experience.

crypto dashboard data-analysis data-visualization react template

Last synced: 16 May 2026

https://github.com/mboula/mboula.github.io

GitHub portfolio + interactive resume | Showcasing data projects in civil rights (housing), cannabis, and analytics

cannabis case-study civil-rights compliance dashboards data-analysis data-cleaning data-vizualization excel google-data-analytics housing open-data pattern-analysis portfolio pro-se public-data r sql tableau

Last synced: 10 Jul 2025

https://github.com/katarinatmb/serbia-protest-analysis

This project analyzes the frequency, regional distribution, and group characteristics of protests that emerged across Serbia following the fatal collapse of the Novi Sad train station roof in November 2024. The analysis explores how different communities responded in the aftermath of the disaster, using data visualization in RStudio

data-analysis data-visualization r r-mark rstudio

Last synced: 10 Jul 2025

https://github.com/colindean/allegheny_voter_reg_analysis

Allegheny County Voter Registration Analysis Tools

data-analysis data-science elections pandas polars python voting

Last synced: 16 May 2026

https://github.com/athari22/multivariable_regression_and_valuation_model_

Multivariable regression model using Python to analyze and predict Boston housing prices based on various socioeconomic and environmental features.

data-analysis data-analysis-python housing-prices housing-prices-competition machine-learning pandas pandas-python plotly python regression-models seaborn seaborn-python sklearn

Last synced: 17 Jun 2025