An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/maddieemihle/pandas-challenge

Python analysis to create and manipulate school and standardized test data. Scores are calculated, grouped, aggregated, summarized, and organized using pandas.

data-analysis pandas-python

Last synced: 09 Jun 2026

https://github.com/badranalyst/movie-correlation-analysis-in-python

This project analyzes movie data correlations using Python libraries like Pandas, NumPy, Seaborn, and Matplotlib. It examines relationships between attributes such as ratings, genres, and box office performance to uncover trends that inform recommendations and enhance understanding of movie success factors.

data data-analysis dataset jupyter jupyter-notebook matplotlib matplotlib-pyplot numpy pandas python seaborn

Last synced: 03 May 2026

https://github.com/helenaden/data-science-fundamentals

This project delves into fundamental data science concepts using Python libraries like NumPy and Pandas

data-analysis datascience datasets datavisualization datawrangling heatmap numpy pandas patterns python

Last synced: 03 May 2026

https://github.com/ahmedhosssam/lesser_pandas

Pandas-like Data Analysis library in C++

cpp data-analysis data-science pandas

Last synced: 03 May 2026

https://github.com/stepankuzmin/machine-learning-data-analysis

My homeworks on Coursera Machine Learning and Data Analysis specialization

coursera data-analysis jupiter machine-learning python

Last synced: 03 May 2026

https://github.com/zients/tw-lottery-recommandation

Taiwan lottery draw analyzer & number recommender with Transformer ML model. Supports 539, 649, 638, 3D, and 4D lotteries.

cli data-analysis lottery machine-learning python pytorch taiwan transformer

Last synced: 03 May 2026

https://github.com/chaedoll/analysis-python-foreignerinfra

국내 외국인 대상 인프라 개선을 위한 보고서 (Report on improving infrastructure for foreigners)

data-analysis python team-project

Last synced: 03 May 2026

https://github.com/rohitinu6/tesla-price-prediction

A machine learning project that predicts future stock price movements using Logistic Regression, SVC, and XGBoost with engineered financial features.

data-analysis data-visualization feature-engineering financial-analysis logistic-regression machine-learning matplotlib python scikit-learn seaborn stock-market stock-price-prediction support-vector-machine time-series xgboost

Last synced: 03 May 2026

https://github.com/emredemirbas/movie-ratings-analysis

A data analysis project investigating potential bias in movie ratings from 2015, comparing them with ratings from other platforms using Python, pandas, and visualization libraries.

data-analysis matplotlib pandas python seaborn

Last synced: 03 May 2026

https://github.com/maddieemihle/python-challenge

Creating a Python script that analyzes financial records and election results

data-analysis python

Last synced: 09 Jun 2026

https://github.com/mohnish88/e-commerce-data-analysis

I analyzed sales data to identify trends and patterns, which significantly enhanced decision-making processes. Additionally, I created interactive visualizations to present these insights clearly and effectively, facilitating better understanding and communication of the data's implications.

data-analysis data-cleaning jupyter-notebook pandas plotly python python-library sales sales-analysis visulaization

Last synced: 03 May 2026

https://github.com/devlucho/modelos-predictivos

Modelos predictivos utilizando los algoritmos de Regresión Lineal, Regresión Logística y Árboles de Decisión.

data-analysis jupyter-notebook python3

Last synced: 03 May 2026

https://github.com/nathadriele/diabetes-clinical-etl-pipeline

Este projeto de Engenharia de Dados em Saúde Pública implementa um pipeline completo para coletar, tratar, padronizar, validar, integrar e visualizar dados públicos do SUS relacionados ao Diabetes Mellitus no Brasil, filtrando pelos códigos CID-10 E10 a E14.

cid data-analysis data-extraction data-pipeline data-science data-structures data-visualization datasus diabetes-detection diabetes-prediction epidemiology-analysis etl-pipeline healthcare-analytics ibge logger pytest sih streamlit sus

Last synced: 09 Jun 2026

https://github.com/salma-mamdoh/project-writing-functions-for-product-analysis

My Project to learn the Basics of Analysis on DataCamp

data-analysis data-camp pandas python

Last synced: 03 May 2026

https://github.com/syed-m-nofel/python-data-science-fundamentals

Python notebooks for data manipulation (Pandas/NumPy) and API workflows – from basics to practical examples.

api beginner-friendly data-analysis data-science http-requests jupyter-notebook numpy pandas pandas-dataframe python tutorial

Last synced: 03 May 2026

https://github.com/ggarciajavier/udacity-dalf-project4-identify-fraud-enron-email

Work performed for the 4th project of the Udacity Data Analyst Nanodegree: machine learning classifier for identifying fraud in Enron email corpus.

data-analysis data-science machine-learning nlp-machine-learning python python27

Last synced: 03 May 2026

https://github.com/nurulashraf/logistic-regression-loan-prediction

Loan approval prediction using logistic regression based on applicant data, including income, credit history, and property details, after data preparation and feature engineering.

data-analysis data-science loan-prediction logistic-regression machine-learning predictive-modeling python sklearn

Last synced: 03 May 2026

https://github.com/ljadhav25/swiggy-restaurant-analysis

This repository contains data and analysis related to restaurants listed on Swiggy, one of India's largest online food ordering and delivery platforms. The objective is to explore restaurant trends, customer reviews, pricing strategies, and delivery metrics to gain insights into the food delivery industry.

data-analysis data-visualization matplotlib-pyplot numpy-library pandas-library python seaborn-plots

Last synced: 03 May 2026

https://github.com/matteospanio/speed-analysis

A project to analyze the internet speed

bash-script data-analysis

Last synced: 03 May 2026

https://github.com/syarwinaaa09/analyzing-crime-in-los-angeles

Exploratory data analysis of Los Angeles crime data with insights on temporal patterns, locations, and age demographics.

crime-data data-analysis eda los-angeles pandas public-safety python visualization

Last synced: 03 May 2026

https://github.com/bpkaur/whats-in-a-name

Exploring dataset of first names of babies born in the US in order to uncover interesting stories

data-analysis datacamp numpy pandas python3

Last synced: 04 May 2026

https://github.com/r13i/cheapest-phone-call

Small challenge to find the best phone operator to use based on call price

big-data big-data-analytics cheapest data-analysis data-cruncher pandas phone-number pricelist

Last synced: 04 May 2026

https://github.com/xiaohan2012/myunisport

Visualize your Unisport annual training records

data-analysis data-visualization pandas pygal sports-stats tikzposter

Last synced: 04 May 2026

https://github.com/sanchittechnogeek/rental-data-visualization_python

Statistics and visualization of rental data with python

data-analysis data-science data-visualization statistics

Last synced: 04 May 2026

https://github.com/arv-anshul/ipl-api

IPL API using Flask framework and ipl dataset.

api data-analysis fast-api flask flask-api ipl ipl-api python3

Last synced: 04 May 2026

https://github.com/damisparks/become_data_analyst

Are you new to Data Analysis ? Here you will find simple notebook that will help through your journey. These are personal projects I work on and still working.

data data-analysis data-visualization matplotlib numpy pandas-tutorial

Last synced: 04 May 2026

https://github.com/mchenryspagg/investigate_a_dataset

This is a data analysis project that demonstrates the student's ability to use python data analysis libraries such as pandas, numpy and pyplot in matplotlib to investigate a dataset and answer specific questions from the dataset, thus demonstrating skills in data cleaning, data wrangling, and exploratory data analysis.

data-analysis datetime descriptive-analysis descriptive-statistics exploratory-data-analysis numpy pandas pyplot python visualization

Last synced: 04 May 2026

https://github.com/mr-chang95/sf_data_visualization

In this personal project, I am interested in examining all of the active businesses in the San Francisco Bay Area while performing some simple data visualizations, mainly on categorical variables.

business data-analysis data-visualization jupyter-notebook pandas python san-francisco

Last synced: 04 May 2026

https://github.com/sweta-kaundilya/python_for_data_analysis

Learning Python and all the relevant libraries in python for Data field.

cufflinks data-analysis data-science matplotlib numpy pandas plotly python seaborn

Last synced: 04 May 2026

https://github.com/fatihilhan42/book-recommendation-system-with-python

In this project, we are making a book recommendation system that recommends similar books according to the genres or ratings that the user enters, using a large book dataset. The link of the dataset is given below. Happy reading...

books data-analysis data-science data-visualization kaggle python recommendation-engine recommendation-system

Last synced: 04 May 2026

https://github.com/hyperplasma/olympic-visualization-analysis

Multidimensional analysis and visualization of Olympic medals, economy, and happiness index.

data-analysis data-visualization matplotlib numpy pandas python wordcloud

Last synced: 04 May 2026

https://github.com/carmoreno/nobelprizes

Final project of Big Data Module.

data-analysis mongodb

Last synced: 29 Apr 2026

https://github.com/dineshdhamodharan24/singapore_flat_resale_

This project focuses on developing a machine learning model to predict the resale values of apartments in Singapore. The goal is to create a user-friendly online application that enables users to obtain accurate predictions for the resale values of specific properties.

data-analysis flat json numpy pandas pickle project python streamlit

Last synced: 07 Apr 2026

https://github.com/dinamohsin/ai-job-market-analysis-using-sql-excel

This project explores a dataset of AI-related jobs to uncover insights about salary trends, in-demand skills, education levels, and remote work preferences. The analysis was done using SQL for querying and Excel for data cleaning and preparation.

data-analysis data-preprocessing excel functions query sql sql-server

Last synced: 25 Jun 2025

https://github.com/jarrarshahid/nutrition-calculator

Simple python app to calculate nutritions in everyday meals.

data-analysis health json jupyter-notebook logic-programming python

Last synced: 15 Jul 2025

https://github.com/vbhvsingh0/nflteam_corr_population

The goal of this project is to find the correlation in between NFL teams' win and loss with the population of the city.

data-analysis data-cleaning-and-preprocessing data-manipulation-with-pandas numpy-library pandas-python pearson-correlation python3

Last synced: 29 Jun 2026

https://github.com/miusarname2/proyectos-final-analitica-de-datos

Welcome to the repository where the magic of data analytics comes to life! This is the result of our effort and creativity in the subject of data analysis at the Universidad Cooperativa de Colombia (UCC). Here we keep everything we did to analyse data, draw cool conclusions and solve the workshop we were given. 🎯📊

data-analysis data-science data-visualization pip python

Last synced: 15 Jul 2025

https://github.com/priyanshubiswas-tech/farmlab-report-and-case-study-iot

This project was developed through live interviews and case studies with farmers in the year 2023 to address key agricultural challenges. The device provides real-time farm insights for better decision-making. Future plans include a digital portal, increased range, more sensors, and improved design. Open to collaboration!

arduino-ide c case case-study data data-analysis iot iot-device serialization

Last synced: 15 Jul 2025

https://github.com/theo-jenkins/fmri-brain-scan-analyser

MATLAB toolkit for reading, analysing and simulating rs-fMRI brain scans in .nii format.

algorithms data-analysis data-visualization fmri-data-analysis matlab neuroimaging

Last synced: 15 Jul 2025

https://github.com/shubhammittal-data/sales-customer_dashboard_tableau

An interactive Tableau project showcasing advanced data visualization techniques for sales performance and customer analytics. This dashboard provides key business insights using KPIs, trend analysis, and customer segmentation. Designed for executives, sales managers, and marketing teams to drive data-driven decision-making.

customer-behavior-analysis customer-segmentation data-analysis data-visualization product-analytics sales-analysis tableau tableau-dashboards tableau-public

Last synced: 07 Mar 2026

https://github.com/jlee9503/defense-risk-prediction

Build a machine learning pipeline that ingests defense procurement data, identifies high-risk contracts, and visualizes the results in an interactive dashboard.

data-analysis data-visualization exploratory-data-analysis python

Last synced: 25 Jan 2026

https://github.com/caesaredia/chicago-taxi-data-insights

Exploratory data analysis and hypothesis testing on Chicago taxi trip data to uncover patterns in demand and the effects of rainy weather on travel time.

chicago data-analysis data-visualization exploratory-data-analysis hypothesis-testing python statistical-analysis taxi-trips weather-analysis

Last synced: 17 May 2026

https://github.com/bonelesswater/tradingbot

This project is a web application for a trading bot that displays financial data and indicators. It includes functionality for researching financial data, displaying market indicators, and more.

ai azure css d3 data-analysis django html javascript jquery materializecss python stock-market

Last synced: 30 Dec 2025

https://github.com/harmanveer-2546/motor-vehicle-accidents-in-india

As per the report, a total of 4,61,312 road accidents have been reported by States and Union Territories (UTs) during the calendar year 2022, which claimed 1,68,491 lives and caused injuries to 4,43,366 persons.

accidents accidents-analysis darkgrid data-analysis eda exploratory-data-analysis indian-roads inline matplotlib motor-vehicles numpy pandas review seaborn visualization

Last synced: 19 Jan 2026

https://github.com/mrendiks/analyst-data-survey-monkey

Learn how to analyst data from dataset surver monkey using Excel and Python

data-analysis ipynb-jupyter-notebook python

Last synced: 07 Mar 2026

https://github.com/mituskillologies/aiml-dypiemr-sep24

Programs conducted at DYPIEMR, Pune in training on AIML during September 2024.

artificial-intelligence data-analysis data-science machine-learning matplotlib neural-network numpy pandas python3

Last synced: 05 Apr 2025

https://github.com/chahelgupta/fitness-data-analysis-r-project

This project focuses on analyzing fitness data collected from various tracking devices to gain insights into users' activity levels, sleep patterns, calorie expenditure, and heart rate. The dataset used in this project consists of multiple CSV files, each containing different aspects of fitness-related data.

data-analysis data-cleaning data-exploration data-science data-visualization r r-language r-programming r-studio

Last synced: 18 May 2026

https://github.com/percival33/machine-learning-engineering

Uni project about enhancing fictional music streaming service, by developing machine learning models to generate popular playlists

data-analysis data-science machine-learning python

Last synced: 14 Jul 2025

https://github.com/jonathancaleb/adap

📊🌱 Agricultural Data Analysis Platform 🌍🚜 A personal initiative to analyze coffee growth trends in Uganda using Python, data science, and machine learning. This project supports sustainable farming with predictive models and interactive visualizations. 🍃📈

data-analysis data-science python

Last synced: 18 May 2026

https://github.com/simranrayait51/internshala-ds-projects

Projects from the Internshala Data Science course, showcasing my skills in Excel, SQL, Python, and Tableau for data manipulation, analysis, and visualization.

data-analysis data-science data-visualization excel internshala-project pgc postgresql python sql tableau

Last synced: 17 May 2026

https://github.com/Fisseha-Estifanos/telecom

A showcase repository for a specific telecommunication company. Used to analyze several telecommunication data set features and generate useful insights accordingly. Insights generated could be seen at https://github.com/Fisseha-Estifanos/telecom-visualizer or at https://fisseha-estifanos-telecom-visualizer-home-huxgy0.streamlitapp.com/

data-analysis notebooks-jupyter python visual-studio-code visualization

Last synced: 11 Mar 2025

https://github.com/majajuri/text-classification-using-string-kernels

Projekt u sklopu predmeta Uvod u znanost o podacima

data-analysis string-kernel

Last synced: 05 Apr 2025

https://github.com/ashvinhandoo/bionic-lab-projects

Computational neurophysiology pipelines for analyzing astrocyte and vascular dynamics. Includes Python- and MATLAB-based analysis frameworks for modeling calcium, vasomotion, and pupil-linked activity, demonstrating advanced signal processing, transfer entropy estimation, and data visualization skills used in biomedical research.

biocomputation bioinformatics biomedical-engineering computational-biology data-analysis matlab neuroscience python signal-processing time-series

Last synced: 18 May 2026

https://github.com/amoghkori/deeplabcut-package-for-animal-pose-estimation

DeepLabCut Mouse Location Prediction: Training a deep neural network to predict the location of a mouse using annotated joint positions.

data-analysis data-annotations data-preprocessing deep-learning machine-learning model-evaluation python-programming research research-project

Last synced: 17 Mar 2025

https://github.com/martachesnova/python-apis

A weather analysis that randomly selects more than 500 cities across the globe, pulls data from the OpenWeatherMap API for each city. Analysis of the weather and perfect vacation spot is viewable on my Jupyter Notebook.

api data-analysis python

Last synced: 24 Feb 2025

https://github.com/martachesnova/python

Created a Python script to calculate and analyze financial records of a company. Created another Python script to do calculations and analysis of the voting process in a small town.

data-analysis python

Last synced: 24 Apr 2026

https://github.com/nurulashraf/linear-regression-insurance-premium

This analysis applies simple linear regression to explore the relationship between age and insurance premium. It includes model training, visualisation, and evaluation using MSE and RMSE to assess prediction accuracy.

beginner-project data-analysis insurance-data linear-regression machine-learning matplotlib predictive-modeling python regression-models scikit-learn

Last synced: 05 May 2026

https://github.com/guilherme-marcello/r-data-analysis-piechart

Reading RDS files, processing and presentation in pie charts

data-analysis data-visualization pie-chart r

Last synced: 13 Jul 2025

https://github.com/data-edd/e-commercestore_analysis

This project analyzes e-commerce data to provide insights into sales performance, profitability, and customer behavior using Power BI.

data-analysis powerbi powerbidashboard

Last synced: 02 Feb 2026

https://github.com/lparham2/factors-driving-ev-adoption-charging-station-deployment

This project explores factors driving EV adoption and charging station deployment using Python-based data analysis. It examines sales trends, infrastructure growth, and socioeconomic influences to uncover key insights. The goal is to aid policymakers and businesses in optimizing EV infrastructure and accelerating sustainable transportation.

data-analysis data-visualization electric-vehicle-charging-station electric-vehicles powerpoint-presentations python

Last synced: 18 May 2026

https://github.com/jhermienpaul/google-data-analytics-program

Hands-on learning materials from the 8-course Google Data Analytics Professional Certificate program, covering foundational data skills, tools, and real-world business problem-solving

bigquery dashboard data-analysis data-analytics data-modeling data-storytelling data-visualization data-wrangling descriptive-analytics diagnostic-analytics etl-pipeline r-programming rstudio sql tableau

Last synced: 13 Jul 2025

https://github.com/antononcube/wl-mosaicplot-paclet

Wolfram Language (aka Mathematica) paclet for mosaic plots over datasets or lists of records.

data-analysis machine-learning mosaic mosaic-plots

Last synced: 16 Jan 2026

https://github.com/myktorijus/retention-cohort

Extracted cohort data using SQL in BigQuery focusing on weekly retention from week 0 to week 6

bigquery data-analysis data-visualization powerbi sql

Last synced: 13 Jul 2025

https://github.com/gappeah/credit-card-transactions-fraud-detection-project

The Credit Card Transactions Fraud Detection Project repository is designed to analyse and detect fraudulent transactions in credit card data.

data-analysis postgresql sql

Last synced: 12 Jul 2025

https://github.com/farzeennimran/fashion-mnist-dataset-classification-using-neural-network

Implementation of a Multi-layer Perceptron classifier with hyperparameter tuning and k-fold cross-validation employing GridSearchCV for classifying images on the Fashion MNIST dataset 👗👚👖

artificial-intelligence data-analysis data-mining data-science dataset deep-learning fashion-mnist-dataset gridsearchcv hyperparameter-tuning kfold-cross-validation machine-learning multilayer-perceptron-network neural-network numpy pandas python sklearn

Last synced: 03 Apr 2026

https://github.com/gui-sitton/games

Identify patterns that determine whether a game is successful or not. This will allow you to identify potential big winners and plan advertising campaigns.

data data-analysis data-analysis-python data-science data-visualization python

Last synced: 18 May 2026

https://github.com/wikidata/purdue-data-mine-2024

Program materials for WMDE's 2024 Purdue Data Mine project

analytics data-analysis data-quality data-science etl open-data python wikidata wikimedia

Last synced: 12 May 2025

https://github.com/jagoda11/elastic-vision

This repository contains a full-stack application designed to explore data from ElasticSearch🧐indices and visualize it using charts and graphs. The backend is built using Node.js and the frontend is powered🚀 by React.

backend chartjs dashboard-development data-analysis data-visualization docker elasticsearch frontend fullstack javascript material-ui monorepo mui-x node pie-chart react restful-api tables

Last synced: 09 Apr 2026

https://github.com/xjwllmsx/profitable-app-profiles

Analyzes Google Play & App Store data to recommend profitable profiles for free, ad-supported mobile apps

data data-analysis data-cleaning jupyter pandas python

Last synced: 18 May 2026

https://github.com/pyramidheadshark/ai-mirea-sem1p

Completed set of all MIREA AI an DA practices (1 sem.)

beginner-friendly data-analysis data-science jupyter mirea

Last synced: 05 Apr 2025

https://github.com/sgb31/covid-19-data-analysis

"In this project, I analyzed COVID-19 data to explore trends, case growth, and key patterns. I worked on cleaning the data, performing exploratory analysis, and visualizing infection rates, recoveries, and fatalities. The goal was to gain insights into how the pandemic evolved and its overall impact.

data-analysis data-visualization matplotlib pandas python seaborn

Last synced: 13 May 2026

https://github.com/shubhamprajapati7748/end-to-end-house-price-prediction

A machine learning model that accurately predicts housing prices using the Boston Housing dataset by analyzing various house features, and it utilizes a CatBoost model to assist potential buyers or sellers in estimating housing prices.

boston-housing-price-prediction data-analysis data-science-projects machine-learning regression regression-models

Last synced: 30 Oct 2025

https://github.com/ddihora1604/social_media_analysis

A powerful, interactive dashboard for analyzing social media conversations, trends, and network dynamics. This tool allows researchers and analysts to explore patterns in social media data, identify key trends, and detect coordinated behavior.

aiml css data-analysis data-visualization html javascript python

Last synced: 30 Oct 2025

https://github.com/jwt218/sinc

MATLAB Standardization and Isotope Normalization for CSIA (with integrated correction and uncertainty quantification)

data-analysis geochemistry isotopes matlab

Last synced: 23 Jun 2025

https://github.com/jofaval/boston-housing

Regression Analysis into the Boston Housing in-demand pricing in 1978

boston-housing data-analysis data-science data-visualization machine-learning python regression

Last synced: 16 May 2026

https://github.com/nikbarb810/motif_detection_in_r

Motif Detection for TFBS in Glycolysis and Glyconeogenesis pathways

bioinformatics data-analysis null-hypothesis pwm r

Last synced: 23 Jun 2025

https://github.com/adriangalvanzamora/ecommerce-analytics-olist

Data analysis project based on the Olist Brazilian E-Commerce dataset. Includes data cleaning, exploratory analysis, delivery performance metrics, customer satisfaction modeling, and geospatial insights. Built entirely in Python (Jupyter Notebook) using real-world data from Kaggle.

brazil customer-satisfaction data-analysis data-visualization ecommerce folium geospatial-analysis machine-learning matplotlib notebook pandas plotly python seaborn

Last synced: 06 May 2026

https://github.com/rociobenitez/airbnb-data-mining

Análisis detallado y modelado predictivo de alojamientos en Madrid utilizando técnicas de Big Data y estadística en R, enfocado en optimización de datos y predicción de características de propiedades.

airbnb data-analysis data-mining estadistica prediction-model predictive-analytics predictive-modeling qmd r rstudio

Last synced: 23 Jun 2025

https://github.com/drisskhattabi6/meteo-data-mining

This repo contains using Data Mining Techniques to analyze meteorological (meteo) data. The objective is to extract meaningful insights and patterns from the data that can aid in understanding weather phenomena and predicting future weather conditions.

cart data-analysis data-mining data-visualization decision-making decision-tree extract-data extract-insights insights-analytics insights-data k-means knn machine-learning svm

Last synced: 21 Mar 2025

https://github.com/debjyotisaha/data-analytics-projects-phase-2

Developed and showcased various data analytics projects, including data preprocessing, exploratory data analysis, and visualization. Utilized tools such as Python, Pandas, NumPy, and Matplotlib to derive actionable insights and demonstrate problem-solving capabilities.

data-analysis data-preprocessing eda matplotlib numpy pandas python seaborn

Last synced: 09 Apr 2026

https://github.com/jayita11/eda-student-exam-performance

This project performs Exploratory Data Analysis (EDA) and hypothesis testing on student performance data. It explores trends based on attributes like gender, race/ethnicity, parental education, lunch type, and test preparation course completion.

data-analysis eda hypothesis-testing matplotlib pandas python seaborn statsmodels student-performance-analysis

Last synced: 11 Jul 2025