An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/rugwiroparfait/alx_sql

This repo is where I save my queries and learning materials in Data Science program from ALX

anaconda data data-analysis jupyter-notebook sql

Last synced: 19 Aug 2025

https://github.com/rohitblaze10/survey_monkey_analysis--using-ipython

This data analysis project focused on extracting insights from survey responses. It involves data cleaning, merging, and transformation using iPython (Pandas,OS) and SQL. The goal is to identify trends and patterns in survey data for better decision-making.

data-analysis ipynb ipython-notebook

Last synced: 28 Jul 2025

https://github.com/vetrivel07/flight-price-prediction

Developed a flight price prediction model using Python, analyzing historical data to forecast airfare prices and help travelers make informed booking decisions

data-analysis data-visualization jupyter-notebook numpy pandas python

Last synced: 15 Jun 2025

https://github.com/gui-sitton/prepaid

In this project I work as an analyst for the telecommunications company Megaline. The company offers its customers prepaid plans, Surf and Ultimate. The sales department wants to know which plans bring in the most revenue in order to adjust the advertising budget

data data-analysis data-analysis-python data-science data-visualization python

Last synced: 22 May 2026

https://github.com/buildwithlal/introduction-to-data-science-in-python-coursera

introduction to data science in python, part of Applied Data Science using Python Specialization from University of Michigan offered by Coursera

data-analysis matplotlib numpy pandas

Last synced: 03 May 2026

https://github.com/ashwin331133/sql-project--sales-data-analysis--walmart

This SQL-based Walmart data analysis project aims to identify top-performing branches and products, optimize sales strategies using Kaggle's Walmart Sales Forecasting Competition dataset.

data-analysis eda sql

Last synced: 03 Jan 2026

https://github.com/labex-labs/sqlite-intermediate-to-advanced

In this course, delve into advanced SQLite techniques. Master constraints, indexing, joins, subqueries, transactions, triggers, views, full-text search, JSON, backups, PRAGMA tuning, CTEs, window functions, and more!

advanced-sql course data-analysis data-integrity data-manipulation data-modeling database database-design hands-on labex labs performance-tuning programming query-optimization relational-database schema-management sql sqlite stored-procedures transaction-management

Last synced: 18 May 2026

https://github.com/cassiofb-dev/fide-rating-analysis

The plot speaks for itself

chess data-analysis fide hans rating

Last synced: 15 Jun 2025

https://github.com/antononcube/wl-tilestats-paclet

Wolfram Language (aka Mathematica) paclet for statistics over 2D tillings. (Tile binning, aggregation functions application, etc.)

2d-data data-analysis geospatial-data mathematica wolfram-language

Last synced: 20 Mar 2026

https://github.com/kineticloom/plydb-fun-nfl-analyst

Analyze NFL data with your AI agent

data-analysis football-analytics nfl

Last synced: 15 May 2026

https://github.com/archanakokate/eda_amazon_products_and_discounts_2023

Exploratory Data Analysis (EDA) on Amazon's 2023 Products and Discounts data

data-analysis data-mining data-visualization exploratory-data-analysis

Last synced: 03 Jan 2026

https://github.com/swethajoseph/statistical-stock-performance-analysis

Conducted a statistical analysis of Microsoft, Tesla, and Apple stock performance compared to the S&P 500, examining price trends, volatility, and correlations to derive investment insights.

advancedexcel comparative-analysis data-analysis data-visualization datapreparation descriptive-statistics moving-average msexcel performance-analysis performance-metrics regression-analysis statistical-analysis

Last synced: 03 Jan 2026

https://github.com/prateek5525/retail-sales-analysis-project

This project involves analyzing retail sales data using SQL to uncover insights into sales patterns, customer behavior, and product performance. It serves as an exercise to develop foundational SQL skills in data exploration, cleaning, and analysis.

data-analysis data-cleaning retail-sales-data sql

Last synced: 03 Jan 2026

https://github.com/hasinii12/-chocolate-analysis-dashboard

This Power BI report provides a comprehensive analysis of chocolate ratings and related attributes.

data-analysis data-visualization powerbi

Last synced: 09 Feb 2026

https://github.com/srummanf/elnino-anomaly-study

Study on El Niño’s impact on Chennai groundwater sustainability

data-analysis machine-learning python satellite-imagery-analysis

Last synced: 15 May 2026

https://github.com/noorulhudaajmal/business-performance-analytics

Python-Streamlit based interactive dashboard to analyze and visualize key business metrics for an online store.

business-analytics dashboard data-analysis python-streamlit

Last synced: 29 Jul 2025

https://github.com/malakasupun/crime-data-analysis-of-lapd

This project aims to explore and analyse crime patterns in Los Angeles using a dataset spanning from 2020 to the present. The primary focus is to extract meaningful insights by integrating structured data analysis and advanced techniques in SQL and Natural Language Processing (NLP).

data-analysis data-visualization llm nlp sql

Last synced: 29 Jul 2025

https://github.com/yash22222/literacy-exploration-analysis

Delve into India's literacy landscape through data analysis. Uncover regional disparities, high/low literacy states & gender imbalances.

csv data-analysis data-visualization government-data india literacy literacy-analysis states

Last synced: 29 Jul 2025

https://github.com/nguyenda18/ppp-data-tool

Command line tool (could later be used as lambda function) to download CSV files from SBA and generate JSON

data-analysis nodejs-server ppp-files ppp-loans

Last synced: 29 Jul 2025

https://github.com/cyprianfusi/data-scientist-technical-exercise-10ds

With recommendations to UK Department for Education of 10 Local Authorities where National Tutoring Programme (NTP) should be intensified and a response to UK Secretary of Health regarding a 76% Accident and Emergency (A&E) performance target which seems far-fetched.

data-analysis data-cleaning data-visualization hypothesis-testing pandas-python policy statistics

Last synced: 21 Sep 2025

https://github.com/naso7y/twitter-sentiment-analysis

Classifies airline-related tweets as positive, negative, or neutral using machine learning and NLP.

data-analysis machine-learning nlp sentiment-analysis

Last synced: 29 Jul 2025

https://github.com/mjshubham21/ny_yellow_taxi_python_da_project

A data analysis project of New York Yellow Taxi (Feb of 2025) using Python and its libraries for analytics like : NumPy, MatPlotLib, Pandas and Seaborn.

data-analysis jupyter-notebook matplotlib numpy pandas python seaborn

Last synced: 04 May 2026

https://github.com/dcs-training/scottishaccounts

This repo contains various examples of analysis that can be performed on the Statistical Accounts of Scotland dataset. Go to the readme file

data-analysis data-visualisation data-wrangling geographical-data r rmarkdown text-analysis

Last synced: 16 Aug 2025

https://github.com/12danielll/neurogenomics_project

This project focuses on analyzing sequencing data to understand molecular mechanisms of neurological diseases and predict the effectiveness of immunotherapy in breast cancer patients. It integrates Python and R scripts for data processing, statistical analysis, and visualization, alongside a comprehensive report detailing methods and findings.

bioinformatics biostatistics clustering clustering-algorithms data-analysis data-visualization deseq2 differential-gene-expression functional-analysis immune-therapy machine-learning neurological-disease neuroscience pca-analysis python r seurat single-cell-analysis

Last synced: 06 Apr 2026

https://github.com/pentalpha/eu-car-emissions-analysis-2015

Analysis of CO² Emissions on Passenger Cars at the E.U. Contries, Year 2015.

data-analysis data-science dataset jupyter-notebook python python3

Last synced: 15 May 2026

https://github.com/pentalpha/bti-performance-study

A series of analysis on a large amount of data about the grades of students in the Technology Information course at UFRN

analysis big-data clustering data-analysis data-science data-visualization ipynb ipython jupyter-notebook performance-analysis plot python python3

Last synced: 15 May 2026

https://github.com/sebastiansauer/hans-hackathon2025

Materials for a course on the evaluation of the AI student learn tool "HaNS"

ai data-analysis evaluaton r

Last synced: 04 Oct 2025

https://github.com/mtholahan/advanced-mysqlquery-tuning-mini-project

Analyzed EuroCup 2016 data with advanced SQL queries. Imported CSV datasets into MySQL, designed schema with match, player, and referee details, and implemented queries covering match outcomes, penalty shootouts, player stats, bookings, substitutions, and referee activity to explore tournament dynamics.

bootcamp data-analysis data-engineering data-modeling database eurocup football mysql queries soccer sports springboard sql

Last synced: 15 May 2026

https://github.com/eesunmoon/genai_cor-recom

[Project] Outfit Coordination Recommender System using KoAlpaca

data-analysis fine-tuning generative-ai huggingface keyword llm numpy pandas python python3 selenium

Last synced: 06 Apr 2026

https://github.com/danicaalana/sales-review-sentiment-analysis

This project is a sentiment analysis project using a machine learning model. It analyzes Amazon product reviews to determine whether the sentiment expressed is positive, negative, or neutral using Multinomial Naive Bayes Method.

amazon data-analysis data-science machine-learning naive-bayes python sales-review sentiment-analysis

Last synced: 15 May 2026

https://github.com/ozep/genshincharacteranalysis

Uses a spreadsheet with Character Data and organizes it into readable graphs.

data-analysis jypyternotebook python

Last synced: 18 Apr 2026

https://github.com/nishumehta/uber-rides-data-analysis

An in-depth analysis of Uber ride data for the year 2016, to uncover patterns in ride behavior, mileage trends, and frequent start locations to generate actionable insights for business decisions.

data-analysis jupyter-notebook matplotlib-pyplot pandas python tableau-dashboards

Last synced: 09 May 2026

https://github.com/syarwinaaa09/exploring-airbnb-market-trends

a data analysis project exploring NYC Airbnb listings, using data visualization and pandas for price trends, room types, and reviews.

airbnb data-analysis data-science data-visualization jupyter-notebook new-york-city nyc pandas price-analysis reviews room-types

Last synced: 30 Apr 2026

https://github.com/antrita/stroke_prediction_model

A model that combines Kaggle's Stroke Prediction Dataset with live weather/air quality data to implement FDA-compliant MLOps pipeline and shows expertise in healthcare regulations and real-time inference.

ai data-analysis deep-learning kaggle-dataset machine-learning prediction-model random-forest real-time scikit-learn streamlit weather-api xgboost

Last synced: 07 May 2026

https://github.com/smsraj2001/sds-datathon

A simple data science project/hackathon done as part of SDS course

data-analysis data-analysis-python data-cleaning data-science statistics statistics-for-data-science

Last synced: 16 Jul 2025

https://github.com/sinsunsan/earth-survival-kit

Global warning data visualisation app to make everyone understand global warning and take actions that matter

angular angular7 d3 data-analysis data-visualization ecology global-warning ngx-charts

Last synced: 05 May 2026

https://github.com/nathadriele/transaction_fraud_prevention_pipeline

Uma solução de detecção e prevenção de fraudes em transações financeiras, combinando Machine Learning, regras de negócio e análises estatísticas avançadas. O sistema oferece um dashboard interativo para monitoramento em tempo real, análise de dados e gestão de alertas de fraude.

data-analysis data-visualization docker fraud-prevention machine-learning matplotlib numpy pandas pipeline pytest python scikit-learn scipy seaborn streamlit tensorflow transaction xgboost

Last synced: 10 Apr 2026

https://github.com/rijul007/market-basket-analysis-using-r

Market Basket Analysis using association rules, leveraging R’s powerful tools for data-driven retail strategies.

data-analysis data-science r

Last synced: 02 Apr 2025

https://github.com/annnieglez/computer-vision-parking-lot

This project leverages computer vision techniques to analyze parking lot occupancy. The goal is to detect available parking spaces in real-time using image and video input.

computer-vision data-analysis data-science data-visualization google-colab image-classification image-processing machine-learning python transfer-learning

Last synced: 15 May 2026

https://github.com/karencofre/riesgorelativo-lookerstudio

proyecto de análisis de datos y análisis perdicitvo en looker studio y google colab

bigquery data-analysis data-science machine-learning matplotlib python sklearn sql

Last synced: 03 Jan 2026

https://github.com/kathisnehith/realestate-sales-analysis

Investigating real estate sales trends to understand market dynamics and inform investment decisions.

data-analysis excel realestate sales sql stastical-analysis-tools tableau

Last synced: 12 Feb 2026

https://github.com/madhursinghbhadoriya/data_analysis_sales_insights_using_tableau

• Performed Data Cleaning using MySQL. • Data analysis and ETL in Tableau. • Created an Interactive Dashboard with significant information about the Sales Insights, Profit and Revenue Analysis.

data-analysis data-visualization dataanalysis etl mysql tableau-dashboards tableau-desktop

Last synced: 09 Apr 2025

https://github.com/kartikey2807/bike-classification-1rt700

Binary classification problem involving Logistic regression, SMOTE and feature expansion.

data-analysis data-engineering data-visualization logistic-regression

Last synced: 30 Jul 2025

https://github.com/nishumehta/retail-sales-analysis

Retail sales performance analysis using Python and Power BI.

data-analysis ipynb-notebook jupyter-notebook powerbi python

Last synced: 15 May 2026

https://github.com/prakashjha1/whatsapp-chat-analyzer

WhatsApp Analyzer means we are analyzing our WhatsApp group activities. It tracks our conversation and analyses how much time we are spending or saying it as “wasting” on WhatsApp.

data-analysis data-science natural-language-processing pandas pyhton regular-expression

Last synced: 15 May 2026

https://github.com/sanveed-adnan/supermarket-sales-sql-project

SQL-based data analysis project on supermarket sales performance using SQLite and Power BI.

business-intelligence data-analysis data-science data-science-projects data-visualization power-bi sales-data sql sqlite

Last synced: 08 Nov 2025

https://github.com/rachkat/random-foresst-analysis-r-studio-plotting-classification-tree

Classification analysis in R using the birthwt dataset. Built and compared Decision Tree and Random Forest models to predict low birth weight. Both achieved 71.05% accuracy, with Random Forest reducing overfitting and confirming maternal weight and age as key predictors.

classification data-analysis decision-trees machine-learning predictive-modeling r random-forest

Last synced: 04 Oct 2025

https://github.com/alanmenchaca/getting-and-cleaning-data-course-project

The purpose of this project is to demonstrate how to collect, work with, and clean a data set.

data-analysis getting-and-cleaning-data rstudio tidy-data

Last synced: 31 Jul 2025

https://github.com/teamtigers/echartify

A web application built with .net core 2.2 that has come with the idea of reading the National Election's Data-set of Bangladesh in a fastest possible time and then representing the data-set with different statistical charts.

bangladesh chartjs code-first-migration cross-platform data-analysis data-structures data-visualization dotnet-core election-analysis election-data entity-framework-core materializecss mvc npoi razor-pages

Last synced: 16 Apr 2026

https://github.com/vara-co/solar-eclipse-2024

Group Project on the 2024 Solar Eclipse's Path over the US with an interactive map and a couple of visualizations on the data gathered.

data-analysis data-visualizations html-css-javascript interactive-map javascript map solar-eclipse

Last synced: 15 May 2026

https://github.com/k31ner/inmopipeline

Proyecto integral de análisis y modelado predictivo de datos inmobiliarios, que abarca recolección, transformación, visualización y machine learning utilizando Python y herramientas modernas de ingeniería y ciencia de datos.

data-analysis data-engineering data-science fastapi python streamlit

Last synced: 08 May 2026

https://github.com/anas436/data-science-projects

Explore my diverse collection of projects showcasing machine learning, data analysis, and more. Organized by project, each directory contains code, datasets, documentation, and resources. Dive in to discover insights and techniques in data science. Reach out for collaborations and feedback.

data-analysis data-science machine-learning

Last synced: 27 Mar 2025

https://github.com/jovicdev97/Financial-Loan-DataScience-Notebook

using numpy and pandas to analyze a synthetic loan dataset with python

data-analysis matlabplot numpy pandas plotting python seaborn

Last synced: 12 Mar 2025

https://github.com/cyberoctane29/epa-air-quality-aqi-analysis

This project involved analyzing air quality data from the EPA, focusing on the Air Quality Index (AQI). I used Python data structures like dictionaries and sets to manage and process the data, simulating real-world data analysis to assess pollution levels and their health implications.

data-analysis numpy pandas python statistics

Last synced: 10 Apr 2026

https://github.com/alrza2003/google-data-analysis-case-study-cyclistic

This project analyzes Cyclistic’s trip data to identify patterns in bike usage between casual riders and annual members. The findings help optimize marketing strategies and membership conversions.

business-task cyclistic-bike-share-analysis-case-study data-analysis data-science data-visualization google-data-analytics google-data-analytics-capstone-project google-data-analytics-professional jupyter-notebook python rmarkdown tableau

Last synced: 09 May 2026

https://github.com/ayeshathoi/simulation-sessional-412

Simulation of SSQS, Inventory System, Transient State, PERT, Monte Carlo Alo etc.

data-analysis excel inventory-system monte-carlo python simulation ssqs triangle-distributions

Last synced: 31 Jul 2025

https://github.com/aalkiyumi/project-3-docker-container-for-data-processing-script

This Dockerized Python application analyzes two text files (IF.txt and AlwaysRememberUsThisWay.txt). It counts total words, identifies the largest file, and finds the top three most frequent words in each. Results are saved to an output file and printed to the console.

cs5165 data-analysis data-engineering data-science docker introduction-to-cloud-computing statistical-analysis text-processing uc uc2026 university-of-cincinnati

Last synced: 17 May 2026

https://github.com/jofaval/iris-flowers

Multilabel Classification of the famous Iris Flowers Dataset from Ronald Aylmer Fisher in 1936

classification data-analysis data-science data-visualization google-colab iris-flowers kaggle machine-learning python scikit-learn xgboost

Last synced: 05 Apr 2026

https://github.com/mainak-97/netflix-content-analysis-project

SQL-based analysis of Netflix’s movies and TV shows dataset to uncover content trends, popular genres, geographical insights, and audience preferences. Includes data queries, findings, and a presentation of key insights.

data-analysis mysql mysql-workbench powerpoint presentation-slides sql

Last synced: 23 Sep 2025

https://github.com/remram44/apex-legends-ocr-data

Get data from Apex Legends streams using OCR

apex-legends data-analysis video-games

Last synced: 31 Jul 2025

https://github.com/chandkund/loan-eligibility-prediction

This project is designed to predict the eligibility of loan applicants based on various factors such as income, credit history, and marital status. By analyzing historical loan application data, the model helps to determine whether a loan application should be approved or not.

data-analysis data-science data-visualization machine-learning-algorithms matplotlib numpy pandas python seaborn

Last synced: 09 Apr 2026

https://github.com/farrelfaricaf/exploratorydataanalyst---titanic

This project analyzes the Titanic dataset using exploratory data analysis (EDA) and visualization techniques to identify survival patterns. The goal is to understand how demographic factors like gender and age influenced survival rates during the 1912 disaster.

data data-analysis data-science data-visualization eda python titanic-dataset

Last synced: 31 Jul 2025

https://github.com/jpgiant/training_project

Analyzing whether there is a difference between the average death ages of left handers and right handers using Bayesian Conditional Probability Theorem.

bayesian-statistics data-analysis data-visualization numpy pandas-dataframe python

Last synced: 30 Apr 2026

https://github.com/pauliorandall/airline-passenger-satisfaction-r

Analysing the Airline Passenger Satisfaction dataset from Maven Analytics

data-analysis data-analytics r

Last synced: 01 Aug 2025

https://github.com/computingvictor/mercadona_agent

Web app to explore supermarket products with advanced filters, search, favorites, and nutritional info. Includes data analysis notebooks for deeper insights.

css data-analysis data-science data-visualization filtering html interactive-ui javascript notebooks nutritional-info pandas product-catalog python supermarket webapp

Last synced: 09 Apr 2026

https://github.com/darkdk123/handwashing-discovery-analysis

A Guided Project in a Boot camp to Analyse the Original Data used in the Discovery of Viruses & Hand Washing By Dr. Ignaz Semmelweis in Vienna General Hospital in the 1840s.

data-analysis data-science data-visualization matplotlib-pyplot numpy pandas plotly-python python seaborn-plots

Last synced: 09 Apr 2026

https://github.com/kailenroa/sleep-efficiency-project

This project focuses on analyzing sleep efficiency using wearable technology data. It explores patterns in sleep behavior and key factors impacting sleep quality. A dashboard was created using phyton and data visualization tools to provide actionable insights and recommendations for improving sleep health.

dashboard data-analysis html phyton sleep-efficiency

Last synced: 06 Jan 2026

https://github.com/hevalhazalkurt/word_analyser

A web app developed in Python and Django that analyzes given text mathematically and sentimentally.

analyzer analyzes content data-analysis django emotion python python3 sentiment sentiment-analyser sentiment-analysis text text-analysis

Last synced: 19 May 2026

https://github.com/aygp-dr/claude-log-stream

Advanced analytics engine for Claude Code logs with real-time processing capabilities

claude-api clojure data-analysis monitoring

Last synced: 24 Sep 2025

https://github.com/palwisha-18/time_series_analysis_lex_vs_gdp

Analyzes how a country’s GDP per capita correlates with the life expectancy of its citizens over a period of about 100+ years

data-analysis data-visualization pandas plotl time

Last synced: 19 May 2026