An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/maddieemihle/pandas-challenge

Python analysis to create and manipulate school and standardized test data. Scores are calculated, grouped, aggregated, summarized, and organized using pandas.

data-analysis pandas-python

Last synced: 09 Jun 2026

https://github.com/bhavna-kale/cars-eda-project

Project analyzing used car market data to identify high-impact price drivers and depreciation curves, presented through an interactive web application.

data-analysis excel matplotlib numpy pandas python3 searborn streamlit

Last synced: 03 May 2026

https://github.com/stepankuzmin/machine-learning-data-analysis

My homeworks on Coursera Machine Learning and Data Analysis specialization

coursera data-analysis jupiter machine-learning python

Last synced: 03 May 2026

https://github.com/chaedoll/analysis-python-foreignerinfra

국내 외국인 대상 인프라 개선을 위한 보고서 (Report on improving infrastructure for foreigners)

data-analysis python team-project

Last synced: 03 May 2026

https://github.com/ababic/dumpling

Fast, flexibile, powerful static data anonymisation for SQL dumps

anonymisation cli data-analysis data-science pii pii-redaction postgres privacy rust rust-lang scrubber scrubbing security tooling

Last synced: 03 May 2026

https://github.com/brunomontezano/sleep-quality-cognition

💤 Analysis of the paper "Associations between general sleep quality and measures of functioning and cognition in subjects recently diagnosed with bipolar disorder".

bipolar-disorder cognition data-analysis sleep-analysis sleep-research

Last synced: 15 Jun 2026

https://github.com/muskanmi/data_analysis_python

Data analysis on students result dataset using python libraries.

boxplot countplots data-analysis numpy pandas pie-chart python3 seaborn

Last synced: 03 May 2026

https://github.com/gaboelc/analysis-of-the-employment-situation-in-costa-rica-2018-2022

This is an analysis with data extracted from the INEC in order to identify the changes that occurred in the Costa Rican labor market before, during and after the COVID-19 pandemic.

costa-rica data-analysis empleo employment

Last synced: 24 Mar 2025

https://github.com/rachit1084/sql-practice-ankit-bansal

Personal SQL problem-solving practice based on Ankit Bansal's YouTube series, with logic-driven solutions for analyst prep.

analytics data-analysis data-analyst interview-preparation logical-reasoning postgresql sql sql-practice

Last synced: 04 Jul 2025

https://github.com/matteospanio/speed-analysis

A project to analyze the internet speed

bash-script data-analysis

Last synced: 03 May 2026

https://github.com/syarwinaaa09/analyzing-crime-in-los-angeles

Exploratory data analysis of Los Angeles crime data with insights on temporal patterns, locations, and age demographics.

crime-data data-analysis eda los-angeles pandas public-safety python visualization

Last synced: 03 May 2026

https://github.com/nathadriele/world-marathon-run-majors-analytics-challenge

This project presents a complete data engineering, analytics, machine learning, and Streamlit dashboard pipeline focused on the Abbott World Marathon Majors: Tokyo, Boston, London, Berlin, Chicago, and New York City. Covering the 2018 to 2025 seasons, it analyzes more than 628,000 runner records and 86 verified winner entries.

challenge data-analysis data-pipeline gradient-boosting lasso-regression linear-regression machine-learning models predictive-modeling python random-forest ridge-regression run-analytics world-marathon

Last synced: 09 Jun 2026

https://github.com/prathmesh2507/global-stock-intelligence-dashboard

Interactive Global Stock Market Analytics Dashboard built using Python, YFinance, Pandas, Streamlit, and Plotly. Analyze 20+ countries and 400+ top stocks with advanced visualizations and financial insights.

dashboard data-analysis data-visualization python stock-analysis streamlit

Last synced: 15 Jun 2026

https://github.com/r13i/cheapest-phone-call

Small challenge to find the best phone operator to use based on call price

big-data big-data-analytics cheapest data-analysis data-cruncher pandas phone-number pricelist

Last synced: 04 May 2026

https://github.com/fatihilhan42/the-office-eda

Data analysis study of my favorite sitcom, The Office (US).

data-analysis data-science data-visualization fatihilhan office python sitcom

Last synced: 04 May 2026

https://github.com/soham7998/data-analysis-projects

My Data Analysis Projects which are completed by me and gain a hands on Experience from each project. the project showcase different Concepts , Visualization and many things.

data data-analysis data-science machine-learning nlp python soham visualization

Last synced: 04 May 2026

https://github.com/mchenryspagg/investigate_a_dataset

This is a data analysis project that demonstrates the student's ability to use python data analysis libraries such as pandas, numpy and pyplot in matplotlib to investigate a dataset and answer specific questions from the dataset, thus demonstrating skills in data cleaning, data wrangling, and exploratory data analysis.

data-analysis datetime descriptive-analysis descriptive-statistics exploratory-data-analysis numpy pandas pyplot python visualization

Last synced: 04 May 2026

https://github.com/halyusa16/e-commerce-analysis

This project analyzes a public e-commerce dataset to uncover valuable insights and answer critical business questions. The dataset contains customer, product, order, and transaction details, providing a comprehensive view of the e-commerce platform's operations.

data-analysis data-cleaning data-exploration data-visualization self-project

Last synced: 09 Jun 2026

https://github.com/youssefyaser/scrape-the-imdb-site-for-the-top-250-movies

Web scraping the top 250 movies in IMDB site.

data-analysis numpy pandas python

Last synced: 04 May 2026

https://github.com/matt-ags/jornada-python

Repositório com os projetos realizados durante a semana "Jornada Python" - 01/2025

artificial-intelligence automation data-analysis jupyter-notebook machine-learning python

Last synced: 05 May 2026

https://github.com/rtlich/sap-sustainable-management

Project for the ERP & BI course at Esprit School of Engineering. It optimizes resource and operations management in an agri-food company using SAP MM & PM, focusing on sustainability, CO₂ reduction, and predictive maintenance.

angular business-intelligence data-analysis flask machine-learning ocr powerbi python sql-server talend

Last synced: 05 May 2026

https://github.com/anderson-andre-p/uber-data-analysis

This repository contains a comprehensive data analysis project focused on Uber rides. The dataset used in this project is a spreadsheet obtained from Uber, containing data related to ride details, such as pick-up and drop-off locations, date and time of the ride, and the fare amount.

data-analysis data-science data-visualization python

Last synced: 15 Jun 2026

https://github.com/sajjad425/edaipl

The dataset covers the Indian Premier League (IPL) with details on matches (date, teams, venue, results), player stats (runs, wickets), team stats (wins, losses), season summaries, and umpire info. The EDA reveals patterns and insights, highlighting dominant teams, star players, and trends across seasons.

data-analysis eda exploratory-data-analysis ipl python

Last synced: 05 May 2026

https://github.com/pcanadas/weather_scraper

Este proyecto automatiza la recopilación y el procesamiento de datos meteorológicos históricos y previsionales. Utiliza Selenium para extraer información de sitios web de clima, procesa los datos con Pandas y los almacena en archivos CSV limpios. Es ideal para análisis climáticos, visualización de datos o integración en otros sistemas.

beautifulsoup data-analysis pandas python selenium

Last synced: 05 May 2026

https://github.com/ankitmishralive/machinelearning

Continuously deep diving in understanding & advancing my expertise in Machine Learning through ongoing education and hands on experience with practical learning.

artificial-intelligence data-analysis data-cleaning data-gathering machine-learning machinel-learning-algorithms matplotlib numpy pandas python seaborn

Last synced: 22 Mar 2025

https://github.com/eubrunoo/beer-consumption-predictor

An R project analyzing the impact of environmental factors on beer consumption in São Paulo, with a predictive linear regression model.

data-analysis data-science data-visualization machine-learning r statistical-analysis statistics

Last synced: 02 Apr 2025

https://github.com/astropenguin/optimap

Optimized integrated intensity map method for spectral cubes

astronomy data-analysis data-science python python3 radio-astronomy spectral-cubes

Last synced: 09 Apr 2025

https://github.com/benjaminrose/data-analysis-book

A Jupyter Book for my Spring 2025 PHY 5381 class on Data Analysis

book data-analysis data-science data-visualization jupyter-book open-book python r statistics-course

Last synced: 06 May 2026

https://github.com/yashpaneliya/bank-loan-default-analysis

Analyze and understand the driving factors (or driver variables) behind loan default, i.e. the variables which are strong indicators of default.

data-analysis loan-default-analysis matplotlib numpy pandas python

Last synced: 06 May 2026

https://github.com/abhinav330/customer-behavior-analysis-linear-regression

This repository explores customer behavior data for an NYC clothing company with both a mobile app and website. They want to understand which platform drives higher sales.

data-analysis data-science data-visualization eda exploratory-data-analysis jupyter jupyter-notebook linear-regression machine-learning machine-learning-algorithms machinelearning-python numpy pandas python regression-analysis

Last synced: 06 May 2026

https://github.com/juanse0330/registro-pacientes-terapia-python

Proyecto en Python para automatizar el registro y análisis de pacientes en terapia ocupacional domiciliaria. Herramienta orientada al sector salud.

automatizacion data-analysis python salud terapia-ocupacional

Last synced: 17 Jun 2026

https://github.com/rohitblaze10/netflix_analysis_using_tableau

The Netflix dashboard in Tableau provides a professional and visually captivating interface for users to explore a vast collection of TV shows and series. With seamless navigation and interactive filters, users can easily personalize their recommendations based on release year, genre, duration, and rating.

data data-analysis data-science data-visualization netflix tableau

Last synced: 04 Feb 2026

https://github.com/korniichuk/pydatan-homework

Python Data Analysis course homework

course data-analysis data-analysis-python python python3

Last synced: 06 May 2026

https://github.com/rlalpha49/anisearch-model

AniSearchModel leverages Sentence-BERT (SBERT) models to generate embeddings for synopses, enabling the calculation of semantic similarities between descriptions. This allows users to find the most similar anime or manga based on a given description.

anime api data-analysis data-merging embeddings flask hugging-face-datasets kaggle-datasets machine-learning manga natural-language-processing nlp python sentence-bert similarity-search

Last synced: 06 May 2026

https://github.com/urbanekda/upwork_dashboard

A data analysis project examining trends and patterns in the data science job market on Upwork. This project analyzes job postings, requirements, and market demands to provide insights into the freelance data science ecosystem.

data-analysis data-science data-science-projects data-visualization freelance jupyter-notebook python streamlit

Last synced: 07 May 2026

https://github.com/badranalyst/exploratory-data-analysis-on-salaries-dataset

Performing EDA on a dataset related to salaries, exploring relationships between factors like job titles, industries, and locations. Insights are visualized with plots to identify trends and disparities in salary data.

data-analysis dataset eda exploratory-data-analysis pandas python

Last synced: 07 May 2026

https://github.com/vikpires/ds_tips-dataset

Projeto individual do bootcamp de ciência de dados avanti 2024.2, com o objetivo de analisar e observar padrões no conjunto de dados "Tips".

data-analysis data-science data-visualization exploratory-data-analysis jupyter-notebook matplotlib numpy pandas python seaborn tips

Last synced: 17 Sep 2025

https://github.com/ibttf/bayborhood

Interactive map to find the ideal neighborhood in San Francisco based on data.

data data-analysis data-visualization gis mapbox react

Last synced: 18 Jun 2026

https://github.com/ilhanseyhanx/car-price-prediction-with-machine-learning

🚗 ML-powered car price prediction model with 95.88% accuracy using Random Forest and comprehensive data preprocessing

car-price-prediction data-analysis data-science machine-learning pandas python random-forest regression sklearn

Last synced: 19 Jun 2026

https://github.com/biginformatics/git-basics

Hands-on Git and GitHub lessons for analysts and statisticians

data-analysis git github public-health training

Last synced: 10 Jun 2026

https://github.com/bnvulpe/regression-and-time-series

This work centers on assessing and comparing predictive models for regression and time series prediction using specific datasets, with the goal of selecting the most effective methodology for unseen test data.

colab data-analysis data-analysis-python data-science data-visualization forecasting jupyter-notebook machine-learning model-evaluation predictive-modeling python regression sarima sarimax time-series-analysis time-series-analysis-and-forecasting

Last synced: 08 May 2026

https://github.com/otonomee/against-the-clock-transcript-analysis

This repository contains code and analysis for exploring the transcripts of the various "Against The Clock" videos featured on the FACT Magazine YouTube channel. The goal is to uncover insights, patterns, and trends across the different artists and their creative process under time constraints.

against-the-clock ai-analysis audio-processing creative-ai creative-process data-analysis fact-magazine machine-learning music-production natural-language-processing nlp text-mining yt-dlp

Last synced: 08 May 2026

https://github.com/shahaf-f-s/feature-space

A modular framework for combining pandas series features

data-analysis data-science feature-engineering

Last synced: 19 Jun 2026

https://github.com/0290192029/apartment-price-predictor

Python-проект по прогнозированию стоимости аренды квартир с помощью линейной регрессии. Практическая работа по теме: "Основы машинного обучения" дисциплины "МДК 13.01: Основы применения методов искусственного интеллекта в программировании".

apartment-price-prediction apartments-for-rent api correios-api data-analysis feature-engineering feature-enginering linear-regression linear-regression-models mlops numpy prediction-model r seaborn

Last synced: 08 May 2026

https://github.com/sumit-sinha9/sales-analysis

Analyzing 12 months worth fo Sales data

data-analysis pandas python visualization

Last synced: 08 May 2026

https://github.com/vanshuchaudhary/flightpriceanalysis-

The uploaded file is a Jupyter Notebook titled "Flight Analysis". It likely involves analyzing flight-related data, potentially exploring trends, patterns, or insights using data science techniques. The analysis might include data visualization, statistical analysis, or predictive modeling.

business-analytics data data-analysis data-visualization datainsights datascience matplotlib-pyplot python seaborn seaborn-plots seaborn-python sns statistical-analysis

Last synced: 08 May 2026

https://github.com/satvikpraveen/numpymasterpro

A hands-on, production-ready toolkit to master NumPy — from first principles to real-world applications. Includes modular Jupyter notebooks, reusable utility scripts, cheatsheets, and advanced projects like K-Means clustering from scratch.

broadcasting data-analysis data-science data-source data-visualization jupyter-notebook kmeans-clustering linear-algebra machine-learning matrix-algebra numerical-computation numpy numpy-broadcasting numpy-examples numpy-tutorial open-source python scientific-computing standardization vectorization

Last synced: 08 May 2026

https://github.com/deepanshkhurana/udacityproject-prediciting-boston-housing-prices

This is a Udacity Project for the Machine Learning Nanodegree. Here, we are trying to predict Boston Housing Prices using sklearn.

data-analysis data-science machine-learning python scikit-learn udacity

Last synced: 08 May 2026

https://github.com/chdre/data-analyzer

A small package to analyze and preprocess data.

data-analysis python

Last synced: 28 Jun 2026

https://github.com/mahapeth/invest-track

Реализация инструмента для мониторинга активности пользователей ИС "Инвест" для ВКР по направлению 01.03.02 Прикладная математика и информатика

analitycs app data-analysis data-visualization jupyter-notebook python sites

Last synced: 20 Jun 2026

https://github.com/alejandrolara11/data-preprocessing

Data preprocessing through the use of the libraries NumPy and pandas.

data-analysis data-cleaning data-preprocessing numpy pandas python

Last synced: 09 May 2026

https://github.com/mrunmayee3108/financial-chatbot

A Python chatbot for analyzing financial data of companies with revenue, income, assets, cash flow, and debt ratio queries

chatbot data-analysis jupyter-notebook pandas python python3

Last synced: 09 May 2026

https://github.com/trismald/eurosoccer1023

Data Analyst - European Soccer 2010 2023

data-analysis data-visualization jupyter-notebook pandas powerbi python

Last synced: 06 May 2026

https://github.com/manishkaa/google_data_analytics_capstone_case_study

This case study is a part of Google Data Analytics Capstone Project

bigquery data-analysis sql tableau

Last synced: 05 Oct 2025

https://github.com/vishal786-commits/target-businesscasestudy-sql

This project analyzes Target’s e-commerce transactions in Brazil between 2016 and 2018 using SQL. The goal was to explore customer behavior, order patterns, payments, delivery times, and freight costs to generate actionable business insights.

bigquery data-analysis sql

Last synced: 05 Oct 2025

https://github.com/subhamghimire/dataanavis

Learning Data analysis and visualization

data-analysis data-science data-visualization dataset

Last synced: 06 Oct 2025

https://github.com/data-edd/mastering_sql

This is a repo documenting me mastering sql

data-analysis mysql mysql-database sql

Last synced: 06 Oct 2025

https://github.com/pranavsp108/time-series-forcasting

A time-series forecasting project to predict hourly energy consumption using Python, Pandas, and an XGBoost regression model.

data-analysis data-science energy-consumption forecasting matplotlib numpy pandas python scikit-learn sustainability time-series xgboost

Last synced: 10 Apr 2026

https://github.com/ehsan-behzadi/online-retail-data-analysis-and-preprocessing

This project analyzes and preprocesses the Online Retail dataset to uncover insights into customer purchasing behaviors, sales trends, and product performance. It includes data cleaning, exploration, and visualization, with the goal of enhancing understanding of online retail dynamics.

cohort-analysis data-analysis data-cleaning data-exploration duplicate-detection exploratory-data-analysis-eda feature-encoding feature-engineering handling-missing-values online-retail outlier-detection preprocessing trends-visualization visualization z-score-method

Last synced: 16 Apr 2026

https://github.com/dacq-trap/dacq

DacQ Score Server

data-analysis go nextjs

Last synced: 14 Jan 2026

https://github.com/paphada1103/data-analysis-with-python

📊 Analyze data efficiently using Python’s top libraries. Learn to explore, clean, and visualize data for meaningful insights in your projects.

carpentries data-analysis data-carpentry data-visualisation dataframe-api dataset english hacktoberfest ibm jovian lsl machine-learning matplotlib programming python realtime social-sciences spark

Last synced: 09 May 2026

https://github.com/michaelcurrin/yahoo-finance-reports

Use the Yahoo Finance API to get info on shares of interest and report on them

data-analysis data-science python reporting shares stock-market yahoo-finance yahoo-finance-api

Last synced: 07 Oct 2025

https://github.com/eharshit/end-to-end-vendor-insights

End-to-end analysis of vendor performance for wholesale/retail businesses, featuring data ingestion, cleaning, insights, and interactive Power BI dashboards.

analysis analysis-algorithms analytics dashboard data data-analysis datascience jupyter jupyter-notebook pandas powerbi powerbi-report retail wholesale

Last synced: 07 Oct 2025

https://github.com/edanur-y/airline-customer-satisfaction-prediction-with-multiple-logistic-regression

Performing multiple logistic regression analysis on airline and customer data to predict the satisfaction. 🔵R

data-analysis missing-values-analysis multiple-logistic-regression optimal-cut-off-points r

Last synced: 09 Jun 2026

https://github.com/abhipatel35/svm-hyperparameter-optimization-for-breast-cancer

Utilizing SVM for breast cancer classification, this project compares model performance before and after hyperparameter tuning using GridSearchCV. Evaluation metrics like classification report showcase the effectiveness of the optimized model.

breast-cancer cancer-diagnosis classification data-analysis data-science gridsearchcv healthcare hyperparameter-tuning jupyter-notebook machine-learning medical-imaging pycharm python scikit-learn support-vector-machine svm

Last synced: 05 Feb 2026

https://github.com/npodlozhniy/podlozhnyy-module

One place for the most useful methods for work

data-analysis data-science pypi

Last synced: 21 Jan 2026

https://github.com/aymanmomin/excel-coffee-data-analytics-exploring-coffee-orders-dataset

This project utilizes a coffee orders dataset to perform comprehensive data analytics and gain insights into customer preferences, popular items, and sales trends. The analysis aims to provide valuable information for coffee shop owners and enthusiasts, facilitating data-driven decision-making and improved customer satisfaction.

data-analysis data-visualization excel project

Last synced: 18 Jan 2026

https://github.com/drod75/burger_king_analysis

A simple analysis on a burger king dataset.

data-analysis data-visualization jupyter-notebook pandas python seaborn

Last synced: 09 May 2026

https://github.com/rizkipragustono/data_analysis_spark

Exploration: Data Analysis using Spark

apache-spark data-analysis pyspark python spark-sql sql

Last synced: 09 May 2026

https://github.com/ndomah1/learning-probability-and-statistics

This repo is a comprehensive learning resource that covers fundamental to advanced topics in probability and statistics, including probability theory, descriptive and inferential statistics, probability distributions, regression analysis, and data exploration techniques.

correlation-analysis data-analysis descriptive-statistics exploratory-data-analysis hypothesis-testing inferential-statistics probability regression statistics

Last synced: 18 Jan 2026

https://github.com/aalekhpatel07/statcan

StatCAN dataset fetcher and cleaner.

census data-analysis data-science statcan

Last synced: 02 Apr 2025

https://github.com/alexquilis1/spanish-fuel-stations-analysis

Real-time analysis of Spanish fuel prices using government API data with interactive maps and regional comparisons

data-analysis data-visualization fuel-prices geospatial-analysis ggplot2 government-data leaflet open-data r shiny spain tidyverse

Last synced: 08 Oct 2025

https://github.com/emanoelcampos/python-onemonth

This repository contains educational materials and projects developed during a Python course offered by OneMonth. It covers Python basics, intermediate concepts, web development with Flask, and data analysis with pandas. The course is structured into weeks, each focusing on a different aspect of Python programming and its applications.

data-analysis flask jupyter-notebook onemonth python python3

Last synced: 09 May 2026