An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/zrkhadija/data-analysis-for-financial-time-series

In this notebook, we performed data analysis on financial time series data from Yahoo Finance for the US market. We examined seasonality, trends, stationarity, and other aspects such as outliers and correlations.

autocorrelation correlation-analysis data-analysis financial-analysis time-series-analysis timeseries-forecasting visualization

Last synced: 09 Feb 2026

https://github.com/frikishaan/browsing-history-analysis

This is a data analysis of my browsing history for the last 7 months.

browsing-history data-analysis jupyter-notebook python

Last synced: 18 May 2026

https://github.com/pangeo-data/foss4g-2022

Pangeo tutorial at FOSS4G 2022

data-analysis hvplot pangeo time-series xarray

Last synced: 12 Apr 2025

https://github.com/lachlanharrisdev/project-eidolon

// A modular OSINT pipeline framework that makes information gathering feel like cheating — because it almost is.

cybersecurity data-analysis docker enterprise infosec modular osint python

Last synced: 01 Mar 2026

https://github.com/marios-mamalis/mca-visualisation

A script for automatic visualisation of Multiple Correspondence Analysis (MCA) results from FactoMineR in 3 dimensions using Plotly (exported as html)

3d-scatterplots correspondence-analysis data-analysis factominer html mca multiple-correspondence-analysis plotly visualisation

Last synced: 29 Jun 2026

https://github.com/1994nikunj/nlp-toolkit-desktop-app

The code is a collection of NLP analyses, including text cleaning, most common words, n-grams generation, co-occurrence matrix generation, wordcloud generation, topic modeling (using Latent Dirichlet Allocation), and general text statistics.

data-analysis n-grams network-visualization nlp python text-cleaning topic-modeling wordcloud-generator

Last synced: 18 Jul 2025

https://github.com/beckversync/probability-and-statis_computer-parts-cpus-and-gpus-ics_

Probability and statistical analysis techniques are employed to explore data related to computer components, such as CPUs, GPUs, and Integrated Circuits (ICs). The objective is to uncover trends, identify patterns, and extract meaningful insights from real-world hardware data.

data-analysis r

Last synced: 18 Feb 2026

https://github.com/yukito0209/predict-podcast-listening-time

Kaggle · Playground Prediction Competition, Playground Series - Season 5, Episode 4

data-analysis ensemble-learning jupyter-notebook kaggle-competition machine-learning prediction

Last synced: 10 Apr 2025

https://github.com/hariprashad-ravikumar/ai-datascience-lab

AI‑DataScience‑Lab is a web app for uploading CSV datasets, cleaning with Pandas, and running quick exploratory analyses and regression models using scikit‑learn. Its modular design supports future AI extensions, like deep learning with TensorFlow or insight generation via the OpenAI API.

ai api azure cloudcomputing data data-analysis data-science data-visualization mathplotlib numpy openai pandas python scikit-learn

Last synced: 02 Aug 2025

https://github.com/felixcharotte/ibm_datascience_capstone

In this project, we predicted if the SpaceX Falcon 9 first stage will land successfully by following the data science methodology. We also summarized the results for the business stakeholders.

analysis data-analysis data-science data-visualization databases folium jupyter-notebook machine-learning machine-learning-alrgorithms matplotlib pandas plotly plotly-dash python scikit-learn scipy seaborn sql

Last synced: 26 Jul 2025

https://github.com/justin-pyne/dota-liquipedia-web-scraper

Scraping information off Liquipedia from DOTA leagues with BeautifulSoup/Pandas for statistical analysis/EDA.

bs4 csv data-analysis pandas python scraper

Last synced: 13 Jul 2025

https://github.com/mwoss/mlflow-stock-market-example

Stock market prediction - machine learning pipeline using MLFlow.

anaconda data-analysis databricks example lstm mlflow python stock-market stock-price-prediction tutorial

Last synced: 21 Jul 2025

https://github.com/tesfamichael12/solar-farm-analysis

This repository contains code and analysis for exploring solar farm data from Benin, Sierra Leone, and Togo. It includes EDA, strategic recommendations for optimal solar farm locations, and an interactive Streamlit dashboard.

data-analysis eda ml solar-farm-analysis

Last synced: 07 Aug 2025

https://github.com/kevinyang372/san-francisco-crime-data-analysis

An ARIMA prediction model for forecasting potential crimes based on users' time and location

data-analysis machine-learning

Last synced: 29 Oct 2025

https://github.com/bocaletto-luca/world-bank-explorer

World Bank Explorer is an interactive and responsive web application that retrieves, visualizes, and compares global development indicators sourced from the World Bank Open Data API. The application allows users to explore data on multiple scales ... By Bocaletto Luca

api bocaletto-luca chartjs css3 data-analysis data-visualization economic-indicatos economic-trends free-data global-development html5 interactive-dashboard javascript open-data open-source publicdata responsive world-bank

Last synced: 12 Mar 2026

https://github.com/CAIDA/submarine-cable-impact-analysis-public

This repository contains tools implemented for the PAM 2020 paper "Unintended consequences: Effects of submarine cable deployment on Internet routing" to collect and analyze data depicting the impact of the South-Atlantic Cable System (SACS) launch on Internet routing. This codebase can be extended to other use-cases of cable launches, failures, etc.

africa-americas africa-south-america bgp-data-analysis caida-ark-measurement-platform data-analysis historical-traceroutes impact internet-routing ripe-atlas-measurement-platform sacs-cable sail-cable submarine-cables

Last synced: 06 Apr 2025

https://github.com/nikoshet/exploratory-data-analysis-using-r

Exploratory Data Analysis using R Course Project for M.Sc. 'Data Science and Machine Learning' in NTUA

data data-analysis data-science eda exploratory-data-analysis ggplot2 r

Last synced: 14 May 2026

https://github.com/techytushar/india-odi-analysis

Analysis of ODI cricket matches of Indian Team

cricket data-analysis data-science pandas plotting python3

Last synced: 05 May 2026

https://github.com/poga/dat-ipynb-demo

use ipython notebook to analyze data in dat archive

dat data-analysis distributed jupyter-notebook

Last synced: 17 Aug 2025

https://github.com/aloth/power-bi-book-resources

Official resources for "Teach Yourself VISUALLY Power BI" by Alexander Loth (Wiley). Get all Power BI project files (.pbix) and datasets to follow along with the visual, step-by-step exercises in the book.

analytics bi business-analytics business-intelligence dashboards data-analysis data-cleaning data-modeling datavisualization dax etl microsoft microsoft-power-bi power-bi-desktop power-platform powerbi powerquery reporting sql visualization

Last synced: 19 Feb 2026

https://github.com/quantumudit/uk-student-accommodation-analysis

This project focuses on scraping student properties related data from the UK Student Accommodation website; performing necessary transformations on the scraped data and then analyzing & visualizing it using Jupyter Notebook and Power BI.

data-analysis data-science data-transformation data-visualization etl jupyter-notebook power-bi python webscraping

Last synced: 27 Apr 2026

https://github.com/yash22222/data-analysis-with-python

This repository provides a practical introduction to data acquisition and analysis using Pandas. It covers loading datasets, exploring data, manipulating data, and gaining insights through statistical summaries. Ideal for beginners, it offers code examples and explanations to enhance your data manipulation skills using Pandas for Python.

binning data data-acquisition data-analysis data-binning data-cleaning data-formatting data-integration data-normalization data-preprocessing data-science data-transformation data-wrangling dataframe description numpy pandas pandas-dataframe python python3

Last synced: 09 Apr 2026

https://github.com/emaasit/pydata-book

Learning data analysis with python

data-analysis jupyter pandas python

Last synced: 12 Jul 2025

https://github.com/umbrellaleaf5/drugdesign_data_analysis

Module of the DrugDesign project responsible for loading and pre-processing data from ChEMBL and PubChem, necessary for further modeling and analysis in drug development

chembl chemistry dafe data-analysis doxygen-documentation mipt pubchem python requests

Last synced: 15 Aug 2025

https://github.com/joanacmbarros/ardm-website

Website to support the R in Pharma 2023 workshop on the ARDM

analysis-results automation clinical-data data-analysis data-model r-in-pharma

Last synced: 03 Apr 2025

https://github.com/praju-1/pandas

The library is widely used in data science and machine learning for data cleaning, preparation, and analysis.

data-analysis pandas python

Last synced: 17 Feb 2026

https://github.com/emptymalei/mini-lab

Some code snippets used to explain stuff to myself in my personal data science wiki

data-analysis data-mining data-science data-visualization datascience

Last synced: 07 Apr 2025

https://github.com/anonympins/data-primals-engine

Manage and automate your data at scale 🚀 With data-primals-engine you get workflows, dashboards, alerts, i18n, client integration & AI assistant — all open-source, all MongoDB powered.

api automation data data-analysis data-engineer data-visualization database expressjs low-code mongodb nodejs rest-api

Last synced: 07 Mar 2026

https://github.com/wfamous/fiv_update-data

This project automates the retrieval, processing, and publishing of digital product data for our Shopify store. It integrates Google Cloud Platform (GCP), Amazon Web Service (AWS), Terraform (Tofu), Python, Bash, Ansible and GitHub Actions to manage data pipelines efficiently.

ansible aws bash data data-analysis data-science devops gcp python pythonpackage shopify terraform tofu

Last synced: 17 Feb 2026

https://github.com/atxtechbro/flightradar24

Advanced Python application leveraging the power of APIs and the pandas library to retrieve and perform in-depth analysis of flight data from Flightradar24. It uncovers insights such as the most common departure and arrival cities, contributing to the field of aviation data science.

api-integration aviation-data data-analysis data-science data-visualization flightradar24-api pandas-library python requests-library web-scraping

Last synced: 21 Mar 2025

https://github.com/rgalyeon/machine_learning_and_data_analysis

Machine Learning and Data Analysis specialization by Yandex and MIPT

coursera data-analysis data-science machine-learning mipt python yandex

Last synced: 03 Mar 2025

https://github.com/archived-blueprints/postgresql-blueprints

Simplified blueprints for building data pipelines with PostgreSQL.

cli data-analysis data-engineering data-pipeline data-science database elt etl postgres postgresql

Last synced: 29 Jul 2025

https://github.com/deep-diver/data-analysis-on-titanic

applying data analysis on titanic data sheet

data-analysis titanic-data

Last synced: 30 Mar 2025

https://github.com/rapidsurveys/oldr

An Implementation of the Rapid Assessment Method for Older People (RAM-OP)

assessment data-analysis odk r ram-op rapid-assessment

Last synced: 12 Apr 2025

https://github.com/kaguya163/ankara_coffee_sales_analysis

"Coffee shop sales analysis in Ankara. SQL, Tableau, Python, Data Analytics"

data-analysis mysql python sql tableau

Last synced: 17 Feb 2026

https://github.com/prathmesh2507/india-superstore-powerbi-dashboard

Interactive India Superstore Sales Dashboard built using Power BI

business-intelligence dashboard data-analysis data-visualization powerbi

Last synced: 16 May 2026

https://github.com/cego669/datathonengopevi

Equipe: Embrapeiros. Solução proposta para o Datathon do VI ENGOPE (Encontro Goiano de Probabilidade e Estatística). Obs: FOMOS CAMPEÕES!!!!!!!!

data-analysis data-science datathon python r streamlit xgboost-classifier

Last synced: 18 Feb 2026

https://github.com/deep-diver/enron-data-analysis

Data Analysis and Machine Learning on Enron Data

data-analysis enron-data exploratory-data-analysis machine-learning

Last synced: 08 Jan 2026

https://github.com/prathmesh2507/coffee-sales-powerbi-dashboard

Interactive Coffee Shop Sales Dashboard built using Power BI

business-intelligence dashboard data-analysis data-visualization dax powerbi

Last synced: 16 May 2026

https://github.com/prithivsakthiur/data-board

Data Boards - Visualization of various plots ( Analysis )

data-analysis gradio huggingface keras mathplotlib pandas plots pyplot scikit-learn seaborn spaces

Last synced: 25 Feb 2026

https://github.com/michaelcurrin/water-crisis-scraper

Scrape and explore data related to Cape Town's water crisis (Python3 application)

cape-town cron csv dam-levels data-analysis html open-data python3 schedule scraping south-africa water-crisis water-level webscraping

Last synced: 28 Jul 2025

https://github.com/nabilalibou/uber_fare_prediction_explained

This repository documents a complete ML workflow to model Uber fares in Paris, from granular EDA and feature engineering to building and fine-tuning a stacking regressor on 10k real-world rides.

data-analysis data-science eda feature-engineering machine-learning predictive-analytics pricing-model python regression-model stacking-ensemble uber

Last synced: 30 Jun 2026

https://github.com/elysian01/ml-eda-and-modelling-using-streamlit

Beautiful Web interface made using Streamlit for quick Exploratory Data Analysis and building classification models which are implemented from scratch.

data-analysis data-visualization eda exploratory-data-analysis knn-classification logistic-regression matplotlib ml-model-on-web ml-models naive-bayes-classifier pandas seaborn streamlit streamlit-webapp

Last synced: 12 Apr 2025

https://github.com/leandronasx/agro-data

Projeto final da formação de analista de dados e dashboard da SoulCode Academy.

bigquery data-analysis gcp looker pandas powerbi python

Last synced: 18 Jul 2025

https://github.com/BigBangData/TimesheetAnalysis

R shiny app to help analyze a bookkeeper's business - or anyone with a timesheet and some time.

bookkeeping data-analysis data-viz r-programming shiny-apps shiny-r timesheet-management

Last synced: 29 Jul 2025

https://github.com/thecoderpinar/reta

🍃 Explore the world of renewable energy production, analyze historical data, and predict sustainable energy trends. Join us on the journey to a greener future!

arima clean-energy data-analysis data-science data-visualization energy-future forecasting-models innovation renewable-energy sustainability time-series

Last synced: 03 Apr 2025

https://github.com/rawsashimi1604/jobextract

Scrapes LinkedIn data. Conducts sentiment analysis on what traits and qualifications employers are looking for.

data data-analysis data-analytics data-cleaning linkedin mvc python webscraper

Last synced: 06 Nov 2025

https://github.com/thecoderpinar/worldpopulationanalysis2024

World Population Analysis 2024: An In-Depth Exploration of Urban and Rural Populations and Infrastructure Accessibility

data-analysis data-science economic-indicators machine-learning population-growth prophet-forecasting

Last synced: 03 Apr 2025

https://github.com/quantumudit/demographic-data-analysis

This project focuses on analyzing and finding correlations between the three important metrics by 195 countries,i.e., birth rate, internet users, and income group.

data-analysis jupyter-notebook power-bi python

Last synced: 15 May 2026

https://github.com/himanshu231204/featurementor-ai

🧠 AI-powered Feature Engineering Mentor for ML students. Upload any CSV → get smart preprocessing recommendations with Google Gemini explanations. Learn WHY, not just HOW. Built with Streamlit + Python. ⭐ Star if useful!

data-analysis data-preprocessing data-science feature-engineering generative-ai pdf-report streamlit

Last synced: 04 Apr 2026

https://github.com/zachlagden/spotify-listening-analyzer

A comprehensive Python tool for analyzing your Spotify listening history data.

analytics data-analysis pandas python spotify-web-api spotipy

Last synced: 31 Jul 2025

https://github.com/sarincr/data-analytics-with-knime

Data Analytics with KNIME (Konstanz Information Miner), a free and open-source data analytics, reporting and integration platform. KNIME integrates various components for machine learning and data mining through its modular data pipelining concept. A graphical user interface and use of JDBC allows assembly of nodes blending different data sources, including preprocessing (ETL: Extraction, Transformation, Loading), for modeling, data analysis and visualization without, or with only minimal, programming.

ai artificial-intelligence artificial-intelligence-algorithms artificial-neural-networks data-analysis data-mining data-science data-structures data-visualization database datascience deep-learning machine-intelligence machine-learning machine-learning-algorithms machinelearning mining mining-software

Last synced: 14 Mar 2025

https://github.com/nafisalawalidris/analyzing-nobel-prize-dataset-demographics-and-trends

This project analyses a Nobel Prize dataset using Python and data analysis libraries. It explores the distribution of winners by category and country, examines the proportion of female winners over time, investigates the age of winners when they received the prize and identifies the oldest and youngest recipients.

age-at-award country-distribution data-analysis data-manipulation dataset demographics filtering gender-balance grouping nobel-prize notable-laureates python trends visualisation winners

Last synced: 19 May 2026

https://github.com/i4ds/ecallisto_ng

Ecallisto NG is a Python package tailored for interacting with Ecallisto data.

data-analysis data-visualization e-callisto ecallisto-international-network numpy pandas python spectrometer

Last synced: 13 Oct 2025

https://github.com/souvik09-tech/walmart_sales_dataanalysis

This end-to-end data analysis project leverages Python for processing and SQL for advanced querying to extract key business insights from Walmart sales data. It's designed for data analysts to enhance skills in data manipulation, querying, and pipeline creation.

data-analysis end-to-end etl-pipeline jupyter-notebook mysql mysql-database pandas python

Last synced: 17 Feb 2026

https://github.com/chaitanyac22/customer-segmentation-using-rfm-analysis-for-online-retail

This project uses RFM (Recency, Frequency, and Monetary) segmentation to analyze customer behavior and provide insights for targeted marketing campaigns. By classifying customers based on their purchasing patterns, strategies can be tailored to improve customer retention, drive growth, and maximize the lifetime value of each customer.

customer-segmentation data-analysis data-science data-visualization exploratory-data-analysis marketing numpy pandas python python3 rfm rfm-analysis rfm-segmentation

Last synced: 11 Jun 2025

https://github.com/phollemans/cwutils

CoastWatch Utilities software for working with satellite data files from NOAA CoastWatch and elsewhere

cdat coastwatch-utilities data-analysis data-visualization install4j java noaa-coastwatch remote-sensing satellite-imagery

Last synced: 02 May 2025

https://github.com/simoneas02/data-science

🐍 A planning study to become a data scientist and to improve my current skills. 🤘🏼🌻

data data-analysis data-science data-visualization deep-learning machine-learning pandas python3 r sql

Last synced: 12 Apr 2026

https://github.com/shibam120302/black-friday-sales-data-analysis

This repository contain Data Analysis on Black Friday Sales Data using various Regression ML algorithms

data-analysis eda machine-learning python random-forest regression

Last synced: 20 May 2026

https://github.com/qytz/finchan

An event process framework with Python3.

data-analysis data-science dispatch-events event-driven python3

Last synced: 27 Mar 2026

https://github.com/arjo129/image-sorter

Sort through folders of videos and images. Root out blurred and overexposed images.

computational-photography data-analysis photo-browser photo-gallery photography uwp uwp-apps

Last synced: 25 Jul 2025

https://github.com/casualcomputer/sql.mechanic

Functions that generate SQL queries that summarize high-dimensional tables stored in various databases (e.g. Microsoft SQL Servers, Netezza, DB2, Postgres, Oracle, MySQL, etc.).

data-analysis data-quality-checks data-science database mysql netezza oracle postgres quality-control r sql sql-server

Last synced: 30 Jul 2025

https://github.com/fbecerra/fbecerra.github.io

Source code for my website www.fernandobecerra.com

data-analysis data-science data-visualization dataviz interactive-visualizations

Last synced: 20 Mar 2025

https://github.com/cosmoduende/r-marvel-vs-dc

DC Comics vs Marvel Comics - Exploratory Data Analysis and Data Visualization with R. Who has the smartest, strongest, fastest, or most powerful hero or villain? How to answer this and more questions with R

comics data-analysis data-analysis-r data-analytics data-visualization dataviz dc-characters dc-comics eda exploratory-analysis exploratory-data-analysis exploratory-data-visualizations marvel-characters marvel-comics marvel-vs-dc shdb superherodb superheroes superheros

Last synced: 11 Apr 2025

https://github.com/thisisashukla/survival-analysis

Hands-On Survival Analysis in Python

data-analysis data-science survival-analysis

Last synced: 28 Jul 2025

https://github.com/tameronline/tameronline

Showcasing Projects on Data Analysis, Programming, and AI — Developed Using Python and Modern Frameworks

data-analysis deep-learning flask machine-learning numpy pandas python3 sql web-development

Last synced: 11 Jun 2025

https://github.com/leonism/customer-predictive-analysis

Explore this repository, a comprehensive resource offering an in-depth guide to conducting customer predictive analysis using cutting-edge machine learning techniques, all within the intuitive framework of Dataiku.

data-analysis data-model data-science data-visualization dataiku machine-learning predictive-modeling

Last synced: 28 Mar 2025

https://github.com/bkataru/physics-ia

Programs and files written for Astrostatistics for IB Physics IA. Topic: Visualizing and analyzing the habitable zones for 150,000 stars from the hipparcos catalogue.

astronomical-algorithms astronomy astrophysics astrostatistics data-analysis data-science data-visualization matplotlib plotting

Last synced: 07 Jul 2025

https://github.com/muzammil-13/data_analysis-inmakes

A data-driven project that leverages machine learning to predict Bitcoin price trends. Using historical Bitcoin data, this analysis provides 30-day price forecasts through advanced statistical modeling.

data-analysis data-science learning-by-doing machine-learning numpy pandas python python-library task

Last synced: 19 Feb 2026

https://github.com/patex1987/temperature-calibration

Notebook for sensor calibration evaluation

calibration data-analysis jupyter-notebook sensor

Last synced: 20 Jun 2025

https://github.com/johntocci/nullaxe

Nullaxe is a powerful and user-friendly Python library designed for cleaning and preprocessing data. It works seamlessly with both pandas and polars DataFrames, making it a versatile tool for data scientists and developers.

data data-analysis data-science datacleaning pandas polars python

Last synced: 06 Apr 2026

https://github.com/bala-ceg/digital-payment-index

This project aims to develop an index for the digital transactions of India

collaborate data-analysis fintech hacktoberfest machine-learning statistics

Last synced: 20 Jun 2025

https://github.com/aalkiyumi/senior-design-project

Web scraper for collecting product and review data from e-commerce sites using Scraping Bee, AWS, Selenium, and Pandas. Focuses on cost-effective solutions, user-friendly interfaces, and efficient data extraction and analysis.

aws cs5001 data-analysis data-extraction data-processing data-storage e-commerce-analytics e-commerce-data pandas product-reviews review-sentiment-analysis scraping-bee selenium senior-design-project uc uc2026 university-of-cincinnati web-crawlers web-scraping

Last synced: 18 Feb 2026

https://github.com/arzan101/ev--car-data-analysis

This Power BI dashboard provides an interactive and data-driven overview of the electric vehicle (EV) landscape. It visualizes key insights across various dimensions including sales trends, model performance, manufacturer comparisons, and market growth. The purpose of the dashboard is to enable stakeholders to explore and analyze development

data-analysis data-science data-visualization database datacleaning excel powerbi

Last synced: 17 Jun 2025

https://github.com/milind220/hk-air-quality-analysis

My final project for a statistics and data analysis course. Whew that was a lot of graphs!

data-analysis jupyter-notebook numpy pandas python python3 scipy seaborn statistics

Last synced: 12 Apr 2026

https://github.com/gallillio/data_science-data_visualizer_tool

## About Supervised ML Helper is a Python application that streamlines exploratory data analysis (EDA) and preprocessing for supervised machine learning. Featuring a user-friendly Tkinter interface, it enables users to load CSV files, visualize data, and perform essential transformations, making data preparation accessible for all skill levels.

data-analysis data-science data-visualization matplotlib numpy pandas seaborn sklearn

Last synced: 17 Feb 2026