An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/cosmoduende/r-spotify-history-analysis

Explore your activity on Spotify with R and "spotifyr": How to analyze and visualize your streaming history and music tastes. Find out how and how much you consume from Spotify, using a copy of your personal data and the "spotifyr" package

analisis-de-data analytics data-analysis data-analysis-r data-analytics data-visualization r-language r-programming sentiment-analysis spotify-analysis spotify-api spotify-connect spotify-data spotify-playlist spotify-streaming-history spotify-web-api spotifyr streaming-history visualizacion-de-datos visualizaciones

Last synced: 11 Apr 2025

https://github.com/selva221724/edasql

edaSQL is a python library to bridge the SQL with Exploratory Data Analysis where you can connect to the Database and insert the queries. The query results can be passed to the EDA tool which can give greater insights to the user.

correlation data-analysis data-science data-visualization dataprofiling eda missing-values outlier-detection pandas python sql

Last synced: 10 Jun 2025

https://github.com/data-forge/data-forge-fs

This library contains the file system extensions to Data-Forge that allow it to directly read and write CSV and JSON files in Node.js

csv data data-analysis data-cleaning data-cleansing data-forge data-management data-manipulation data-munging data-visualization data-wrangling javascript json linq nodejs pandas visualization

Last synced: 04 Sep 2025

https://github.com/cool-japan/pandrs

DataFrame library for data analysis implemented in Rust. It has features and design inspired by Python's pandas library, combining fast data processing with type safety.

data-analysis data-science datafrane pandas rust rust-lang

Last synced: 04 Apr 2026

https://github.com/alluxio/k8s-operator

An operator for managing Alluxio system on Kubernetes cluster

alluxio data-analysis data-orchestration kubernetes kubernetes-operator machine-learning

Last synced: 15 Aug 2025

https://github.com/29dch/ai_ml_dataanalysis_datavisualization_classic-examples

关于AI,ML,DA,DV等的几个经典案例,包括堵车模拟(NagelSchreckenberg)、蒙特卡洛排队问题(Monte Carlo Queuing Problem)、人脸识别(RecognitionFace)、遗传算法推断图像(IconGenetic)

ai data-analysis data-visualization ml

Last synced: 29 Apr 2025

https://github.com/zjunlp/datamind

Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study

agent artificial-intelligence data-analysis data-science language-model natural-language-processing

Last synced: 04 Oct 2025

https://github.com/jaumebonet/rosettasilenttoolbox

Python Toolbox For Rosetta Silent Files Processing

data-analysis data-visualization protein-design protein-sequence science

Last synced: 06 Sep 2025

https://github.com/noduslabs/infranodus

A Node.Js / Neo4J tool that translates words and relations into network graphs and shows you how it all connects.

data-analysis datavisualization dataviz graph-visualization javascript neo4j network-analysis nodejs text-mining visualization

Last synced: 21 Jan 2026

https://github.com/lcsrodriguez/hft-intradayvol-estimation

Intraday volatility estimation using High-Frequency Financial Data

data-analysis hft hft-data microstructure-noise python q realized-volatility

Last synced: 17 May 2026

https://github.com/lotfiferaga/eda-python

Exploratory Data Analysis with Python

data-analysis pandas python visualization

Last synced: 27 Apr 2025

https://github.com/pnnl-comp-mass-spec/proteomics-data-analysis-tutorial

A comprehensive tutorial for proteomics data analysis in R that utilizes packages developed by researchers at PNNL and from Bioconductor.

data-analysis proteomics

Last synced: 18 Jan 2026

https://github.com/contextlab/data-wrangler

Wrangle messy numerical, image, and text data into consistent well-organized formats

data data-analysis data-science data-wrangling hugging-face image-data machine-learning nlp numpy pandas python scikit-learn

Last synced: 10 Apr 2025

https://github.com/vuthanhhai2302/apply-machine-learning-on-data-analytics

My project of applied machine learning on data analytics, using pandas, numpy and scikit-learn to analyze data

data-analysis numpy pandas scikit-learn

Last synced: 28 Apr 2025

https://github.com/pottekkat/go-corona

Live viz and updates of COVID 19. Let us fight this together!

corona covid-19 covid-19-india data-analysis data-visualization health infographics spread tableau

Last synced: 30 Jul 2025

https://github.com/vishnu-t-r/sql_solved_questions

In this repository you will find SQL queries of multi-difficulty levels that can applied on to large datasets for various business requirements.

business-solutions complex-sql data-analysis sql-data-analysis sql-queries

Last synced: 02 Mar 2026

https://github.com/ilyasmoutawwakil/fcc-data-analysis-with-python

My 5 Data Analysis projects that I've built as part of my freeCodeCamp assignment.

certification course-project data-analysis fcc fcc-assignment fcc-certification fcc-data python

Last synced: 27 Jul 2025

https://github.com/nicejade/play-with-python

学习 Python & 实战练习,以便能更好的玩 Python 相关技能、工具。

ai automation data-analysis excel python python3 sendmail spider visualization

Last synced: 08 Apr 2025

https://github.com/tbep-tech/tbeptools

R package for Tampa Bay Estuary Program functions

data-analysis package tampa-bay tbep water-quality

Last synced: 19 Feb 2026

https://github.com/AurelienAubry/Spotlight

Spotlight is a Spotify dashboard that allows user to visualize his listening habits.

backend bootstrap chartjs data data-analysis data-science data-visualization flask frontend javascript js pandas python python3 react react-bootstrap spotify

Last synced: 15 Apr 2025

https://github.com/imartinezl/formula-one-viewer

Formula 1 standings and point scoring systems 🏎️🏁

data-analysis data-visualization formula1 interactive-visualizations js

Last synced: 28 Feb 2026

https://github.com/otherwa/csvdash

Analyze and visualize CSV data with ease using this Streamlit-powered data analytics tool

analytics data-analysis huggingface llama-index pandas python statistics streamlit

Last synced: 15 Apr 2025

https://github.com/cosmoduende/r-ggsoccer

StrangeR things: Visualizing Soccer Data with R… on a Soccer Pitch? How to analyze, visualize and report soccer data and strategies on a soccer pitch with the "ggsoccer" package

data-analysis data-analysis-in-r data-analytics ggsoccer package packages r-language r-package r-programming r-programming-projects r-studio soccer soccer-analytics soccer-data soccer-game soccer-matches soccer-simulation

Last synced: 11 Apr 2025

https://github.com/masasron/shraga

Private, powerful data analysis—right in your browser.

data-analysis frontend llm

Last synced: 14 Jul 2025

https://github.com/ztjhz/sc1015-project

Predict the success of an anime using data science and machine learning (regression + classification)

anime data-analysis data-science machine-learning

Last synced: 22 Mar 2025

https://github.com/saisurajmatta/bike-sales-excel-dashboard-project

Bike Sales Excel Dashboard Project: Analyzed and visualized sales data, cleaned datasets, and created interactive dashboards in Excel.

data-analysis data-analytics data-cleaning data-visualization excel excel-dashboard excel-data-analytics pivot-tables

Last synced: 11 Feb 2026

https://github.com/vidhi1290/machine-learning-pipeline

Explore a collection of Jupyter notebooks that guide you through various stages of the machine learning pipeline. From data analysis and feature engineering to model training and deployment, these notebooks provide practical insights for both beginners and experienced data enthusiasts. Let's dive into the world of data-driven decision-making! 📊🚀"

data-analysis feature-engineering feature-selection jupyter jupyter-notebook machine-learning machine-learning-algorithms machine-learning-pipeline model-training new-dataset opensource python

Last synced: 10 Apr 2025

https://github.com/jonocarroll/runkeepr

Extract, plot, and analyse Runkeeper(TM) data.

data-analysis data-mining gis gpx rstats runkeeper

Last synced: 29 Oct 2025

https://github.com/amazingcoderpro/flight-delay-prediction

2018年全球程序员大赛参赛作品, 在给定的数据基础上,加上自己采集的飞机、天气等影响因子, 利用svm算法预测航班延误率.

airline data-analysis python svm svm-classifier svm-model

Last synced: 07 May 2025

https://github.com/reycn/data-analytics-in-julia

Notebooks for data analysis in social science using Julia, replicating frequent analytical steps in Python & R.

data data-analysis data-science data-visualization julia

Last synced: 07 May 2025

https://github.com/port-zero/lens

Data structure traversal in the command line

data-analysis lens traversal

Last synced: 18 Jan 2026

https://github.com/abhay557/fakedata

The fakedata package generates realistic synthetic user profiles for machine learning, deep learning, data analysis, and data science workflows.

abhay557 anime data data-analysis data-science deep-learning fake fake-data generator joke machine-learning mock mock-data

Last synced: 30 May 2026

https://github.com/antoniosbarotsis/coronabot

My attempt at data mining and analysis on Covid-19

chartjs coronavirus covid-19 data-analysis discord-bot graph hacktoberfest

Last synced: 28 Oct 2025

https://github.com/joanacmbarros/analysis-results-data-model

Data analysis produces data in the form of results. Although integrating and putting findings into context is a cornerstone of scientific work, data of this type is often neglected. The analysis results data model is a proposed solution to this issue that combines analysis standards with a common data model.

analysis-results automation clinical-data clinical-data-management data-analysis data-management data-model data-standards data-stewardship

Last synced: 16 Aug 2025

https://github.com/wahyudesu/predicting-hotel-booking-cancellations

This project will help hotel managers optimize their booking policies, reduce cancellations, and improve revenue.

data data-analysis data-science python

Last synced: 07 Jul 2025

https://github.com/tillbiskup/aspecd

Python framework for handling spectroscopic data focussing on reproducibility

data-analysis good-practices reproducible-research reproducible-science spectroscopy

Last synced: 06 Sep 2025

https://github.com/vandermeerlab/nept

Neuroelectrophysiology tools used for the analysis of neural recording data and related behaviors

data-analysis neuroscience python

Last synced: 11 Mar 2026

https://github.com/meetpateltech/convelyze

Visualize your ChatGPT usage with interactive charts and insights.

ai chatgpt data-analysis openai

Last synced: 31 Mar 2025

https://github.com/mchenryspagg/sql-portfolio-project

A portfolio project involving a detailed analysis of 37,997 high school/college student records to showcase key insights through the aid of effective visualizations aimed at evaluating the factors affecting student's academic performance in high school and colleges in the US..

data-analysis data-visualization database dataset datasets joins spreadsheet sql sqlite visualization

Last synced: 05 Sep 2025

https://github.com/pmgraham/datagrunt

Datagrunt is a Python library designed to simplify the way you work with CSV files. It provides a streamlined approach to reading, processing, and transforming your data into various formats, making data manipulation efficient and intuitive.

csv csv-parser data-analysis data-engineering data-science data-wrangling dataframe duckdb open-source polars python python3

Last synced: 26 Aug 2025

https://github.com/lonelyhentxi/minellius

A experimental github data analysis solution. Group project of comp-3002, 2018 fall, hitsz.

angular assignment data-analysis electron github nestjs ng-zorro-antd typescript

Last synced: 11 Apr 2025

https://github.com/shanisoni/tata-data-visualization-empowering-business-with-effective-insights

Within the confines of this repository, you will find an extensive compilation of the assignments that were integral to my participation in the TATA Data Visualization Empowering Business with Effective Insights Virtual Experience Program. 📊 📈 📉

analysis-and-reporting analytics-and-decision-science chart communications dashboards data-analysis data-cleanup data-interpretation data-storytelling data-visualizations graph insights power-bi tableau visual-basic visualizations

Last synced: 05 Mar 2026

https://github.com/tawounfouet/mlops-specialiazation-duke

MLOps | Machine Learning Operations Specialization from Duke University : acquiring critical MLOps skills, including the use of Python and Rust, utilizing GitHub Copilot to enhance productivity, and leveraging platforms like Amazon SageMaker, Azure ML, and MLflow.

aws azure big-data cloud-computing data-analysis data-management devops mlops python rust

Last synced: 23 Apr 2025

https://github.com/thecoderpinar/big-tech-financial-insights

🚀 A comprehensive project analyzing Big Tech stock prices using time series analysis, volatility modeling, and macroeconomic indicators. Featuring interactive dashboards and automated reporting! 📈💼

data-analysis data-science finance machine-learning macroeconomics stock-analysis time-series-analysis volatility-modeling

Last synced: 03 Apr 2025

https://github.com/aditeyabaral/lok-sabha-election-twitter-analysis

Twitter Feeds were analysed during the Lok Sabha Elections 2019 to guage the overall popularities of each party and predict the winner based solely on the tweets made by the population. This was made as a part of our Data Science course (UE18CS203) at PES University.

data-analysis data-science data-visualization elections loksabha nlp prediction probabilistic-graphical-models probability python python3 sentiment-analysis sentiment-classification sentiment-polarity sentiment-scores social-media socialmediaanalytics statistical-analysis statistical-models twitter

Last synced: 16 Apr 2025

https://github.com/kaos599/vit-gpa-calculator

VIT-GPA-Calculator is a Python application that extracts course grade data from a PDF file to calculate your current CGPA and allows you to simulate grade improvements. It leverages Camelot for PDF table extraction and Pandas for data manipulation, making it a handy tool for students and educators alike.

calculator camelot cgpa cgpa-calculator cgpa-simulator data-analysis education pandas pdf python- vellore-institute-of-technology vit-bhopal vit-university

Last synced: 18 Nov 2025

https://github.com/antoineaugusti/google-search-gdpr

Transform your Google Search GDPR export in CSV

data-analysis gdpr google-search python self-data

Last synced: 28 Oct 2025

https://github.com/mattcieslak/meap

Software for cardiovascular data analysis and visualization

data-analysis electrophysiology python statistics

Last synced: 18 Sep 2025

https://github.com/super-lou/makaho

🥤 MAKAHO (for MAnn-Kendall Analysis of Hydrological Observations) is an interactive cartographic visualization system that allows to calculate trends present in data from hydrometric stations with flows which are little influenced by human actions

climate climate-change climate-data data-analysis data-visualization environment hydrology inrae leaflet r shiny shiny-apps stationarity-test statistics

Last synced: 13 Apr 2025

https://github.com/ritvik19/vizard

Intuitive, Interactive, Easy and Quick Visualizations for Data Science Projects

data-analysis data-science data-visualization

Last synced: 10 Apr 2025

https://github.com/srohit0/ml-misc

Miscellaneous Machine Learning and Data Analysis Projects

colaboratory data-analysis data-science data-visualization google-colab machine-learning-algorithms

Last synced: 15 Apr 2025

https://github.com/shubham18024/census_analysis

This repository contains code and resources for a summer research project focused on statistical analysis of census data. The project aims to analyze demographic trends, population distributions, and other relevant metrics derived from census datasets.

census-data csv-files data-analysis data-visualization jupyter-notebook mitosheet python report statistics

Last synced: 15 Apr 2025

https://github.com/graphcore-research/kg-topology-toolbox

A Python toolbox to compute topological metrics and statistics for Knowledge Graphs

data-analysis graph-topology knowledge-graph metrics-gathering

Last synced: 17 Jan 2026

https://github.com/daisybio/prone

R Package for preprocessing, normalizing, and analyzing proteomics data

data-analysis evaluation normalization proteomics

Last synced: 25 May 2026

https://github.com/tushar2704/pyverse-exploring-python-frameworks

This repository is the Ultimate guide to exploring and mastering Python Libraries & frameworks, collection of code and guide by me, Tushar!

artificial-intelligence data-analysis data-engineering data-science data-visualization machine-learning python streamlit-tushar2704 tushar2704 web-application

Last synced: 30 Oct 2025

https://github.com/jhrcook/spotify-data-analysis

The analysis of my Spotify streaming data.

data-analysis data-analysis-r r rlang rstudio spotify-data

Last synced: 30 Oct 2025

https://github.com/ayushanand18/climate-change-vs-agri

Visualizing trends and patterns how climate change affects agriculture (Python, Jupyter Notebook)

climate-change data-analysis data-visualization matplotlib python visualization

Last synced: 28 Apr 2026

https://github.com/kaustubhgupta/blogathon-analysis

Analytics Vidhya Blogathon Data Analysis: Python Data Extraction with PowerBI dashboard

data-analysis data-mining data-scraping excel pandas powerbi powerbi-report project python tqdm

Last synced: 12 Apr 2025

https://github.com/psyteachr/quant-fun-v2

Fundamentals of Quantitative Analysis

data-analysis data-visualization datawrangling r statistics

Last synced: 11 Oct 2025

https://github.com/stats-tests/statstests

Statstests: a Python package that provides a complement of process and statistical tests for statsmodels statistical models.

count-models data-analysis hyphotesis-tests python regression-models statistics statsmodels

Last synced: 17 Mar 2026

https://github.com/ahmed-maher77/wind-turbine-power-prediction-app-using-machine-learning

"Wind Power Predictor" is a machine learning project that forecasts turbine output using real-time data from Turkish wind farms. Its web app interface offers convenient access to predictions, enabling informed decisions for maximizing energy production and advancing renewable energy usage.

ai catboost data-analysis data-science flask html-css-javascript javascript machine-learning matplotlib numpy pandas predictive-modeling pwa python sklearn web web-development wind wind-turbine wind-turbine-operational-optimization

Last synced: 10 Apr 2025

https://github.com/thecoderpinar/house-price-prediction-project

🏠 This project focuses on predicting house prices using advanced regression techniques. It involves comprehensive data preprocessing, feature engineering, and model selection. The aim is to develop an accurate predictive model for real estate prices.

data-analysis data-preprocessing data-visualization deep-learning jupyter-notebook machine-learning neural-networks python regression regression-models

Last synced: 30 Apr 2025

https://github.com/alrza2003/binancetrader

AI-powered trading bot using Python, pandas, scikit-learn, NumPy, and TensorFlow. Interacts with Binance API for cryptocurrency trading based on ZigZag indicator and AI predictions.

binance ccxt cryptocurrency data-analysis numpy pandas python statistics tensorflow trading trading-algorithms trading-strategies tradingbot

Last synced: 12 Apr 2026

https://github.com/akbaritabar/dask-duckdb-dbeaver

Parallelised and out of memory data analysis using Dask in Python and DuckDB and DBeaver in SQL. Using example of publicly accessible ORCID 2019 XML files

data-analysis data-science pandas parallel-computing python

Last synced: 08 Aug 2025

https://github.com/adrienc21/vulpes

Vulpes: Test many classification, regression models and clustering algorithms to see which one is most suitable for your dataset

automl data-analysis data-science machine-learning models package python scikit-learn statistics

Last synced: 25 Oct 2025

https://github.com/louis-heraut/makaho

🥤 MAKAHO (for MAnn-Kendall Analysis of Hydrological Observations) is an interactive cartographic visualization system that allows to calculate trends present in data from hydrometric stations with flows which are little influenced by human actions

climate climate-change climate-data data-analysis data-visualization environment hydrology inrae leaflet r shiny shiny-apps stationarity-test statistics

Last synced: 08 Jul 2025

https://github.com/rasrea/dataviz

这是一个基于Java和Spring Boot的可视化项目,旨在帮助(新手)开发者掌握前后端分离的开发模式。在项目中,用户可以上传数据文件,进行数据预处理,并通过前端的图表模块展示数据。项目涉及Spring Boot的API设计,数据的前后端交互,数据的处理与可视化展示,并且使用了Echarts图表进行数据展示。

data-analysis data-visualization spring-boot

Last synced: 05 Aug 2025

https://github.com/syamkakarla98/datascience_head_start

This repository focuses on the building path for the data science.

data-analysis data-science data-visualization machine-learning machinelearning-python python3

Last synced: 03 May 2025

https://github.com/dr-montasir/mnjs

MATH NODE JS (MNJS): A tiny math library for node.js & JavaScript on browser

data-analysis data-science javascript js jsdelivr library math nextjs npm react svelte sveltekit ts typescript yarn

Last synced: 26 Apr 2025

https://github.com/thecoderpinar/samsung_stock_analysis_forecasting_and_volatility_analysis

A comprehensive analysis and forecasting project for Samsung stock data, utilizing historical data to build predictive models and analyze volatility.

data-analysis deep-learning financial-analysis forecasting machine-learning python stock-analysis volatility-forecasting

Last synced: 31 Jul 2025

https://github.com/leosouliotis/odsc_python_da

My tutorial on "Introduction to Python for Data Analysis" on the ODSCA 2022

data-analysis data-visualization pandas python

Last synced: 07 Sep 2025

https://github.com/fluhus/biostuff

Computational biology packages for Go.

bioinformatics biology computational-biology data-analysis go golang

Last synced: 12 Jan 2026

https://github.com/abubakkar32/machine-learning-practice

This GitHub repository is a valuable resource for machine learning and Python enthusiasts. It includes a wide range of projects and tools, covering topics like Data Visualization, Data Analysis, ML, DL, Automation, NLP, Web Scraping, and more. Contributors are welcome to join and learn together in this supportive community. Happy coding!

automation big-data chatbot data-analysis data-visualization datastractures machine-learning mathplotlib nlp-machine-learning numpy pandas plotly pypdf2-lib pytest python3 seaborn selenium-webdriver skit-learn webscraping

Last synced: 23 Mar 2025

https://github.com/banitalebi/data-visualization-dashboard

The Data Visualization Dashboard serves as an excellent starting point for projects focused on data analysis and representation.

data-analysis data-visualization interactive-charts nextjs shadcn-ui starter talwindcss typescript

Last synced: 10 Apr 2025