An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/scikit-learn/scikit-learn

scikit-learn: machine learning in Python

data-analysis data-science machine-learning python statistics

Last synced: 12 Nov 2025

https://github.com/pandas-dev/pandas

Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more

alignment data-analysis data-science flexible pandas python

Last synced: 16 Dec 2025

https://github.com/metabase/metabase

The easy-to-use open source Business Intelligence and Embedded Analytics tool that lets everyone work with data :bar_chart:

analytics bi business-intelligence businessintelligence clojure dashboard data data-analysis data-visualization database metabase mysql postgres postgresql reporting slack sql-editor visualization

Last synced: 29 Jan 2026

https://github.com/gradio-app/gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

data-analysis data-science data-visualization deep-learning deploy gradio gradio-interface hacktoberfest interface machine-learning models python python-notebook ui ui-components

Last synced: 28 Jan 2026

https://github.com/gchq/cyberchef

The Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis

compression data-analysis data-manipulation encoding encryption hashing parsing

Last synced: 12 May 2025

https://github.com/gchq/CyberChef

The Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis

compression data-analysis data-manipulation encoding encryption hashing parsing

Last synced: 13 Mar 2025

https://gchq.github.io/CyberChef/

The Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis

compression data-analysis data-manipulation encoding encryption hashing parsing

Last synced: 18 Mar 2025

https://github.com/dataease/dataease

🔥 人人可用的开源 BI 工具,数据可视化神器。An open-source BI tool alternative to Tableau.

apache-doris business-intelligence data-analysis data-visualization echarts kettle superset tableau

Last synced: 22 Jan 2026

https://github.com/sinaptik-ai/pandas-ai

Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.

ai csv data data-analysis data-science data-visualization database datalake gpt-4 llm pandas sql text-to-sql

Last synced: 15 Jan 2026

https://github.com/airbytehq/airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.

bigquery change-data-capture data data-analysis data-collection data-engineering data-integration data-pipeline elt etl java mssql mysql pipeline postgresql python redshift s3 self-hosted snowflake

Last synced: 09 Sep 2025

https://github.com/allinurl/goaccess

GoAccess is a real-time web log analyzer and interactive viewer that runs in a terminal in *nix systems or through your browser.

analytics apache c caddy cli command-line dashboard data-analysis gdpr goaccess google-analytics monitoring ncurses nginx privacy real-time terminal tui web-analytics webserver

Last synced: 13 May 2025

https://github.com/Sinaptik-AI/pandas-ai

Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.

ai csv data data-analysis data-science data-visualization database datalake gpt-4 llm pandas sql text-to-sql

Last synced: 25 Mar 2025

https://github.com/kanaries/pygwalker

PyGWalker: Turn your pandas dataframe into an interactive UI for visual analysis

data-analysis data-exploration dataframe matplotlib pandas plotly tableau tableau-alternative visualization

Last synced: 09 Sep 2025

https://github.com/Kanaries/pygwalker

PyGWalker: Turn your pandas dataframe into an interactive UI for visual analysis

data-analysis data-exploration dataframe matplotlib pandas plotly tableau tableau-alternative visualization

Last synced: 26 Mar 2025

https://github.com/akfamily/akshare

AKShare is an elegant and simple financial data interface library for Python, built for human beings! 开源财经数据接口库

academic akshare asset-pricing bond currency data data-analysis data-science datasets economic-data economics finance finance-api financial-data fundamental futures option quant stock

Last synced: 20 Jan 2026

https://github.com/openrefine/openrefine

OpenRefine is a free, open source power tool for working with messy data and improving it

data-analysis data-science data-wrangling datacleaning datacleansing datajournalism datamining java journalism opendata reconciliation wikidata

Last synced: 11 Jan 2026

https://github.com/guipsamora/pandas_exercises

Practice your pandas skills!

data-analysis exercise pandas practice tutorial

Last synced: 23 Apr 2025

https://github.com/OpenRefine/OpenRefine

OpenRefine is a free, open source power tool for working with messy data and improving it

data-analysis data-science data-wrangling datacleaning datacleansing datajournalism datamining java journalism opendata reconciliation wikidata

Last synced: 15 Mar 2025

https://github.com/tangyudi/ai-learn

人工智能学习路线图,整理近200个实战案例与项目,免费提供配套教材,零基础入门,就业实战!包括:Python,数学,机器学习,数据分析,深度学习,计算机视觉,自然语言处理,PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域

algorithm artificial-intelligence caffe cv data-analysis data-mining data-science deep-learning keras machine-learning mathematics matplotlib nlp numpy pandas python pytorch seaborn tensorflow tensorflow2

Last synced: 14 May 2025

https://github.com/tangyudi/Ai-Learn

人工智能学习路线图,整理近200个实战案例与项目,免费提供配套教材,零基础入门,就业实战!包括:Python,数学,机器学习,数据分析,深度学习,计算机视觉,自然语言处理,PyTorch tensorflow machine-learning,deep-learning data-analysis data-mining mathematics data-science artificial-intelligence python tensorflow tensorflow2 caffe keras pytorch algorithm numpy pandas matplotlib seaborn nlp cv等热门领域

algorithm artificial-intelligence caffe cv data-analysis data-mining data-science deep-learning keras machine-learning mathematics matplotlib nlp numpy pandas python pytorch seaborn tensorflow tensorflow2

Last synced: 07 May 2025

https://github.com/gonum/gonum

Gonum is a set of numeric libraries for the Go programming language. It contains libraries for matrices, statistics, optimization, and more

data-analysis go golang graph matrix scientific-computing statistics

Last synced: 10 May 2025

https://github.com/alluxio/alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud

alluxio data-analysis data-orchestration hadoop memory-speed presto spark tensorflow virtual-distributed-filesystem

Last synced: 08 Jan 2026

https://github.com/scikit-learn-contrib/imbalanced-learn

A Python Package to Tackle the Curse of Imbalanced Datasets in Machine Learning

data-analysis data-science machine-learning python statistics

Last synced: 12 May 2025

https://github.com/Alluxio/alluxio

Alluxio, data orchestration for analytics and machine learning in the cloud

alluxio data-analysis data-orchestration hadoop memory-speed presto spark tensorflow virtual-distributed-filesystem

Last synced: 26 Mar 2025

https://github.com/jeecgboot/JimuReport

「数据可视化工具:报表、大屏、仪表盘」积木报表是一款类Excel操作风格,在线拖拽设计的报表工具和和数据可视化产品。功能涵盖: 报表设计、大屏设计、打印设计、图形报表、仪表盘门户设计等,完全免费!秉承“简单、易用、专业”的产品理念,极大的降低报表开发难度、缩短开发周期、解决各类报表难题。

bi bigscreen birt data-analysis data-visualization dataease datav echart echarts finereport highcharts ireport jasperreport jfreechart metabase print redash report superset tableau

Last synced: 27 Mar 2025

https://github.com/jeecgboot/jimureport

「数据可视化工具:报表、大屏、仪表盘」积木报表是一款类Excel操作风格,在线拖拽设计的报表工具和和数据可视化产品。功能涵盖: 报表设计、大屏设计、打印设计、图形报表、仪表盘门户设计等,完全免费!秉承“简单、易用、专业”的产品理念,极大的降低报表开发难度、缩短开发周期、解决各类报表难题。

bi bigscreen birt data-analysis data-visualization dataease datav echart echarts finereport highcharts ireport jasperreport jfreechart metabase print redash report superset tableau

Last synced: 08 Mar 2025

https://github.com/rhiever/data-analysis-and-machine-learning-projects

Repository of teaching materials, code, and data for my data analysis and machine learning projects.

data-analysis data-science evolutionary-algorithm ipython-notebook machine-learning python

Last synced: 14 May 2025

https://github.com/rhiever/Data-Analysis-and-Machine-Learning-Projects

Repository of teaching materials, code, and data for my data analysis and machine learning projects.

data-analysis data-science evolutionary-algorithm ipython-notebook machine-learning python

Last synced: 15 Mar 2025

https://github.com/flyteorg/flyte

Scalable and flexible workflow orchestration platform that seamlessly unifies data, ML and analytics stacks.

data data-analysis data-science dataops declarative fine-tuning flyte golang grpc hacktoberfest kubernetes kubernetes-operator llm machine-learning mlops orchestration-engine production python scale workflow

Last synced: 12 May 2025

https://github.com/microsoft/taskweaver

A code-first agent framework for seamlessly planning and executing data analytics tasks.

agent ai-agents code-interpreter copilot data-analysis llm openai

Last synced: 13 May 2025

https://github.com/airbnb/knowledge-repo

A next-generation curated knowledge sharing platform for data scientists and other technical professions.

data data-analysis data-science knowledge

Last synced: 14 May 2025

https://github.com/microsoft/TaskWeaver

A code-first agent framework for seamlessly planning and executing data analytics tasks.

agent ai-agents code-interpreter copilot data-analysis llm openai

Last synced: 25 Mar 2025

https://github.com/cube2222/octosql

OctoSQL is a query tool that allows you to join, analyse and transform data from multiple databases and file formats using SQL.

cli csv data-analysis go golang json mysql nosql postgresql query query-engine redis sql

Last synced: 14 May 2025

https://github.com/SpiderClub/weibospider

:zap: A distributed crawler for weibo, building with celery and requests.

data-analysis distributed-crawler python3 sina weibo weibospider

Last synced: 26 Mar 2025

https://github.com/javascriptdata/danfojs

Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.

danfojs data-analysis data-analytics data-manipulation data-science dataframe javascript pandas plotting-charts stream-data stream-processing table tensorflow tensors

Last synced: 14 May 2025

https://github.com/spiderclub/weibospider

:zap: A distributed crawler for weibo, building with celery and requests.

data-analysis distributed-crawler python3 sina weibo weibospider

Last synced: 15 Dec 2025

https://github.com/nyandwi/machine_learning_complete

A comprehensive machine learning repository containing 30+ notebooks on different concepts, algorithms and techniques.

computer-vision data-analysis data-science data-visualization datascience deep-learning keras machine-learning matplotlib neural-networks nlp numpy open-source pandas python scikit-learn seaborn tensorflow

Last synced: 12 Apr 2025

https://github.com/Nyandwi/machine_learning_complete

A comprehensive machine learning repository containing 30+ notebooks on different concepts, algorithms and techniques.

computer-vision data-analysis data-science data-visualization datascience deep-learning keras machine-learning matplotlib neural-networks nlp numpy open-source pandas python scikit-learn seaborn tensorflow

Last synced: 05 Apr 2025

https://github.com/bramblexu/pydata-notebook

利用Python进行数据分析 第二版 (2017) 中文翻译笔记

chinese-translation data-analysis jupyter-notebook pandas python-for-data-analysis

Last synced: 14 May 2025

https://github.com/BrambleXu/pydata-notebook

利用Python进行数据分析 第二版 (2017) 中文翻译笔记

chinese-translation data-analysis jupyter-notebook pandas python-for-data-analysis

Last synced: 28 Mar 2025

https://github.com/whoiskatrin/sql-translator

SQL Translator is a tool for converting natural language queries into SQL code using artificial intelligence. This project is 100% free and open source.

data-analysis data-engineering dataquery datascience dataset openai postgresql query sql

Last synced: 14 May 2025

https://github.com/has2k1/plotnine

A Grammar of Graphics for Python

data-analysis grammar graphics plotting python

Last synced: 29 Jan 2026

https://github.com/residentmario/missingno

Missing data visualization module for Python.

data-analysis data-visualization missing-data pandas python

Last synced: 13 May 2025

https://github.com/ResidentMario/missingno

Missing data visualization module for Python.

data-analysis data-visualization missing-data pandas python

Last synced: 15 Mar 2025

https://github.com/briefercloud/briefer

Dashboards and notebooks in a single place. Create powerful and flexible dashboards using code, or build beautiful Notion-like notebooks and share them with your team.

analytics bi bigquery briefer business-intelligence businessintelligence dashboard data-analysis data-visualization jupyter notebook postgres postgresql reporting visualization

Last synced: 13 May 2025

https://github.com/clidey/whodb

A lightweight next-gen data explorer - Postgres, MySQL, SQLite, MongoDB, Redis, MariaDB, Elastic Search, and Clickhouse with Chat interface

anthropic chatgpt clickhouse data-analysis database elasticsearch explorer golang lightweight mariadb mongodb mysql ollama openai parcel postgresql reactjs sqlite3 typescript

Last synced: 29 Dec 2025

https://github.com/antonycourtney/tad

A desktop application for viewing and analyzing tabular data

csv data-analysis data-science database desktop-application duckdb parquet-viewer pivot-tables pivots tabular-data

Last synced: 22 Mar 2025

https://github.com/aksnzhy/xlearn

High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

data-analysis data-science factorization-machines ffm fm machine-learning statistics

Last synced: 14 May 2025

https://github.com/pydata/pandas-datareader

Extract data from a wide range of Internet sources into a pandas DataFrame.

data data-analysis dataset econdb economic-data fama-french finance financial-data fred html pandas pydata python stock-data

Last synced: 14 May 2025

https://github.com/apache/incubator-devlake

Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.

dashboard-friendly data data-analysis data-engineering data-integration data-transfers devops domain-layer dora etl golang hacktoberfest integration jira open-source user-friendly

Last synced: 16 Jan 2026

https://github.com/jm199504/financial-knowledge-graphs

小型金融知识图谱构建流程(neo4j / python / cypher / KG)

cypher data-analysis graph-database neo4j python

Last synced: 15 May 2025

https://github.com/jm199504/Financial-Knowledge-Graphs

小型金融知识图谱构建流程(neo4j / python / cypher / KG)

cypher data-analysis graph-database neo4j python

Last synced: 27 Mar 2025

https://github.com/root-project/root

The official repository for ROOT: analyzing, storing and visualizing big data, scientifically

c-plus-plus cling data-analysis geometry graphics hacktoberfest interpreter machine-learning mathematics parallel physics python root root-cern statistics visualization

Last synced: 12 May 2025

https://github.com/hicccc77/WeFlow

WeFlow - 一个本地的微信聊天导出和年度报告应用

annual-report data-analysis data-visualization message wechat

Last synced: 09 Feb 2026

https://github.com/ajcr/100-pandas-puzzles

100 data puzzles for pandas, ranging from short and simple to super tricky (60% complete)

data-analysis numpy pandas python

Last synced: 14 May 2025

https://github.com/mito-ds/mito

Jupyter extensions that help you write code faster: Context aware AI Chat, Autocomplete, and Spreadsheet

ai data data-analysis data-science data-visualization jupyter jupyterhub jupyterlab jupyterlab-extension pandas python streamlit-component

Last synced: 16 Dec 2025

https://github.com/justinzm/gopup

数据接口:百度、谷歌、头条、微博指数,宏观数据,利率数据,货币汇率,千里马、独角兽公司,新闻联播文字稿,影视票房数据,高校名单,疫情数据…

covid19-data data data-analysis data-science datasets economic-data gopup index-data python

Last synced: 14 May 2025