An open API service indexing awesome lists of open source software.

Data analysis

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the goal of discovering useful information, informing conclusions, and supporting decision-making.

https://github.com/visactor/vtable

VTable is not just a high-performance multidimensional data analysis table, but also a grid artist that creates art between rows and columns.

canvas-table data-analysis data-visualization database datagrid grid javascript-table javescript list-table list-tree online-excel pivot-chart pivot-grid pivot-tables react-table sparklines spreadsheet tree-table visualization vue-table

Last synced: 28 Dec 2025

https://github.com/weijie-chen/linear-algebra-with-python

Lecture Notes for Linear Algebra Featuring Python. This series of lecture notes will walk you through all the must-know concepts that set the foundation of data science or advanced quantitative skillsets. Suitable for statistician/econometrician, quantitative analysts, data scientists and etc. to quickly refresh the linear algebra with the assistance of Python computation and visualization.

computational-science data-analysis data-science data-visualization diagonalization eigenvalues eigenvectors gram-schmidt jupyter linear-algebra linear-transformations mathematics matrix matrix-calculations multivariate-normal-distribution null-space python singular-value-decomposition symmetric-matrices vector-space

Last synced: 15 May 2025

https://github.com/weijie-chen/Linear-Algebra-With-Python

Lecture Notes for Linear Algebra Featuring Python. This series of lecture notes will walk you through all the must-know concepts that set the foundation of data science or advanced quantitative skillsets. Suitable for statistician/econometrician, quantitative analysts, data scientists and etc. to quickly refresh the linear algebra with the assistance of Python computation and visualization.

computational-science data-analysis data-science data-visualization diagonalization eigenvalues eigenvectors gram-schmidt jupyter linear-algebra linear-transformations mathematics matrix matrix-calculations multivariate-normal-distribution null-space python singular-value-decomposition symmetric-matrices vector-space

Last synced: 27 Mar 2025

https://github.com/lana-k/sqliteviz

Instant offline SQL-powered data visualisation in your browser

charting csv data-analysis pivot pivot-table plotly plotting sql sqlite visualization

Last synced: 28 Dec 2025

https://github.com/justmarkham/pandas-videos

Jupyter notebook and datasets from the pandas video series

data-analysis data-cleaning data-science jupyter-notebook pandas python tutorial

Last synced: 15 May 2025

https://github.com/elementary-data/elementary

The dbt-native data observability solution for data & analytics engineers. Monitor your data pipelines in minutes. Available as self-hosted or cloud service with premium features.

analytics-engineer bigquery data-analysis data-governance data-lineage data-observability data-pipeline data-pipelines data-reliability data-warehouse dataops dbt dbt-artifacts dbt-packages lineage redshift snowflake

Last synced: 12 May 2025

https://github.com/rilldata/rill

Rill is a tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL. BI-as-code.

bi business-analytics csv data data-analysis data-visualization dataviz duckdb gcs golang parquet parquet-tools parquet-viewer s3 sql sql-editor svelte sveltejs sveltekit

Last synced: 13 May 2025

https://github.com/chris1610/pbpython

Code, Notebooks and Examples from Practical Business Python

data-analysis data-visualization datascience pandas python scikit-learn

Last synced: 15 May 2025

https://github.com/rilldata/rill-developer

Rill is a tool for effortlessly transforming data sets into powerful, opinionated dashboards using SQL. BI-as-code.

bi business-analytics csv data data-analysis data-visualization dataviz duckdb gcs golang parquet parquet-tools parquet-viewer s3 sql sql-editor svelte sveltejs sveltekit

Last synced: 08 Mar 2025

https://github.com/DAGWorks-Inc/hamilton

Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

dag data-analysis data-engineering data-science dataframe etl etl-framework etl-pipeline feature-engineering hacktoberfest lineage llmops machine-learning mlops orchestration pandas python rag software-engineering

Last synced: 26 Mar 2025

https://github.com/man-group/arcticdb

ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.

big-data data data-analysis data-science database dataframe pandas quantitative-analysis quantitative-finance quantitative-trading

Last synced: 13 May 2025

https://github.com/h2oai/datatable

A Python package for manipulating 2-dimensional tabular data structures

data-analysis data-structure ftrl performance python

Last synced: 13 May 2025

https://github.com/bellingcat/octosuite

GitHub Data Analysis Framework.

data-analysis github

Last synced: 14 May 2025

https://github.com/apachecn/python_data_analysis_and_mining_action

《python数据分析与挖掘实战》的代码笔记

data-analysis data-science python3 readingnotes

Last synced: 08 Apr 2025

https://github.com/deepnote/deepnote

Deepnote is a drop-in replacement for Jupyter with an AI-first design, sleek UI, new blocks, and native data integrations. Use Python, R, and SQL locally in your favorite IDE, then scale to Deepnote cloud for real-time collaboration, Deepnote agent, and deployable data apps. https://deepnote.com/

artificial-intelligence data data-analysis data-science data-visualization deepnote eda jupyter jupyterhub jupyterlab machine-learning notebooks python r sql

Last synced: 18 Nov 2025

https://github.com/404notf0und/AI-for-Security-Learning

安全场景、基于AI的安全算法和安全数据分析业界实践

data-analysis data-mining machine-learning security

Last synced: 27 Apr 2025

https://github.com/404notf0und/ai-for-security-learning

安全场景、基于AI的安全算法和安全数据分析业界实践

data-analysis data-mining machine-learning security

Last synced: 25 Mar 2025

https://github.com/jadianes/spark-py-notebooks

Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks

big-data bigdata data-analysis data-science ipython ipython-notebook machine-learning mllib notebook pyspark python spark

Last synced: 15 May 2025

https://github.com/datageartech/datagear

DataGear数据可视化分析平台,自由制作任何您想要的数据看板

bi business-intelligence chart data-analysis data-analytics data-visualization echarts

Last synced: 14 May 2025

https://github.com/microsoft/responsible-ai-toolbox

Responsible AI Toolbox is a suite of tools providing model and data exploration and assessment user interfaces and libraries that enable a better understanding of AI systems. These interfaces and libraries empower developers and stakeholders of AI systems to develop and monitor AI more responsibly, and take better data-driven actions.

data-analysis data-science data-visualization error-analysis explainability explainable-ai explainable-ml fairness fairness-ai fairness-ml interpretability jupyter machine-learning machinelearning ml responsible-ai ui visualization widget widgets

Last synced: 13 May 2025

https://github.com/ecmadao/hacknical

Hacknical, hacker & technical. A website for GitHub user to make a better resume.

contribute-languages contributions data-analysis github github-analysis github-commits github-contributions reac react resume resume-template

Last synced: 08 Apr 2025

https://github.com/nubank/fklearn

fklearn: Functional Machine Learning

data-analysis data-science machine-learning ml python

Last synced: 13 May 2025

https://github.com/DataBrewery/cubes

[NOT MAINTAINED] Light-weight Python OLAP framework for multi-dimensional data analysis

cube data data-analysis data-warehouse multidimensional-analysis olap sql

Last synced: 26 Mar 2025

https://github.com/man-group/ArcticDB

ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.

big-data data data-analysis data-science database dataframe pandas quantitative-analysis quantitative-finance quantitative-trading

Last synced: 12 Mar 2025

https://github.com/Litlyx/litlyx

Powerful Analytics Solution. Setup in 30 seconds. Display all your data on a Simple, AI-powered dashboard. Fully self-hostable and GDPR compliant. Alternative to Google Analytics, MixPanel, Plausible, Umami & Matomo.

ai analytics angular charts data data-analysis data-visualization javascript metrics nextjs nodejs nuxt open-source react statistics typescript vue website

Last synced: 25 Aug 2025

https://github.com/googlecloudplatform/data-science-on-gcp

Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017

cloud-computing data-analysis data-engineering data-pipeline data-processing data-science data-visualization machine-learning

Last synced: 14 Apr 2025

https://github.com/patmartin/dex

Dex : The Data Explorer -- A data visualization tool written in Java/Groovy/JavaFX capable of powerful ETL and publishing web visualizations.

d3 d3js data-analysis data-mining data-science data-visualization datavis datavisualization dataviz groovy java javafx visualization

Last synced: 16 May 2025

https://github.com/PatMartin/Dex

Dex : The Data Explorer -- A data visualization tool written in Java/Groovy/JavaFX capable of powerful ETL and publishing web visualizations.

d3 d3js data-analysis data-mining data-science data-visualization datavis datavisualization dataviz groovy java javafx visualization

Last synced: 04 May 2025

https://github.com/GoogleCloudPlatform/data-science-on-gcp

Source code accompanying book: Data Science on the Google Cloud Platform, Valliappa Lakshmanan, O'Reilly 2017

cloud-computing data-analysis data-engineering data-pipeline data-processing data-science data-visualization machine-learning

Last synced: 19 Jul 2025

https://github.com/singer-io/getting-started

This repository is a getting started guide to Singer.

data-analysis etl etl-framework python singer

Last synced: 14 May 2025

https://github.com/starpig1129/datagen

DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing. Now expanding into crypto market intelligence. Learn more: https://datagen.digital/.

agent ai ai-data-analysis artificial-intelligence code-generation data-analysis data-analytics data-science langchain langgraph large-language-model large-language-models llm multiagent-systems python

Last synced: 14 May 2025

https://github.com/alan-turing-institute/clevercsv

CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.

csv csv-converter csv-export csv-files csv-format csv-import csv-parser csv-parsing csv-reader csv-reading data-analysis data-mining data-science datascience machine-learning python python-library python3

Last synced: 13 May 2025

https://github.com/starpig1129/AI-Data-Analysis-MultiAgent

DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing. Now expanding into crypto market intelligence. Learn more: https://datagen.digital/.

agent ai ai-data-analysis artificial-intelligence code-generation data-analysis data-analytics data-science langchain langgraph large-language-model large-language-models llm multiagent-systems python

Last synced: 02 May 2025

https://github.com/alan-turing-institute/CleverCSV

CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement for the builtin CSV module with improved dialect detection, and comes with a handy command line application for working with CSV files.

csv csv-converter csv-export csv-files csv-format csv-import csv-parser csv-parsing csv-reader csv-reading data-analysis data-mining data-science datascience machine-learning python python-library python3

Last synced: 26 Mar 2025

https://github.com/VisActor/VTable

VTable is not just a high-performance multidimensional data analysis table, but also a grid artist that creates art between rows and columns.

canvas-table data-analysis data-visualization database datagrid grid javascript-table javescript list-table list-tree online-excel pivot-chart pivot-grid pivot-tables sparklines spreadsheet table tree-chart tree-table visualization

Last synced: 06 Aug 2025

https://github.com/starpig1129/DATAGEN

DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing. Now expanding into crypto market intelligence. Learn more: https://datagen.digital/.

agent ai ai-data-analysis artificial-intelligence code-generation data-analysis data-analytics data-science langchain langgraph large-language-model large-language-models llm multiagent-systems python

Last synced: 17 Nov 2025

https://github.com/machow/siuba

Python library for using dplyr like syntax with pandas and SQL

data-analysis dplyr pandas python sql

Last synced: 15 May 2025

https://github.com/litlyx/litlyx

Powerful Analytics Solution. Setup in 30 seconds. Display all your data on a Simple, AI-powered dashboard. Fully self-hostable and GDPR compliant. Alternative to Google Analytics, MixPanel, Plausible, Umami & Matomo.

ai analytics angular charts data data-analysis data-visualization javascript metrics nextjs nodejs nuxt open-source react statistics typescript vue website

Last synced: 14 May 2025

https://github.com/apachecn/pyda-2e-zh

:book: [译] 利用 Python 进行数据分析 · 第 2 版

book data-analysis numpy pandas pyda python

Last synced: 12 Apr 2025

https://github.com/comet-ml/kangas

🦘 Explore multimedia datasets at scale

data-analysis data-exploration dataframe datagrid machine-learning

Last synced: 14 May 2025

https://github.com/xinglie/report-designer

⚡打印设计、可视化、标签打印、编辑器、设计器、数据分析、报表设计、组件化、表单设计、h5页面、调查问卷、pdf生成、流程图、试卷、SVG、图形元素、物联网、标签纸

cloud-print data-analysis data-visualization editor h5-creator h5-editor h5-maker iot-demo layouts-and-renderings online-design online-printing printer snapshot visiual-editor xinglie

Last synced: 16 Mar 2025

https://github.com/d4t4x/data-selfie

Data Selfie - a browser extension to track yourself on Facebook and analyze your data.

chrome-extension data-analysis data-dashboard firefox-addon privacy

Last synced: 12 Apr 2025

https://github.com/bruin-data/bruin

Build data pipelines with SQL and Python, ingest data from different sources, add quality checks, and build end-to-end flows.

analytics bigquery data-analysis data-ingestion data-modeling data-pipelines data-platform data-transformation python snowflake sql

Last synced: 02 Jan 2026

https://github.com/kotlin/dataframe

Structured data processing in Kotlin

data-analysis data-science dataframe kotlin

Last synced: 04 Jul 2025

https://github.com/apache/cloudberry

One advanced and mature open-source MPP (Massively Parallel Processing) database. Open source alternative to Greenplum Database.

ai big-data c cloudberry data-analysis data-warehouse database distributed-database greenplum mpp olap postgres postgresql sql

Last synced: 14 May 2025

https://github.com/visualpython/visualpython

GUI-based Python code generator for data science, extension to Jupyter Lab, Jupyter Notebook and Google Colab.

bigdata chrome-extension code-generator data-analysis jupyter-lab-extension jupyter-notebook-extension jupyterlab-extension pandas python visual-coding

Last synced: 15 May 2025

https://github.com/chawlaavi/daily-dose-of-data-science

A collection of code snippets from the publication Daily Dose of Data Science on Substack: http://www.dailydoseofds.com/

data-analysis data-science data-science-tips data-visualization jupyter jupyter-notebook jupyter-tips matplotlib matplotlib-tips numpy pandas pandas-tips python python-tips sklearn

Last synced: 04 Apr 2025

https://github.com/empathy87/the-elements-of-statistical-learning-python-notebooks

A series of Python Jupyter notebooks that help you better understand "The Elements of Statistical Learning" book

data-analysis data-science machine-learning python sklearn statistical-learning tensorflow tutorials

Last synced: 13 Apr 2025

https://github.com/Kotlin/dataframe

Structured data processing in Kotlin

data-analysis data-science dataframe kotlin

Last synced: 11 Apr 2025

https://github.com/GoogleCloudPlatform/DataflowJavaSDK

Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.

big-data data-analysis data-mining data-processing data-science google-cloud-dataflow

Last synced: 01 May 2025

https://github.com/googlecloudplatform/dataflowjavasdk

Google Cloud Dataflow provides a simple, powerful model for building both batch and streaming parallel data processing pipelines.

big-data data-analysis data-mining data-processing data-science google-cloud-dataflow

Last synced: 03 Oct 2025

https://github.com/empathy87/The-Elements-of-Statistical-Learning-Python-Notebooks

A series of Python Jupyter notebooks that help you better understand "The Elements of Statistical Learning" book

data-analysis data-science machine-learning python sklearn statistical-learning tensorflow tutorials

Last synced: 01 May 2025

https://github.com/androz2091/discord-data-package-explorer

🌀 What's really in your Discord Data package?

data-analysis discord discord-data-package statistics

Last synced: 16 May 2025

https://github.com/yoshoku/rumale

Rumale is a machine learning library in Ruby

artificial-intelligence data-analysis data-science machine-learning ml ruby rubyml

Last synced: 29 Apr 2025

https://github.com/Androz2091/discord-data-package-explorer

🌀 What's really in your Discord Data package?

data-analysis discord discord-data-package statistics

Last synced: 31 Mar 2025

https://github.com/ChawlaAvi/Daily-Dose-of-Data-Science

A collection of code snippets from the publication Daily Dose of Data Science on Substack: http://www.dailydoseofds.com/

data-analysis data-science data-science-tips data-visualization jupyter jupyter-notebook jupyter-tips matplotlib matplotlib-tips numpy pandas pandas-tips python python-tips sklearn

Last synced: 04 Oct 2025

https://github.com/mrankitgupta/Data-Analyst-Roadmap

I am sharing my Journey of 66DaysofData into Data Analytics by participating in Ken Jee's #66daysofdata challenge

ankit ankit-gupta ankitgupta data-analysis data-analytics data-science data-structures data-visualization excel mongodb mysql pandas powerbi python sql sql-server tableau

Last synced: 07 Sep 2025

https://github.com/xiaopujun/light-chaser

light chaser is a lightweight data visualization designer tool

blueprints data-analysis data-visualization draggable javascript typescript web-editor

Last synced: 16 May 2025

https://github.com/mrankitgupta/data-analyst-roadmap

I am sharing my Journey of 66DaysofData into Data Analytics by participating in Ken Jee's #66daysofdata challenge

ankit ankit-gupta ankitgupta data-analysis data-analytics data-science data-structures data-visualization excel mongodb mysql pandas powerbi python sql sql-server tableau

Last synced: 13 Apr 2025