An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with data-analytics

A curated list of projects in awesome lists tagged with data-analytics .

https://github.com/lightdash/lightdash

Self-serve BI to 10x your data team ⚡️

business-intelligence data-analytics data-visualization dbt

Last synced: 09 Feb 2026

https://github.com/javascriptdata/danfojs

Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.

danfojs data-analysis data-analytics data-manipulation data-science dataframe javascript pandas plotting-charts stream-data stream-processing table tensorflow tensors

Last synced: 14 May 2025

https://github.com/datachain-ai/datachain

Analytics, Versioning and ETL for multimodal data: video, audio, PDFs, images

ai cv data-analytics data-wrangling embeddings llm llm-eval machine-learning mlops multimodal

Last synced: 25 Jan 2026

https://github.com/iterative/datachain

ETL, Analytics, Versioning for Unstructured Data

ai cv data-analytics data-wrangling embeddings llm llm-eval machine-learning mlops multimodal

Last synced: 18 Jun 2025

https://github.com/diffgram/diffgram

The AI Datastore for Schemas, BLOBs, and Predictions. Use with your apps or integrate built-in Human Supervision, Data Workflow, and UI Catalog to get the most value out of your AI Data.

annotation annotation-tool annotations data data-analytics data-annotation data-science datasets datastore deep-learning image-annotation kubernetes labeling machine-learning training-data video-annotation

Last synced: 14 Mar 2025

https://github.com/brimdata/zui

Zui is a powerful desktop application for exploring and working with data. The official front-end to the Zed lake.

csv data data-analytics data-viz data-wrangling electron-app json-inspector keyword-search super-structured-data table-view type-system zed zng zq zui

Last synced: 12 Jun 2025

https://github.com/brimsec/brim

Zui is a powerful desktop application for exploring and working with data. The official front-end to the Zed lake.

csv data data-analytics data-viz data-wrangling electron-app json-inspector keyword-search super-structured-data table-view type-system zed zng zq zui

Last synced: 25 Feb 2025

https://github.com/datageartech/datagear

DataGear数据可视化分析平台,自由制作任何您想要的数据看板

bi business-intelligence chart data-analysis data-analytics data-visualization echarts

Last synced: 14 May 2025

https://github.com/dremio/dremio-oss

Dremio - the missing link in modern data

analytics big-data data-analytics ui

Last synced: 14 May 2025

https://github.com/starpig1129/datagen

DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing. Now expanding into crypto market intelligence. Learn more: https://datagen.digital/.

agent ai ai-data-analysis artificial-intelligence code-generation data-analysis data-analytics data-science langchain langgraph large-language-model large-language-models llm multiagent-systems python

Last synced: 14 May 2025

https://github.com/mining/mining

Business Intelligence (BI) in Python, OLAP

bi business-intelligence data-analytics olap olap-cube python

Last synced: 27 Sep 2025

https://github.com/starpig1129/AI-Data-Analysis-MultiAgent

DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing. Now expanding into crypto market intelligence. Learn more: https://datagen.digital/.

agent ai ai-data-analysis artificial-intelligence code-generation data-analysis data-analytics data-science langchain langgraph large-language-model large-language-models llm multiagent-systems python

Last synced: 02 May 2025

https://github.com/starpig1129/DATAGEN

DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing. Now expanding into crypto market intelligence. Learn more: https://datagen.digital/.

agent ai ai-data-analysis artificial-intelligence code-generation data-analysis data-analytics data-science langchain langgraph large-language-model large-language-models llm multiagent-systems python

Last synced: 17 Nov 2025

https://github.com/mariusandra/insights

Open Source Self-Hosted Business Intelligence Platform

business-intelligence dashboard data-analytics feathersjs insights kea react visualization

Last synced: 15 May 2025

https://github.com/traildb/traildb

TrailDB is an efficient tool for storing and querying series of events

big-data c data-analytics database event-data time-series traildb

Last synced: 20 Mar 2025

https://github.com/mrankitgupta/Data-Analyst-Roadmap

I am sharing my Journey of 66DaysofData into Data Analytics by participating in Ken Jee's #66daysofdata challenge

ankit ankit-gupta ankitgupta data-analysis data-analytics data-science data-structures data-visualization excel mongodb mysql pandas powerbi python sql sql-server tableau

Last synced: 07 Sep 2025

https://github.com/mrankitgupta/data-analyst-roadmap

I am sharing my Journey of 66DaysofData into Data Analytics by participating in Ken Jee's #66daysofdata challenge

ankit ankit-gupta ankitgupta data-analysis data-analytics data-science data-structures data-visualization excel mongodb mysql pandas powerbi python sql sql-server tableau

Last synced: 13 Apr 2025

https://github.com/program-spiritual/dataanalysisinaction

(Finished) Geek Time Data Analysis Practical 45 Lecture - Detailed notes containing markdown images mind map code data can be read directly code test

data-analysis data-analytics in-action notebook-jupyter pipenv pyenv python python-data-analysis python-data-science python3

Last synced: 23 Sep 2025

https://github.com/abixen/abixen-platform

Abixen Platform is a microservices based software platform for building enterprise applications delivering functionalities through creating particular microservices and integrating by provided CMS.

analytics angularjs architecture aws business-intelligence businessintelligence charts cloud dashboard data-analysis data-analytics data-visualization low-code microservices netflixoss reporting spring-boot spring-cloud sql-editor visualization

Last synced: 23 Aug 2025

https://github.com/squarespace/datasheets

Read data from, write data to, and modify the formatting of Google Sheets

data data-analytics data-science dataframe google pandas python

Last synced: 16 May 2025

https://github.com/Squarespace/datasheets

Read data from, write data to, and modify the formatting of Google Sheets

data data-analytics data-science dataframe google pandas python

Last synced: 15 Mar 2025

https://github.com/essandess/isp-data-pollution

ISP Data Pollution to Protect Private Browsing History with Obfuscation

crawling data data-analytics obfuscation privacy privacy-enhancing-technologies web

Last synced: 29 Dec 2025

https://github.com/canner/wren-engine

🤖 The Semantic Engine for Model Context Protocol(MCP) Clients and AI Agents 🔥

agent agentic-ai ai business-intelligence data data-analysis data-analytics data-lake data-warehouse hacktoberfest llm mcp mcp-server semantic semantic-layer sql

Last synced: 22 Jan 2026

https://github.com/gchq/stroom

Stroom is a highly scalable data storage, processing and analysis platform.

big-data dashboards data-analytics enrichment lucene pipeline-processor visualisation xml xslt

Last synced: 09 Feb 2026

https://github.com/dbt-labs/dbt-mcp

A MCP (Model Context Protocol) server for interacting with dbt.

data-analytics data-engineering dbt llm mcp mcp-server model-context-protocol

Last synced: 27 Jan 2026

https://github.com/randomfractals/geo-data-viewer

Geo Data Analytics tool for VSCode IDE with kepler.gl support to generate and view maps 🗺️ without any Python 🐍, IPyWidgets ⚙️, pandas 🐼, Jupyter notebooks 📚, or ReactJS ⚛️ app code.

data data-analytics geo keplergl map spatial tool viewer vscode

Last synced: 26 Jan 2026

https://github.com/girder/girder

A data management platform for the web, developed by Kitware

data-analytics data-management data-science javascript kitware python resonant

Last synced: 03 Apr 2025

https://github.com/desbordante/desbordante-core

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

anomaly-detection correlations data-analytics data-cleaning data-cleansing data-engineering data-exploration data-mining data-mining-algorithms data-preprocessing data-profiling data-science data-wrangling exploratory-data-analysis feature-engineering feature-extraction feature-selection knowledge-discovery spreadsheets tabular-data

Last synced: 22 Nov 2025

https://github.com/blockchain-etl/bitcoin-etl

ETL scripts for Bitcoin, Litecoin, Dash, Zcash, Doge, Bitcoin Cash. Available in Google BigQuery https://goo.gl/oY5BCQ

apache-beam bitcoin bitcoincash blockchain-analytics crypto cryptocurrency dash data-analytics data-engineering dogecoin etl gcp google-dataflow google-pubsub litecoin on-chain-analysis web3 zcash

Last synced: 10 Apr 2025

https://github.com/blockchain-etl/ethereum-etl-airflow

Airflow DAGs for exporting, loading, and parsing the Ethereum blockchain data. How to get any Ethereum smart contract into BigQuery https://towardsdatascience.com/how-to-get-any-ethereum-smart-contract-into-bigquery-in-8-mins-bab5db1fdeee

apache-airflow blockchain-analytics crypto cryptocurrency data-analytics data-engineering ethereum etl gcp google-cloud google-cloud-platform on-chain-analysis web3

Last synced: 24 Jun 2025

https://github.com/xoolive/traffic

A toolbox for processing and analysing air traffic data

adsb air-traffic-data data-analytics data-science data-visualisation declarative-pipeline mode-s trajectory

Last synced: 14 May 2025

https://github.com/aiguofer/gspread-pandas

A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.

data data-analytics data-engineering data-science dataframes google google-sheets google-spreadsheets gspread pandas python sheets

Last synced: 15 May 2025

https://github.com/Desbordante/desbordante-core

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

anomaly-detection correlations data-analytics data-cleaning data-cleansing data-engineering data-exploration data-mining data-mining-algorithms data-preprocessing data-profiling data-science data-wrangling exploratory-data-analysis feature-engineering feature-extraction feature-selection knowledge-discovery spreadsheets tabular-data

Last synced: 03 Apr 2025

https://github.com/tellery/tellery

Tellery lets you build metrics using SQL and bring them to your team. As easy as using a document. As powerful as a data modeling tool.

analytics bigquery business-intelligence collaboration dashboard data-analytics data-modeling data-science data-visualization database dbt notebook self-hosted sql

Last synced: 16 May 2025

https://github.com/empower-ai/dsensei

AI-powered key driver analysis tool that pinpoints root cause behind metrics fluctuation in one minute.

analytics business-analytics business-intelligence data data-analytics data-insights data-science

Last synced: 01 Aug 2025

https://github.com/flashxio/FlashX

FlashX is a collection of big data analytics tools that perform data analytics in the form of graphs and matrices.

c-plus-plus data-analytics graph matrices ssd

Last synced: 15 Mar 2025

https://github.com/flashxio/flashx

FlashX is a collection of big data analytics tools that perform data analytics in the form of graphs and matrices.

c-plus-plus data-analytics graph matrices ssd

Last synced: 18 Dec 2025

https://github.com/koolphp/koolreport

This is an Open Source PHP Reporting Framework which you can use to write perfect data reports or to construct awesome dashboards using PHP

data-analytics data-visualization datasource extended-packages framework mongodb mysql mysql-reporting-tools oracle php php-reporting-tools postgresql report-generator reporting reporting-engine reporting-pipeline reporting-tool sqlserver visualization

Last synced: 04 Apr 2025

https://github.com/build-on-aws/cloud-clubs-learner-library

A library for learners! Whether or not you're a part of AWS Cloud Clubs, take a look in this library for free, open, leveled content for students 18+ worldwide

ai aws containers data-analytics data-science databases iot kubernetes ml mobile-development security serverless web web-development

Last synced: 09 Apr 2025

https://github.com/minusxai/minusx

MinusX is an AI Data Scientist for Analytics Apps you already use and love. Currently it supports Jupyter, Metabase, Google Sheets & Posthog.

artificial-intelligence data-analytics data-science jupyter metabase

Last synced: 04 Apr 2025

https://github.com/terraform-google-modules/terraform-google-bigquery

Creates opinionated BigQuery datasets and tables

cft-terraform data-analytics

Last synced: 09 Nov 2025

https://github.com/apache/texera

Collaborative Machine-Learning-Centric Data Analytics Using Workflows

artificial-intelligence data data-analytics data-science machine-learning texera workflow

Last synced: 15 Dec 2025

https://github.com/cda-group/arcon

State-first Streaming Applications in Rust

arrow data-analytics dataflow distributed-computing rust stream-processor

Last synced: 27 Apr 2025

https://github.com/xiangpenghao/liquid-cache

10x lower latency for cloud-native DataFusion

arrow cache data-analytics datafusion object-store parquet query-engine

Last synced: 16 May 2025

https://github.com/bps-statistics/stadata

STADATA is a Python package that simplifies access to statistical data provided by BPS - Statistics Indonesia

data data-analytics data-science national-statistics nso official-statistics python python-package statistics

Last synced: 14 Oct 2025

https://github.com/hemansnation/Data-Analyst-Roadmap

Data-Analyst-Roadmap for Professionals. This roadmap contains 8 Chapters that can be completed in 8 weeks, whether you are a fresher in the field or an experienced professional who wants to transition into Data Analysis.

analytics data-analysis data-analysis-python data-analytics data-science numpy predictive-analytics project-based-learning python statistics tableau

Last synced: 07 Sep 2025

https://github.com/hemansnation/data-analyst-roadmap

Data-Analyst-Roadmap for Professionals. This roadmap contains 8 Chapters that can be completed in 8 weeks, whether you are a fresher in the field or an experienced professional who wants to transition into Data Analysis.

analytics data-analysis data-analysis-python data-analytics data-science numpy predictive-analytics project-based-learning python statistics tableau

Last synced: 15 Apr 2025

https://github.com/trainingbypackt/data-wrangling-with-python

Simplify your ETL processes with these hands-on data sanitation tips, tricks, and best practices

analytics beautifulsoup data-analytics data-munging data-science data-wrangling database numpy pandas python regular-expression web-scraping

Last synced: 06 Apr 2025

https://github.com/iam-mhaseeb/skytrax-data-warehouse

A full data warehouse infrastructure with ETL pipelines running inside docker on Apache Airflow for data orchestration, AWS Redshift for cloud data warehouse and Metabase to serve the needs of data visualizations such as analytical dashboards.

airflow data-analysis data-analytics data-cleaning data-engineering data-orchestration data-processing data-visualization data-warehouse data-warehousing database docker metabase python python3 redshift s3 s3-bucket sql

Last synced: 12 Aug 2025

https://github.com/Canner/wren-engine

🤖 The semantic engine for LLMs, bringing semantic context to AI agents. 🔥

business-intelligence data data-analysis data-analytics data-lake data-warehouse hacktoberfest llm semantic semantic-layer sql

Last synced: 01 Apr 2025

https://github.com/kitware/candela

Visualization components for the web

components data-analytics kitware resonant visualization web

Last synced: 10 Apr 2025

https://github.com/Kitware/candela

Visualization components for the web

components data-analytics kitware resonant visualization web

Last synced: 03 Apr 2025

https://github.com/virajbhutada/bi-projects-collection

Discover a curated collection of dynamic Power BI dashboards covering financial analytics, HR metrics, streaming service trends, real estate dynamics, and more. Meticulously designed for comprehensive data exploration, this repository continues to expand with new and impactful visualizations.

analytical-insights data-analytics data-exploration data-visualization dynamic-dashboards healthcare-analysis hr-management powerbi trends-visualization visual-reporting

Last synced: 21 Nov 2025

https://github.com/DataWithBaraa/sql-data-analytics-project

This repository contains a collection of SQL scripts demonstrating various analytical techniques, such as changes over time, cumulative, performance, data segmentation, part-to-whole analysis.

analytics business-analytics business-intelligence data data-analysis data-analyst data-analytics data-engineering data-science data-scientist database datascience query reporting sql sql-queries sql-query sql-server window-functions window-functions-in-sql

Last synced: 14 Oct 2025

https://github.com/hatchet/hatchet

Analyze graph/hierarchical performance data using pandas dataframes

comparative-analysis data-analytics graphs hierarchical-data hpc pandas performance performance-analysis python trees

Last synced: 15 Jul 2025

https://github.com/terraform-google-modules/terraform-google-pubsub

Creates Pub/Sub topic and subscriptions associated with the topic

cft-terraform data-analytics

Last synced: 09 Nov 2025

https://github.com/7eggs/node-druid-query

Simple querying library for Druid (http://druid.io)

client data-analytics druid nodejs

Last synced: 12 May 2025

https://github.com/tiagoantao/python-performance

Repository for the book Fast Python - published by Manning

concurrency cython data-analytics gpu numpy pandas parallel-computing performance-python python

Last synced: 06 Apr 2025

https://github.com/empower-ai/sql-agent

Ai Agent that helps you do data analytics with natural language.

analytics bigquery chatgpt chatgpt-bot data data-analytics data-science mysql postgresql slack slack-bot slackbot

Last synced: 11 Apr 2025

https://github.com/ds2-lab/wukong

Wukong: A scalable and locality-enhanced serverless parallel framework (ACM SoCC'20)

analytics aws aws-lambda cloud-computing dask data-analytics faas linear-algebra machine-learning parallel-computing python serverless serverless-computing

Last synced: 09 Jul 2025

https://github.com/dbiir/rainbow

A data layout optimization framework for wide tables stored on HDFS. See rainbow's webpage

column-store data-analytics data-layout hdfs sql wide-table

Last synced: 30 Jun 2025