An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with data-analytics

A curated list of projects in awesome lists tagged with data-analytics .

https://github.com/javascriptdata/danfojs

Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.

danfojs data-analysis data-analytics data-manipulation data-science dataframe javascript pandas plotting-charts stream-data stream-processing table tensorflow tensors

Last synced: 14 May 2025

https://github.com/opensource9ja/danfojs

Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.

danfojs data-analysis data-analytics data-manipulation data-science dataframe javascript pandas plotting-charts stream-data stream-processing table tensorflow tensors

Last synced: 26 Dec 2024

https://github.com/lightdash/lightdash

Self-serve BI to 10x your data team ⚡️

business-intelligence data-analytics data-visualization dbt

Last synced: 13 May 2025

https://github.com/lancedb/lance

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..

apache-arrow computer-vision data-analysis data-analytics data-centric data-format data-science dataops deep-learning duckdb embeddings llms machine-learning mlops python rust

Last synced: 05 May 2025

https://github.com/iterative/datachain

ETL, Analytics, Versioning for Unstructured Data

ai cv data-analytics data-wrangling embeddings llm llm-eval machine-learning mlops multimodal

Last synced: 20 Apr 2025

https://github.com/diffgram/diffgram

The AI Datastore for Schemas, BLOBs, and Predictions. Use with your apps or integrate built-in Human Supervision, Data Workflow, and UI Catalog to get the most value out of your AI Data.

annotation annotation-tool annotations data data-analytics data-annotation data-science datasets datastore deep-learning image-annotation kubernetes labeling machine-learning training-data video-annotation

Last synced: 14 Mar 2025

https://github.com/brimsec/brim

Zui is a powerful desktop application for exploring and working with data. The official front-end to the Zed lake.

csv data data-analytics data-viz data-wrangling electron-app json-inspector keyword-search super-structured-data table-view type-system zed zng zq zui

Last synced: 25 Feb 2025

https://github.com/brimdata/zui

Zui is a powerful desktop application for exploring and working with data. The official front-end to the Zed lake.

csv data data-analytics data-viz data-wrangling electron-app json-inspector keyword-search super-structured-data table-view type-system zed zng zq zui

Last synced: 14 Mar 2025

https://github.com/datageartech/datagear

DataGear数据可视化分析平台,自由制作任何您想要的数据看板

bi business-intelligence chart data-analysis data-analytics data-visualization echarts

Last synced: 14 May 2025

https://github.com/dremio/dremio-oss

Dremio - the missing link in modern data

analytics big-data data-analytics ui

Last synced: 14 May 2025

https://github.com/starpig1129/datagen

DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing. Now expanding into crypto market intelligence. Learn more: https://datagen.digital/.

agent ai ai-data-analysis artificial-intelligence code-generation data-analysis data-analytics data-science langchain langgraph large-language-model large-language-models llm multiagent-systems python

Last synced: 14 May 2025

https://github.com/mining/mining

Business Intelligence (BI) in Python, OLAP

bi business-intelligence data-analytics olap olap-cube python

Last synced: 17 Jan 2025

https://github.com/starpig1129/AI-Data-Analysis-MultiAgent

DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing. Now expanding into crypto market intelligence. Learn more: https://datagen.digital/.

agent ai ai-data-analysis artificial-intelligence code-generation data-analysis data-analytics data-science langchain langgraph large-language-model large-language-models llm multiagent-systems python

Last synced: 02 May 2025

https://github.com/starpig1129/DATAGEN

DATAGEN: AI-driven multi-agent research assistant automating hypothesis generation, data analysis, and report writing. Now expanding into crypto market intelligence. Learn more: https://datagen.digital/.

agent ai ai-data-analysis artificial-intelligence code-generation data-analysis data-analytics data-science langchain langgraph large-language-model large-language-models llm multiagent-systems python

Last synced: 19 Feb 2025

https://github.com/starpig1129/ai-data-analysis-multiagent

AI-Driven Research Assistant: An advanced multi-agent system for automating complex research processes. Leveraging LangChain, OpenAI GPT, and LangGraph, this tool streamlines hypothesis generation, data analysis, visualization, and report writing. Perfect for researchers and data scientists seeking to enhance their workflow and productivity.

agent ai ai-data-analysis artificial-intelligence code-generation data-analysis data-analytics data-science langchain langgraph large-language-model large-language-models llm multiagent-systems python

Last synced: 15 Feb 2025

https://github.com/mariusandra/insights

Open Source Self-Hosted Business Intelligence Platform

business-intelligence dashboard data-analytics feathersjs insights kea react visualization

Last synced: 15 May 2025

https://github.com/traildb/traildb

TrailDB is an efficient tool for storing and querying series of events

big-data c data-analytics database event-data time-series traildb

Last synced: 20 Mar 2025

https://github.com/mrankitgupta/data-analyst-roadmap

I am sharing my Journey of 66DaysofData into Data Analytics by participating in Ken Jee's #66daysofdata challenge

ankit ankit-gupta ankitgupta data-analysis data-analytics data-science data-structures data-visualization excel mongodb mysql pandas powerbi python sql sql-server tableau

Last synced: 13 Apr 2025

https://github.com/program-spiritual/dataanalysisinaction

(Finished) Geek Time Data Analysis Practical 45 Lecture - Detailed notes containing markdown images mind map code data can be read directly code test

data-analysis data-analytics in-action notebook-jupyter pipenv pyenv python python-data-analysis python-data-science python3

Last synced: 15 Jan 2025

https://github.com/abixen/abixen-platform

Abixen Platform is a microservices based software platform for building enterprise applications delivering functionalities through creating particular microservices and integrating by provided CMS.

analytics angularjs architecture aws business-intelligence businessintelligence charts cloud dashboard data-analysis data-analytics data-visualization low-code microservices netflixoss reporting spring-boot spring-cloud sql-editor visualization

Last synced: 12 Apr 2025

https://github.com/squarespace/datasheets

Read data from, write data to, and modify the formatting of Google Sheets

data data-analytics data-science dataframe google pandas python

Last synced: 16 May 2025

https://github.com/Squarespace/datasheets

Read data from, write data to, and modify the formatting of Google Sheets

data data-analytics data-science dataframe google pandas python

Last synced: 15 Mar 2025

https://github.com/essandess/isp-data-pollution

ISP Data Pollution to Protect Private Browsing History with Obfuscation

crawling data data-analytics obfuscation privacy privacy-enhancing-technologies web

Last synced: 02 Apr 2025

https://github.com/girder/girder

A data management platform for the web, developed by Kitware

data-analytics data-management data-science javascript kitware python resonant

Last synced: 03 Apr 2025

https://github.com/randomfractals/geo-data-viewer

Geo Data Analytics tool for VSCode IDE with kepler.gl support to generate and view maps 🗺️ without any Python 🐍, IPyWidgets ⚙️, pandas 🐼, Jupyter notebooks 📚, or ReactJS ⚛️ app code.

data data-analytics geo keplergl map spatial tool viewer vscode

Last synced: 21 Mar 2025

https://github.com/blockchain-etl/bitcoin-etl

ETL scripts for Bitcoin, Litecoin, Dash, Zcash, Doge, Bitcoin Cash. Available in Google BigQuery https://goo.gl/oY5BCQ

apache-beam bitcoin bitcoincash blockchain-analytics crypto cryptocurrency dash data-analytics data-engineering dogecoin etl gcp google-dataflow google-pubsub litecoin on-chain-analysis web3 zcash

Last synced: 10 Apr 2025

https://github.com/xoolive/traffic

A toolbox for processing and analysing air traffic data

adsb air-traffic-data data-analytics data-science data-visualisation declarative-pipeline mode-s trajectory

Last synced: 14 May 2025

https://github.com/aiguofer/gspread-pandas

A package to easily open an instance of a Google spreadsheet and interact with worksheets through Pandas DataFrames.

data data-analytics data-engineering data-science dataframes google google-sheets google-spreadsheets gspread pandas python sheets

Last synced: 15 May 2025

https://github.com/Desbordante/desbordante-core

Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.

anomaly-detection correlations data-analytics data-cleaning data-cleansing data-engineering data-exploration data-mining data-mining-algorithms data-preprocessing data-profiling data-science data-wrangling exploratory-data-analysis feature-engineering feature-extraction feature-selection knowledge-discovery spreadsheets tabular-data

Last synced: 03 Apr 2025

https://github.com/tellery/tellery

Tellery lets you build metrics using SQL and bring them to your team. As easy as using a document. As powerful as a data modeling tool.

analytics bigquery business-intelligence collaboration dashboard data-analytics data-modeling data-science data-visualization database dbt notebook self-hosted sql

Last synced: 16 May 2025

https://github.com/empower-ai/dsensei

AI-powered key driver analysis tool that pinpoints root cause behind metrics fluctuation in one minute.

analytics business-analytics business-intelligence data data-analytics data-insights data-science

Last synced: 08 Apr 2025

https://github.com/flashxio/flashx

FlashX is a collection of big data analytics tools that perform data analytics in the form of graphs and matrices.

c-plus-plus data-analytics graph matrices ssd

Last synced: 06 Apr 2025

https://github.com/flashxio/FlashX

FlashX is a collection of big data analytics tools that perform data analytics in the form of graphs and matrices.

c-plus-plus data-analytics graph matrices ssd

Last synced: 15 Mar 2025

https://github.com/koolphp/koolreport

This is an Open Source PHP Reporting Framework which you can use to write perfect data reports or to construct awesome dashboards using PHP

data-analytics data-visualization datasource extended-packages framework mongodb mysql mysql-reporting-tools oracle php php-reporting-tools postgresql report-generator reporting reporting-engine reporting-pipeline reporting-tool sqlserver visualization

Last synced: 04 Apr 2025

https://github.com/build-on-aws/cloud-clubs-learner-library

A library for learners! Whether or not you're a part of AWS Cloud Clubs, take a look in this library for free, open, leveled content for students 18+ worldwide

ai aws containers data-analytics data-science databases iot kubernetes ml mobile-development security serverless web web-development

Last synced: 09 Apr 2025

https://github.com/minusxai/minusx

MinusX is an AI Data Scientist for Analytics Apps you already use and love. Currently it supports Jupyter, Metabase, Google Sheets & Posthog.

artificial-intelligence data-analytics data-science jupyter metabase

Last synced: 04 Apr 2025

https://github.com/terraform-google-modules/terraform-google-bigquery

Creates opinionated BigQuery datasets and tables

cft-terraform data-analytics

Last synced: 18 Feb 2025

https://github.com/cda-group/arcon

State-first Streaming Applications in Rust

arrow data-analytics dataflow distributed-computing rust stream-processor

Last synced: 27 Apr 2025

https://github.com/Texera/texera

Collaborative Machine-Learning-Centric Data Analytics Using Workflows

data-analytics declarative-ui machine-learning nlp texera workflow

Last synced: 06 Jan 2025

https://github.com/xiangpenghao/liquid-cache

10x lower latency for cloud-native DataFusion

arrow cache data-analytics datafusion object-store parquet query-engine

Last synced: 16 May 2025

https://github.com/hemansnation/data-analyst-roadmap

Data-Analyst-Roadmap for Professionals. This roadmap contains 8 Chapters that can be completed in 8 weeks, whether you are a fresher in the field or an experienced professional who wants to transition into Data Analysis.

analytics data-analysis data-analysis-python data-analytics data-science numpy predictive-analytics project-based-learning python statistics tableau

Last synced: 15 Apr 2025

https://github.com/trainingbypackt/data-wrangling-with-python

Simplify your ETL processes with these hands-on data sanitation tips, tricks, and best practices

analytics beautifulsoup data-analytics data-munging data-science data-wrangling database numpy pandas python regular-expression web-scraping

Last synced: 06 Apr 2025

https://github.com/iam-mhaseeb/skytrax-data-warehouse

A full data warehouse infrastructure with ETL pipelines running inside docker on Apache Airflow for data orchestration, AWS Redshift for cloud data warehouse and Metabase to serve the needs of data visualizations such as analytical dashboards.

airflow data-analysis data-analytics data-cleaning data-engineering data-orchestration data-processing data-visualization data-warehouse data-warehousing database docker metabase python python3 redshift s3 s3-bucket sql

Last synced: 14 Dec 2024

https://github.com/Canner/wren-engine

🤖 The semantic engine for LLMs, bringing semantic context to AI agents. 🔥

business-intelligence data data-analysis data-analytics data-lake data-warehouse hacktoberfest llm semantic semantic-layer sql

Last synced: 01 Apr 2025

https://github.com/canner/wren-engine

🤖 The semantic engine for LLMs, bringing semantic context to AI agents. 🔥

business-intelligence data data-analysis data-analytics data-lake data-warehouse hacktoberfest llm semantic semantic-layer sql

Last synced: 04 Apr 2025

https://github.com/kitware/candela

Visualization components for the web

components data-analytics kitware resonant visualization web

Last synced: 10 Apr 2025

https://github.com/Kitware/candela

Visualization components for the web

components data-analytics kitware resonant visualization web

Last synced: 03 Apr 2025

https://github.com/hatchet/hatchet

Analyze graph/hierarchical performance data using pandas dataframes

comparative-analysis data-analytics graphs hierarchical-data hpc pandas performance performance-analysis python trees

Last synced: 23 Nov 2024

https://github.com/terraform-google-modules/terraform-google-pubsub

Creates Pub/Sub topic and subscriptions associated with the topic

cft-terraform data-analytics

Last synced: 18 Feb 2025

https://github.com/7eggs/node-druid-query

Simple querying library for Druid (http://druid.io)

client data-analytics druid nodejs

Last synced: 12 May 2025

https://github.com/tiagoantao/python-performance

Repository for the book Fast Python - published by Manning

concurrency cython data-analytics gpu numpy pandas parallel-computing performance-python python

Last synced: 06 Apr 2025

https://github.com/empower-ai/sql-agent

Ai Agent that helps you do data analytics with natural language.

analytics bigquery chatgpt chatgpt-bot data data-analytics data-science mysql postgresql slack slack-bot slackbot

Last synced: 11 Apr 2025

https://github.com/dbiir/rainbow

A data layout optimization framework for wide tables stored on HDFS. See rainbow's webpage

column-store data-analytics data-layout hdfs sql wide-table

Last synced: 21 Nov 2024

https://github.com/urbanos-public/smartcitiesdata

The core micro services of UrbanOS as an umbrella project with component documentation

data-analytics data-processing data-visualization elixir elixir-phoenix

Last synced: 06 Apr 2025

https://github.com/spratiher9/sparkora

Powerful rapid automatic EDA and feature engineering library with a very easy to use API 🌟

apache apache-spark data data-analysis data-analysis-python data-analytics easy-to-use eda exploratory-data-analysis open-source opensource pyspark python python3 toolkit

Last synced: 17 Mar 2025

https://github.com/terraform-google-modules/terraform-google-composer

Manages Cloud Composer v1 and v2 along with option to manage networking

cft-terraform data-analytics operations

Last synced: 18 Feb 2025

https://github.com/ahmed-mohamed-sn/olliePy

OlliePy is a python package which can help data scientists in exploring their data and evaluating and analysing their machine learning experiments by utilising the power and structure of modern web applications. The data scientist only needs to provide the data and any required information and OlliePy will generate the rest.

ai analytics charts dashboard data data-analytics data-science data-scientist eda error-analysis exploratory-data-analysis machine-learning python visualization

Last synced: 08 May 2025

https://github.com/cosmoduende/r-youtube-personal-history-analysis

Explore your activity on YouTube with R: How to analyze and visualize your personal pata history. Find out how you consume YouTube using a copy of your personal data from Google Takeout.

data-analysis data-analytics data-visualization data-viz google-takeout r-data r-language r-programming youtube youtube-accounts youtube-analytics youtube-api youtube-data youtube-data-analysis youtube-data-api youtube-data-api-v3 youtube-data-scraping youtube-dataset youtube-scrape youtube-scraper

Last synced: 11 Apr 2025

https://github.com/googlecloudplatform/notebooks-blueprint-security

Opinionated setup for securely using AI Platform Notebooks.

cft-terraform data-analytics security-identity

Last synced: 08 Apr 2025

https://github.com/spidy20/data-scince-ml-project

In this repository i created many data scince - machine learning projects like(Deep dream,weather prediction,Movie recommender system etc) with code & datasets

data-analysis data-analytics deep-dream machine-learning matplotlib number-recognition python recommender-system songs songs-data-analysis stock-market-prediction

Last synced: 12 Apr 2025

https://github.com/hemansnation/python-for-data-professionals

This course is designed to get a good grip on python programming, logic building, solving algorithm-based questions, data structures, understanding of data analytics, working with pandas, professional practices, and API building.

data-analytics data-professionals data-science exploratory-data-analysis logic-programming machine-learning pandas python

Last synced: 15 Apr 2025