An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with dataframe

A curated list of projects in awesome lists tagged with dataframe .

https://github.com/pola-rs/polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust

arrow dataframe dataframe-library dataframes out-of-core polars python rust

Last synced: 15 Dec 2025

https://github.com/kanaries/pygwalker

PyGWalker: Turn your pandas dataframe into an interactive UI for visual analysis

data-analysis data-exploration dataframe matplotlib pandas plotly tableau tableau-alternative visualization

Last synced: 09 Sep 2025

https://github.com/Kanaries/pygwalker

PyGWalker: Turn your pandas dataframe into an interactive UI for visual analysis

data-analysis data-exploration dataframe matplotlib pandas plotly tableau tableau-alternative visualization

Last synced: 26 Mar 2025

https://github.com/modin-project/modin

Modin: Scale your Pandas workflows by changing a single line of code

analytics data-science dataframe datascience distributed modin pandas python sql

Last synced: 11 May 2025

https://github.com/vaexio/vaex

Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀

bigdata data-science dataframe hdf5 machine-learning machinelearning memory-mapped-file pyarrow python tabular-data visualization

Last synced: 12 Dec 2025

https://github.com/apache/datafusion

Apache DataFusion SQL Query Engine

arrow big-data dataframe datafusion olap python query-engine rust sql

Last synced: 12 Dec 2025

https://github.com/javascriptdata/danfojs

Danfo.js is an open source, JavaScript library providing high performance, intuitive, and easy to use data structures for manipulating and processing structured data.

danfojs data-analysis data-analytics data-manipulation data-science dataframe javascript pandas plotting-charts stream-data stream-processing table tensorflow tensors

Last synced: 14 May 2025

https://github.com/lk-geimfari/mimesis

Mimesis is a robust data generator for Python that can produce a wide range of fake data in multiple languages.

data dataframe datascience dummy factory factory-boy fake fixtures generator json-generator mimesis mock pandas polars pytest-plugin python schema syntetic synthetic-data testing

Last synced: 28 Dec 2025

https://github.com/databricks/koalas

Koalas: pandas API on Apache Spark

big-data data-science dataframe mlflow pandas pydata spark

Last synced: 13 May 2025

https://github.com/adamerose/PandasGUI

A GUI for Pandas DataFrames

dataframe gui pandas viewer

Last synced: 27 Mar 2025

https://github.com/adamerose/pandasgui

A GUI for Pandas DataFrames

dataframe gui pandas viewer

Last synced: 14 May 2025

https://github.com/eventual-inc/daft

Distributed data engine for Python/SQL designed for the cloud, powered by Rust

big-data data-engineering data-science dataframe distributed-computing machine-learning python rust

Last synced: 08 May 2025

https://github.com/mars-project/mars

Mars is a tensor-based unified framework for large-scale data computation which scales numpy, pandas, scikit-learn and Python functions.

dask dataframe joblib lightgbm machine-learning numpy pandas python pytorch ray scikit-learn statsmodels tensor tensorflow xgboost

Last synced: 25 Apr 2025

https://github.com/Eventual-Inc/Daft

Distributed data engine for Python/SQL designed for the cloud, powered by Rust

big-data data-engineering data-science dataframe distributed-computing machine-learning python rust

Last synced: 09 Apr 2025

https://github.com/sngyai/sequoia

A股自动选股程序,实现了海龟交易法则、缠中说禅牛市买点,以及其他若干种技术形态

a-shares akshare dataframe pandas python ta-lib turtle-trade tushare

Last synced: 11 Apr 2025

https://github.com/sngyai/Sequoia

A股自动选股程序,实现了海龟交易法则、缠中说禅牛市买点,以及其他若干种技术形态

a-shares akshare dataframe pandas python ta-lib turtle-trade tushare

Last synced: 01 Apr 2025

https://github.com/sfu-db/connector-x

Fastest library to load data from DB to DataFrames in Rust and Python

cpp database dataframe python rust sql

Last synced: 18 Jan 2026

https://github.com/alexhallam/tv

📺(tv) Tidy Viewer is a cross-platform CLI csv pretty printer that uses column styling to maximize viewer enjoyment.

cli column command-line command-line-tool csv csv-cat csv-column csv-pretty-print csv-viewer csv-visualization data-science dataframe datatable pretty-print pretty-printer rust tabular-data terminal tibble

Last synced: 13 May 2025

https://sfu-db.github.io/connector-x/

Fastest library to load data from DB to DataFrames in Rust and Python

cpp database dataframe python rust sql

Last synced: 20 Sep 2025

https://github.com/DAGWorks-Inc/hamilton

Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

dag data-analysis data-engineering data-science dataframe etl etl-framework etl-pipeline feature-engineering hacktoberfest lineage llmops machine-learning mlops orchestration pandas python rag software-engineering

Last synced: 26 Mar 2025

https://github.com/man-group/arcticdb

ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.

big-data data data-analysis data-science database dataframe pandas quantitative-analysis quantitative-finance quantitative-trading

Last synced: 16 Feb 2026

https://github.com/apache/datafusion-ballista

Apache DataFusion Ballista Distributed Query Engine

arrow big-data dataframe distributed olap python query-engine rust sql

Last synced: 12 Dec 2025

https://github.com/man-group/ArcticDB

ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.

big-data data data-analysis data-science database dataframe pandas quantitative-analysis quantitative-finance quantitative-trading

Last synced: 12 Mar 2025

https://github.com/pyjanitor-devs/pyjanitor

Clean APIs for data cleaning. Python implementation of R package Janitor

cleaning-data data data-engineering dataframe hacktoberfest pandas pydata

Last synced: 18 Feb 2026

https://github.com/uwdata/arquero

Query processing and transformation of array-backed data tables.

arrays data database dataframe query table transform

Last synced: 13 May 2025

https://github.com/rocketlaunchr/dataframe-go

DataFrames for Go: For statistics, machine-learning, and data manipulation/exploration

data-science dataframe dataframes go golang machine-learning pandas pandas-dataframe python statistics

Last synced: 15 May 2025

https://github.com/comet-ml/kangas

🦘 Explore multimedia datasets at scale

data-analysis data-exploration dataframe datagrid machine-learning

Last synced: 14 May 2025

https://github.com/graphframes/graphframes

GraphFrames is a package for Apache Spark which provides DataFrame-based Graphs

apache-spark big-data connected-components dataframe dataframes graphs network-motif network-motifs networks spark

Last synced: 14 May 2025

https://github.com/redislabs/spark-redis

A connector for Spark that allows reading and writing to/from Redis cluster

dataframe java redis spark

Last synced: 14 May 2025

https://github.com/RedisLabs/spark-redis

A connector for Spark that allows reading and writing to/from Redis cluster

dataframe java redis spark

Last synced: 28 Mar 2025

https://github.com/kotlin/dataframe

Structured data processing in Kotlin

data-analysis data-science dataframe kotlin

Last synced: 04 Jul 2025

https://github.com/freqtrade/technical

Various indicators developed or collected for the Freqtrade

dataframe freqtrade technical-analysis trading

Last synced: 14 May 2025

https://github.com/Kotlin/dataframe

Structured data processing in Kotlin

data-analysis data-science dataframe kotlin

Last synced: 11 Apr 2025

https://github.com/stitchfix/hamilton

A scalable general purpose micro-framework for defining dataflows. THIS REPOSITORY HAS BEEN MOVED TO www.github.com/dagworks-inc/hamilton

dag data-engineering data-platform data-science dataframe etl etl-framework etl-pipeline feature-engineering featurization hamilton hamiltonian machine-learning numpy pandas python software-engineering stitch-fix

Last synced: 29 Sep 2025

https://github.com/mrpowers-io/spark-daria

Essential Spark extensions and helper methods ✨😲

dataframe spark

Last synced: 21 Feb 2026

https://github.com/pdpipe/pdpipe

Easy pipelines for pandas DataFrames.

data data-science dataframe dataframes pandas pandas-dataframe pipeline

Last synced: 06 Mar 2026

https://github.com/techascent/tech.ml.dataset

A Clojure high performance data processing system

clojure csv dataframe datascience dataset etl-pipeline java machine-learning xlsx

Last synced: 15 May 2025

https://github.com/elastic/eland

Python Client and Toolkit for DataFrames, Big Data, Machine Learning and ETL in Elasticsearch

big-data data-analysis dataframe dataframes eland elasticsearch etl lightgbm machine-learning pandas python scikit-learn time-series-forecasting

Last synced: 14 Apr 2025

https://github.com/dmnfarrell/pandastable

Table analysis in Tkinter using pandas DataFrames.

data-analysis dataframe pandas plotting scientific tkinter

Last synced: 14 May 2025

https://github.com/squarespace/datasheets

Read data from, write data to, and modify the formatting of Google Sheets

data data-analytics data-science dataframe google pandas python

Last synced: 16 May 2025

https://github.com/Squarespace/datasheets

Read data from, write data to, and modify the formatting of Google Sheets

data data-analytics data-science dataframe google pandas python

Last synced: 15 Mar 2025

https://github.com/ranaroussi/pystore

Fast data store for Pandas time-series data

dask database dataframe datastore pandas parquet timeseries

Last synced: 01 Apr 2025

https://github.com/Gmousse/dataframe-js

A javascript library providing a new data structure for datascientists and developpers

data data-frame dataframe datascience datastructures functional groupby javascript manipulation matrix sql sql-syntax

Last synced: 15 Mar 2025

https://github.com/xorq-labs/xorq

multi-engine batch transformation framework

arrow dataframe elt machine-learning multi-engine python sklearn sql

Last synced: 06 Mar 2026

https://github.com/firmai/pandasvault

Advanced Pandas Vault — Utilities, Functions and Snippets (by @firmai).

data-science data-structures dataframe functions pandas python snippets table tips

Last synced: 06 May 2025

https://github.com/tobgu/qframe

Immutable data frame for Go

data-frame data-science dataframe go golang immutable

Last synced: 04 Apr 2025

https://github.com/deepspace2/styleframe

A library that wraps pandas and openpyxl and allows easy styling of dataframes in excel

data-frame dataframe excel openpyxl pandas

Last synced: 20 Jan 2026

https://github.com/DeepSpace2/StyleFrame

A library that wraps pandas and openpyxl and allows easy styling of dataframes in excel

data-frame dataframe excel openpyxl pandas

Last synced: 19 Jul 2025

https://github.com/manzt/quak

a scalable data profiler

database dataframe jupyter python visualization

Last synced: 16 May 2025

https://github.com/bluenote10/NimData

DataFrame API written in Nim, enabling fast out-of-core data processing

dataframe nim

Last synced: 13 Apr 2025

https://github.com/bluenote10/nimdata

DataFrame API written in Nim, enabling fast out-of-core data processing

dataframe nim

Last synced: 02 Nov 2025

https://github.com/tidyverse/duckplyr

A drop-in replacement for dplyr, powered by DuckDB for speed.

analytics dataframe dplyr duckdb performance r

Last synced: 05 Jul 2025

https://github.com/lifeomic/sparkflow

Easy to use library to bring Tensorflow on Apache Spark

apache-spark dataframe deep-learning lifeomic pipeline spark-ml tensorflow

Last synced: 04 Apr 2025

https://github.com/scicloj/tablecloth

Dataset manipulation library built on the top of tech.ml.dataset

clojure dataframe dataset machinelearning

Last synced: 12 Apr 2025

https://github.com/tirthajyoti/design-of-experiment-python

Design-of-experiment (DOE) generator for science, engineering, and statistics

analytics dataframe design-of-experiments experiment factors latin-hypercube matrix random-generation science statistics

Last synced: 06 Apr 2025

https://github.com/alastairrushworth/inspectdf

🛠️ 📊 Tools for Exploring and Comparing Data Frames

comparison dataframe eda exploratory-data-analysis r rstats visualization

Last synced: 13 Apr 2025

https://github.com/kszucs/pandahouse

Pandas interface for Clickhouse database

clickhouse dataframe pandas

Last synced: 10 Sep 2025

https://github.com/alanmarazzi/panthera

Data-frames & arrays on Clojure

array clojure dataframe numpy pandas python

Last synced: 10 Apr 2025

https://github.com/alteryx/woodwork

Woodwork is a Python library that provides robust methods for managing and communicating data typing information.

data-science dataframe dataframes evalml featuretools inference machine-learning nlp-primitives python semantic-tags typing woodwork

Last synced: 15 May 2025

https://github.com/abdenlab/oxbow

Oxbow makes genomic data ready for high-performance analytics.

apache-arrow bioinformatics data-science dataframe fair-data genomics multiomics ngs pandas polars python r rust-lang

Last synced: 07 Mar 2026

https://github.com/scinim/datamancer

A dataframe library with a dplyr like API

dataframe dplyr hacktoberfest nim nim-lang

Last synced: 06 Apr 2025

https://github.com/SciNim/Datamancer

A dataframe library with a dplyr like API

dataframe dplyr hacktoberfest nim nim-lang

Last synced: 08 May 2025

https://github.com/noahgift/rust-mlops-template

A work in progress to build out solutions in Rust for MLOPs

cli dataframe hugging huggingface mlops polars pytorch rust web

Last synced: 15 Jul 2025

https://github.com/scipp/scipp

Multi-dimensional data arrays with labeled dimensions

dataframe dataset python science

Last synced: 17 Jan 2026

https://github.com/Quantco/dataframely

A declarative, 🐻‍❄️-native data frame validation library.

dataframe polars validation

Last synced: 28 Apr 2025

https://github.com/bertrandmartel/tableau-scraping

Tableau scraper python library. R and Python scripts to scrape data from Tableau viz

dataframe pandas python r tableau web-scraping

Last synced: 29 Aug 2025

https://github.com/archivesunleashed/aut

The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.

analysis apache-spark big-data big-data-analytics dataframe digital-humanities hadoop network-graphing pyspark python3 scala spark text-extraction webarchives

Last synced: 13 Apr 2025

https://github.com/dmnfarrell/tablexplore

Table analysis and plotting application written in PySide2/PyQt5

data-analysis data-science dataframe pandas plotting pyqt5 pyside2 python qt

Last synced: 01 Aug 2025

https://github.com/clojure-finance/clojask

Clojask is a Clojure data processing framework with parallel computing on larger-than-memory datasets

big-data clojure dataframe parallel-computing

Last synced: 07 May 2025

https://github.com/tidypyverse/tidypandas

A grammar of data manipulation for pandas inspired by tidyverse

data-analysis data-science dataframe dataframe-library dplyr pandas python tidyverse

Last synced: 13 Mar 2026

https://github.com/jgperrin/net.jgp.labs.spark

Apache Spark examples exclusively in Java

data-ingestion dataframe ingestion java spark udf

Last synced: 16 Apr 2025

https://github.com/facultyai/lens

Summarise and explore Pandas DataFrames

dask data-exploration data-science data-visualisation dataframe pandas

Last synced: 14 Apr 2025

https://github.com/talegari/tidypandas

A grammar of data manipulation for pandas inspired by tidyverse

data-analysis data-science dataframe dataframe-library dplyr pandas python tidyverse

Last synced: 03 Mar 2025

https://github.com/ashvardanian/StringWars

Comparing performance-oriented string-processing libraries for substring search, multi-pattern matching, hashing, edit-distances, sketching, and sorting across CPUs and GPUs in Rust 🦀 and Python 🐍

benchmark bioinformatics database dataframe levenshtein-distance libc memchr polars rapids string string-search strstr substring-search

Last synced: 08 Oct 2025

https://github.com/ashvardanian/stringwars

Comparing performance-oriented string-processing libraries for substring search, multi-pattern matching, hashing, edit-distances, sketching, and sorting across CPUs and GPUs in Rust 🦀 and Python 🐍

benchmark bioinformatics database dataframe levenshtein-distance libc memchr polars rapids string string-search strstr substring-search

Last synced: 18 Jan 2026

https://github.com/CybercentreCanada/jupyterlab-sql-editor

A JupyterLab extension providing, SQL formatter, auto-completion, syntax highlighting, Spark SQL and Trino

auto-completion dataframe datagrid extension formatter ipython-magic json jupyterlab lsp nested-structures notebook schema sparksql sql syntax-highlighting trino vscode-extension

Last synced: 06 Mar 2025

https://github.com/cybercentrecanada/jupyterlab-sql-editor

A JupyterLab extension providing, SQL formatter, auto-completion, syntax highlighting, Spark SQL and Trino

auto-completion dataframe datagrid extension formatter ipython-magic json jupyterlab lsp nested-structures notebook schema sparksql sql syntax-highlighting trino vscode-extension

Last synced: 04 Apr 2025

https://github.com/nmandery/h3ron

Rust crates for the H3 geospatial indexing system

dataframe geospatial h3 ndarray rust

Last synced: 27 Apr 2025