
An open API service indexing awesome lists of open source software.

Query processing and transformation of array-backed data tables.

arrays data database dataframe query table transform

Last synced: about 2 months ago
JSON representation

Query processing and transformation of array-backed data tables.




# Arquero

**Arquero** is a JavaScript library for query processing and transformation of array-backed data tables. Following the [relational algebra]( and inspired by the design of [dplyr](, Arquero provides a fluent API for manipulating column-oriented data frames. Arquero supports a range of data transformation tasks, including filter, sample, aggregation, window, join, and reshaping operations.

* Fast: process data tables with million+ rows.
* Flexible: query over arrays, typed arrays, array-like objects, or [Apache Arrow]( columns.
* Full-Featured: perform a variety of wrangling and analysis tasks.
* Extensible: add new column types or functions, including aggregate & window operations.
* Lightweight: small size, minimal dependencies.

To get up and running, start with the [Introducing Arquero]( tutorial, part of the [Arquero notebook collection](

Have a question or need help? Post to the [Arquero GitHub Discussions board](

Arquero is Spanish for "archer": if datasets are [arrows](, Arquero helps their aim stay true. 🏹 Arquero also refers to a goalkeeper: safeguard your data from analytic "own goals"! 🥅 ✋ ⚽

## API Documentation

* [Top-Level API]( - All methods in the top-level Arquero namespace.
* [Table]( - Table access and output methods.
* [Verbs]( - Table transformation verbs.
* [Op Functions]( - All functions, including aggregate and window functions.
* [Expressions]( - Parsing and generation of table expressions.
* [Extensibility]( - Extend Arquero with new expression functions or table verbs.

## Example

The core abstractions in Arquero are *data tables*, which model each column as an array of values, and *verbs* that transform data and return new tables. Verbs are table methods, allowing method chaining for multi-step transformations. Though each table is unique, many verbs reuse the underlying columns to limit duplication.

import { all, desc, op, table } from 'arquero';

// Average hours of sunshine per month, from
const dt = table({
'Seattle': [69,108,178,207,253,268,312,281,221,142,72,52],
'Chicago': [135,136,187,215,281,311,318,283,226,193,113,106],
'San Francisco': [165,182,251,281,314,330,300,272,267,243,189,156]

// Sorted differences between Seattle and Chicago.
// Table expressions use arrow function syntax.
month: d => op.row_number(),
diff: d => d.Seattle - d.Chicago
.select('month', 'diff')

// Is Seattle more correlated with San Francisco or Chicago?
// Operations accept column name strings outside a function context.
corr_sf: op.corr('Seattle', 'San Francisco'),
corr_chi: op.corr('Seattle', 'Chicago')

// Aggregate statistics per city, as output objects.
// Reshape (fold) the data to a two column layout: city, sun.
dt.fold(all(), { as: ['city', 'sun'] })
min: d => op.min(d.sun), // functional form of op.min('sun')
max: d => op.max(d.sun),
avg: d => op.average(d.sun),
med: d => op.median(d.sun),
// functional forms permit flexible table expressions
skew: ({sun: s}) => (op.mean(s) - op.median(s)) / op.stdev(s) || 0

## Usage

### In Browser

To use in the browser, you can load Arquero from a content delivery network:



Arquero will be imported into the `aq` global object. The default browser bundle does not include the [Apache Arrow]( library. To perform Arrow encoding using [toArrow()]( or binary file loading using [loadArrow()](, import Apache Arrow first:



Alternatively, you can build and import `arquero.min.js` from the `dist` directory, or build your own application bundle. When building custom application bundles for the browser, the module bundler should draw from the `browser` property of Arquero's `package.json` file. For example, if using [rollup](, pass the `browser: true` option to the [node-resolve]( plugin.

Arquero uses modern JavaScript features, and so will not work with some outdated browsers. To use Arquero with older browsers including Internet Explorer, set up your project with a transpiler such as [Babel](

### In Node.js or Application Bundles

First install `arquero` as a dependency, for example via `npm install arquero --save`. Arquero assumes Node version 12 or higher.

Import using CommonJS module syntax:

const aq = require('arquero');

Import using ES module syntax, import all exports into a single object:

import * as aq from 'arquero';

Import using ES module syntax, with targeted imports:

import { op, table } from 'arquero';

## Build Instructions

To build and develop Arquero locally:

- Clone [](
- Run `npm i` to install dependencies.
- Run `npm test` to run test cases, `npm run perf` to run performance benchmarks, and `npm run build` to build output files.