https://github.com/mhyfritz/bin-data
Partition data into given number of chunks and pick a representative value for each chunk.
https://github.com/mhyfritz/bin-data
array binning bucketing chunks compression iterable quantization
Last synced: 17 days ago
JSON representation
Partition data into given number of chunks and pick a representative value for each chunk.
- Host: GitHub
- URL: https://github.com/mhyfritz/bin-data
- Owner: mhyfritz
- Created: 2019-09-10T06:09:37.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2023-01-04T22:00:27.000Z (over 3 years ago)
- Last Synced: 2025-09-15T06:50:08.351Z (9 months ago)
- Topics: array, binning, bucketing, chunks, compression, iterable, quantization
- Language: JavaScript
- Homepage:
- Size: 396 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 14
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Data binning
> Partition data into given number of chunks and pick a representative value for each chunk.
## Installation
```bash
npm install @mhyfritz/bin-data
```
## Usage
[Try `bin-data` in your browser.](https://npm.runkit.com/@mhyfritz/bin-data)
Node / module bundlers:
```javascript
const binData = require("@mhyfritz/bin-data");
// or
import * as binData from "@mhyfritz/bin-data";
```
Browser:
```html
const { bin } = mhyfritzBinData;
bin([1, 2, 3, 4, 5], 2);
```
API:
```javascript
const { bin, chunk, pick } = require("@mhyfritz/bin-data");
// data = [-10, -9, ..., -1, 0, 1, ..., 9, 10]
const data = Array.from({ length: 21 }, (_, i) => i - 10);
// we can get the raw chunks if we want
chunk(data, 4);
// ==>
// [
// { start: 0, end: 4, data: [ -10, -9, -8, -7, -6 ] },
// { start: 5, end: 10, data: [ -5, -4, -3, -2, -1, 0 ] },
// { start: 11, end: 15, data: [ 1, 2, 3, 4, 5 ] },
// { start: 16, end: 20, data: [ 6, 7, 8, 9, 10 ] }
// ]
// bin the data; by default, the max. value is picked
bin(data, 4);
// ==> [ -6, 0, 5, 10 ]
bin(data, 4, pick.min);
// ==> [ -10, -5, 1, 6 ]
bin(data, 4, pick.mean);
// ==> [ -8, -2.5, 3, 8 ]
bin(data, 4, chunk => pick.quantile(chunk, 0.75));
// ==> [ -7, -1.25, 4, 9 ]
// for complex, non-numeric data, specify an accessor function
// objects = [{x: -10}, ..., {x: 0}, ..., {x: 10}]
const objects = data.map(value => ({ x: value }));
bin(objects, 4, chunk => pick.max(chunk, d => d.x));
// [ -6, 0, 5, 10 ]
// for getting back the actual objects, one can use `pick.greatest()`
bin(objects, 4, chunk => pick.greatest(chunk, d => d.x));
// [ { x: -6 }, { x: 0 }, { x: 5 }, { x: 10 } ]
// one can of course also provide a custom function
// example: pick maximum of the absolute values
function absMax(chunk) {
let ret = chunk[0];
for (const x of chunk) {
if (Math.abs(x) > Math.abs(ret)) {
ret = x;
}
}
return ret;
}
bin(data, 4, absMax);
// ==> [ -10, -5, 5, 10 ]
```
### `bin(data, numChunks, pickRepresentative)`
- `data`: an array or other iterable
- `numChunks`: the number of chunks to generate
- `pickRepresentative [chunk => value]`: function to pick representative value of chunk; default: `pick.max` (see below)
Example
```javascript
bin([1, 2, 3, 4, 5], 2);
// ==> [ 3, 5 ]
```
### `pick`
Object holding pre-defined functions to pick chunk representative,
e.g. `pick.min` to pick minimum value of chunk.
See [`d3-array`](https://github.com/d3/d3-array/blob/09b8ff21742b365a7090e728120640fb45464637/README.md)
for all options.
Example
```javascript
pick.mean([1, 2, 3]);
// ==> 2
pick.mean([4, 5]);
// ==> 4.5
bin([1, 2, 3, 4, 5], 2, pick.mean);
// ==> [ 2, 4.5 ]
```
### `chunk(data, numChunks)`
Partition `data` into `numChunks` chunks. Returns an array holding values of chunks plus start
and end indices in `data`.
Example
```javascript
chunk([1, 2, 3, 4, 5], 2);
// ==>
// [
// { start: 0, end: 2, data: [ 1, 2, 3 ] },
// { start: 3, end: 4, data: [ 4, 5 ] }
// ]
```