https://github.com/glebec/batching-toposort

Efficiently sort interdependent tasks into a sequence of concurrently-executable batches
https://github.com/glebec/batching-toposort

algorithm concurrency dag digraph directed-acyclic-graph graph sort toposort

Last synced: 3 months ago
JSON representation

Efficiently sort interdependent tasks into a sequence of concurrently-executable batches

Host: GitHub
URL: https://github.com/glebec/batching-toposort
Owner: glebec
License: mit
Created: 2019-02-23T05:26:21.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2021-11-12T03:15:56.000Z (over 3 years ago)
Last Synced: 2024-10-12T17:44:26.424Z (8 months ago)
Topics: algorithm, concurrency, dag, digraph, directed-acyclic-graph, graph, sort, toposort
Language: JavaScript
Homepage:
Size: 335 KB
Stars: 27
Watchers: 0
Forks: 3
Open Issues: 1
Metadata Files:
- Readme: README.md
- Contributing: .github/CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md

Awesome Lists containing this project

README

        # Batching-Toposort

[![npm version](https://img.shields.io/npm/v/batching-toposort.svg?maxAge=3600)](https://www.npmjs.com/package/batching-toposort)

[![Build Status](https://travis-ci.org/glebec/batching-toposort.svg?branch=master)](https://travis-ci.org/glebec/batching-toposort)

[![code style: prettier](https://img.shields.io/badge/code_style-prettier-ff69b4.svg)](https://github.com/prettier/prettier)

Efficiently sort interdependent tasks into a sequence of concurrently-executable batches.

```hs

batchingToposort :: { DependencyId : [DependentId] } -> [[TaskId]]

```

*   `O(t + d)` time complexity (for `t` tasks and `d` dependency relationships)

*   `O(t)` space complexity

*   Zero package dependencies

*   Thoroughly tested (including [invariant tests](http://jsverify.github.io/#property-based-testing))

*   Errors on cyclic graphs

## Motivation

Often one needs to schedule interdependent tasks. In order to determine task order, the classic solution is to use [topological sort](https://en.wikipedia.org/wiki/Topological_sorting). However, toposort typically outputs a list of individual tasks, without grouping those that can be executed concurrently. Batching-Toposort takes this additional consideration into account, producing a list of lists of tasks. The outer list is ordered; each inner list is unordered.

## Usage

```sh

npm install batching-toposort

```



Batching-Toposort expects a [directed acyclic graph](https://en.wikipedia.org/wiki/Directed_acyclic_graph) (DAG) implemented via [adjacency list](https://en.wikipedia.org/wiki/Adjacency_list). In other words, construct an object whose keys are dependency IDs, and whose values are lists of dependent IDs.

```js

const batchingToposort = require('batching-toposort')

// DAG :: { DependencyId : [DependentId] }

const DAG = {

    a: ['c', 'f'], // `a` is a dependency of `c` and `f`

    b: ['d', 'e'],

    c: ['f'],

    d: ['f', 'g'],

    e: ['h'],

    f: ['i'],

    g: ['j'],

    h: ['j'],

    i: [],

    j: [],

}

// batchingToposort :: DAG -> [[TaskId]]

const taskBatches = batchingToposort(DAG)

// [['a', 'b'], ['c', 'd', 'e'], ['f', 'g', 'h'], ['i', 'j']]

```

(If there is demand, Batching-Toposort may one day include a small DAG API and/or [DOT]() support for convenience, but as of now it is the developer's role to construct the graph.)

## Implementation

In short, Batching-Toposort adapts [Kahn's Algorithm](https://en.wikipedia.org/wiki/Topological_sorting#Kahn's_algorithm) by inserting each round of root tasks into sublists rather than appending tasks directly to the main output list.

The classic DAG toposort keeps track of each task's in-degree (number of dependencies). As root tasks (those with no dependencies) are added to the output list, their dependents' in-degree counts are decremented. For a task to become a root, all of its dependencies must have been accounted for. The core algorithm is illustrated below in pseudocode (the actual implementation is in [`src/index.js`](src/index.js)).

```

given G = adjacency list of tasks and dependents (~O(1) lookup):

let N = map from tasks to in-degree counts (~O(1) lookup / update)

let L = [] (empty output list) (~O(1) append)

let R1 = list of root tasks (~O(1) addition, ~O(n) iteration)

while R1 is nonempty

    append R1 to L

    let R2 = [] (empty list for next batch) (~O(1) append)

    for each task T in R1

        for each dependent D of T (as per G)

            decrement in-degree count for D (in N)

            if D's in-degree (as per N) is 0

                add D to R2

    R1 = R2

return L

```

### Performance

The time complexity is `O(|V| + |E|)` for `V` task vertices and `E` dependency edges.

*   The algorithm loops through rounds of roots, and every task is only a root only once, contributing to `O(|V|)` rounds (worst case is a linked list of tasks).

*   Each round handles a disjoint set of dependency edges (those rooted in that round's tasks), so the `O(|E|)` handling of all edges is effectively distributed across rounds.

*   Other operations, e.g. querying a node's in-degree (average case `O(1)`), are carefully managed to preserve the time complexity.

The space complexity is slightly better at `O(|V|)`.

*   The in-degree map size is proportional to the number of vertices `|V|`, but not edges, as those are folded into an integer count during map construction.

*   The output by definition contains `|V|` tasks (distributed among as many or fewer lists).

*   Again, other operations are controlled to keep space complexity low.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/glebec/batching-toposort

Awesome Lists containing this project

README