https://github.com/kanaka/rbt_cfs

Red-Black Tree and Completely Fair Scheduler Simulation and Visualization
https://github.com/kanaka/rbt_cfs
Last synced: 2 months ago
JSON representation
Red-Black Tree and Completely Fair Scheduler Simulation and Visualization
Host: GitHub
URL: https://github.com/kanaka/rbt_cfs
Owner: kanaka
Created: 2014-04-05T04:54:07.000Z (about 11 years ago)
Default Branch: master
Last Pushed: 2014-04-05T04:56:24.000Z (about 11 years ago)
Last Synced: 2025-03-25T14:51:16.942Z (3 months ago)
Language: JavaScript
Size: 375 KB
Stars: 16
Watchers: 5
Forks: 5
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE.MPL-2.0
Awesome Lists containing this project

README

        # Red-Black Tree and Completely Fair Scheduler Simulation and Visualization

## Overview

This project is a Javascript implementation of a CPU scheduler and

four data structures for use as the future task timeline: Binary

Search Tree (unbalanced), Red-Black Tree, Heap Tree, and Heap Array.

When the scheduler is using the RBT data structure for the timeline,

this models the Linux Completely Fair Scheduler (CFS).

See online versions at:

* [Binary Tree Visualization](http://kanaka.github.io/rbt_cfs/trees.html)

* [Complete Fair Scheduler Visualation](http://kanaka.github.io/rbt_cfs/scheduler.html)

## Running the code

* Prerequisites:

    * You need to install [Node.js](http://nodejs.org)

    * Unpacked project directory

    * To run the unit tests you also need to install the nodeunit

      module. Use npm from the project directory:

        ```

        npm install nodeunit

        ```

* Load a simple file with 3 tasks, run the scheduler using a simple

  (unbalanced) Binary Search Tree for the timeline, and generate

  a brief report (elapsed time, throughput, tree operations).

    ```

    node ./scheduler.js --report bst data/simple3.txt

    ```

* Load a more complicated tasks file, run the scheduler using

  a Red-Black Tree for the timeline, and generate a summary of the

  state of the timline and a list of the tasks that ran at each time

  unit:

    ```

    node scheduler.js --summary rbt data/mixed12.txt

    ```

* Load a larger task file (20 tasks), run the scheduler using

  a Red-Black Tree for the timeline, and generate a detailed report

  (starting tasks, running/completed task at each tick, in addition to

  information from --report):

    ```

    node ./scheduler.js --detailed rbt data/IncrementalSTdiffRT.txt

    ```

* Generate 9 different sets of tasks (with 2^2 through 2^10 tasks),

  run the scheduler against each task set using a HeapTree for the

  timeline, and generate CSV formatted output (one line per task set):

    ```

    node ./run_tasks.js --csv heaptree 2 10

    ```

* Load a tasks file with 8 tasks, run the scheduler using a HeapArray

  for the timeline, and generate CSV output with one line for each

  simulation time unit that contains fairness ratios for every task.

    ```

    node ./run_tasks_fairness.js heaparray data/flat8.txt

    ```

* Run all the unit tests using nodeunit:

    ```

    node node_modules/nodeunit/bin/nodeunit test_*

    ```

* In addition to the command line programs, there are two web

  applications that can be loaded in a browser (tested in Google

  Chrome). These webapps and node.js interfaces both share the same

  JavaScript code for the underlying algorithms.

    * **trees.html**: This webapp uses the [d3.js](http://d3js.org)

      library to visually render the state of each of tree

      data-structures. The interface has a dropdown selection for four

      differnent tree structures: Binary Search Tree, Red-Black Tree,

      Min/Max HeapTree, and Min/Max HeapArray. After selecting a tree,

      nodes can be added to the tree one at a time, or as a group of

      randomly chosen nodes from a range. 

    * **scheduler.html**: This webapp presents an interface to the

      parseTasks (in `tasks.j`) and runScheduler (in `scheduler.js`)

      algorithms. The page loads with a very small default task file

      presented. The "Choose File" button can be used to load a new

      tasks file (which can be edited after it is loaded). The "Run

      Scheduler" button will run the scheduler algorith using the

      selected data structure and reporting output format.

## Design

There are several competing goals that must be balanced in scheduler

design. The two that are considered by this project are fairness and

throughput.

### Fairness

Fairness is a measure of how close the scheduler approximates

a perfectly and infinitely subdividable CPU. In other words, if there

are N task that all started at the same time and have the same

duration, then each task should have received exactly 1/Nth of the CPU

runtime at every point in the tasks lifecycle. Real CPUs (even SMP

CPUs) are not infinitely subdividable and as such only approximate

perfectly fairness. Contrary to the name, the Completely Fair

Scheduler is still an approximately of a 100% fair scheduler where the

average fairness (or unfairness) across all tasks approaches

completely fair over time.

### Throughput

Throughput is a measure of the efficiency (low overhead) of the

scheduler. Maximal throughput is achieved if the scheduler keeps

allows the CPU to allocate 100% of its processing power to actual

tasks. In other words, any time that the CPU is running the

scheduler rather than the tasks to be scheduled is overhead that

decreases throughput.

## Algorithm Description

### Scheduler

The scheduler (runScheduler in `scheduler.js`) algorithm performs the

following tasks in a loop (where curTime is the current tick/time

unit):

- Add any tasks to the timeline structure that have a start time equal

  to curTime.

- If there is a running task and it is no longer the most unfair task,

  then put it back to the timline.

- If there is no running task and there are tasks on the timeline,

  remove the most unfair task from the timeline (the left-most in

  a BST and RBT, and the top in a Min Heap) and set it as the running

  task.

- Execute the running task

- If the running task is complete (its runtime has reached its

  duration), then mark it as completed and stop running it.

### Timeline Interface

Each of the data structures that can be used as a scheduler timeline

provide the following interface:

- min() - return the tree node containing the task which is currently

  the most unfair (the smallest vruntime value)

- insert() - insert a task into the tree based on that task's vruntime

  value.

- remove() - remove a given tree node from the tree.

### Node (binarytree.js)

Each of the timeline tree data structures is implemented using the

Node class (Node in `binarytree.js`) to represents nodes in the tree.

All operations on the tree are performed in terms of the Node class.

Some of the Node operations/attributes (swap, p, left, right) behave

differently depending on whether the data structure is constructed

using references (BST, RBT, HeapTree) or using a flat array structure

(HeapArray).

The Node class provides the following attributes:

- **id**    : A unique node ID. Used for distinguishing node which

              have an identical value.

- **isNIL** : Set to true if this is a special NIL sentinel node,

              false otherwise.

- **val**   : The actual value of the node. In the case of the

              scheduler timeline, this holds the task object/map.

- **color** : The color of this node. Only applicable to Red-Black

              Tree.

- **idx**   : The array index of this node if it is being used in

              a data structure that is array backed.

- **p**     : A reference to the parent of this Node.

- **left**  : A reference to the left child of this Node.

- **right** : A reference to the right child of this Node.

In addition the Node class provides two methods:

- **cmp**   : Called with another Node; returns -1 if this Node is

              less than the other Node, 0 if it is equal, and 1 if it

              is greater than the other Node.

- **swap**  : Called with another Node; swap position with the other

              Node.

From the perspective of the Node class, the val attribute is an opaque

object. In order to compare this Node to another Node, when a Node is

instantiated it is given a compare function which allows is to compare

two Node val properties.

Since all the binary tree data structures use the Node class to

construct the tree, the Node class is used to gather statistics on the

operations that are performanced against the tree.

### BinaryTree (binarytree.js)

BinaryTree is a generic class that implements methods that are common

to all of the binary trees data structures:

- **size**   : Returns the number of non-NIL nodes in the tree.

- **reduce** : Returns a reduced value that results from running an

               provided action function against each node of the tree

               and accumulating the resulting value.

- **tuple**  : Returns a simple hierarchical list representation of

               the tree that simplifies validation/testing.

- **walk**   : Returns a sequence of Nodes that result from walking

               the tree in, pre, or post order.

- **links**  : Returns a sequence of pairs that represent all the

               parent to child links in the tree.

- **DOT**    : Returns string in DOT (graphviz) format that can be

               used generate an image of the tree.

When a BinaryTree derived (concrete) class is instantiated, it take

a comparison function which is used to instantiate new Nodes.

The concrete classes that are derived from BinaryType must provide an

insert and remove function. The may also provide concrete

implementations of min, max and search if they support those

operations.

### BST (bst.js)

BST is a Binary Search Tree class derived from the BinaryTree class

that provides concrete implementations of insert, remove, search, min,

max. These methods are basically implemented as described in CLRS

(Algorithms, Cormen et al). One notable difference in the

implementation is that BST uses sentinel NIL nodes rather than null

pointers. This provides consistency of implementation across all the

binary tree data structures.

Unit tests for BST are defined in `test_bst.js`.

### RBT (rbt.js)

RBT is a Red-Black Tree class derived from the BST that overrides the

implementation of the insert and remove methods (and re-uses search,

min, and max from BST). These methods are basically implemented as

described in CLRS.

Unit tests for RBT are defined in `test_rbt.js`.

### HeapTree (heaptree.js)

HeapTree is a Heap derived from the BinaryTree class. The HeapTree

class supports either min-heap or max-heap behavior that can be

selected at instantiation time. HeapTree provides concrete

implementation for min (for min-heap), max (for max-heap), insert and

remove.

HeapTree uses normal references to connect Nodes in the tree together.

Because HeapTree does not use an array layout in memory (see

HeapArray), the algorithm for finding the last position of the tree

(for inserts and deletes) is more complicated and uses an O(log N)

walk from the root (implemented in heapGetLast).

Unit tests for HeapTree are defined in `test_heap.js`.

### HeapArray (heaparray.js)

HeapArray is similar to (and derived from) HeapTree but rather than

linking Nodes together using standard references, the Nodes are

positioned in an array in memory such that the left child is 2i+1

(where i is the index of the current node). HeapArray overrides the

insert and remove methods but still uses the heapBubbleUp and

heapBubbleDown functions from HeapTree.

Unit tests for HeapArray are defined in `test_array.js`.

### Tasks (tasks.js)

Tasks for the scheduler to run can be generated dynamically or loaded

from a task description file. A task description file has the

following format:

```

NUM_OF_TASKS TOTAL_TIME

TASK1_ID TASK1_START_TIME TASK1_DURATION

TASK2_ID TASK2_START_TIME TASK2_DURATION

...

TASKn_ID TASKn_START_TIME TASKn_DURATION

```

The parseTasks function (in `tasks.js`) can be used to parse the data

from a task description file into a tasks descriptor object/map that

can be passed to the scheduler function.

The generateTasks function (in `tasks.js`) can be used to generate

task descriptor object/map. The function parameters allow for fixed or

random ranges for the start and durations of the tasks being

generated.

The tasksToString function (in `tasks.js`) takes a task descriptor

object/map and generates a string in the task description file format

which can then be written to disk as a task desctiption file.

### License

This project is licensed under the MPL-2.0 license (see

LICENSE.MPL-2.0).
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/kanaka/rbt_cfs

Awesome Lists containing this project

README