An open API service indexing awesome lists of open source software.

https://github.com/strainer/barsort

An unweildy exercise in fast numeric sorting
https://github.com/strainer/barsort

exercise javascript sort

Last synced: about 2 months ago
JSON representation

An unweildy exercise in fast numeric sorting

Awesome Lists containing this project

README

          

Barsort
=======

A speed optimised general purpose and stable numeric sort, specialised to work on numeric input only (integers and reals) and return a clone or sort index.

Barsort utilises a specialised algorithm similar to 'counting sort' originaly made to place array elements into groups of equal size with similar magnitudes. It is combined here with insert and merge sorts, and with edge case processing to create a fast numeric sort.

Testing across a good range of possible input distributions and sizes shows barsort is many times faster than node 2016's native sort and competitive with a proficient javascript implementation of Pythons optimised 'Timsort'.

Alas, the source code is a beastly private exertion beyond redemption.

Usage
-----

```javascript

//return a sorted clone of array
sorted_arr = Barsort.sort( array [,"descend"] )

//return a sorted index to array ([optional params])
index_arr = Barsort.sortorder( array [,index_arr][,"descend"] )

```

### Summary of speedtests:

The following tables are for generating a sort index of arrays. Timsort is considerably faster doing light sorting in place.

Pre-sorted input, Lengths: | 100 | 10,000 | 1,000,000
:-------------- | :-------: | :---------: | :----------
Barsort sort | 100 % | 100 % | 100 %
Native sort | 5 % | 2 % | 2 %
Timsort sort | 100 % | 100 % | 100 %

Gaussian distribution | 100 | 10,000 | 1,000,000
:-------------- | :-------: | :---------: | :----------
Barsort sort | 100 % | 100 % | 100 %
Native sort | 2 % | 2 % | 2 %
Timsort sort | 100 % | 60 % | 60 %

Tough distribution | 100 | 10,000 | 1,000,000
:-------------- | :-------: | :---------: | :----------
Barsort sort | 100 % | 100 % | 100 %
Native sort | 10 % | 10 % | 10 %
Timsort sort | 100 % | 65 % | 65 %

### See also

[Timsort](https://github.com/mziccard/node-timsort) is a popular multipurpose in-place sort.

[LSD Radix Sort](https://duvanenko.tech.blog/2017/06/15/faster-sorting-in-javascript/) is 3x Barsort speed but is limited to unsigned integers and can not arrange an index.

### Barsort algorithm basics - a "counting sort"

The input numbers are first tallied into bins as though calculating a histogram (by dividing by a suitable factor and casting to integer to get a bin number). Like this:
```
for(var i=0; i>0 }
```
These "counting bins" are subsequently indexed by a fewer number of "placement bins". The core algorithm was developed to sort data roughly into histogram bars ( without sorting *within* the bars). The "counting bins" were subdivisions of the bars to reduce spillage between the bars. So, the cumulative sum of the populations of the placement bins is calculated so that for each placement bin an anchor position in the sorting index (output) is known (for values of bins range).

Like this:
```
for(var bin=0; bin=fcap){ //when bar is full...
barsfill[fillbar+1]+=barsfill[fillbar]-fcap
barsfill[fillbar]=fcap
fillbar++ //...fill next bar
nxtcap+=kysperbar-fcap //nxtcap and kysperbar are floats
fcap=nxtcap >>>0 //fcap is integer (it differs for each bar)
}
}
```

Some multi-indirected lookup and updating is done for each input to use the base placement info to assign inputs their position in the sorting index. Here is that final 'curious' code:
```
var bapos=new Array(nbar); bapos[0]=0 //( before_anchor_pos )
for(var i=0;i