Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/tinyield/ts-sequences-benchmarks

This project provides a way to benchmark various Sequence operations in Typescript and Javascript, such as find, every and zip.
https://github.com/tinyield/ts-sequences-benchmarks

benchmark every javascript lazyjs lodash nodejs npm operations sequences sequences-comparison tinyield typescript underscore zip

Last synced: 12 days ago
JSON representation

This project provides a way to benchmark various Sequence operations in Typescript and Javascript, such as find, every and zip.

Awesome Lists containing this project

README

        

# Glossary
1. [Overview](#overview)
2. [Usage](#usage)
3. [Benchmarks Overview](#benchmarks-overview)
1. [All Match](#all-match)
2. [Every](#every)
3. [Find](#find)
4. [First](#find-first)
5. [Zip](#zip)
6. [Distinct Top Artist and Top Track by Country Benchmark](#distinct-top-artist-and-top-track-by-country-benchmark)
7. [Artists who are in a Country's top ten who also have Tracks in the same Country's top ten Benchmark](#artists-who-are-in-a-countrys-top-ten-who-also-have-tracks-in-the-same-countrys-top-ten-benchmark)
8. [Flatmap and Reduce](#flatmap-and-reduce)
9. [Temperature Transitions](#temperature-transitions)
4. [Performance Comparison](#performance-comparison)
6. [More](#more)

# Overview
In this document you'll find information on how to use the project
in terms of how to run it in your local machine, a brief description
of each benchmark's pipeline.

# Usage
To run these benchmarks on you local machine just run:
```
npm run build
```
And then:
```
node ./build/lib/index.js
```

For more information run:
```ignorelang
node ./build/lib/index.js -h
```
# Benchmarks Overview
### All Match - `allMatch(sequence, predicate)`
Benchmarks the `allMatch()` operation in the different sequence types.

Collection Sizes: [1000, 100 000]

Pipeline:
```ignorelang
sequenceOfEvenNumbers -> allMatch(isEven)
```
### Every - `every(sequence1, sequence2, predicate)`
Every is an operation that, based on a user defined predicate, tests if all the
elements of a sequence match between corresponding positions. So for instance, if we have:
```javascript
const seq1 = ["505", "Amsterdam", "Mural"];
const seq2 = ["505", "Amsterdam", "Mural"];
const seq3 = ["Mural", "Amsterdam", "505"];
const pred = (s1, s2) => s1.equals(s2);
```
Then:
```javascript
every(seq1, seq2, pred); // returns true
every(seq1, seq3, pred); // returns false
```
For `every` to return true, every element of each sequence must match in the same index.
The `every` feature can be achieved through a pipeline of `zip`-`allMatch` operations,
such as:
```javascript
seq1.zip(seq2, pred).allMatch(isTrue);
```

Benchmarked for: [Class, Number, String]
Collection Sizes: [1000, 100 000]

Pipeline:
```ignorelang
sourceSequence + copyOfSourceSequence -> zip(equals) -> allMatch(isTrue)
```
### Find - `find(sequence1, sequence2, predicate)`
The `find` between two sequences is an operation that, based on a user defined
predicate, finds two elements that match, returning one of them in the process.
So if we have:
```javascript
const seq1 = ["505", "Amsterdam", "Mural"];
const seq2 = ["400", "Amsterdam", "Stricken"];
const seq3 = ["Amsterdam", "505", "Stricken"];
const predicate = (s1, s2) => s1.equals(s2);
```
Then:
```java
find(seq1, seq2, predicate); // returns "Amsterdam"
find(seq1, seq3, predicate); // returns null
find(seq2, seq3, predicate); // returns "Stricken"
```
For `find` to return an element, two elements of each sequence must match in the
same index. Here's what an implementation of `find` would look like with the support
of a `zip`:
```javascript
zip(seq1, seq2, (t1, t2) => predicate(t1, t2) ? t1 : null)
.filter(isNotNull)
.first();
}
```

Benchmarked for: [Class, Number, String]
Collection Sizes: [1000, 100 000]
This pipeline has two ways of determining which element will be the matching element:
1. For String sequences, the matching element will be on a fixed index, using the
following criteria:
* For collections with more than 100 elements, the matching index will correspond
to COLLECTION_SIZE / 100.
(e.g: for a COLLECTION_SIZE of 100K the pipeline will match with the 1000th element)

* Otherwise, the matching index will correspond to COLLECTION_SIZE / 10.
(e.g: for a COLLECTION_SIZE of 100 elements the pipeline will match with the 10th element)
2. For all sequences (including String), the match index increments with each iteration
to make sure there are matches in every index of the sequence.
(e.g: On the 1st iteration the match index will be 0, on the 2nd it will be 1, etc... )

Pipeline:
```javascript
sourceSequence1 + sourceSequence2 ->
-> zip((e1, e2) => predicate(e1, e2) ? t1 : null) ->
-> filter(isNotNull) ->
-> first()
```
### First - `first(sequence, predicate)`
Benchmarks the usage of the `findFirst()` operator. This benchmark was run
with three types of Sequences:
1. One where the match would be found in the first element.
1. One where the match would be found in the middle.
1. One where the match would be found in the last element.

Collection Sizes: [1000, 100 000]

Pipeline:
```ignorelang
sequenceOfAllButOneEvenNumbers -> filter(isOdd) -> findFirst()
```
### Zip Primes with Values - `zip(primes, values)`
Benchmarks _zipping_ a sequence of prime Numbers with instances of the class Value.

Collection Sizes: [1000, 100 000]

Pipeline:
```ignorelang
(sequenceOfNumbers -> filter(isPrime)) + sequenceOfValues -> zip(Pair.with) -> forEach(blackhole)
```
### Distinct Top Artist and Top Track by Country Benchmark - `zip(artists, tracks)`
Benchmarks creating two different sequences, one consisting of the top 50 Artists
(provided by [Last.fm](https://www.last.fm/api/)) of each non english speaking
country (provided by [REST Countries](https://restcountries.eu/)) and the other
the exact same thing but for the top 50 Tracks.
Then _zipping_ both sequences into a Trio of Country, First Artist and First Track and
retrieving the distinct elements by Artist.

Pipelines:
* Sequence of Artists:
```ignorelang
sequenceOfCountries -> filter(isNonEnglishSpeaking) -> filter(hasArtists) -> map(Pair.of(country, artists));
```

* Sequence of Tracks:
```ignorelang
sequenceOfCountries -> filter(isNonEnglishSpeaking) -> filter(hasTracks) -> map(Pair.of(country, tracks));
```
* Pipeline
```ignorelang
sequenceOfArtists + sequenceOfTracks ->
-> zip(Trio.of(country, topArtist, topTrack)) ->
-> distinctBy(artist) ->
-> forEach(blackhole)
```
### Artists who are in a Country's top ten who also have Tracks in the same Country's top ten Benchmark - `zip(artists, tracks)`
Benchmarks creating two different sequences, one consisting of the top 50 Artists
(provided by [Last.fm](https://www.last.fm/api/)) of each non english speaking
country (provided by [REST Countries](https://restcountries.eu/)) and the other
the exact same thing but for the top 50 Tracks.
Then, for each Country, we take the top ten Artists and top ten Track artist's
names and zip them into a Trio. After, the top ten artists are filtered by their
presence in the top ten Track artist's name, returning a Pair with the Country
and the resulting Sequence.

Pipelines:
* Sequence of Artists:
```ignorelang
sequenceOfCountries -> filter(isNonEnglishSpeaking) -> filter(hasArtists) -> map(Pair.of(country, artists));
```

* Sequence of Tracks:
```ignorelang
sequenceOfCountries -> filter(isNonEnglishSpeaking) -> filter(hasTracks) -> map(Pair.of(country, tracks));
```
* Pipeline
```ignorelang
sequenceOfArtists + sequenceOfTracks ->
-> zip(Trio.of(country, artists, tracks)) ->
-> map(Pair.of(country, artists)) ->
-> forEach(blackhole)
```

### Flatmap and Reduce - `flatmapAndReduce(nestedSequence)`
Benchmarks the usage of the flatmap operation terminating with a reduce operation. In this case in
particular we use a nested sequence of numbers flatmap it to a regular sequence of numbers and then sum
them.

Pipeline:
```ignorelang
nestedSequence -> flatmap() -> reduce()
```

### Temperature Transitions - `collapse(sequence)`
This benchmarks the usage of a user defined operation, for this benchmark we used real data gathered from
the [World Weather Online API](https://developer.worldweatheronline.com/api/). The data is in .csv which
means it needs to be prepared, in order to do so we have two user defined operations "oddLines" which
filters any line in an odd index, and "collapse" that filters equal elements in sequence to each other
(e.g. input: 3, 5, 5, 3, 5 -> output: 3,5,3,5 )

Pipeline:
```ignorelang
lineSequence ->
-> filter("#") ->
-> skip(No Data Lines) ->
-> oddLines ->
-> map(toTemperature) ->
-> collapse() ->
-> count()
```

# Performance Comparison
The results presented here were based on the results attained from Github Actions,
they are presented in relation to IxJs's IterableX performance, due to it being the closest
to Javascript's out of the box lazy sequence which is the Iterable. For the actual results
check the [Github Actions Section](https://github.com/tinyield/ts-sequences-benchmarks/actions).

**Notes:**
* IxJs's IterableX performance is equivalent to 1, all results are presented in relation to it
* For Pipelines where collection sizes vary, only 1k results and 100k results will be
displayed, separated in a pipe format, like so:
* 1K results | 100K results

### Benchmarks with one Sequence



Benchmark
Time Complexity

Underscore
Tinyield
Sequency
Lodash
Lazy.js




All match
Linear

6.51 | 3.57
4.11 | 2.70
2.42 | 2.23
6.23 | 3.60
6.96 | 3.57


First in Beginning
Constant

0.41 | 1.51
1.71 | 1.98
1.43 | 1.98
0.31 | 1.36
1.54 | 1.96


First in Middle
Linear

2.19 | 2.40
4.41 | 2.43
2.98 | 2.25
1.51 | 2.11
6.95 | 3.05


First in End
Linear

3.39 | 2.93
4.42 | 2.72
2.55 | 2.27
2.54 | 2.60
8.35 | 3.40


Flatmap and Reduce
Linear

1.52 | 0.66
2.47 | 3.62
2.98 | 2.92
2.22 | 0.60
5.03 | 6.17


Temperature Transitions
Linear

5.24
5.74
2.61
4.48
5.86
1.00

### Benchmarks with two Sequences



Benchmark
Time Complexity

Underscore
Tinyield
Sequency
Lodash
Lazy.js




Every String
Linear

1.60 | 0.49
4.47 | 2.05
3.17 | 1.83
3.17 | 0.70
3.44 | 2.09


Every Number
Linear

1.48 | 0.40
4.40 | 3.24
3.21 | 2.64
3.53 | 0.67
3.87 | 3.17


Every Class
Linear

1.44 | 0.41
4.15 | 3.07
3.25 | 2.55
3.47 | 0.69
3.65 | 3.05


Zip Primes with Values
Linear

0.64 | 0.27
3.27 | 1.89
2.68 | 1.75
0.95 | 0.29
3.00 | 2.06


Zip Top Artist & Track(1)
Linear

0.69
1.59
2.13
0.42
2.13


Zip Artists in Top10(2)
Linear

1.12
1.90
1.55
0.67
2.70

(1) Corresponds to "Distinct Top Artist and Top Track by Country Benchmark"

(2) Corresponds to Artists who are in a Country's top ten who also have Tracks"
in the same Country's top ten Benchmark

# More
If you're interested in a similar comparison but in the Java world head over to our other
[Benchmarks Github Repository](https://github.com/tinyield/sequences-benchmarks). You can
also check a more in depth analysis in an [Article we wrote over at
DZONE](https://dzone.com/articles/bridge-the-gap-of-zip-operation).