{"id":19016186,"url":"https://github.com/fosskers/scala-benchmarks","last_synced_at":"2025-04-23T02:40:45.698Z","repository":{"id":66325041,"uuid":"108156233","full_name":"fosskers/scala-benchmarks","owner":"fosskers","description":"An independent set of benchmarks for testing common Scala idioms.","archived":false,"fork":false,"pushed_at":"2019-04-30T03:40:16.000Z","size":63,"stargazers_count":65,"open_issues_count":3,"forks_count":11,"subscribers_count":8,"default_branch":"master","last_synced_at":"2025-03-29T21:21:00.226Z","etag":null,"topics":["benchmarks","jmh","scala"],"latest_commit_sha":null,"homepage":null,"language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fosskers.png","metadata":{"files":{"readme":"README.org","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-10-24T16:48:10.000Z","updated_at":"2022-06-12T10:53:27.000Z","dependencies_parsed_at":"2023-03-05T00:15:34.405Z","dependency_job_id":null,"html_url":"https://github.com/fosskers/scala-benchmarks","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fosskers%2Fscala-benchmarks","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fosskers%2Fscala-benchmarks/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fosskers%2Fscala-benchmarks/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fosskers%2Fscala-benchmarks/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fosskers","download_url":"https://codeload.github.com/fosskers/scala-benchmarks/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249318969,"owners_count":21250444,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmarks","jmh","scala"],"created_at":"2024-11-08T19:41:34.358Z","updated_at":"2025-04-17T05:31:39.903Z","avatar_url":"https://github.com/fosskers.png","language":"Scala","funding_links":[],"categories":[],"sub_categories":[],"readme":"#+TITLE: Scala Benchmarks\n#+AUTHOR: Colin\n\nAn independent set of benchmarks for testing common Scala idioms.\n\n* Table of Contents                     :TOC_4_gh:noexport:\n- [[#functional-programming][Functional Programming]]\n  - [[#folding][Folding]]\n  - [[#chained-higher-order-functions][Chained Higher-Order Functions]]\n  - [[#concatenation][Concatenation]]\n- [[#mutable-data][Mutable Data]]\n  - [[#list-ilist-and-array][List, IList and Array]]\n  - [[#builder-classes][Builder Classes]]\n  - [[#mutable-set-and-javas-concurrenthashmap][Mutable Set and Java's ConcurrentHashMap]]\n- [[#pattern-matching][Pattern Matching]]\n  - [[#deconstructing-containers][Deconstructing Containers]]\n  - [[#guard-patterns][Guard Patterns]]\n\n* Functional Programming\n\n** Folding\n\nWe often want to collapse a collection into some summary of its elements.\nThis is known as a /fold/, a /reduce/, or a /catamorphism/:\n\n#+BEGIN_SRC scala\n  List(1,2,3).foldLeft(0)(_ + _)  // 6\n#+END_SRC\n\nHow fast is this operation in the face of the JVM's ~while~ and mutable\nvariables? For instance the familiar, manual, and error-prone:\n\n#+BEGIN_SRC scala\n  var n: Int = 0\n  var i: Int = coll.length - 1\n\n  while (i \u003e= 0) {\n    n += coll(i)\n    i -= 1\n  }\n#+END_SRC\n\n~FoldBench~ compares ~List~, ~scalaz.IList~, ~Vector~, ~Array~, ~Stream~, and\n~Iterator~ for their speeds in various fold operations over Ints.\n~FoldClassBench~ tries these same operations over a simple wrapping class to see\nhow boxing/references affect things.\n\n~Int~ Results:\n\n/All times are in microseconds. Entries marked with an asterisk are sped up by\noptimization flags. Entries marked with two are slowed down by them./\n\n| Benchmark      |   List | IList | Vector | Array |         Stream | EStream        | Iterator |\n|----------------+--------+-------+--------+-------+----------------+----------------+----------|\n| ~foldLeft~     | 44.1** |  31.3 |   63.5 | 34.0* |           56.9 | 180.3**        |     55.4 |\n| ~foldRight~    |   69.2 |  81.9 | 137.9* | 36.3* | Stack Overflow | Stack Overflow |    147.6 |\n| Tail Recursion |   45.9 |  24.1 |        |       |           69.8 |                |          |\n| ~sum~          |   76.9 |       |   71.0 | 79.0  |           74.7 |                |          |\n| ~while~        |   44.0 |       |   38.4 | 3.0   |           52.9 |                |     45.4 |\n\n~Pair~ Class Results:\n\n/All times are in microseconds./\n\n| Benchmark      | List | IList | Vector | Array | Stream         | Iterator |\n|----------------+------+-------+--------+-------+----------------+----------|\n| ~foldLeft~     | 39.5 |  37.5 |   70.2 |  39.9 | 68.2           |     65.8 |\n| ~foldRight~    | 83.6 |  98.1 |  242.1 |  38.8 | Stack Overflow |    157.3 |\n| Tail Recursion | 39.2 |  37.9 |        |       | 118.6**        |          |\n| ~while~        | 39.3 |       |   57.8 |  36.2 | 70.1           |     39.2 |\n\nObservations:\n\n- ~foldLeft~ is always better than both ~foldRight~ and manual tail recursion for\n  catamorphisms (reduction to a single value).\n- ~sum~ should be avoided.\n- ~Iterator~ benefits from ~while~, but not enough to beat ~List~.\n- Collections with random access (especially ~Array~) benefit from ~while~\n  loops.\n- *Array has no advantage over List when holding non-primitive types!*\n\nRecommendation:\n\n#+BEGIN_QUOTE\n~List.foldLeft~ is concise and performant for both primitive and boxed types.\nIf you were already dealing with an ~Array[Int]~ or likewise, then a ~while~\nloop will be faster.\n#+END_QUOTE\n\n** Chained Higher-Order Functions\n\nIt's common to string together multiple operations over a collection, say:\n\n#+BEGIN_SRC scala\n  List(1,2,3,4).map(foo).filter(pred).map(bar)\n#+END_SRC\n\nwhich is certainly shorter and cleaner in its intent than manually manipulating\na mutable collection in a ~while~ loop. Are higher-order operations like these\nstill fast? People used to Haskell's list fusion might point out that these\noperations typically don't fuse in Scala, meaning that each chained operation\nfully iterates over the entire collection and allocates a new copy. ~Stream~ and\n~Iterator~ are supposed to be the exceptions, however.\n\n~Stream~ in particular is what people wanting Haskell's lazy lists may reach for\nfirst, on the claim that the elements memoize, chained operations fuse, and they\nsupport infinite streams of values. Let's see how everything performs.\n\n~StreamBench~ performs the following operations on ~List~, ~scalaz.IList~,\n~Vector~, ~Array~, ~Stream~, ~scalaz.EphemeralStream~ and ~Iterator~. We test:\n\n- /Head/: map-filter-map-head. Which collections \"short-circuit\", only\n  fully processing the head and nothing else?\n- /Max/: map-filter-map-max. How quickly can each collection fully process itself?\n  Does fusion occur (esp. with ~Stream~)?\n- /Reverse/: reverse-head. Can any of the collections \"cheat\" and grab the last\n  element quickly?\n- /Sort/: map-filter-map-sorted-head. Does ~Stream~ still leverage laziness with\n  a \"mangling\" operation like sort?\n\nResults:\n\n/All times are in microseconds./\n\n| Benchmark |  List | IList | Vector | Array | Stream | EStream | Iterator |\n|-----------+-------+-------+--------+-------+--------+---------+----------|\n| Head      | 182.3 | 273.2 |  133.2 | 206.3 |  0.065 |    0.17 |    0.023 |\n| Max       | 198.9 | 401.7 |  263.5 | 192.7 |  863.7 |  1714.4 |    139.7 |\n| Reverse   |  37.8 |  49.2 |  146.7 |  45.6 |  371.6 |   448.5 |          |\n| Sort      | 327.5 | 607.6 |  277.8 | 289.4 | 1482.8 |         |          |\n\nObservations:\n\n- ~Stream~ won't do work it doesn't have to, as advertised (re: /Head/).\n- ~Stream~ is very slow to fully evaluate, implying no operation fusion.\n  Nothing clever happens with sorting.\n- ~Iterator~ overall is the fastest collection to chain higher-order\n  functions.\n- ~List~ has the fastest ~reverse~.\n\nRecommendation:\n\n#+BEGIN_QUOTE\nIf you want to chain higher-order operations in Scala, use an ~Iterator~.\nIf you have something like a ~List~ instead, create an ~Iterator~ first\nwith ~.iterator~ before you chain.\n#+END_QUOTE\n\n** Concatenation\n\nSometimes we need to merge two instances of a container together, end-to-end.\nThis is embodied by the classic operator ~++~, available for all the major\ncollection types.\n\nWe know that the collection types are implemented differently. Are some better\nthan others when it comes to ~++~? For instance, we might imagine that the\nsingly-linked ~List~ type would be quite bad at this. The lazy ~Stream~ types\nshould be instantaneous.\n\n~ConcatBench~ tests ~List~, ~scalaz.IList~, ~Array~, ~Vector~, ~Stream~, and\n~scalaz.EphemeralStream~ for their performance with the ~++~ operator. Two\nresults are offered for ~Array~: one with ~Int~ and one for a simple ~Pair~\nclass, to see if primitive Arrays can somehow be optimized here by the JVM, as\nthey usually are. Otherwise, the results are all for collections of ~Int~.\n\n/All times are in microseconds./\n\n| Item Count | ~List~ | ~IList~ | ~Vector~ | ~Array[Int]~ | ~Array[Pair]~ | ~Stream~ | ~EStream~ |\n|------------+--------+---------+----------+--------------+---------------+----------+-----------|\n| 1,000      |     14 |      10 |       17 |          0.6 |           0.7 |     0.02 |      0.02 |\n| 10,000     |    117 |      78 |      147 |            7 |             7 |     0.02 |      0.02 |\n| 100,000    |    931 |     993 |     1209 |           75 |            77 |     0.02 |      0.02 |\n| 1,000,000  |   8506 |   10101 |    10958 |         1777 |          1314 |     0.02 |      0.02 |\n\nObservations:\n\n- The ~Stream~ types were instantaneous, as expected.\n- ~Array~ is quick! Somehow quicker for classes, though.\n- The drastic slowdown for ~Array~ at the millions-of-elements scale is strange.\n- ~IList~ beats ~List~ until millions-of-elements scale.\n- ~Vector~ has no advantage here, despite rumours to the contrary.\n\nRecommendation:\n\n#+BEGIN_QUOTE\nIf your algorithm requires concatenation of large collections, use ~Array~.\nIf you're worried about passing a mutable collection around your API, consider\n~scalaz.ImmutableArray~, a simple wrapper that prevents careless misuse.\n#+END_QUOTE\n\n* Mutable Data\n\n** List, IList and Array\n\nAbove we saw that ~List~ performs strongly against ~Array~ when it comes to\nchaining multiple higher-order functions together. What happens when we just\nneed to make a single transformation pass over our collection - in other words,\na ~.map~? ~Array~ with a ~while~ loop is supposed to be the fastest iterating\noperation on the JVM. Can ~List~ and ~IList~ still stand up to it?\n\n~MapBench~ compares these operations over increasing larger collection sizes of\nboth ~Int~ and a simple wrapper class.\n\nResults:\n\n/All times are in microseconds./\n\n| Benchmark     | ~List.map~ | ~IList.map~ | ~Array~ + ~while~ |\n|---------------+------------+-------------+-------------------|\n| 100 Ints      |       0.77 |         1.1 |              0.05 |\n| 1000 Ints     |        7.8 |        10.9 |              0.45 |\n| 10000 Ints    |       71.6 |        99.9 |               3.7 |\n|---------------+------------+-------------+-------------------|\n| 100 Classes   |       0.83 |        1.3  |               0.4 |\n| 1000 Classes  |        8.6 |        12.9 |               4.3 |\n| 10000 Classes |       81.3 |       111.2 |              43.1 |\n\nObservations:\n\n- For ~List~, there isn't too much difference between Ints and classes.\n- ~Array~ is fast to do a single-pass iteration.\n\nRecommendation:\n\n#+BEGIN_QUOTE\nIf your code involves ~Array~, primitives, and simple single-pass\ntransformations, then ~while~ loops will be fast for you. Otherwise, your code\nwill be cleaner and comparitively performant if you stick to immutable\ncollections and chained higher-order functions.\n#+END_QUOTE\n\n** Builder Classes\n\nYou want to build up a new collection, perhaps iterating over an existing one,\nperhaps from some live, dynamic process. For whatever reason ~.map~ and\n~.foldLeft~ are not an option. Which collection is best for this? ~VectorBench~\ntests how fast each of ~List~, ~scalaz.IList~, ~ListBuffer~, ~Vector~,\n~VectorBuilder~, ~Array~, ~ArrayBuilder~, and ~IndexedSeq~ can create themselves\nand accumulate values. For ~List~, this is done with tail recursion. For\n~IndexedSeq~, this is done via a naive for-comprehension. For all others, this\nis done with ~while~ loops. The ~Buffer~ and ~Builder~ classes perform a\n~.result~ call at the end of iterating to take their non-builder forms (i.e.\n~VectorBuilder =\u003e Vector~). ~ArrayBuilder~ is given an overshot size hint (with\n~.sizeHint~) in order to realistically minimize inner ~Array~ copying.\n\nResults:\n\n/All times are in microseconds./\n\n| Benchmark      | ~List~ | ~IList~ | ~ListBuffer~ | ~Vector~ | ~VectorBuilder~ | ~Array~ | ~ArrayBuilder~ | ~IndexedSeq~ |\n|----------------+--------+---------+--------------+----------+-----------------+---------+----------------+--------------|\n| 1000 Ints      |    5.7 |     5.5 |          5.5 |     20.8 |             6.6 |     0.6 |            1.1 |          5.9 |\n| 10000 Ints     |   60.2 |    57.1 |         57.9 |    206.1 |            39.0 |     5.3 |           11.4 |         61.4 |\n| 100000 Ints    |  545.1 |   529.1 |        551.6 |   2091.2 |           384.3 |    53.3 |          121.3 |        615.3 |\n| 1000 Classes   |    6.2 |     6.2 |          7.2 |     21.5 |             6.3 |     3.8 |            4.9 |          6.4 |\n| 10000 Classes  |   64.4 |    62.4 |         68.5 |    214.3 |            44.7 |    41.4 |           53.1 |         65.4 |\n| 100000 Classes |  592.0 |   600.3 |        611.6 |   2164.7 |           429.4 |   357.0 |          523.5 |        653.3 |\n\nObservations:\n\n- For primitives, ~Array~ is king.\n- *Avoid appending to immutable Vectors.*\n- *Avoid repeated use of ListBuffer.prepend!* Your runtime will slow by an order of magnitude vs ~+=:~.\n- For classes, at small scales (~1000 elements) there is mostly no difference between\n  the various approaches.\n- ~ArrayBuilder~ can be useful if you're able to ballpark what the final result size will be.\n- ~VectorBuilder~ fulfills the promise of Builders, but can only append to the right.\n  You'd have to deal with the fact that your elements are reversed.\n\nRecommendation:\n\n#+BEGIN_QUOTE\nThe best choice here depends on what your next step is.\n\nIf you plan to perform ~while~ -based numeric calculations over primitives only,\nstick to ~Array~. If using ~ArrayBuilder~ with primitives, avoid the ~.make~\nmethod. Use something like ~.ofInt~ instead. Also make sure that you use\n~.sizeHint~ to avoid redundant inner ~Array~ copying as your collection grows.\nFailing to do so can introduce a 5x slowdown.\n\nOtherwise, consider whether your algorithm can't be reexpressed entirely in\nterms of ~Iterator~. This will always give the best performance for subsequent\nchained, higher-order functions.\n\nIf the algorithm can't be expressed in terms of ~Iterator~ from the get-go, try\nbuilding your collection with ~VectorBuilder~, call ~.iterator~ once filled,\nthen continue.\n#+END_QUOTE\n\n** Mutable Set and Java's ConcurrentHashMap\n\nYou'd like to build up a unique set of values and for some reason calling\n~.toSet~ on your original collection isn't enough. Perhaps you don't have an\noriginal collection. Scala's collections have been criticized for their\nperformance, with one famous complaint saying how their team had to fallback to\nusing Java collection types entirely because the Scala ones couldn't compare\n(that was for Scala 2.8, mind you).\n\nIs this true? ~UniquesBench~ compares both of Scala's mutable and immutable\n~Set~ types with Java's ~ConcurrentHashMap~ to see which can accumulate unique\nvalues faster.\n\nResults:\n\n/All values are in microseconds./\n\n| Benchmark    | ~mutable.Set~ | ~immutable.Set~ | Java ~ConcurrentHashMap~ |\n|--------------+---------------+-----------------+--------------------------|\n| 100 values   |           4.6 |             7.7 |                      6.1 |\n| 1000 values  |          62.2 |           107.4 |                     71.3 |\n| 10000 values |        811.1* |          1290.4 |                    777.1 |\n\n*Note*: About half the time the 10000-value benchmark for ~mutable.Set~\noptimizes down to ~600us instead of the ~800us shown in the chart.\n\nObservations:\n\n- ~mutable.Set~ is fastest at least for small amounts of data, and /might/ be\n  fastest at scale.\n- ~immutable.Set~ is slower and has worse growth, as expected.\n\nRecommendation:\n\n#+BEGIN_QUOTE\nFirst consider whether your algorithm can't be rewritten in terms of the usual\nFP idioms, followed by a ~.toSet~ call to make the collection unique.\n\nIf that isn't possible, then trust in the performance of native Scala\ncollections and use ~mutable.Set~.\n#+END_QUOTE\n\n* Pattern Matching\n\n** Deconstructing Containers\n\nIt's common to decontruct containers like this in recursive algorithms:\n\n#+BEGIN_SRC scala\n  def safeHead[A](s: Seq[A]): Option[A] = s match {\n    case Seq() =\u003e None\n    case h +: _ =\u003e Some(h)\n  }\n#+END_SRC\n\nBut ~List~ and ~Stream~ have special \"cons\" operators, namely ~::~ and ~#::~\nrespectively. The ~List~ version of the above looks like:\n\n#+BEGIN_SRC scala\n  def safeHead[A](l: List[A]): Option[A] = l match {\n    case Nil =\u003e None\n    case h :: _ =\u003e Some(h)\n  }\n#+END_SRC\n\nHow do these operators compare? Also, is it any slower to do it this way than a\nmore Java-like:\n\n#+BEGIN_SRC scala\n  def safeHead[A](l: List[A]): Option[A] =\n    if (l.isEmpty) None else l.head\n#+END_SRC\n\nThe ~MatchContainersBench~ benchmarks use a tail-recursive algorithm to find the\nlast element of each of ~List~, ~scalaz.IList~, ~Vector~, ~Array~, ~Seq~, and\n~Stream~.\n\nResults:\n\n/All times are in microseconds./\n\n| Benchmark       | List | IList | Vector |   Seq |   Array | Stream |\n|-----------------+------+-------+--------+-------+---------+--------|\n| ~::~ Matching   | 42.8 | 23.6  |        |       |         |  168.4 |\n| ~+:~ Matching   | 79.0 |       | 1647.5 | 707.4 |         |  170.2 |\n| ~if~ Statements | 39.9 |       |  816.9 |  39.4 | 16020.6 |   55.8 |\n\nObservations:\n\n- Canonical ~List~ and ~IList~ matching is /fast/.\n- ~Seq~ matching with ~+:~, its canonical operator, is ironically slow.\n- Pattern matching with ~+:~ should be avoided in general.\n- ~if~ is generally faster than pattern matching, but the code isn't as nice.\n- Avoid recursion with ~Vector~ and ~Array~!\n- ~Array.tail~ is pure evil. Each call incurs ~ArrayOps~ wrapping and\n  seems to reallocate the entire ~Array~. ~Vector.tail~ incurs a similar\n  slowdown, but not as drasticly.\n\nRecommendation:\n\n#+BEGIN_QUOTE\nRecursion involving containers should be done with ~List~ and pattern matching\nfor the best balance of speed and simplicity. If you can take ~scalaz~ as a\ndependency, its ~IList~ will be even faster.\n#+END_QUOTE\n** Guard Patterns\n\nIt can sometimes be cleaner to check multiple ~Boolean~ conditions using a ~match~:\n\n#+BEGIN_SRC scala\n  def foo(i: Int): Whatever = i match {\n    case _ if bar(i) =\u003e ???\n    case _ if baz(i) =\u003e ???\n    case _ if zoo(i) =\u003e ???\n    case _ =\u003e someDefault\n  }\n#+END_SRC\n\nwhere we don't really care about the pattern match, just the guard. This is in\nconstrast to ~if~ branches:\n\n#+BEGIN_SRC scala\n  def foo(i: Int): Whatever = {\n    if (bar(i)) ???\n    else if (baz(i)) ???\n    else if (zoo(i)) ???\n    else someDefault\n  }\n#+END_SRC\n\nwhich of course would often be made more verbose by many ~{}~ pairs. Are we\npunished for the empty pattern matches? ~MatchBench~ tests this, with various\nnumbers of branches.\n\nResults:\n\n/All times are in nanoseconds./\n\n| Benchmark    | Guards | Ifs |\n|--------------+--------+-----|\n| 1 Condition  |    3.3 | 3.3 |\n| 2 Conditions |    3.6 | 3.6 |\n| 3 Conditions |    3.9 | 3.9 |\n\nIdentical! Feel free to use whichever you think is cleaner.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffosskers%2Fscala-benchmarks","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffosskers%2Fscala-benchmarks","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffosskers%2Fscala-benchmarks/lists"}