https://github.com/biojulia/kwaymerges.jl
Implementation of k-way merge algorithms in Julia
https://github.com/biojulia/kwaymerges.jl
Last synced: 9 months ago
JSON representation
Implementation of k-way merge algorithms in Julia
- Host: GitHub
- URL: https://github.com/biojulia/kwaymerges.jl
- Owner: BioJulia
- License: mit
- Created: 2025-06-12T09:23:01.000Z (12 months ago)
- Default Branch: master
- Last Pushed: 2025-07-11T14:53:58.000Z (11 months ago)
- Last Synced: 2025-09-09T16:35:00.461Z (9 months ago)
- Language: Julia
- Size: 22.5 KB
- Stars: 4
- Watchers: 1
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
#
KWayMerges.jl
[](https://github.com/BioJulia/KWayMerges.jl/releases/latest)
[](https://github.com/BioJulia/KWayMerges.jl/blob/master/LICENSE)
[](https://codecov.io/gh/BioJulia/KWayMerges.jl)
Implementation of k-way merge.
This package exports the function `kway_merge`.
It constructs a `KWayMerger` - a stateful, lazy iterator of the elements in an iterator of iterators.
The elements of the inner iterators will be yielded in order, as specified by the optional ordering (default: `Forward`).
Therefore, if the inner iterators are sorted by the order, the yielded elements of the `KWayMerger` is also guaranteed to be sorted.
The primary purpose of `kway_merge` is to efficiently merge N sorted iterables into one sorted stream.
The iterator yields `@NamedTuple{from_iter::Int, value::T}`, where the value field has the next element of one of the iterators, and the from_iter field contains the 1-based index of the iterator that yielded the value:
```julia
julia> it = kway_merge([[2, 3], [1, 4]]);
julia> first(it)
(from_iter = 2, value = 1)
julia> println(map(Tuple, it))
[(1, 2), (1, 3), (2, 4)]
```
The function `peek` can be used to check the next element without advancing the iterator:
```julia
julia> it = kway_merge([1]);
julia> peek(it)
(from_iter = 1, value = 1)
julia> first(it)
(from_iter = 1, value = 1)
julia> peek(it) === nothing
true
```
## Documentation
This package's public functionality are the `kway_merge` function, the (unexported) `KWayMerger` type, and its `Base.peek` method.
See their docstrings for more details.
## Performance
When merging I iterables:
* A `KWayMerger` allocates O(I) space upon construction
* Producing each element takes O(log(I)) time
Therefore, merging I sorted iterables with N total elements using `kway_merge` takes O(N * log(I)) time.
This is similar to the O(N * log(N)) time taken for comparison-based sorts.
That's no co-incidence: One can take a list with N elements, separate it into N 1-element lists, then merge them with a kway-merge. That is a variant of merge sort.
However, compared to a comparison-based sort like quicksort, using a kway merge has the following differences:
* Usually, we have I << N, and therefore, kway merge is usually faster.
* For large I, quicksort is faster in practice because its overhead per element is smaller.
Note that Julia uses radix sort for integers, which sorts in O(N), and therefore usually beats a k-way merge.
## Contributing
We appreciate contributions from users including reporting bugs, fixing
issues, improving performance and adding new fea oftentures.
Take a look at the [contributing files](https://github.com/BioJulia/Contributing)
detailed contributor and maintainer guidelines, and code of conduct.
## Questions?
If you have a question about contributing or using BioJulia software,
come on over and chat to us on [the Julia Slack workspace](https://julialang.org/slack/),
or you can try the [Bio category of the Julia discourse site](https://discourse.julialang.org/c/domain/bio).