https://github.com/cihga39871/bibufferedstreams.jl
A fast double-buffered stream to wrap Cmd Pipe, ie. open(::Cmd)
https://github.com/cihga39871/bibufferedstreams.jl
Last synced: 2 months ago
JSON representation
A fast double-buffered stream to wrap Cmd Pipe, ie. open(::Cmd)
- Host: GitHub
- URL: https://github.com/cihga39871/bibufferedstreams.jl
- Owner: cihga39871
- License: mit
- Created: 2024-10-26T16:41:41.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2024-11-04T13:54:43.000Z (7 months ago)
- Last Synced: 2025-03-10T03:11:29.292Z (3 months ago)
- Language: Julia
- Homepage:
- Size: 12.7 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# BiBufferedStreams.jl
A fast double-buffered stream to wrap Cmd Pipe (eg: `open(x::Cmd)`). It fills a buffer using Cmd asyncly, and another buffer is ready to take from the main process.
In Julia v1.11.1, calling `open(x::Cmd)` is extremely slower than v1.10.4, which is why I developed this package.
## Usage
```julia
using BiBufferedStreamsbam = "/path/to/a/bam/file"
io = BiBufferedStream(open(`samtools view -h $bam`))while !eof(io)
line = readline(io)
# do something
endclose(io)
```## Benchmark
It was tested by reading a bam file (114 MB, 1_000_002 lines). After precompilation, the benchmark function runs 3 times. Scripts are followed by results.
### Results
In Julia v1.11.1, `BiBufferedStream` is ***6.73X*** faster than `open`: 1.41s vs. 9.50s.
In Julia v1.10.4, `BiBufferedStream` is ***1.26X*** faster than `open`: 1.74s vs. 2.19s.
#### Detail `@time` outputs:
Julia v1.11.1:
- `BiBufferedStream(open(cmd))`
- 1.476861 seconds (3.06 M allocations: 815.326 MiB, 7.63% gc time)
- 1.397820 seconds (3.06 M allocations: 815.330 MiB, 7.38% gc time)
- 1.379766 seconds (3.06 M allocations: 815.340 MiB, 7.47% gc time)- `open(cmd)`
- 9.370884 seconds (9.84 M allocations: 1.516 GiB, 0.64% gc time)
- 9.574880 seconds (9.84 M allocations: 1.516 GiB, 0.74% gc time)
- 9.752019 seconds (9.84 M allocations: 1.516 GiB, 0.86% gc time)Julia v1.10.4:
- `BiBufferedStream(open(cmd))`
- 1.741036 seconds (3.06 M allocations: 847.324 MiB, 9.60% gc time)
- 1.782255 seconds (3.06 M allocations: 847.315 MiB, 9.59% gc time)
- 1.681973 seconds (3.06 M allocations: 847.310 MiB, 10.87% gc time)- `open(cmd)`
- 2.187896 seconds (5.00 M allocations: 1.400 GiB, 15.49% gc time)
- 2.186295 seconds (5.00 M allocations: 1.400 GiB, 15.39% gc time)
- 2.194213 seconds (5.00 M allocations: 1.400 GiB, 15.76% gc time)### Script
```julia
using BiBufferedStreamsbam = "/path/to/a/bam/file"
function bench_readline(cmd; use_bi_buffered_stream=true)
if use_bi_buffered_stream
io = BiBufferedStream(open(cmd))
else
io = open(cmd)
end
c = 0
@time while !eof(io)
line = readline(io)
c += length(line)
end
close(io)
c
endcmd = `samtools view -@ 10 -h $bam`
bench_readline(cmd; use_bi_buffered_stream=true)
bench_readline(cmd; use_bi_buffered_stream=true)
bench_readline(cmd; use_bi_buffered_stream=true)
bench_readline(cmd; use_bi_buffered_stream=true)bench_readline(cmd; use_bi_buffered_stream=false)
bench_readline(cmd; use_bi_buffered_stream=false)
bench_readline(cmd; use_bi_buffered_stream=false)
bench_readline(cmd; use_bi_buffered_stream=false)
```## API
Currently, the following API is available.
```julia
BiBufferedStream(io::IO)readline(io::BiBufferedStream; keep::Bool = false)
readuntil(io::BiBufferedStream, delim::Union{Char,UInt8}; keep::Bool = false)
eof(io::BiBufferedStream)
close(io::BiBufferedStream)
```