{"id":22912023,"url":"https://github.com/opencoff/fastdd","last_synced_at":"2025-05-09T01:33:40.424Z","repository":{"id":75756955,"uuid":"190689312","full_name":"opencoff/fastdd","owner":"opencoff","description":"Fast \"dd\" implementation that leverages Linux's splice(2)","archived":false,"fork":false,"pushed_at":"2022-03-20T21:45:44.000Z","size":73,"stargazers_count":18,"open_issues_count":2,"forks_count":4,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-31T20:39:14.430Z","etag":null,"topics":["dd","disk-dump","multithreaded-copy","splice"],"latest_commit_sha":null,"homepage":null,"language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/opencoff.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-06-07T04:46:35.000Z","updated_at":"2024-09-13T23:05:55.000Z","dependencies_parsed_at":"2023-06-07T14:45:46.983Z","dependency_job_id":null,"html_url":"https://github.com/opencoff/fastdd","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opencoff%2Ffastdd","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opencoff%2Ffastdd/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opencoff%2Ffastdd/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/opencoff%2Ffastdd/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/opencoff","download_url":"https://codeload.github.com/opencoff/fastdd/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253174390,"owners_count":21865856,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dd","disk-dump","multithreaded-copy","splice"],"created_at":"2024-12-14T04:19:35.630Z","updated_at":"2025-05-09T01:33:40.409Z","avatar_url":"https://github.com/opencoff.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# What is `fastdd`\n`fastdd` is a performance enhanced, simpler implementation of `dd`.\nOn Linux, it uses `splice(2)` to avoid data-copy to/from user space.\nOn other platforms, it uses multiple threads and larger block sizes\nto speed up I/O.\n\n`fastdd` doesn't try to emulate all the behavior of `dd`; thus, the\nnotion of \"blocks\" (and \"block size\") is used only as a multipler\nfor the number of bytes to move. The I/O size is picked based on the\nsource and destination (default is 64k).\n\n# What part of `dd` does it implement?\nIt only supports a few options of dd:\n\n * bs=N\n * count=N\n * skip=N     -- skip N input blocks\n * seek=N     -- skip N output blocks before first write\n * if=FILE\n * of=FILE\n * iflag=nonblock\n * oflag=nonblock,excl,sync\n\n Each of the integer arguments `N` can have an optional suffix of\n `k`, `M`, `G`, `T`, `P` for kilo, Mega, Giga, Tera, Peta byte\n respectively (**multiples of 1024**).\n\nThe source and destination can be a file, pipe, character-device or\nblock-device.\n\n# Building and Installing `fastdd`\n`fastdd` currently is designed for Linux, OpenBSD and MacOS;\ntherefore the makefile only supports those 3 OSes.\n\nYou need `GNU make`, `gcc`. There is no `configure` mess, the\ncode is organized to be generic enough to build on modern POSIX\nsystems (depends on pthreads):\n\nOn Darwin and OpenBSD:\n\n    $ gmake # or gnumake\n\nOn Linux:\n\n    $ make\n\nBy default, the makefile generates a \"release\" binary (with full\noptimization). The build artifacts produces the `fastdd` binary\nin a OS specific directory:\n\n* Linux: `Linux-rel`\n* Mac OS: `Darwin-rel`\n* OpenBSD: `OpenBSD-rel`\n\n## Installing `fastdd`\nFastdd can be installed in any location:\n\n1. System default: `sudo make install DESTDIR=/usr`\n2. Local install: `sudo make install DESTDIR=/usr/local`\n3. Home dir: `make install DESTDIR=$HOME`\n\nIn each case, the `fastdd` binary goes in `$DESTDIR/bin`.\n\n## Using `fastdd`\nFastdd is written to be usable without having to remember\ncomplicated flags. You only need to know \"if=\" and \"of=\".\nFor the most common use cases of copying from a file to a block\ndevice, you don't need to specify `bs=` or `count=` arguments;\n`fastdd` can infer the input size and use a platform optimal block\nsize for copying.\n\ne.g., if your USB device on Linux was on `/dev/sde`, then you can\nturn it into a bootable ISO like so:\n\n    fastdd if=my.iso of=/dev/sde\n\nThat's it.  If you do forget what flags to use, try:\n\n    fastdd --help\n\nUnless the `--quiet` option is used, `fastdd` prints a progress bar\nto show its incremental progress. When the input size is known, the\nprogress bar is \"rich\" (it shows progress \u0026 completion %). When the\ninput is unknown (e.g., from a pipe), the progress bar is simple -\nonly showing number of bytes written. In either case, the sizes are\nhuman friendly (kB, MB, etc.).\n\n# Performance Numbers\nAnecdotally, on OpenBSD and Darwin, the multi-threaded version seems\nto be faster than the native dd. On Linux, the version with\n`splice(2)` seems to be faster than `dd`.\n\n**TODO**: One of these days, I will sit and write a repetable benchmark\nscript.\n\n\n# Developer Notes\nOn Linux, `fastdd` is single-threaded and uses `splice(2)` for\nmoving data in the kernel avoiding all user-space reads. If\nneither source nor destination are pipes (sockets), `fastdd`\ncreates an intermediate pipe to splice the data.\n\nOn other platforms, `fastdd` is multi-threaded and uses a separate\nread thread to gather I/O blocks. The reader and writer communicate\nvia producer-consumer queue.\n\nIn both cases, I/O (`splice(2)` or `read(2)`) is done in units of\n`iosize` (command line parameter).\n\n## Testing \u0026 Test Framework\nThere are two test harnesses:\n\n1. `basic-tests.sh`: This is simple set of test cases to cover basic\n   functionality, command-line flags, `oflag` combinations etc.\n\n2. `tests.sh`: This is a data-file test driver that reads from\n   `tests.in` and runs each test in turn. Each test comprises of\n   generating some input, using `fastdd` to move the data from source\n   to destination, repeating the same data movement using `dd` and\n   comparing checksums of the resulting output.\n\n### Using tests.sh\nNew tests can be added into 'tests.in' or its own input file. The\nformat of the input file is documented in tests.in.\n\nRunning the data-driven tests is simple:\n\n    ./tests.sh tests.inp\n\n## Debug builds\nYou can build a debug version of the program:\n\n    make DEBUG=1 -j5\n\nAnd the build artifacts will be in the directory `$OS-dbg`.\n\n## Guide to Source Code\n\n* fastdd.c - `main()` for `fastdd`.\n\n* args.c - `fastdd` command line parsing (key=value). It uses a\n  table driven approach to parse the values directly into a struct\n  instance (uses `offsetof()`).\n\n* blksize_darwin.c - Implementation of `Blksize()` for Mac OS\n  (tested on 10.11 -- 10.14)\n\n* blksize_linux.c - Implementation of `Blksize()` for Linux (tested\n  on Linux 4.18)\n\n* blksize_openbsd.c - Implementation of `Blksize()` for OpenBSD\n  (tested on 6.5).\n\n* copy_linux.c - Implementation of `Copy()` for Linux using\n  `splice(2)`.\n\n* copy_posix.c - Implementation of `Copy()` using pthreads for\n  non-Linux platforms (tested only on Darwin and OpenBSD).\n\n* disksize.c - Small test program to call `Blksize()` and print the\n  resulting disk size.\n\n* utils.c - I/O utility functions.\n\n* opts.c - Auto-generated file for parsing long and short options;\n  the command line options are in opts.in. The code uses standard\n  `getopt_long()` - but removes the tedium of having to write the\n  option processing by hand.\n\nThere is a separate directory `portable/` that is a partial copy of\nan [external repository](github.com/opencoff/portable-lib). This\nsubdirectory only contains the files needed to build `fastdd`. Some\nfunctionality from that lib:\n\n    * Generic (type-safe) lists, producer-consumer queue\n    * Implementation of POSIX semaphores for Darwin\n    * functions to convert \"sizes\" to strings and vice versa (a size\n      string is a numeric string with a suffix of `[kKMGTPE]`).\n\n## Adding support for other OSes\nThe easiest way to add support to other POSIX OSes (FreeBSD,\nDragonFlyBSD etc.), is:\n\n* Implement `Blksize()` for that OS - follow similar implementations\n  as in in *blksize_darwin.c*, *blksize_openbsd.c* etc.\n\n* Make changes to GNUmakefile:\n\n    *  Add the OS specific object files to its var\n    *  Add any necessary LD libs to its var\n\ne.g., for some POSIX OS \"foo\":\n\n    * write code for *blksize_foo.c*\n    * *GNUMakefile* changes:\n       1. `foo_objs = blksize_foo.o copy_posix.o`\n       2. `foo_LIBS =`\n\n\nThis will give you a working version that uses pthreads for I/O. If\nyour OS supports `splice(2)` like functionality, you have to write\nthe fast-path code for `Copy()` and put it in *copy_$OS.c*.\n\n## TODO\n* Benchmark suite to measure performance on supported platforms\n* For non-linux platforms, is `mmap(2)` for source and/or\n  destination worth it?\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopencoff%2Ffastdd","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fopencoff%2Ffastdd","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopencoff%2Ffastdd/lists"}