{"id":13768060,"url":"https://github.com/kristapsdz/openrsync","last_synced_at":"2025-05-16T09:02:35.820Z","repository":{"id":34122035,"uuid":"166116085","full_name":"kristapsdz/openrsync","owner":"kristapsdz","description":"BSD-licensed implementation of rsync","archived":false,"fork":false,"pushed_at":"2025-01-27T04:36:46.000Z","size":682,"stargazers_count":464,"open_issues_count":4,"forks_count":28,"subscribers_count":21,"default_branch":"master","last_synced_at":"2025-04-14T00:52:03.105Z","etag":null,"topics":["rsync"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"isc","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kristapsdz.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2019-01-16T21:38:50.000Z","updated_at":"2025-04-13T02:11:03.000Z","dependencies_parsed_at":"2025-04-14T00:48:32.871Z","dependency_job_id":"9523144e-d26b-4b45-b916-c6b4e83ae7ff","html_url":"https://github.com/kristapsdz/openrsync","commit_stats":null,"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kristapsdz%2Fopenrsync","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kristapsdz%2Fopenrsync/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kristapsdz%2Fopenrsync/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kristapsdz%2Fopenrsync/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kristapsdz","download_url":"https://codeload.github.com/kristapsdz/openrsync/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254501548,"owners_count":22081526,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["rsync"],"created_at":"2024-08-03T16:01:16.079Z","updated_at":"2025-05-16T09:02:35.784Z","avatar_url":"https://github.com/kristapsdz.png","language":"C","readme":"# Introduction\n\n**This system has been merged into OpenBSD base.  If you'd like to\ncontribute to openrsync, please mail your patches to tech@openbsd.org.\nThis repository is simply the OpenBSD version plus some glue for\nportability.**\n\nThis is an implementation of [rsync](https://rsync.samba.org/) with a\nBSD (ISC) license.  It's compatible with a modern rsync (3.1.3 is used\nfor testing, but any supporting protocol 27 will do), but accepts only a\nsubset of rsync's command-line arguments.\n\nIts officially-supported operating system is OpenBSD, but it will\ncompile and run on other UNIX systems.  See [Portability](#Portability)\nfor details.\n\nThe canonical documentation for openrsync is its manual pages.  See\n[rsync(5)](https://github.com/kristapsdz/openrsync/blob/master/rsync.5)\nand\n[rsyncd(5)](https://github.com/kristapsdz/openrsync/blob/master/rsyncd.5)\nfor protocol details or utility documentation in\n[openrsync(1)](https://github.com/kristapsdz/openrsync/blob/master/openrsync.1).\nIf you'd like to write your own rsync implementation, the protocol\nmanpages should have all the information required.\n\nThe [Architecture](#Architecture) and [Algorithm](#Algorithm) sections\non this page serve to introduce developers to the source code.  They are\nnon-canonical.\n\n## Project background\n\nopenrsync is written as part of the\n[rpki-client(1)](https://medium.com/@jobsnijders/a-proposal-for-a-new-rpki-validator-openbsd-rpki-client-1-15b74e7a3f65)\nproject, an\n[RPKI](https://en.wikipedia.org/wiki/Resource_Public_Key_Infrastructure)\nvalidator for OpenBSD.  openrsync was funded by\n[NetNod](https://www.netnod.se), [IIS.SE](https://www.iis.se),\n[SUNET](https://www.sunet.se) and [6connect](https://www.6connect.com).\n\n# Installation\n\nOn an up-to-date UNIX system, simply download and run:\n\n```\n% ./configure\n% make\n# make install\n```\n\nThis will install the openrsync utility and manual pages.\nIt's ok to have an installation of rsync at the same time: the two will\nnot collide in any way.\n\nIf you upgrade your sources and want to re-install, just run the same.\nIf you'd like to uninstall the sources:\n\n```\n# make uninstall\n```\n\nIf you'd like to interact with the openrsync as a server, you can run\nthe following:\n\n```\n% rsync --rsync-path=openrsync src/* dst\n% openrsync --rsync-path=openrsync src/* dst\n```\n\nIf you'd like openrsync and rsync to interact, it's important to use\ncommand-line flags available on both.\nSee\n[openrsync(1)](https://github.com/kristapsdz/openrsync/blob/master/openrsync.1)\nfor a listing.\n\n# Algorithm\n\nFor a robust description of the rsync algorithm, see \"[The rsync\nalgorithm](https://rsync.samba.org/tech_report/)\", by Andrew Tridgell\nand Paul Mackerras.\nAndrew Tridgell's PhD thesis, \"[Efficient Algorithms for Sorting and\nSynchronization](https://www.samba.org/~tridge/phd_thesis.pdf)\", covers the\ntopics in more detail.\nThis gives a description suitable for delving into the source code.\n\nThe rsync algorithm has two components: the *sender* and the *receiver*.\nThe sender manages source files; the receiver manages the destination.\nIn the following invocation, first the sender is host *remote* and the\nreceiver is the localhost, then the opposite.\n\n```\n% openrsync -lrtp remote:foo/bar ~/baz/xyzzy\n% openrsync -lrtp ~/foo/bar remote:baz/xyzzy\n```\n\nThe algorithm hinges upon a file list of names and metadata (e.g., mode,\nmtime, etc.) shared between components.\nThe file list describes all source files of the update and is generated\nby the sender.\nThe sharing is implemented in\n[flist.c](https://github.com/kristapsdz/openrsync/blob/master/flist.c).\n\nAfter sharing this list, both the receiver and sender independently sort\nthe entries by the filenames' lexicographical order.\nThis allows the file list to be sent and received out of order.\nThe ordering preserves a directory-first order, so directories are\nprocessed before their contained files.\nMoreover, once sorted, both sender and receiver may refer to file\nentries by their position in the sorted array.\n\nAfter the receiver reads the list, it iterates through each file in\nthe list, passing information to the sender so that the sender may send\nback instructions to update the file.\nThis is called the \"block exchange\" and is the maintstay of the rsync\nalgorithm.\nDuring the block exchange, the sender waits to receive a request for\nupdate or end of sequence message; once a request is received, it scans\nfor new blocks to send to the receiver.\n\nOnce the block exchange is complete, the files are all up to date.\n\nThe receiver is implemented in\n[receiver.c](https://github.com/kristapsdz/openrsync/blob/master/receiver.c);\nthe sender, in\n[sender.c](https://github.com/kristapsdz/openrsync/blob/master/sender.c).\nA great deal of the block exchange happens in\n[blocks.c](https://github.com/kristapsdz/openrsync/blob/master/blocks.c).\n\n## Block exchange\n\nThe block exchange sequence is different for whether the file is a\ndirectory, symbolic link, or regular file.\n\nFor symbolic links, the information required by the receiver is already\nencoded in the file list metadata.\nThe symbolic link is updated to point to the correct target.\nNo update is requested from the sender.\n\nFor directories, the directory is created if it does not already exist.\nNo update is requested from the sender.\n\nRegular files are handled as follows.\nFirst, the file is checked to see if it's up to date.\nThis happens if the file size and last modification time are the same.\nIf so, no update is requested from the sender.\n\nOtherwise, the receiver examines each file in blocks of a fixed size.\nSee [Block sizes](#block-sizes) for details.\n(The terminal block may be smaller if the file size is not divisible by\nthe block size.)\nIf the file is empty or does not exist, it will have zero blocks.\nEach block is hashed twice: first, with a fast Adler-32 type 4-byte\nhash; second, with a slower MD4 16-byte hash.\nThese hashes are implemented in\n[hash.c](https://github.com/kristapsdz/openrsync/blob/master/hash.c).\nThe receiver sends the file's block hashes to the sender.\n\nOnce accepted, the sender examines the corresponding file with the given\nblocks.\nFor each byte in the source file, the sender computes a fast hash given\nthe block size.\nIt then looks for matching fast hashes in the sent block information.\nIf it finds a match, it then computes and checks the slow hash.\nIf no match is found, it continues to the next byte.\nThe matching (and indeed all block operation) is implemented in\n[block.c](https://github.com/kristapsdz/openrsync/blob/master/block.c).\n\nWhen a match is found, the data prior to the match is first sent as a\nstream of bytes to the receiver.\nThis is followed by an identifier for the found block, or zero if no\nmore data is forthcoming.\n\nThe receiver writes the stream of bytes first, then copies the data in\nthe identified block if one has been specified.\nThis continues until the end of file, at which point the file has been\nfully reconstituted.\n\nIf the file does not exist on the receiver side---the basis case---the\nentire file is sent as a stream of bytes.\n\nFollowing this, the whole file is hashed using an MD4 hash.\nThese hashes are then compared; and on success, the algorithm continues\nto the next file.\n\n## Block sizes\n\nThe block size algorithm plays a crucial role in the protocol\nefficiency.\nIn general, the block size is the rounded square root of the total file\nsize.\nThe minimum block size, however, is 700 B.\nOtherwise, the square root computation is simply\n[sqrt(3)](https://man.openbsd.org/sqrt.3) followed by\n[ceil(3)](https://man.openbsd.org/ceil.3) \n\nFor reasons unknown, the square root result is rounded up to the nearest\nmultiple of eight.\n\n# Architecture\n\nEach openrsync session is divided into a running *server* and *client*\nprocess.\nThe client openrsync process is executed by the user.\n\n```\n% openrsync -rlpt host:path/to/source dest\n```\n\nThe server openrsync is executed on a remote host either on-demand over\n[ssh(1)](https://man.openbsd.org/ssh.1) or as a persistent network\ndaemon.\nIf executed over [ssh(1)](https://man.openbsd.org/ssh.1), the server\nopenrsync is distinguished from a client (user-started) openrsync by the\n**--server** flag.\n\nOnce the client or server openrsync process starts, it examines the\ncommand-line arguments to determine whether it's in *receiver* or\n*sender* mode.\n(The daemon is sent the command-line arguments in a protocol-specific\nway described in\n[rsyncd(5)](https://github.com/kristapsdz/openrsync/blob/master/rsyncd.5),\nbut otherwise does the same thing.)\nThe receiver is the destination for files; the sender is the origin.\nThere is always one receiver and one sender.\n\nThe server process is explicitly instructed that it is a sender with the\n**--sender** command-line flag, otherwise it is a receiver.\nThe client process implicitly determines its status by looking at the\nfiles passed on the command line for whether they are local or remote.\n\n```\nopenrsync path/to/source host:destination\nopenrsync host:source path/to/destination\n```\n\nIn the first example, the client is the sender: it *sends* data from\nitself to the server.\nIn the second, the opposite is true in that it *receives* data.\n\nThe client's command-line files may have any of the following host\nspecifications that determine locality.\n\n- local: *../path/to/source ../another*\n- remote server: *host:path/to/source :path/to/another*\n- remote daemon: *rsync://host/module/path ::another*\n\nHost specifications must be consistent: sources must all be local or all\nbe remote on the same host.  Both may not be remote.  (**Aside**: it's\ntechnically possible to do this.  I'm not sure why the GPL rsync is\nlimited to one or the other.)\n\nIf the source or destination is on a remote server, the client then\n[fork(2)](https://man.openbsd.org/fork.2)s and starts the server\nopenrsync on the remote host over\n[ssh(1)](https://man.openbsd.org/ssh.1).\nThe client and the server subsequently communicate over\n[socketpair(2)](https://man.openbsd.org/socketpair.2) pipes.\nIf on a remote daemon, the client does *not* fork, but instead connects\nto the standalone server with a network\n[socket(2)](https://man.openbsd.org/socket.2).\n\nThe server's command-line, whether passed to an openrsync spawned on-demand\nover an [ssh(1)](https://man.openbsd.org/ssh.1) session or passed to the daemon, \ndiffers from the client's.\n\n```\nopenrsync --server [--sender] . files...\n```\n\nThe files given are either the single destination directory when in receiver\nmode, or the list of sources when in sender mode.\nThe standalone full-stop is a mystery to me.\n\nLocality detection and routing to client and server run-times are\nhandled in\n[main.c](https://github.com/kristapsdz/openrsync/blob/master/main.c).\nThe client for a server is implemented in\n[client.c](https://github.com/kristapsdz/openrsync/blob/master/client.c)\nand the server in\n[server.c](https://github.com/kristapsdz/openrsync/blob/master/server.c).\nThe client for a network daemon is in\n[socket.c](https://github.com/kristapsdz/openrsync/blob/master/socket.c).\nInvocation of the remote server openrsync is managed in\n[child.c](https://github.com/kristapsdz/openrsync/blob/master/child.c).\n\nOnce the client and server begin, they start to negotiate the transfer\nof files over the connected socket.\nThe protocol used is specified in\n[rsync(5)](https://github.com/kristapsdz/openrsync/blob/master/rsync.5).\nFor daemon connections, the\n[rsyncd(5)](https://github.com/kristapsdz/openrsync/blob/master/rsyncd.5)\nprotocol is also used for handshaking.\n\nThe receiver side is managed in\n[receiver.c](https://github.com/kristapsdz/openrsync/blob/master/receiver.c)\nand the sender in\n[sender.c](https://github.com/kristapsdz/openrsync/blob/master/sender.c).\n\nThe receiver side technically has two functions: not only must it upload\nblock metadata to the sender, it must also handle data writes as they\nare sent by the sender.\nThe rsync protocol is designed so that the sender receives block\nrequests and continuously sends data to the receiver.\n\nTo accomplish this, the receiver multitasks as the *uploader* and\n*downloader*.  These roles are implemented in\n[uploader.c](https://github.com/kristapsdz/openrsync/blob/master/uploader.c).\nand\n[downloader.c](https://github.com/kristapsdz/openrsync/blob/master/downloader.c),\nrespectively.\nThe multitasking takes place by a finite state machine driven by data\ncoming from the sender and files on disc are they are ready to be\nchecksummed and uploaded.\n\nThe uploader scans through the list of files and asynchronously opens\nfiles to process blocks.\nWhile it waits for the files to open, it relinquishes control to the\nevent loop.\nWhen files are available, it hashes and checksums blocks and uploads to\nthe sender.\n\nThe downloader waits on data from the sender.\nWhen data is ready (and prefixed by the file it will update), the\ndownloader asynchronously opens the existing file to perform any block\ncopying.\nWhen the file is available for reading, it then continues to read data\nfrom the sender and copy from the existing file.\n\n## Differences from rsync\n\nThe design of rsync involves another mode running alongside the\nreceiver: the generator.\nThis is implemented as another process\n[fork(2)](https://man.openbsd.org/fork.2)ed from the receiver, and\ncommunicating with the receiver and sender.\n\nIn openrsync, the generator and receiver are one process, and an event\nloop is used for speedy responses to read and write requests.\n\n# Security\n\nBesides the usual defensive programming, openrsync makes significant use\nof native security features.\n\nThe system operations available to executing code are foremost limited\nby OpenBSD's [pledge(2)](https://man.openbsd.org/pledge.2).  The pledges\ngiven depend upon the operating mode.  For example, the receiver needs\nwrite access to the disc---but only when not in dry-run mode (**-n**).\nThe daemon client needs DNS and network access, but only to a point.\n[pledge(2)](https://man.openbsd.org/pledge.2) allows available resources\nto be limited over the course of operation.\n\nThe second tool is OpenBSD's\n[unveil(2)](https://man.openbsd.org/unveil.2), which limits access to\nthe file-system.  This protects against rogue attempts to \"break out\" of\nthe destination.  It's an attractive alternative to\n[chroot(2)](https://man.openbsd.org/chroot.2) because it doesn't require\nroot permissions to execute.\n\nOn the receiver side, the file-system is \n[unveil(2)](https://man.openbsd.org/unveil.2)ed at and beneath the\ndestination directory.\nAfter the creation of the destination directory, only targets within\nthat directory may be accessed or modified.\n\nLastly, the MD4 hashs are seeded with\n[arc4random(3)](https://man.openbsd.org/arc4random.3) instead of with\n[time(3)](https://man.openbsd.org/time.3).  This is only applicable when\nrunning openrsync in server mode, as the server generates the seed.\n\n# Portability\n\nMany have asked about portability.\n\nThe only officially-supported operating system is OpenBSD, as this has\nconsiderable security features.  openrsync does, however, use\n[oconfigure](https://github.com/kristapsdz/oconfigure) for compilation\non non-OpenBSD systems.  This is to encourage porting.\n\nIt currently is portable across Linux (glibc and musl), FreeBSD, NetBSD,\nMac OS X, and OmniOS.  This is enforced by the GitHub CI mechanism,\nwhich tests on this systems.  Architectures tested for include x86\\_64,\naarch64, and s390x.\n\nThe actual work of porting is matching the security features provided by\nOpenBSD's [pledge(2)](https://man.openbsd.org/pledge.2) and\n[unveil(2)](https://man.openbsd.org/unveil.2).  These are critical\nelements to the functionality of the system.  Without them, your system\naccepts arbitrary data from the public network.\n\nThis is possible (I think?) with FreeBSD's\n[Capsicum](https://man.freebsd.org/capsicum(4)), but Linux's security\nfacilities are a mess, and will take an expert hand to properly secure.\n\n**rsync has specific running modes for the super-user**.\nIt also pumps arbitrary data from the network onto your file-system.\nopenrsync is about 10 000 lines of C code: do you trust me not to make\nmistakes?\n","funding_links":[],"categories":["C","Network"],"sub_categories":["Benchmarks"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkristapsdz%2Fopenrsync","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkristapsdz%2Fopenrsync","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkristapsdz%2Fopenrsync/lists"}