{"id":13508771,"url":"https://github.com/facebook/wdt","last_synced_at":"2025-04-29T14:37:52.922Z","repository":{"id":19009190,"uuid":"22231878","full_name":"facebook/wdt","owner":"facebook","description":"Warp speed Data Transfer (WDT)  is an embeddedable library (and command line tool) aiming to transfer data between 2 systems as fast as possible over multiple TCP paths.","archived":false,"fork":false,"pushed_at":"2025-04-21T16:59:39.000Z","size":2153,"stargazers_count":2900,"open_issues_count":79,"forks_count":388,"subscribers_count":170,"default_branch":"main","last_synced_at":"2025-04-21T17:45:39.940Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://www.facebook.com/WdtOpenSource","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/facebook.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2014-07-24T21:23:34.000Z","updated_at":"2025-04-21T16:59:44.000Z","dependencies_parsed_at":"2024-01-16T04:47:41.849Z","dependency_job_id":"3bb21607-8a04-48f6-a6ab-2e49c0c330e7","html_url":"https://github.com/facebook/wdt","commit_stats":{"total_commits":685,"total_committers":82,"mean_commits":8.353658536585366,"dds":0.6992700729927007,"last_synced_commit":"4cc8a21cfa29e55aa803365ab69248d0bf8fbb82"},"previous_names":["facebookarchive/wdt"],"tags_count":15,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebook%2Fwdt","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebook%2Fwdt/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebook%2Fwdt/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebook%2Fwdt/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/facebook","download_url":"https://codeload.github.com/facebook/wdt/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251520201,"owners_count":21602444,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T02:00:58.169Z","updated_at":"2025-04-29T14:37:52.898Z","avatar_url":"https://github.com/facebook.png","language":"C++","readme":"![](build/wdt_logo.png)\n`WDT` Warp speed Data Transfer\n------------------------------\n\n[![Join the chat at https://gitter.im/facebook/wdt](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/facebook/wdt?utm_source=badge\u0026utm_medium=badge\u0026utm_campaign=pr-badge\u0026utm_content=badge)\n\n[![Build Status](https://travis-ci.org/facebook/wdt.svg?branch=master)](https://travis-ci.org/facebook/wdt)\n\n## Design philosophy/Overview\n\nGoal:\nLowest possible total transfer time -  to be only hardware limited\n(disc or network bandwidth not latency) and as efficient as possible\n(low CPU/memory/resources utilization)\n\nWe keep dependencies minimal in order to maximize portability\nand ensure a small binary size. As a bonus, this also minimizes compile time.\n\nWe aren't using exceptions for performance reasons and because using exceptions\nwould make it harder to reason about the control flow of the library.\nWe also believe the WDT library is easier to integrate as a result.\nOur philosophy is to write moderately structured and encapsulated C code\nas opposed to using every feature of C++.\n\nWe try to minimize the number of system calls, which is one of the reasons\nwe are using blocking thread IOs. We can maximize system throughput because\nat any given point some threads are reading while others are writing, and data\nis buffered on both paths - keeping each subsystem busy while minimizing\nkernel to userspace switches.\n\n## Terminology\nWDT uses \"Mbytes\" everywhere in its output as 1024*1024 bytes = 1048576 bytes\n(technically this should be the new mebibyte (MiB) standard but it felt\nMbytes is be more in line with what other tools are using, clearer, easier\nto read and matching what a traditional \"megabyte\" used to mean in historical\nmemory units where the address lines are binary and thus power of two and not\nof ten)\n\n## Example\n\nWhile WDT is primarily a library, we also have a small command line tool\nwhich we use for tests and which is useful by itself. Here is a quick example:\n\n```\nReceiver side: (starts the server indicating destination directory)\n\n[ldemailly@devbig074]$ wdt -directory /data/users/ldemailly/transfer1\n\nSender side: (discover and sends all files in a directory tree to destination)\n\n[root@dev443]$ wdt -directory /usr/bin -destination devbig074.prn2\n\n[=================================================] 100% 588.8 Mbytes/s\nI0720 21:48:08.446014 3245296 Sender.cpp:314] Total sender time = 2.68699\nseconds (0.00640992 dirTime). Transfer summary : Transfer status = OK. Number\nof files transferred = 1887. Data Mbytes = 1582.08. Header Kbytes = 62.083\n(0.00383215% overhead). Total bytes = 1658999858. Wasted bytes due to\nfailure = 0 (0% overhead). Total sender throughput = 588.816 Mbytes/sec\n(590.224 Mbytes/sec pure transf rate)\n```\n\nNote that in this simple example with lots of small files (/usr/bin from\na linux distribution), but not much data (~1.5Gbyte), the maximum\nspeed isn't as good as it would with more data (as there is still a TCP ramp\nup time even though it's faster because of parallelization) like when we use\nit in our production use cases.\n\n## Performance/Results\n\nIn an internal use at Facebook to transfer RocksDB snapshot between hosts\nwe are able to transfer data at a throttled 600 Mbytes/sec even across\nlong distance, high latency links (e.g. Sweden to Oregon). That's 3x the speed\nof the previous highly optimized HTTP-based solution and with less strain on the\nsystem. When not throttling, we are able to easily saturate a 40 Gbit/s NIC and\nget near theoretical link speed (above 4 Gbytes/sec).\n\nWe have so far optimized WDT for servers with fast IOs - in particular flash\ncard or in-memory read/writes. If you use disks throughput won't be as good,\nbut we do plan on optimizing for disks as well in the future.\n\n## Dependencies\n\nCMake for building WDT - See [build/BUILD.md](build/BUILD.md)\n\ngflags (google flags library) but only for the command line,  the library\ndoesn't depend on that\n\ngtest (google testing) but only for tests\n\nglog (google logging library) - use W*LOG macros so everything logged by WDT\nis always prefixed by \"wdt\u003e\" which helps when embedded in another service\n\nParts of Facebook's Folly open source library (as set in the CMakefile)\nMostly conv, threadlocal and checksum support.\n\nFor encryption, the crypto lib part of openssl-1.x\n\nYou can build and embed wdt as a library with as little as a C++11 compiler\nand glog - and you could macro away glog or replace it by printing to stderr if\nneeded.\n\n## Code layout\n\n### Directories\n\n* top level\nMain WDT classes and Wdt command line source, CMakeLists.txt\n\n* util/\nUtilities used for implementing the main objects\n\n* test/\nTests files and scripts\n\n* build/\nBuild related scripts and files and utils\n\n\n* fbonly/\nStuff specific to facebook/ (not in open source version)\n\n* bench/\nBenchmark generation tools\n\n\n### Main files\n\n* CMakeLists.txt, .travis.yml, build/BUILD.md,travis_linux.sh,travis_osx.sh\nBuild definition file - use CMake to generate a Makefile or a project file for\nyour favorite IDE - details in [build/BUILD.md](build/BUILD.md)\n\n* wdtCmdline.cpp\n\nMain program which allows to have a server or client process to exercise\nthe library (for end 2 end test as well as a standalone utility)\n\n* wcp.sh\n\nA script to use wdt like scp for single big files - pending splitting support\ninside wdt proper the script does the splitting for you. install as \"wcp\".\n\n* WdtOptions.{h|cpp}\n\nTo specify the behavior of wdt. If wdt is used as a library, then the\ncaller get the mutable object of options and set different options accordingly.\nWhen wdt is run in a standalone mode, behavior is changed through gflags in\nwdtCmdLine.cpp\n\n* WdtThread.{h|cpp}\nCommon functionality and settings between SenderThread and ReceiverThread.\nBoth of these kind of threads inherit from this base class.\n\n* WdtBase.{h|cpp}\n\nCommon functionality and settings between Sender and Receiver\n\n* WdtResourceController.{h|cpp}\n\nOptional factory for Sender/Receiver with limit on number being created.\n\n### Producing/Sending\n\n* ByteSource.h\n\nInterface for a data element to be sent/transferred\n\n* FileByteSource.{h|cpp}\n\nImplementation/concrete subclass of ByteSource for a file identified as a\nrelative path from a root dir. The identifier (path) sent remotely is\nthe relative path\n\n* SourceQueue.h\n\nInterface for producing next ByteSource to be sent\n\n* DirectorySourceQueue.{h|cpp}\n\nConcrete implementation of SourceQueue producing all the files in a given\ndirectory, sorted by decreasing size (as they are discovered, you can start\npulling from the queue even before all the files are found, it will return\nthe current largest file)\n\n* ThreadTransferHistory.{h|cpp}\n\nEvery thread maintains a transfer history so that when a connection breaks\nit can talk to the receiver to find out up to where in the history has been\nsent. This class encapsulates all the logic for that bookkeeping\n\n* SenderThread.{h|cpp}\n\nImplements the functionality of one sender thread, which binds to a certain port\nand sends files over.\n\n* Sender.{h|cpp}\n\nSpawns multiple SenderThread threads and sends the data across to receiver\n\n### Consuming / Receiving\n\n* FileCreator.{h|cpp}\n\nCreates file and directories necessary for said file (mkdir -p like)\n\n* ReceiverThread.{h|cpp}\n\nImplements the functionality of the receiver threads, responsible for listening on\na port and receiving files over the network.\n\n* Receiver.{h|cpp}\n\nParent receiver class that spawns multiple ReceiverThread threads and receives\ndata from a remote host\n\n### Low level building blocks\n\n* ServerSocket.{h|.cpp}\n\nEncapsulate a server socket listening on a port and giving a file descriptor\nto be used to communicate with the client\n\n* ClientSocket.{h|cpp}\n\nClient socket wrapper - connection to a server port -\u003e fd\n\n* Protocol.{h|cpp}\n\nDecodes/Encodes meta information needed to interpret the data stream:\nthe id (file path) and size (byte length of the data)\n\n* SocketUtils.{h|cpp}\n\nCommon socket related utilities (both client/server, sender/receiver side use)\n\n* Throttler.{h|cpp}\n\nThrottling code\n\n* ErrorCodes.h\n\nHeader file for error codes\n\n* Reporting.{h|cpp}\n\nClass representing transfer stats and reports\n\n## Future development/extensibility\n\nThe current implementation works well and has high efficiency.\nIt is also extensible by implementing different byte sources both in and\nout. But inserting processing units isn't as easy.\n\nFor that we plan on restructuring the code to use a Zero copy stream/buffer\npipeline: To maintain efficiency, the best overall total transfer time and\ntime to first byte we can see WDT's internal architecture as chainable units\n\n[Disk/flash/Storage IO] -\u003e [Compression] -\u003e [Protocol handling]\n-\u003e [Encryption] -\u003e [Network IO]\n\nAnd the reverse chain on the receiving/writing end\nThe trick is the data is variable length input and some units can change length\nand we need to process things by blocks\nConstraints/Design:\n- No locking / contention when possible\n- (Hard) Limits on memory used\n- Minimal number of copies/moving memory around\n- Still works the same for simple\n   read file fd -\u003e control -\u003e write socked fd current basic implementation\n\nPossible Solution(?) API:\n- Double linked list of Units\n- read/pull from left (pull() ?)\n- push to the right (push() ?)\n- end of stream from left\n- propagate last bytes to right\n\nCan still be fully synchronous / blocking, works thanks to eof handling\n(synchronous gives us lock free/single thread - internally a unit is\nfree to use parallelization like the compression stage is likely to want/need)\n\nAnother thing we touched on is processing chunks out of order - by changing\nheader to be ( fileid, offset, size ) instead of ( filename, size )\nand assuming everything is following in 1 continuous block (will also help\nthe use case of small number of large files/chunks) : mmap'in\nthe target/destination file\nThe issue then is who creates it in what order - similar to the directory\ncreation problem - we could use a meta info channel to avoid locking/contention\nbut that requires synchronization\n\nWe want things to work with even up to 1 second latency without incurring\na 1 second delay before we send the first payload byte\n\n\n## Submitting diffs/making changes\n\nSee CONTRIBUTING.md\n\nPlease run the tests\n```\nCTEST_OUTPUT_ON_FAILURE=1 make test\n```\nAnd ideally also the manual tests (integration/porting upcoming)\n\nwdt_e2e_test.sh\nwdt_download_resumption_test.sh\nwdt_network_test.sh\nwdt_max_send_test.sh\n\n(facebook only:)\nMake sure to do the following, before \"arc diff\":\n```\n (cd wdt ; ./build/clangformat.sh )\n # if you changed the minor version of the protocol (in CMakeLists.txt)\n # run (cd wdt ; ./build/version_update.tcl ) to sync with fbcode's WdtConfig.h\n\n fbconfig  --clang --sanitize=address -r  wdt\n\n fbmake runtests --run-disabled --extended-tests\n # Optionally: opt build\n fbmake runtests_opt\n fbmake opt\n # Sender max speed test\n wdt/test/wdt_max_send_test.sh\n # Check buck build\n buck build wdt/...\n # Debug a specific test with full output even on success:\n buck test wdt:xxx -- --run-disabled --extended-tests --print-passing-details\\\n   --print-long-results\n```\n\nand check the output of the last step to make sure one of the 3 runs is\nstill above 20,000 Mbytes/sec (you may need to make sure you\n/dev/shm is mostly empty to get the best memory throughput, as well\nas not having a ton of random processes running during the test)\n\nAlso :\n\n* Update this file\n* Make sure your diff has a task\n* Put (relevant) log output of sender/receiver in the diff test plan or comment\n* Depending on the changes\n  * Perf: wdt/wdt_e2e_test.sh has a mix of ~ \u003e 700 files, \u003e 8 Gbytes/sec\n  * do run remote network tests (wdt/wdt_remote_test.sh)\n  * do run profiler and check profile results (wdt/fbonly/wdt_prof.sh)\n    80k small files at \u003e 1.6 Gbyte/sec\n","funding_links":[],"categories":["Networking","C++","others","内存分配"],"sub_categories":["网络"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffacebook%2Fwdt","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffacebook%2Fwdt","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffacebook%2Fwdt/lists"}