{"id":20225267,"url":"https://github.com/oliverzh2000/audio-networking","last_synced_at":"2026-05-29T20:31:39.811Z","repository":{"id":137939316,"uuid":"127451589","full_name":"oliverzh2000/audio-networking","owner":"oliverzh2000","description":"Reliable, Connection-oriented, and point-to-point digital communication library using analog audio cables.","archived":false,"fork":false,"pushed_at":"2019-07-28T01:16:14.000Z","size":263,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-01-13T23:26:56.029Z","etag":null,"topics":["computer-networking","digital-communication","ethernet","tcp-ip"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/oliverzh2000.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-03-30T16:51:40.000Z","updated_at":"2024-02-07T18:43:42.000Z","dependencies_parsed_at":null,"dependency_job_id":"3ca3481f-e7b8-400f-a16d-706b9fe2fe49","html_url":"https://github.com/oliverzh2000/audio-networking","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oliverzh2000%2Faudio-networking","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oliverzh2000%2Faudio-networking/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oliverzh2000%2Faudio-networking/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oliverzh2000%2Faudio-networking/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/oliverzh2000","download_url":"https://codeload.github.com/oliverzh2000/audio-networking/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241670090,"owners_count":20000325,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-networking","digital-communication","ethernet","tcp-ip"],"created_at":"2024-11-14T07:12:01.916Z","updated_at":"2026-05-29T20:31:39.800Z","avatar_url":"https://github.com/oliverzh2000.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# audio-networking\nReliable, Connection-Oriented point-to-point digital communication library in Java via analog audio (cabled).\n\nInspired by TCP/IP \u0026 Ethernet. \n\nTransmit directories, files, and aribitrary digital data between two computers at ~`3Kbit/s`. \n\n#### Physical Setup\n* Need 2 computers with analog audio input/output jacks, as well as 2 male-male audio cables.\n* Connect the analog audio input of one machine 1 to the output of the machine 2, and vice versa.\n\n**Note:** This project is designed to be used with audio cables. In-air transmission will result very high packet loss rates (if\neven usable at all).\n\n## audio-networking Architecture (Bottom up)\n### Audio I/O\nReal-time audio I/O is done using Java's `javax.sound.sampled` package. Audio format is 44.1Khz, 8-bit, mono, signed PCM, little-endian.\nThe `RealTimeAudioIO` and `WavFileIO` classes implement the `AudioIO` interface. For testing, the class `WavFileAudioIO` can be \nused to read and write audio to `.wav` files.\n## Line Encoding\nLine encoding is the process that logical bits (`1`, `0`) are converted into a pattern of analog levels for transmission. In our\ncase, logical bits need to be converted into a pattern of audio levels.\n\nManchester encoding is used as the line encoding scheme in `audio-networking`. \n\n#### Encoding\nEncoding works by transmitting logical `1`s as an upward transitions in analog level and logical `0`s as downward transitions.\nYou can think of this as `XOR`ing logcal bits with a clock signal that alternates high-low in the duration of one logical bit.\n\n```\n   +----------------------------------------------------------------------------------------------+\n   |                            ---------               ---------       -----------------         |\n   |                                    |               |       |       |                         |\n   |            Logical bits:       1   |   0       0   |   1   |   0   |   1       1             |\n   |                                    |               |       |       |                         |\n   |                                    -----------------       ---------                         |\n   |                    --------+-------+-------+-------+-------+-------+-------+-------+-------\u003e |\n   |                                -----   -----   -----   -----   -----   -----   -----         |\n   |                                |   |   |   |   |   |   |   |   |   |   |   |   |             |\n   |                   Clock:       |   |   |   |   |   |   |   |   |   |   |   |   |             |\n   |                                |   |   |   |   |   |   |   |   |   |   |   |   |             |\n   |                            -----   -----   -----   -----   -----   -----   -----             |\n   |                    --------+-------+-------+-------+-------+-------+-------+-------+-------\u003e |\n   |                                ---------   -----       ---------       -----   -----         |\n   |                                |       |   |   |       |       |       |   |   |             |\n   |          Encoded signal:       |       |   |   |       |       |       |   |   |             |\n   | (convention: IEEE 802.3)       |       |   |   |       |       |       |   |   |             |\n   |                            -----       -----   ---------       ---------   -----             |\n   +----------------------------------------------------------------------------------------------+                                                                          \n```\n\nIt has 2 important properties:\n1. Continous runs of either `1` or `0` do not produce a steady-state analog signal, due to the guaranteed mid-bit transition. Since the sound cards of most modern computers are designed to\n   filter out any direct current (DC) bias, this property is hugely important for `audio-networking`.\n2. Removes the need for an auxilliary channel to transmit the clock signal for synchronization between sender and reciever. \n   This dramatically simplifies design, but it does come at the cost of effectively doubling the bandwidth requirement (can you see why?). \n\nAlthough the table above depicts the Manchester encoded signal as square waves, in reality they are written to`AudioIO` as rounded\nsquare waves in order to reduce their high-frequency component, since a there is significant distortion that comes with writing and reading\nhigh-frequency waveforms that are only a few samples wide. \n\nThe default bit duration is 8 samples. On computers with `44.1KHz` sample rate, this translates to a `44100/8 = 5512.5 bit/s` \nmaximum bitrate (still orders of magnitudes less than the theoretical maximum bitrate, though). \nAfter accounting for the overhead of frames, and the inter-frame gaps, you can get ~`3Kbit/s`.\n\nShorter bit durations give higher bitrates but unfortunately also increases audio distortion and the likelyhood of a bit-error in transmission. \nComputers with higher-quality digital-to-analog and analog-to-digital converters may be able to handle shorter bit durations, while \nstill keeping error-rate acceptably low. \n\n#### Decoding\nBecause Manchester encoding ensures that each logical bit has a transition in it, the reciever can easily synchronize its clock and decode.\nThe reciever can correctly decode bits if their duration has shifted by less than `3/4` times their original length.\nDistortion of the audio signal will affect recieved bit length.\n\n## Framing\nAt this point we are able to send a stream of logical bits by writing them to audio. \nWe can also recieve them, but with with no guarantee of accuracy. \n\nPacking the raw binary stream into frames allows this binary data to be sent with important metadata, \nand creates a convienient way to add error-checking functionality. The table below lists the frame sections and their respective sizes:\n```\n+--------------+----------------+--------------------------------------------------------------------------------------+--------------------+-----------------+\n| Section      | Preamble + SOF | Header                                                                               | Payload (optional) | Inter-Frame gap |\n+--------------+----------------+--------------------------------------------------------------------------------------+--------------------+-----------------+\n| Subsection   | Preamble | SoF | source | dest | seq | syn | ack | fin | beg | pad | protocol | pay_length | head_chk | data   | pay_chk   |                 |\n+--------------+----------+-----+--------+------+-----+-----+-----+-----+-----+-----+----------+------------+----------+--------+-----------+-----------------+\n| Size (bytes) |        8       | 2      | 2    | 1   | 1   |     |     |     |     | 1        | 2          | 4        | N      | 4         | min=2           |\n+--------------+----------------+--------+------+-----+-----+-----+-----+-----+-----+----------+------------+----------+--------+-----------+-----------------+\n| Size (bits)  | 62       | 2   |        |      | 8   | 1   | 1   | 1   | 1   |     |          |            |          |        |           |                 |\n+--------------+----------+-----+--------+------+-----+-----+-----+-----+-----+-----+----------+------------+----------+--------+-----------+-----------------+\n```\n#### Preamble + Start of frame delimiter\nBefore Line encoding, the preamble is a string 62 alternating `1` and `0`. The start-of-frame (SoF) delimiter is `11`.\nTransmitted from left to right, Preamble + SoF looks like this before line encoding:\n```\n10101010 10101010 10101010 10101010 10101010 10101010 10101010 10101011\n```\nThis long preamble sequence serves an important function clock synchronization in Ethernet. In `audio-networking`, clock \nsynchronization is not nearly as large a problem as it is in Ethernet, and synchronization can be done as soon as the SoF delimiter is read.\nHowever, due to the quirks of the sound card on certain computers, there tends to be a large amount of distortion in the first \n~10-20 samples written after a period of silence. Therefore the 'unnecessarily' long preamble serves as a disposable safety net.\n\nNote that when bit lengths are 8, and sample rate is `44.1KHz`, the preamble will be a `44100/8/2 = 2756.25 Hz` tone after\nbeing Manchester encoded. If you record `audio-networking` during a period of transmission, on playback you will hear the preamble as a \nvery short chip. \n\n#### Header\nThe header holds information important to higher-level components in `audio-networking`, and will be discussed in detail later.\nThe `pay_length` field of the header represents the number of bytes in the data payload. \n\n#### Payload\nThe payload section is where the 'real' binary data is stored. Payload is optional becasue all frames nessecary for the set-up, maintainence, and tear-down\nof connections are header-only. \n\n#### Checksums\nBoth the header and the payload have 32-bit checksum fields (`head_chk` and `pay_chk`). \nIf either of these checksums are incorrect, the entire frame is discarded. \nBecause the entire frame is discarded, it makes more sense to use smaller frames when probability of bit errors occuring in \ntransit are high.\n\nThe `FrameIO` interface exposes blocking methods for sending and reciving frames. \n\n## Connections (high-level overview)\nEach machine is able to set up as many instances of `Connection` as it wants to via `ConnectionHost`. \n`ConnectionHost` manages all the inbound and outbound frames and distributes the inbound frames to the correct 'owner' Connection based\non source and desination `Address`. \n\nConnections expose a public interface to send and recieve messages. When a `message` is sent, it is automatically\nbroken down and sent as multiple frames if nessecary. \nMessages are then pieced together from the frames that a `Connection` recieves.\nThis entire process is invisible to the end user of the `Connection`.\n\n#### Set Up\nConnections are set-up with a three-way handshake in a way very similar to how it's done in TCP. \n\n#### Tear Down\nConnections are terminated by sending a Header-only frame with the `fin` bit set. Once the `ConnectionHost` recieves an `ack` frame, \nthe Connection is terminated for good.\n\n#### Acknowledgements and Retransmission\nAfter a `frame` is sent, the `Connection` that sent the frame expects to recieve `frame` with its `ack` bit set. This signifies reciept-of-message\nand allows the `Connection` to proceed to send the next `frame` in its outbound queue. In the case that no acknowledgement is recieved\nafter a predefined `timeout` has elapsed, the last `frame` will be retransmitted. \n\n#### Frame Order Guarantee\nWhenever a `frame` is sent, it has its `seq` field set to the current frame sequence number. `seq` is incremented each time an outgoing\nframe is sent and also whenever an incoming `ack` frame matching the expected (current) sequence number is recieved. \nFrames recieved with the wrong `seq` field are ignored, which guarantees `frame` order. It is also possible to cache the last few\nframes instead of discarding them in hopes that future frames recieved will restore order - but this is not implemented yet. \n\n#### Addresses\nThe source and destination `Address` fields in each frame are 16 bits long, composed of a `host` byte and a `port` byte. `host` should \nbe unique to the machine.\n\n#### Ping utility\n`ConnectionHost` exposes a simple ping method that tests for the reachability of an arbitrary host. Prints information about\nround-trip-time (RTT) and percentage packet loss. \n\n## File Transfer\nThe `FileTransferProtocol` class runs on top of a `Connection` and can send and request files and directories with another host.\nOf course, reliable delivery of messages is already guaranteed by `Connection`, so the job of `FileTransferProtocol` is relatively easy.\n\n## Future Improvements\n1. Make the switch to stereo audio to take advantage of 2 channels of transmission to double bitrate. Can be done:\n   * Asynchronously: run independent connections in each channel.\n   * Synchronously: alternate bits read/written from lineEncoder between 2 channels - effectively halves the length of each frame.\n2. Encrypted Connections\n3. Flow and congestion control to connect more than two machines together (will require additional hardware - i.e. n-channel audio mixer)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foliverzh2000%2Faudio-networking","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Foliverzh2000%2Faudio-networking","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foliverzh2000%2Faudio-networking/lists"}