{"id":27928450,"url":"https://github.com/nikouu/mgba-lua-socket-loadtest","last_synced_at":"2026-05-11T05:34:33.739Z","repository":{"id":291493953,"uuid":"973441447","full_name":"nikouu/mGBA-lua-socket-loadtest","owner":"nikouu","description":"Finding the limits of mGBA socket communication.","archived":false,"fork":false,"pushed_at":"2025-05-04T08:36:21.000Z","size":471,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-07T02:07:12.856Z","etag":null,"topics":["aspnetcore","csharp","dotnet","lua","mgba","mgba-api","performance"],"latest_commit_sha":null,"homepage":"","language":"C#","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nikouu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-04-27T02:01:58.000Z","updated_at":"2025-05-04T08:40:04.000Z","dependencies_parsed_at":"2025-05-05T01:44:22.598Z","dependency_job_id":"6beac32f-75b2-4c9f-8348-af8cd2ea3af1","html_url":"https://github.com/nikouu/mGBA-lua-socket-loadtest","commit_stats":null,"previous_names":["nikouu/mgba-lua-socket-loadtest"],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nikouu%2FmGBA-lua-socket-loadtest","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nikouu%2FmGBA-lua-socket-loadtest/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nikouu%2FmGBA-lua-socket-loadtest/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nikouu%2FmGBA-lua-socket-loadtest/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nikouu","download_url":"https://codeload.github.com/nikouu/mGBA-lua-socket-loadtest/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252798854,"owners_count":21805888,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aspnetcore","csharp","dotnet","lua","mgba","mgba-api","performance"],"created_at":"2025-05-07T02:07:16.809Z","updated_at":"2026-05-11T05:34:33.727Z","avatar_url":"https://github.com/nikouu.png","language":"C#","funding_links":[],"categories":[],"sub_categories":[],"readme":"# mGBA-lua-socket-loadtest\n\nThis work continues from [mGBA-lua-Socket](https://github.com/nikouu/mGBA-lua-Socket) in order to hone the socket functionality used in [mGBA-http](https://github.com/nikouu/mGBA-http).\n\nThis project has three performance goals: \n1. Find the cleanest way to connect and disconnect sockets\n2. Find the throughput limits\n3. ~~Explore multiplexing~~ ended up not doing this\n\n[These changes have been folded into mGBA-http version 0.6.0 ⭐](https://github.com/nikouu/mGBA-http/releases/tag/0.6.0)\n\n## Preface\n\nThis document will go through each Git tag milestone to explain in depth what's being explored and why. \n\nThe structure of the work loosely matches mGBA-http with an HTTP endpoint that calls an injected `SocketService` object.\n\nWhen the .NET server is running, the message can be sent via:\n```\nhttps://localhost:7185/mgbaendpoint?message=a\n```\n\n**This work ran over a couple of weeks and it's written ramblings of my learning as I go. There will be incorrect statements/theories, me going down the wrong path in my learning, and I'm sure some buried problems. But this repo is really fun!**\n\n[heroldev/AGB-buttontest](https://github.com/heroldev/AGB-buttontest) is the ROM used when running mGBA.\n\n# Version 1 - Base state\n\n[Tag link](https://github.com/nikouu/mGBA-lua-socket-loadtest/tree/Version1)\n\nVersion 1 brings starts this project with:\n1. Singleton `SocketService`\n2. Lua script from [mGBA-lua-Socket](https://github.com/nikouu/mGBA-lua-Socket) modified to reflect back the given message\n\nWhile there will be issues when sending many messages at once due to the singleton socket, this version will only deal with simple messages now and again.\n\n# Version 2 - Automatic socket cleanup on app close\n\n[Tag link](https://github.com/nikouu/mGBA-lua-socket-loadtest/tree/Version2)\n\n## Handling socket cleanup on close\n\nThis version begins to look at how to deal with the different ways mGBA-http can be closed.\n\nWhen closing the server, the message from mGBA is inconsistent. I'm unsure if it's because the IDisposable of `SocketServer` isn't correctly run on shutdown or if there is something else. This is a problem in mGBA-http. The message on closing the server will be either:\n1. [ERROR] Socket 1 Error: disconnected\n1. [ERROR] Socket 1 Error: unknown error\n\nEither one happens after a connect-request-disconnect cycle. However closing the debugged application via the stop in Visual Studio more often does the \"disconnected\" message (or via Ctrl + C in the console), whereas closing the console window with the \"x\" seems to mostly do the \"unknown error\" message.\n\n![Version 1 disconnected](images/version1_disconnected.jpg)\n![Version 1 unknown error](images/version1_unknownError.jpg)\n\nI assume it's something to do with how the OS./NET sends kill/close messages to the process when using different close methods. Looking into it:\n\n- [Host shutdown](https://learn.microsoft.com/en-us/dotnet/core/extensions/generic-host?tabs=appbuilder#host-shutdown)\n- [Host shutdown in web server scenarios](https://learn.microsoft.com/en-us/dotnet/core/extensions/generic-host?tabs=appbuilder#host-shutdown-in-web-server-scenarios)\n- [Web Host shutdown](https://learn.microsoft.com/en-us/aspnet/core/fundamentals/host/web-host?view=aspnetcore-9.0#shutdown-timeout)\n- [WebApplication and WebApplicationBuilder in Minimal API apps](https://learn.microsoft.com/en-us/aspnet/core/fundamentals/minimal-apis/webapplication?view=aspnetcore-9.0)\n- [Detecting console closing in .NET](https://www.meziantou.net/detecting-console-closing-in-dotnet.htm)\n- [Closing of TCP socket - What is different when connection is closed by debugger](https://stackoverflow.com/questions/24281037/closing-of-tcp-socket-what-is-different-when-connection-is-closed-by-debugger)\n- [Extending the shutdown timeout setting to ensure graceful IHostedService shutdown](https://andrewlock.net/extending-the-shutdown-timeout-setting-to-ensure-graceful-ihostedservice-shutdown/)\n- [What is the difference between the SIGINT and SIGTERM signals in Linux? What’s the difference between the SIGKILL and SIGSTOP signals?](https://www.quora.com/What-is-the-difference-between-the-SIGINT-and-SIGTERM-signals-in-Linux-What%E2%80%99s-the-difference-between-the-SIGKILL-and-SIGSTOP-signals?share=1)\n- [Can I handle the killing of my windows process through the Task Manager?](https://stackoverflow.com/questions/1527450/can-i-handle-the-killing-of-my-windows-process-through-the-task-manager)\n- [Closing the Window](https://learn.microsoft.com/en-us/windows/win32/learnwin32/closing-the-window)\n- [How to distinguish 'Window close button clicked (X)' vs. window.Close() in closing handler](https://stackoverflow.com/questions/13361260/how-to-distinguish-window-close-button-clicked-x-vs-window-close-in-closi/20006210#20006210)\n\nThe `Dispose()` method is called when Ctrl + C is used. But doesn't seem to get called with the other closure methods. Which now seems odd bceause we get into this state:\n\n| Closure method    | `Dispose()` runs? | `Lifetime.ApplicationStopping` runs? | mGBA socket close                                |\n| ----------------- | ----------------- | ------------------------------------ | ------------------------------------------------ |\n| Ctrl + C          | Yes               | Yes                                  | \"disconnected\"                                   |\n| Close button      | No                | No                                   | \"unknown error\"                                  |\n| VS stop debugging | No                | No                                   | Mostly \"disconnected\", sometimes \"unknown error\" |\n\n_Technically closure meathods can also be application close or application kill from the Task Manager or command prompt._\n\nI anticipate that most people will close mGBA-http with the close \"x\" button meaning it will be worth better understanding how the shutdown works in that case. Though it seems that even with these exit hooks, the socket can't be closed cleanly:\n\n```csharp\n// 1. Relying on the usual hooks to dispose\n\n// 2. Using Lifetime.ApplicationStopping\napp.Lifetime.ApplicationStopping.Register(() =\u003e\n{\n\n});\n\n// 3. Using the ProcessExit callback\nAppDomain.CurrentDomain.ProcessExit += (s, e) =\u003e {\n\n};\n```\n\nUltimately then this probably has to be handled in the Lua script to deal with abrupt socket closures that lead to \"unknown error\". This can be updated by adding:\n```lua\nif err ~= socket.ERRORS.AGAIN then\n\tconsole:log(\"ST_received 5\")\n\tif err == \"disconnected\" then\n\t\tconsole:log(\"ST_received 6\")\n\t\tconsole:log(ST_format(id, err, true))\n\telseif err == socket.ERRORS.UNKNOWN_ERROR then\n\t\tconsole:log(\"ST_received 7\")\n\t\tconsole:log(ST_format(id, err, true))\n\telse\n\t\tconsole:log(\"ST_received 8\")\n\t\tconsole:error(formatMessage(id, err, true))\n\tend\n\tST_stop(id)\nend\n```\n\nWhere we can have the option of swallowing the error.\n\n## socket.ERRORS.AGAIN\n\nIt seems every request will have the \"socket.ERRORS.AGAIN\" error. Even the two example lua scripts [[1](https://github.com/mgba-emu/mgba/blob/c33a0d65344984294ed8666e98d1735a29f0a2d8/res/scripts/socketserver.lua#L37)][[2](https://github.com/mgba-emu/mgba/blob/c33a0d65344984294ed8666e98d1735a29f0a2d8/res/scripts/sockettest.lua#L39)] from the mGBA repo ignore this error. Meaning this can ignore it too.\n\n# Version 3 - Manual socket closing\n\n[Tag link](https://github.com/nikouu/mGBA-lua-socket-loadtest/tree/Version3)\n\nVersion 3 is around manually cleaning up a socket. Ensuring that future work can be correctly cleaned up outside of the `Dispose()` call in `SocketService.cs`. \n\nFor this, a new socket will be created inside the `SocketService` singleton for each request then cleaned up afterwards. This can automatically be done with `using`:\n\n```csharp\npublic async Task\u003cstring\u003e SendMessageAsync(string message)\n{\n    var ipAddress = IPAddress.Parse(\"127.0.0.1\");\n    var ipEndpoint = new IPEndPoint(ipAddress, 8888);\n    using var socket = new Socket(ipEndpoint.AddressFamily, SocketType.Stream, ProtocolType.Tcp); // using declaration for automatic cleanup\n\n    await socket.ConnectAsync(ipEndpoint);\n\n    var messageBytes = Encoding.UTF8.GetBytes(message);\n    await socket.SendAsync(messageBytes, SocketFlags.None);\n\n    var buffer = new byte[1_024];\n    var received = await socket.ReceiveAsync(buffer, SocketFlags.None);\n    var response = Encoding.UTF8.GetString(buffer, 0, received);\n\n    return response;\n}\n```\n\nWhich correctly triggers the \"disconnected\" status in mGBA.\n\nBut to better understand the situation we can clean up the socket ourselves:\n\n```csharp\npublic async Task\u003cstring\u003e SendMessageAsync(string message)\n{\n    var ipAddress = IPAddress.Parse(\"127.0.0.1\");\n    var ipEndpoint = new IPEndPoint(ipAddress, 8888);\n    var socket = new Socket(ipEndpoint.AddressFamily, SocketType.Stream, ProtocolType.Tcp);\n\n    await socket.ConnectAsync(ipEndpoint);\n\n    var messageBytes = Encoding.UTF8.GetBytes(message);\n    await socket.SendAsync(messageBytes, SocketFlags.None);\n\n    var buffer = new byte[1_024];\n    var received = await socket.ReceiveAsync(buffer, SocketFlags.None);\n    var response = Encoding.UTF8.GetString(buffer, 0, received);\n\n    socket.Shutdown(SocketShutdown.Both);\n    socket.Close();\n    socket.Dispose();            \n\n    return response;\n}\n```\n\nThat seems to be a pattern people use a lot even though `.Close()` doesn't do much and also calls `.Dispose()` [under the hood](https://github.com/dotnet/runtime/blob/1d1bf92fcf43aa6981804dc53c5174445069c9e4/src/libraries/System.Net.Sockets/src/System/Net/Sockets/Socket.cs#L942).\n\n![Version 3 disconnected](images/version3_disconnected.jpg)\nNote: The error states aren't errored because in version 2 they were changed to regular logging calls.\n\nWith that knowledge we can continue with using declaration in the first code example above. This also means that every socket will be properly cleaned up (assuming there's no message in-flight) with whatever way a user closes mGBA-http.\n\n# Version 4 - Load testing\n\n[Tag link](https://github.com/nikouu/mGBA-lua-socket-loadtest/tree/Version4)\n\nNow that we understand the connections to mGBA, it's time to start benchmarking. **All benchmarks on this page are run for 30 seconds.** Version 4 brings in the load testing client. Here is the baseline of the socket code from version 3:\n\n| Requests per second | Actual requests per second | Average latency (ms) | Max latency (ms) | Success rate |\n| ------------------: | -------------------------: | -------------------: | ---------------: | -----------: |\n|                   1 |                       1.00 |                   32 |              264 |         100% |\n|                   2 |                       1.98 |                   28 |              245 |         100% |\n|                   3 |                       2.95 |                   32 |              279 |         100% |\n|                   4 |                       3.79 |                   37 |              282 |          96% |\n|                   5 |                       4.83 |                   31 |              218 |          98% |\n|                  10 |                       9.50 |                   70 |            1,465 |          97% |\n|                  15 |                      14.23 |                  908 |            8,476 |          97% |\n|                  20 |                       9.78 |               10,758 |           32,102 |          79% |\n\nI ran these a few times but there was always a lot of variability. Which was also my experience during mGBA-http and with the errors back in version 2. Often the 1-5 request per second lot will hover around 99% success too.\n\nFor all scenarios, it looks like:\n1. The requests back up, see max latency\n2. The average latency for small RPS is around 30ms\n3. The requests per second really drop off at above 15\n\nOur limiting factor is on the mGBA side - at least for how everything has been written for version 4. \n\nI'd like to set a goal of a consistent 100% success rate at under 60ms max latency for five requests per second for this project. I suspect a lot of the latency is the establishing and closing sockets as each request opens and disposes of a socket. \n\nMeaning next up, we can try socket pooling.\n\n# Version 5 - Socket pooling\n\n[Tag link](https://github.com/nikouu/mGBA-lua-socket-loadtest/tree/Version5)\n\nI think(?) the idea of socket pooling is a little difficult because TCP stockets are stateful. However we're always going to be sending to the same endpoint so I think that makes it okay(?). I also have no idea how mGBA will handle having concurrent requests on the same socket. Both thoughts will be addressed by:\n\n1. Creating a socket pool in C#\n2. During the load testing, pass a new GUID for every message and check that the correct GUID gets returned\n\n.NET comes with a generic object pool in the form of [Microsoft.Extensions.ObjectPool](Microsoft.Extensions.ObjectPool) and I anticipate that a few sockets can be created and reused. The crux of this work is: \n\n```csharp\npublic class ReusableSocket : IResettable, IDisposable\n{\n    private readonly Socket _socket;\n    private readonly IPEndPoint _ipEndpoint;\n\n    public ReusableSocket(string ipAddress, int port)\n    {\n\n        var address = IPAddress.Parse(\"127.0.0.1\");\n        _ipEndpoint = new IPEndPoint(address, port);\n        _socket = new Socket(_ipEndpoint.AddressFamily, SocketType.Stream, ProtocolType.Tcp);\n    }\n\n    public async Task\u003cstring\u003e SendMessageAsync(string message)\n    {\n        if (!_socket.Connected)\n        {\n            await _socket.ConnectAsync(_ipEndpoint);\n        }\n\n        var messageBytes = Encoding.UTF8.GetBytes(message);\n        await _socket.SendAsync(messageBytes, SocketFlags.None);\n\n        var buffer = new byte[1_024];\n        var received = await _socket.ReceiveAsync(buffer, SocketFlags.None);\n        var response = Encoding.UTF8.GetString(buffer, 0, received);\n\n        return response;\n    }\n\n    public bool TryReset()\n    {\n        return true;\n    }\n\n    public void Dispose()\n    {\n        _socket?.Dispose();\n        GC.SuppressFinalize(this);\n    }\n}\n```\n\nSince we'll be reusing the socket in the same state, there's no need to reset. Now on each call, we inject in the pool to get a socket from:\n\n```csharp\napp.MapGet(\"/mgbaendpoint\", async (ObjectPool\u003cReusableSocket\u003e socketPool, string message) =\u003e\n{\n    var socket = socketPool.Get();\n\n    try\n    {\n        return await socket.SendMessageAsync(message);\n    }\n    finally\n    {\n        socketPool.Return(socket);\n    }\n});\n```\n\nVia the IoC setup, this will give us a socket dedicated for that request. If there are idle sockets in the pool it will use one of those, and if there are no sockets left, it will use the `ReusableSocketPooledObjectPolicy` class to create one (see the source for more details).\n\nAt the same time, a slight modification to use a new GUID for every request can be done so we can ensure the right response is returning to the right request. \n\nRunning the benchmarks again:\n\n| Requests per second | Actual requests per second | Average latency (ms) | Max latency (ms) | Success rate |\n| ------------------: | -------------------------: | -------------------: | ---------------: | -----------: |\n|                   1 |                          1 |                   43 |              240 |         100% |\n|                   2 |                       1.97 |                   36 |              314 |         100% |\n|                   3 |                       2.95 |                   33 |              246 |         100% |\n|                   4 |                       3.92 |                   29 |              297 |         100% |\n|                   5 |                       4.90 |                   29 |              274 |         100% |\n|                  10 |                       9.76 |                   29 |              361 |         100% |\n|                  15 |                      10.43 |                5,122 |           12,935 |          81% |\n|                  20 |                       4.28 |               38,406 |           87,501 |          74% |\n\nRemember before how I said there was variability? It seems that if I let everything cool off, I assume the [Time Wait] period we can get back to near 100% success even at 20 RPS, though still with high-ish latency. Something to think about later.\n\nWhen hitting around 15 RPS it starts to fall apart, and mysteriously this error begins to fail requests:\n\u003e System.Net.Sockets.SocketException (10056): A connect request was made on an already connected socket.\n\nBut how? Due to the object pool each inbound HTTP request get its own instance of `ReusableSocket` i.e. there's no non-threadsafe action (as far as I know). And how could the connection check be false, triggering a reconnect, but then throwing an exception saying the connection already exists?\n\n```csharp\n public async Task\u003cstring\u003e SendMessageAsync(string message)\n {\n     if (!_socket.Connected)\n     {\n         await _socket.ConnectAsync(_ipEndpoint);\n     }\n\n    // --\n }\n```\n\nPart of it is that the `Connect` property on `Socket` indicates the [connection state as of the _last operation_](https://learn.microsoft.com/en-us/dotnet/api/system.net.sockets.socket.connected?view=net-9.0#remarks). Meaning in that time the connection may have been dropped from the other side? Though if that's the case, and on a previous call the socket failed, how does the `ConnectAsync()` call fail if the socket is apparently in a disconnected state? This I have no idea about.\n\nIt could be something to do with the first exception that gets thrown during a load test:\n\u003eSystem.Net.Sockets.SocketException (10054): An existing connection was forcibly closed by the remote host.\n\nWhere I assume the socket is in a bad state. This would cause the `Connect` property to be false, but I wonder if the other side isn't resilient enough to have properly closed the connection? This would mean this socket should be properly closed, and a new socket used. \n\nWhile for a \"regular\" number of calls, the success rate is now 100% and average latency low my goal wasn't reached (yet) and higher RPS fail. It's time to introduce retries.\n\n# Version 6 - Retries\n\n[Tag link](https://github.com/nikouu/mGBA-lua-socket-loadtest/tree/Version6)\n\nTaking the work of [Oleg Kyrylchuk](https://okyrylchuk.dev/) from [Understanding the Retry Pattern](https://okyrylchuk.dev/blog/understanding-the-retry-pattern/) for a tidy bit of retry:\n\n```csharp\npublic async Task\u003cstring\u003e SendMessageAsync(string message)\n{\n    var attempts = 0;\n    var delay = _initialDelay;\n\n    while (attempts \u003c _maxRetries)\n    {\n        try\n        {\n            attempts++;\n            return await SendAsync(message);\n        }\n        catch\n        {\n            if (attempts \u003e= _maxRetries)\n            {\n                throw;\n            }\n\n            await Task.Delay(delay);\n            delay = Math.Min(delay * 3, _maxDelay);\n        }\n    }\n\n    throw new Exception(\"How did we get here?\");\n}\n```\n\nThen tweaking the delay values a little, gives us the following table of results:\n\n| Requests per second | Actual requests per second | Average latency (ms) | Max latency (ms) | Success rate |\n| ------------------: | -------------------------: | -------------------: | ---------------: | -----------: |\n|                   1 |                          1 |                   38 |              260 |         100% |\n|                   2 |                       1.98 |                   34 |              263 |         100% |\n|                   3 |                       2.95 |                   29 |              268 |         100% |\n|                   4 |                       3.92 |                   28 |              242 |         100% |\n|                   5 |                       4.90 |                   27 |              304 |         100% |\n|                  10 |                       9.76 |                   30 |              438 |         100% |\n|                  15 |                      14.61 |                   27 |              427 |         100% |\n|                  20 |                       4.28 |               38,406 |           87,501 |          74% |\n\n## The continued variability\n\nPlaguing these benchmarks have been the varability of the tests. At this point in the versions, the errors look like this for each of the retry attempts:\n\n1. An existing connection was forcibly closed by the remote host.\n2. A connect request was made on an already connected socket.\n3. A connect request was made on an already connected socket.\n\nBut in this version, mixed with the knowledge from Version 2 - I think this happens because of the lack of proper socket cleanup when using the restart in Visual Studio:\n\n![VS Restart](images/VSRestart.jpg)\n\nSometimes there are clean tests that are fast and 100% successful and looking in TCPView, really only one socket gets used. The following is a screenshot of a finished 20 requests per second for 30 seconds:\n\n![TCPViewSuccessfulTest](images/TCPViewSuccessfulTest.jpg)\nNote: I assume the number is lower than 600 because the [socket uses the Nagle algorithm by default](https://learn.microsoft.com/en-us/dotnet/api/system.net.sockets.socket.nodelay?view=net-9.0#system-net-sockets-socket-nodelay).\n\nMeaning a single socket can pretty much handle lots of requests, but sometimes it all backs up and throws errors. I wonder if it is because there's some weird socket reuse when the process starts again. \n\nRunning the load test without restarting the server doesn't cause issues. It's something to do with the socket cleanup when debugging in Visual Studio (again). While benchmarking should be run in release mode, I figured the gap with how sockets interacted wouldn't be this large. Tests from here onwards will be only using release binaries. So let's try making the benchmark table again, and go further:\n\n\n| Requests per second | Actual requests per second | Average latency (ms) | Max latency (ms) | Success rate |\n| ------------------: | -------------------------: | -------------------: | ---------------: | -----------: |\n|                   1 |                          1 |                   19 |              145 |         100% |\n|                   2 |                       1.98 |                   17 |              129 |         100% |\n|                   3 |                       2.95 |                   19 |              108 |         100% |\n|                   4 |                       3.92 |                   18 |              145 |         100% |\n|                   5 |                       4.90 |                   18 |              132 |         100% |\n|                  10 |                       9.75 |                   19 |              209 |         100% |\n|                  15 |                      14.65 |                   17 |              101 |         100% |\n|                  20 |                      19.40 |                   18 |              118 |         100% |\n|                  40 |                      32.06 |                   17 |              123 |         100% |\n|                  80 |                      63.31 |                   19 |            1,043 |         100% |\n|                 160 |                      63.67 |                 17.5 |              126 |         100% |\n\nNote: It seems running straight to 160 RPS does give errors, but after a short warmup period, it's fine. Which seems on par with load test things.\n\nSo with everything in release mode, it's easy to send more than one request per frame to mGBA (GB and GBA run at 60fps). After 80 requests per second, the single threaded client doing the load test can't keep up with sending that many request per second. This is probably a great state to be in where requests can be handled faster than the framerate and thus inputs of the emulated hardware. While this project is simply reflecting values back, even with heavier calls, it _probably_ is enough performance.\n\nLooking back at my goal from Version 4:\n\u003eI'd like to set a goal of a consistent 100% success rate at under 60ms max latency for five requests per second for this project\n\nWhile I didn't achieve my goal about max latency being under 60ms for five requests per second, considering the average and max latency numbers are about the same even into way higher RPS, I think it's fine.\n\n## Digging into max latency\n\nBut why does it seem so consistent across all the recent benchmarks? I suspect it's because the first connection of a socket takes time. And when looking at it across different RPS benchmarks, it's very often the first one of the socket pool making the initial connection. The max otherwise, funnily enough is around 50-60ms! Might be worth in the future looking at the 95th or 99th percentile as well.\n\n# Version 7 - Tidy up\n\n[Tag link](https://github.com/nikouu/mGBA-lua-socket-loadtest/tree/Version7)\n\nThis version is just to clean up `ReusableSocket` so it's in a good state to migrate over into mGBA-http. \n\n## More pools\n\nThe method that is doing the actual socket sending and recieving is unnecessarily allocating a new byte array for every request. It also is the crux of [mGBA-http issue #4](https://github.com/nikouu/mGBA-http/issues/4) where messages longer than 1024 bytes are truncated and data is left in the socket to pollute the next request. This is solved by using both [ArrayPool](https://learn.microsoft.com/en-us/dotnet/api/system.buffers.arraypool-1?view=net-9.0) and [Microsoft.IO.RecyclableMemoryStream](https://github.com/microsoft/Microsoft.IO.RecyclableMemoryStream).\n\nBefore:\n```csharp\nprivate async Task\u003cstring\u003e SendAsync(string message)\n{\n    if (!_socket.Connected)\n    {\n        await _socket.ConnectAsync(_ipEndpoint);\n    }\n\n    var messageBytes = Encoding.UTF8.GetBytes(message);\n    await _socket.SendAsync(messageBytes, SocketFlags.None);\n\n    var buffer = new byte[1_024];\n    var received = await _socket.ReceiveAsync(buffer, SocketFlags.None);\n    var response = Encoding.UTF8.GetString(buffer, 0, received);\n\n    return response;\n}\n```\n\nAfter:\n```csharp\nprivate async Task\u003cstring\u003e ReadAsync()\n{\n    var buffer = ArrayPool\u003cbyte\u003e.Shared.Rent(1024);\n    using var memoryStream = _recyclableMemoryStreamManager.GetStream();\n    do\n    {\n        var bytesRead = await _socket.ReceiveAsync(buffer, SocketFlags.None);\n        await memoryStream.WriteAsync(buffer.AsMemory(0, bytesRead));\n    } while (_socket.Available \u003e 0);\n\n    var response = Encoding.UTF8.GetString(memoryStream.GetReadOnlySequence());\n    return response;\n}\n```\n\nTo ensure I don't regress into having truncated messages, I did the load test with 5000 byte messages, larger than the initial buffer. However, I learned about a new problem, TCP packet fragmentation! (I think...). What was happening was the message entering the Lua script was larger than the 1024 byte buffer. This hasn't come up in mGBA-http because either the large payload endpoints aren't fully supported yet, or it just isn't common with the arbitrary sized payloads that are usually small payloads (such as sending a log message). The previous truncation problem happened with the buffer on the _C# side_ due to a large response.\n\nThe case of sending 5000bytes of data ended up causing the Lua script to fail reflecting it. I think because TCP is a streaming protocol, the do while loop check of `_socket.Available \u003e 0` isn't enough as the socket might be empty, but the rest of the data hasn't arrived yet. The solution I opted for is a termination character (well, string) of `\u003c|END|\u003e`. This gets appended to messages sent both by the .NET side and the Lua side in order for both sides to correctly buffer entire messages.\n\nHere is the C# code after:\n```csharp\nprivate async Task\u003cstring\u003e ReadAsync()\n{\n    var buffer = ArrayPool\u003cbyte\u003e.Shared.Rent(1024);\n    using var memoryStream = _recyclableMemoryStreamManager.GetStream();\n    int totalBytes = 0;\n\n    while (true)\n    {\n        var bytesRead = await _socket.ReceiveAsync(buffer, SocketFlags.None);\n        if (bytesRead == 0)\n            break; // Socket closed\n\n        await memoryStream.WriteAsync(buffer.AsMemory(0, bytesRead));\n        totalBytes += bytesRead;\n\n        // Check for termination marker in the accumulated buffer\n        var mem = memoryStream.GetBuffer().AsSpan(0, totalBytes);\n        int markerIndex = mem.IndexOf(_terminationBytes);\n        if (markerIndex \u003e= 0)\n        {\n            // Found marker, extract message up to marker\n            var messageBytes = mem.Slice(0, markerIndex);\n            var response = Encoding.UTF8.GetString(messageBytes);\n            return response;\n        }\n    }\n\n    ArrayPool\u003cbyte\u003e.Shared.Return(buffer);\n    return Encoding.UTF8.GetString(memoryStream.GetReadOnlySequence());\n}\n```\n\nAnd while no \"before\" is shown, here is the updated Lua code:\n```Lua\nfunction ST_received(id)\n    log(\"ST_received 1\")\n    local sock = ST_sockets[id]\n    if not sock then return end\n    sock._buffer = sock._buffer or \"\"\n    while true do\n        local chunk, err = sock:receive(1024)\n        log(\"ST_received 2\")\n        if chunk then\n            sock._buffer = sock._buffer .. chunk\n            while true do\n                local marker_start, marker_end = sock._buffer:find(TERMINATION_MARKER, 1, true)\n                if not marker_start then break end\n                local message = sock._buffer:sub(1, marker_start - 1)\n                sock._buffer = sock._buffer:sub(marker_end + 1)\n                log(\"ST_received 3\")\n                log(ST_format(id, message:match(\"^(.-)%s*$\")))\n                -- Echo back the message with the marker\n                sock:send(message .. TERMINATION_MARKER)\n            end\n        else\n            log(\"ST_received 4\")\n            if err ~= socket.ERRORS.AGAIN then\n                log(\"ST_received 5\")\n                if err == \"disconnected\" then\n                    log(\"ST_received 6\")\n                    log(ST_format(id, err, true))\n                elseif err == socket.ERRORS.UNKNOWN_ERROR then\n                    log(\"ST_received 7\")\n                    log(ST_format(id, err, true))\n                else\n                    log(\"ST_received 8\")\n                    console:error(ST_format(id, err, true))\n                end\n                ST_stop(id)\n            end\n            return\n        end\n    end\nend\n```\n\nWith this, there'll be two benchmarks:\n1. With GUIDs like most of the previous benchmarks have been\n2. With a 5000 byte message\n\nWith GUIDs (36 bytes):\n\n| Requests per second | Actual requests per second | Average latency (ms) | 99th percentile latency | Max latency (ms) | Success rate |\n| ------------------: | -------------------------: | -------------------: | ----------------------: | ---------------: | -----------: |\n|                   1 |                       1.00 |                   21 |                      40 |              121 |         100% |\n|                   2 |                       1.97 |                   19 |                      39 |              129 |         100% |\n|                   3 |                       2.95 |                   17 |                      40 |              115 |         100% |\n|                   4 |                       3.92 |                   17 |                      39 |              122 |         100% |\n|                   5 |                       4.90 |                   18 |                      39 |              109 |         100% |\n|                  10 |                       9.77 |                   18 |                      40 |              117 |         100% |\n|                  15 |                      14.65 |                   17 |                      40 |              192 |         100% |\n|                  20 |                      19.43 |                   18 |                      40 |              124 |         100% |\n|                  40 |                      32.24 |                   18 |                      40 |              126 |         100% |\n\nWith 5000 byte messages:\n\n| Requests per second | Actual requests per second | Average latency (ms) | 99th percentile latency | Max latency (ms) | Success rate |\n| ------------------: | -------------------------: | -------------------: | ----------------------: | ---------------: | -----------: |\n|                   1 |                       1.00 |                   30 |                      40 |              235 |         100% |\n|                   2 |                       1.98 |                   19 |                      40 |              108 |         100% |\n|                   3 |                       2.95 |                   21 |                      39 |              152 |         100% |\n|                   4 |                       3.92 |                   20 |                      38 |              133 |         100% |\n|                   5 |                       4.90 |                   18 |                      38 |              115 |         100% |\n|                  10 |                       9.76 |                   18 |                      39 |              113 |         100% |\n|                  15 |                      14.65 |                   18 |                      39 |              143 |         100% |\n|                  20 |                      19.41 |                   18 |                      39 |              128 |         100% |\n|                  40 |                      32.25 |                   18 |                      41 |              118 |         100% |\n\nThe results are clear: At 40 RPS or lower, there is little to no difference in the payload size of 5KB impacting performance.\n\n## Allocation checking\n\nThis is checking if there are any extra allocations in the server that can be dealt with by running the Visual Studio performance profiler. The following image is after a load test with 10 requests per second for 30 seconds:\n\n![Analysis Performance](images/AllocationAnalysisReport.jpg)\n\nBut at a glance, there doesn't seem to be much without going wild with the code. \n\n## Pushing it to 11\n\nWhat about 200 RPS? Or 1000? Let's try high numbers for fun when sending GUIDs. Note as the values are echoed back, it doesn't mirror real work. But it is fun in terms of focusing on the network layer. Some adjustments have been made to the load tester to get it to go faster. \n\n| Requests per second | Actual requests per second | Average latency (ms) | 99th percentile latency | Max latency (ms) | Success rate |\n| ------------------: | -------------------------: | -------------------: | ----------------------: | ---------------: | -----------: |\n|                 100 |                         95 |                   19 |                      36 |            1,075 |         100% |\n|                 200 |                        181 |                   17 |                      35 |              637 |         100% |\n|                 300 |                        260 |                   16 |                      35 |              679 |         100% |\n|                 400 |                        121 |               18,081 |                  63,338 |           69,769 |          96% |\n\nNote: Like all tests, these are from cold. No ramp up in requests to warm.\n\nTurns out mGBA starts dropping connections between 300 and 400 RPS. Which as primarily an emulator with a socket server on the side, is pretty good! Let's see where the limit is:\n\n| Requests per second | Actual requests per second | Average latency (ms) | 99th percentile latency | Max latency (ms) | Success rate |\n| ------------------: | -------------------------: | -------------------: | ----------------------: | ---------------: | -----------: |\n|                 300 |                        260 |                   16 |                      35 |              679 |         100% |\n|                 325 |                        278 |                   16 |                      34 |            1,046 |         100% |\n|                 330 |                        282 |                   16 |                      34 |            1,042 |         100% |\n|                 335 |                        285 |                   16 |                      34 |              554 |         100% |\n|                 340 |                        283 |                   26 |                      34 |            2,136 |          99% |\n|                 350 |                        283 |                   21 |                      35 |            1,651 |          99% |\n\nIt looks like at 330 RPS we reach max requests per second at just over 280 real RPS. For clarity, here the load tester sends 330 RPS but after the test is done there's a wait for all the requests to finish which can last longer than the test leading to an actual and lower RPS recieved. At this point the errors come in as:\n\u003e An existing connection was forcibly closed by the remote host\n\nAnd the script in mGBA returns error logs:\n\n![Exceeding calls](images/mGBAExceedingLimit.jpg)\n\nSo that's a fantastic limit of 20.1k requests a minute - for now. I'm sure there's more to be done. But I strongly doubt that this scenario with:\n1. A sketchy load tester\n1. No warm up\n1. Vastly not the use case\n\nWon't matter too much. It's just fun to do 😁\n\n## Checking for a long timeout\n\nWhat if the user sends a request, which opens a socket and adds it to the socket pool, then doesn't send another request for a long time? Is a long opened and long unused socket going to stay alive? Will it automatically close after some idle or linger timeout?\n\nSeems the answer is ambigious:\n- [It will always be open if they're on the same machine as the connection probably won't drop](https://stackoverflow.com/questions/11712425/do-tcp-sockets-automatically-close-after-some-time-if-no-data-is-sent)\n- [The socket is open for two hours](https://stackoverflow.com/questions/1480236/does-a-tcp-socket-connection-have-a-keep-alive)\n- [It's up to the parties to work it out](https://serverfault.com/questions/338388/is-it-possible-for-a-tcp-connection-to-remain-open-when-the-client-has-disconnec)\n\nSince I'm using Windows, I'll send a request then wait a little over two hours, then send another one to see if the first (and only) socket in the socket pool will work. Side note: it might also be useful to have a full re-creation of the socket in some circumstances anyway.\n\nSo after 3 hours and 20 minutes, the second request was happily sent. 🤷‍♀️ Regardless, added some new socket exception catches which simply recreate the socket instance as part of the retries.\n\n# Wrapping up\n\nSo what was learned?\n\n1. Exiting a process in different ways leads to different ways sockets are cleaned up\n    1. This includes the different ways to close with Visual Studio during debugging\n    1. Ctrl + C is the cleanest in all cases\n1. There's significant overhead in setting up a new socket connection with respect to regular sending on an already good to go connection\n1. Even flimsy load testers are hard to get remotely correct\n1. Pooling is great\n    1. Socket pooling via Microsoft.Extensions.ObjectPool\n    1. Buffer pooling via ArrayPool\n    1. MemoryStream pooling via Microsoft.IO.RecyclableMemoryStream\n1. There can be a stampede problem when creating new instances of a socket, something to do with the same ports being taken across different instances it seems?\n1. `Socket.Connected` isn't a live representation, it's whatever the last status was of the socket\n1. The Nagle algorithm exists\n1. Buffer responses on both both ends and use something so the other side knows how big the message is. This is because while you can check the socket for available data, ultimately TCP is a streaming protocol and meaning while it may seem like you have recieved all the data, there could be more data on the way. This project solved it by having a terminating string to look for.\n1. An idle socket connection might not get disconnected\n1. Most of the work from Version 4 onwards probably represents only corner cases, but they were the most fun\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnikouu%2Fmgba-lua-socket-loadtest","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnikouu%2Fmgba-lua-socket-loadtest","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnikouu%2Fmgba-lua-socket-loadtest/lists"}