Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/yougov/concurrency-tests
Concurrency tests with different stacks
https://github.com/yougov/concurrency-tests
Last synced: about 2 months ago
JSON representation
Concurrency tests with different stacks
- Host: GitHub
- URL: https://github.com/yougov/concurrency-tests
- Owner: yougov
- License: mit
- Created: 2022-07-02T12:31:42.000Z (over 2 years ago)
- Default Branch: master
- Last Pushed: 2023-01-03T03:14:16.000Z (about 2 years ago)
- Last Synced: 2024-04-16T07:16:25.437Z (9 months ago)
- Language: Python
- Size: 34.1 MB
- Stars: 0
- Watchers: 4
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# concurrency-tests
Concurrency tests with different stacks.It's basically a comparison between running different stacks for gathering data
from an external service in a concurrent fashion.Currently, the comparison is between:
1. uWSGI + gevent + Flask
2. Uvicorn + uvloop + FastAPIThere might be more stacks added in the future (like Rocket and NodeJS) just
for experimentation purposes.# Rationale
The idea is: there's an external service that provides data, so how well does
each stack perform when doing a number of concurrent fetches of data from it?To simulate this external service I run an Nginx image serving static JSON files
(which were previously randomly generated and are versioned under `fixtures/`),
and then we're able to make requests to Nginx to fetch the JSON data. Nginx is
very fast and consistent, so this allows us to focus more on variations between
application stacks fetching data from it, although still providing
decently-sized JSON content.# Dependencies
In order to use this repo you need to have available docker-compose.
# How to test it
1. `$ make up`
2. `$ ./scripts/check-performance.py`Take a look at the logs from docker-compose, and at the outputs from the script.
What the script does is to run a certain amount of requests for JSON data, which
is consistent between each stack being checked, thus exercising them under very
similar conditions.# Findings
Spoiler alert: here are some results I get when running on my computer:
```
$ ./scripts/check-performance.py
Checking http://localhost:8101/data
[0.08104367200212437, 0.09249140500105568, 0.07991218000097433]
Checking http://localhost:8102/data
[0.876144165002188, 0.8645361960006994, 0.8990603880010894]
```The first results are from requests done to the uWSGI stack, and the second
results are from requests done to the FastAPI stack.Now here are the timings I get from inside each handler function when gathering
data from the "external service" (Nginx):```
concurrency-tests-uwsgi-1 | Took 0.05811190605163574 seconds to gather data
concurrency-tests-uwsgi-1 | Took 0.06990480422973633 seconds to gather data
concurrency-tests-uwsgi-1 | Took 0.05684971809387207 seconds to gather data
concurrency-tests-fastapi-1 | Took 0.6699528694152832 seconds to gather data
concurrency-tests-fastapi-1 | Took 0.6578493118286133 seconds to gather data
concurrency-tests-fastapi-1 | Took 0.6970744132995605 seconds to gather data
```This is also relevant because it exposes the fact that the way each stack is
able to fetch the data from the external service plays a big role in the
performance to respond with said data to a client.However, subtracting the time for gathering the data from the response times, we
end up with these timings for the stack to handle the request and then respond:
* uWSGI:
- 0.022931765950488625
- 0.022586600771319354
- 0.02306246190710226
* FastAPI:
- 0.20619129558690474
- 0.20668688417208614
- 0.2019859747015289So, overall, uWSGI + gevent + Flask ends up being about 9x faster than a similar
Uvicorn + uvloop + FastAPI stack.# Updates / edits
## Edit 1: AsyncClient as a global variable
After making the AsyncClient a global object started together with the app, I
managed to get better timings from FastAPI because of avoiding the boilerplate
latency for setting up the client pool:```
$ ./scripts/check-performance.py
Checking http://localhost:8101/data
[0.09677495100186206, 0.0804886539990548, 0.08643015100096818]
Checking http://localhost:8102/data
[0.27411253199898056, 0.2451330690018949, 0.2572915169985208]
```But it's still significantly slower than uWSGI (although noticeably faster than
the previous FastAPI implementation).## Edit 2: aiohttp as client
After adding aiohttp and making `aiohttp.ClientSession` a global session I also
got a bit more improvement:```
$ ./scripts/check-performance.py
Checking http://localhost:8101/data
[0.10453539900117903, 0.07663203999982215, 0.08214562500143074]
Checking http://localhost:8102/data
[0.2804249090004305, 0.21994025399908423, 0.2271890380034165]
```But it's still significantly slower than uWSGI (although a bit faster than
the previous FastAPI tests).## Edit 3: Increasing everything
After increasing the size of the files, the number of timeit calls and the
repetitions, I got an even more dramatic difference:```
$ ./scripts/check-performance.py
Checking http://localhost:8101/data
[2.631773353990866, 2.5597720590012614, 2.6319224660110194, 2.5842257650074316]
Checking http://localhost:8102/data
[16.15418745999341, 16.162098834989592, 16.20327517199621, 16.071328858990455]
```This makes it more pronounced that the delivery of the response payload is the
big difference between those stacks. Still trying to find out why and what's
causing that.## Edit 4: Optimizing Uvicorn as much as I can
I now made sure I wasn't running barely anything else than the stacks in my
computer, and forced Uvicorn to run with httptools and uvloop (just in case it
wasn't using one or either of them), and here were my results now:```
$ ./scripts/check-performance.py
Checking http://localhost:8101/data
[2.527190361986868, 2.5403314139985014, 2.5657907629938563, 2.5391202510072617]
Checking http://localhost:8102/data
[16.13299038200057, 16.14415543198993, 16.0113767899893, 15.997932392012444]
```All cases are slightly better than before, for all stacks, but still a huge
performance difference between the stacks.## Edit 5: Adding aiohttp server
I added aiohttp as a server to the mix, to get a grasp of how it performs,
expecting it to perform similarly to FastAPI + Uvicorn. Much to my surprise, it
not only performed way better than FastAPI + Uvicorn, but also a bit better than
uWSGI + Flask which was my performance reference. For the results below:* http://localhost:8103/data is aiohttp with normal asyncio loop
* http://localhost:8104/data is aiohttp with uvloop```
$ make check-performance
python3 scripts/check-performance.py
*** Checking performance ***
Checking http://localhost:8101/data
Average: 2.6458381063324246
Timings: [2.6972674790013116, 2.6148150009976234, 2.625431838998338]
Checking http://localhost:8102/data
Average: 16.227512442007235
Timings: [16.273041088003083, 16.196980773005635, 16.212515465012984]
Checking http://localhost:8103/data
Average: 1.9147164596652146
Timings: [1.927170394003042, 1.9134009429981234, 1.9035780419944786]
Checking http://localhost:8104/data
Average: 1.843575427332932
Timings: [1.8504752360022394, 1.8394017569953576, 1.8408492890011985]
*** Checking correctness of data ***
Checking http://localhost:8101/data
http://localhost:8101/data is correct
Checking http://localhost:8102/data
http://localhost:8102/data is correct
Checking http://localhost:8103/data
http://localhost:8103/data is correct
Checking http://localhost:8104/data
http://localhost:8104/data is correct
```Same stuff as above, but with fewer items in each response payload:
```
$ make check-performance
python3 scripts/check-performance.py
*** Checking performance ***
Checking http://localhost:8101/data
Average: 0.5782370129988218
Timings: [0.5935518369951751, 0.5747218600008637, 0.5664373420004267]
Checking http://localhost:8102/data
Average: 1.7061704700026894
Timings: [1.776460016000783, 1.6743788240128197, 1.6676725699944654]
Checking http://localhost:8103/data
Average: 0.3183802133329057
Timings: [0.33034264300658833, 0.3127249029930681, 0.3120730939990608]
Checking http://localhost:8104/data
Average: 0.28490452833163243
Timings: [0.2977750389982248, 0.2825904119963525, 0.27434813400032]
*** Checking correctness of data ***
Checking http://localhost:8101/data
http://localhost:8101/data is correct
Checking http://localhost:8102/data
http://localhost:8102/data is correct
Checking http://localhost:8103/data
http://localhost:8103/data is correct
Checking http://localhost:8104/data
http://localhost:8104/data is correct
```which gives an even more pronounced performance improvement for aiohttp.
## Edit 6: FastAPI + Hypercorn
Just to check if the performance issue could be with Uvicorn, I also added
Hypercorn to the mix. For the results below:* http://localhost:8101/data is uWSGI + Flask + gevent
* http://localhost:8102/data is uvloop + FastAPI + Uvicorn
* http://localhost:8103/data is uvloop + aiohttp
* http://localhost:8104/data is uvloop + FastAPI + HypercornHere are the results:
```
$ make check-performance
python3 scripts/check-performance.py
*** Checking correctness of data ***
Checking http://localhost:8101/data
http://localhost:8101/data is correct
Checking http://localhost:8102/data
http://localhost:8102/data is correct
Checking http://localhost:8103/data
http://localhost:8103/data is correct
Checking http://localhost:8104/data
http://localhost:8104/data is correct
*** Checking performance ***
Checking http://localhost:8101/data
Average: 0.5557943843353618
Timings: [0.5642758260073606, 0.5538909169990802, 0.5492164099996444]
Checking http://localhost:8102/data
Average: 1.678230069000468
Timings: [1.7020821720070671, 1.6667501029878622, 1.6658579320064746]
Checking http://localhost:8103/data
Average: 0.2670309113357992
Timings: [0.26443595500313677, 0.26453293100348674, 0.27212384800077416]
Checking http://localhost:8104/data
Average: 1.7436538100009784
Timings: [1.7575230410002405, 1.7376779810001608, 1.735760408002534]
```So Hypercorn performed worse than Uvicorn. And there seems to be something being
done in FastAPI that makes it slower than the other stacks, maybe something that
I could simplify in the test app.## Edit 7: Running on a VM in a remote server
I ran all that stuff, with the latest changes, on a VM that's running on a
remote server, just to check if there would be any surprises when comparing to
them running on my computer. The VM is listed as having only 1 logical core, and
it has 4GB of RAM in total. The tests were run without barely anything else
running, with the same codebase state as the previous test:```
$ make check-performance
python3 scripts/check-performance.py
*** Checking correctness of data ***
Checking http://localhost:8101/data
http://localhost:8101/data is correct
Checking http://localhost:8102/data
http://localhost:8102/data is correct
Checking http://localhost:8103/data
http://localhost:8103/data is correct
Checking http://localhost:8104/data
http://localhost:8104/data is correct
*** Checking performance ***
Checking http://localhost:8101/data
Average: 0.613248768573006
Timings: [0.6303591337054968, 0.6183538045734167, 0.5910333674401045]
Checking http://localhost:8102/data
Average: 1.4408962487553556
Timings: [1.454364343546331, 1.4309169836342335, 1.4374074190855026]
Checking http://localhost:8103/data
Average: 0.2993548788751165
Timings: [0.2810085276141763, 0.2922225706279278, 0.32483353838324547]
Checking http://localhost:8104/data
Average: 1.4772176807746291
Timings: [1.490590337663889, 1.4459769548848271, 1.4950857497751713]
```In that VM FastAPI did run a bit faster than on my computer, and uWSGI and
aiohttp ran a bit slower, but there's still a significant difference in
performance between FastAPI and the other options. uWSGI is at least twice as
fast, and aiohttp is even faster.## Edit 8: Adding Sanic
Just to test yet another asyncio-based framework, I added Sanic to the mix.
Here are the results, with port 8105 being Sanic served directly and 8106 served
behind Uvicorn:```
*** Checking performance ***
Checking http://localhost:8101/data
Average: 0.5595273966667568
Timings: [0.595685119999871, 0.5261793409999882, 0.556717729000411]
Checking http://localhost:8102/data
Average: 1.7249857080000766
Timings: [1.763092103999952, 1.712813942000139, 1.6990510780001387]
Checking http://localhost:8103/data
Average: 0.24711097800006124
Timings: [0.2507987750000211, 0.25464237300002424, 0.23589178600013838]
Checking http://localhost:8104/data
Average: 1.7103904253334197
Timings: [1.7158350830000018, 1.7082126370000879, 1.7071235560001696]
Checking http://localhost:8105/data
Average: 0.25828098299992536
Timings: [0.2605192519999946, 0.2685947129998567, 0.24572898399992482]
Checking http://localhost:8106/data
Average: 0.2449119503333653
Timings: [0.24571987999979683, 0.24528555300003063, 0.2437304180002684]
```So here we have a very interesting situation: Sanic is the fastest of them all,
and even a tiny bit faster when running behind Uvicorn. Which makes me conclude
that it's not Uvicorn which makes FastAPI so slow to respond in our scenario
here.## Edit 9: Pure Starlette
I've just tested pure Starlette. Here are the results for it running behind
Uvicorn:```
Checking http://localhost:8107/data
Average: 0.24250185233358934
Timings: [0.2636999240003206, 0.23142165999979625, 0.23238397300065117]
```So indeed there's something extra being done by FastAPI that makes responses
much slower - because it's based on Starlette, which when running pure has
much better performance.## Edit 10: Returning raw Response instance
As I suspected, the performance problem in FastAPI lies in the way it handles
the data being returned in the response. It does a fair amount of inspection of
the values, probably to get the OpenAPI stuff right, and this imposes a
considerable performance hit.Here are the same results as before, but this time returning from the request
handler with a bare Response instance, thus manually encoding the response
payload (FastAPI is ports 8102 and 8104 below):```
python3 scripts/check-performance.py
*** Checking correctness of data ***
Checking http://localhost:8101/data
http://localhost:8101/data is correct
Checking http://localhost:8102/data
http://localhost:8102/data is correct
Checking http://localhost:8103/data
http://localhost:8103/data is correct
Checking http://localhost:8104/data
http://localhost:8104/data is correct
Checking http://localhost:8105/data
http://localhost:8105/data is correct
Checking http://localhost:8106/data
http://localhost:8106/data is correct
Checking http://localhost:8107/data
http://localhost:8107/data is correct
*** Checking performance ***
Checking http://localhost:8101/data
Average: 0.5492803453331968
Timings: [0.5468588979993001, 0.5586922540005617, 0.5422898839997288]
Checking http://localhost:8102/data
Average: 0.24980172600013853
Timings: [0.29054267300034553, 0.22729272800006584, 0.23156977700000425]
Checking http://localhost:8103/data
Average: 0.25109697366648714
Timings: [0.25375061399972765, 0.24550750499929563, 0.2540328020004381]
Checking http://localhost:8104/data
Average: 0.2413024363331715
Timings: [0.26002944899937575, 0.2304424169997219, 0.23343544300041685]
Checking http://localhost:8105/data
Average: 0.2484863879999466
Timings: [0.23881834100029664, 0.25914031399952364, 0.24750050900001952]
Checking http://localhost:8106/data
Average: 0.2281213440001011
Timings: [0.23330932399949234, 0.21634259300026315, 0.23471211500054778]
Checking http://localhost:8107/data
Average: 0.2612281996665236
Timings: [0.2642462539997723, 0.2647721109997292, 0.2546662340000694]
```Now FastAPI compares to the other asyncio-based frameworks!
## Edit 11: aiohttp under Gunicorn and under uWSGI
Just for the sake of science (not really, but...), I also added two new takes on
aiohttp: one running under Gunicorn (port 8108 below) and one running under
uWSGI (port 8109 below).```
Checking http://localhost:8101/data
Average: 0.5857285166663738
Timings: [0.6075965429918142, 0.5707378310034983, 0.5788511760038091]
Checking http://localhost:8102/data
Average: 0.2693211586689965
Timings: [0.26962231600191444, 0.2656774749921169, 0.272663685012958]
Checking http://localhost:8103/data
Average: 0.27253545600008994
Timings: [0.27471838300698437, 0.270809723995626, 0.2720782609976595]
Checking http://localhost:8104/data
Average: 0.2662247866683174
Timings: [0.2537858319992665, 0.27199866699811537, 0.2728898610075703]
Checking http://localhost:8105/data
Average: 0.2634680923365522
Timings: [0.2701641309977276, 0.2661690460081445, 0.2540711000037845]
Checking http://localhost:8106/data
Average: 0.2536975643306505
Timings: [0.26195553899742663, 0.24524874400231056, 0.2538884099922143]
Checking http://localhost:8107/data
Average: 0.24046570600088066
Timings: [0.22387974899902474, 0.2442926579969935, 0.2532247110066237]
Checking http://localhost:8108/data
Average: 0.29463179100033204
Timings: [0.33611946800374426, 0.26753793899843004, 0.2802379659988219]
Checking http://localhost:8109/data
Average: 0.29934803033150575
Timings: [0.30769793200306594, 0.2901989489910193, 0.3001472100004321]
```Both Gunicorn and uWSGI add a bit of latency to aiohttp, but the benefit is
being able to run multiple processes for the app.## Edit 12: Added actix-web (Rust) to the mix
Now running Rust at port 8110:
```
*** Checking performance ***
Checking http://localhost:8101/data
Average: 0.5860108939911394
Timings: [0.5971241169900168, 0.5796863669966115, 0.58122219798679]
Checking http://localhost:8102/data
Average: 0.2737202873346784
Timings: [0.29462976500508375, 0.2492477229970973, 0.2772833740018541]
Checking http://localhost:8103/data
Average: 0.27776014067057986
Timings: [0.2852721760136774, 0.2723720880021574, 0.2756361579959048]
Checking http://localhost:8104/data
Average: 0.2716589259992664
Timings: [0.285364251001738, 0.2671622249908978, 0.2624503020051634]
Checking http://localhost:8105/data
Average: 0.26566417899933487
Timings: [0.2814665879996028, 0.2480908779980382, 0.2674350710003637]
Checking http://localhost:8106/data
Average: 0.25691093132869963
Timings: [0.2616907409974374, 0.24337856299825944, 0.265663489990402]
Checking http://localhost:8107/data
Average: 0.25629132133326493
Timings: [0.2749351180100348, 0.2427282979915617, 0.2512105479981983]
Checking http://localhost:8108/data
Average: 0.3147027926655331
Timings: [0.3657093520014314, 0.3020647559897043, 0.2763342700054636]
Checking http://localhost:8109/data
Average: 0.36034615100167383
Timings: [0.4017361610021908, 0.31601028400473297, 0.3632920079980977]
Checking http://localhost:8110/data
Average: 0.17194515833398327
Timings: [0.1882213880016934, 0.16158169400296174, 0.16603239299729466]
```As expected, Rust is much faster than the alternatives.
## Edit 13: Added Go(lang) with Gin
Now trying with a Go-based stack using Gin running at port 8111:
```
*** Checking performance ***
Checking http://localhost:8101/data
Average: 0.5495626856666908
Timings: [0.5694074610000825, 0.5435743589998765, 0.5357062370001131]
Checking http://localhost:8102/data
Average: 0.24419921433339672
Timings: [0.25756709500001307, 0.23248478499999692, 0.24254576300018016]
Checking http://localhost:8103/data
Average: 0.2477082216666986
Timings: [0.2562589789999947, 0.23727569700008644, 0.2495899890000146]
Checking http://localhost:8104/data
Average: 0.23632258333335207
Timings: [0.24638180900001316, 0.24029556599998614, 0.22229037500005688]
Checking http://localhost:8105/data
Average: 0.24613552099989042
Timings: [0.25480718599987995, 0.24335959399991225, 0.24023978299987903]
Checking http://localhost:8106/data
Average: 0.23579079066659384
Timings: [0.2582009159998506, 0.21576110699993478, 0.23341034899999613]
Checking http://localhost:8107/data
Average: 0.23948402699988947
Timings: [0.24210006199996315, 0.22951583099984418, 0.24683618799986107]
Checking http://localhost:8108/data
Average: 0.3048614176667191
Timings: [0.32362793099991904, 0.3169696790000671, 0.27398664300017117]
Checking http://localhost:8109/data
Average: 0.2988595006666704
Timings: [0.3288502520001657, 0.30709647799994855, 0.26063177199989696]
Checking http://localhost:8110/data
Average: 0.14635076766671773
Timings: [0.15750972499995441, 0.13875549300018974, 0.14278708500000903]
Checking http://localhost:8111/data
Average: 0.2981659416666389
Timings: [0.2914139739998518, 0.2995065280001654, 0.3035773229998995]
```I spent a considerable amount of time trying to extract the most I could from
the stack, even using Fiber and 3 different JSON codecs, and it still doesn't
perform nearly as well as I expected.This is probably due to my ignorance in Go, I'm still studying it, but it
disappoints me a bit that I can't get it to perform at least as well as the
majority of the Python stacks I tried. Hopefully once I know better about the
language I can optimize this test to perform better.## Edit 14: Revamped Go code for performance
I found out that unmarshalling JSON content into a loose map structure was
super slow (possibly because of the amount of type guessing work it has to do).
So moving to a more predictable structure brought some significant performance
boost, and moving from the standard JSON library to go-json brought yet another
significant boost. We're now pretty close to Rust, which is good enough to me!```
*** Checking performance ***
Checking http://localhost:8101/data
Average: 0.553617372000493
Timings: [0.5674350379995303, 0.5613493280034163, 0.5320677499985322]
Checking http://localhost:8102/data
Average: 0.23834095499963345
Timings: [0.23988015700160759, 0.2288304079993395, 0.24631229999795323]
Checking http://localhost:8103/data
Average: 0.23678846999731226
Timings: [0.24551372599671595, 0.2263299409969477, 0.2385217429982731]
Checking http://localhost:8104/data
Average: 0.24732621599832783
Timings: [0.25520781699742656, 0.2529781319972244, 0.2337926990003325]
Checking http://localhost:8105/data
Average: 0.24024959233065601
Timings: [0.2405069949963945, 0.23926370199478697, 0.24097808000078658]
Checking http://localhost:8106/data
Average: 0.23625101533495277
Timings: [0.2629590620053932, 0.21419178799988003, 0.2316021959995851]
Checking http://localhost:8107/data
Average: 0.24613324599825623
Timings: [0.2750477640001918, 0.22352952899382217, 0.23982244500075467]
Checking http://localhost:8108/data
Average: 0.29476479533332167
Timings: [0.29121846300404286, 0.3508308919990668, 0.24224503099685535]
Checking http://localhost:8109/data
Average: 0.32395537200015195
Timings: [0.3427479570018477, 0.34962768200057326, 0.27949047699803486]
Checking http://localhost:8110/data
Average: 0.15523099167088125
Timings: [0.16314853800577112, 0.15223667600366753, 0.1503077610032051]
Checking http://localhost:8111/data
Average: 0.18434059433153985
Timings: [0.1830926440015901, 0.18587364899576642, 0.18405548999726307]
```## Edit 15: Added Robyn to the mix
Now adding Robyn (a framework that mixes Rust and Python) at port 8112:
```
*** Checking performance ***
Checking http://localhost:8101/data
Average: 0.5350300939996183
Timings: [0.5582880990004924, 0.5265497099971981, 0.5202524730011646]
Checking http://localhost:8102/data
Average: 0.24271579499933674
Timings: [0.24495556100009708, 0.2526484519985388, 0.23054337199937436]
Checking http://localhost:8103/data
Average: 0.25248815933446167
Timings: [0.25919191900175065, 0.2413476450019516, 0.25692491399968276]
Checking http://localhost:8104/data
Average: 0.24734792766685132
Timings: [0.2693825160022243, 0.24649172099816496, 0.2261695460001647]
Checking http://localhost:8105/data
Average: 0.2456523129985726
Timings: [0.26370377499915776, 0.2273315120000916, 0.24592165199646843]
Checking http://localhost:8106/data
Average: 0.24210689533235077
Timings: [0.26894589699804783, 0.21857765299864695, 0.23879713600035757]
Checking http://localhost:8107/data
Average: 0.2351113026670646
Timings: [0.2527544670010684, 0.2214626050008519, 0.23111683599927346]
Checking http://localhost:8108/data
Average: 0.34097923933222773
Timings: [0.39432509600010235, 0.3378925779979909, 0.29072004399858997]
Checking http://localhost:8109/data
Average: 0.3210810686662929
Timings: [0.30726524599958793, 0.37055543399765156, 0.2854225260016392]
Checking http://localhost:8110/data
Average: 0.1509053493330915
Timings: [0.16917408899826114, 0.1430862680026621, 0.1404556909983512]
Checking http://localhost:8111/data
Average: 0.20303179466645815
Timings: [0.1912486399996851, 0.22394528099903255, 0.1939014630006568]
Checking http://localhost:8112/data
Average: 0.21061507800065252
Timings: [0.21572374300012598, 0.20388376499977312, 0.21223772600205848]
```About 10% faster than the other fastest Python stacks, which is pretty good, and
it got very close to Go's performance. Very promising stuff!But still not even close to Rust, and the framework is still very new, with
missing features and not capable of handling list serialization (I had to force
a dict being returned from the handler function).