{"id":15036092,"url":"https://github.com/projectphysx/fluidx3d","last_synced_at":"2025-05-13T23:09:24.368Z","repository":{"id":53210109,"uuid":"521191759","full_name":"ProjectPhysX/FluidX3D","owner":"ProjectPhysX","description":"The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.","archived":false,"fork":false,"pushed_at":"2025-05-13T04:49:17.000Z","size":22019,"stargazers_count":4416,"open_issues_count":30,"forks_count":382,"subscribers_count":62,"default_branch":"master","last_synced_at":"2025-05-13T05:29:25.241Z","etag":null,"topics":["benchmark","cfd","computational-fluid-dynamics","fluid-dynamics","fluid-simulation","fluid-solver","gpgpu","gpu","gpu-computing","high-performance-computing","hpc","interactive-visualization","lattice-boltzmann","lbm","opencl","physics","raytracing","scientific-computing","scientific-visualization","simulation"],"latest_commit_sha":null,"homepage":"https://youtube.com/@ProjectPhysX","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ProjectPhysX.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2022-08-04T08:49:44.000Z","updated_at":"2025-05-13T04:49:20.000Z","dependencies_parsed_at":"2023-10-03T12:19:55.714Z","dependency_job_id":"e5989ff1-6bd7-40b8-86bf-2d522c86fcc5","html_url":"https://github.com/ProjectPhysX/FluidX3D","commit_stats":{"total_commits":260,"total_committers":2,"mean_commits":130.0,"dds":"0.46923076923076923","last_synced_commit":"58ca271f2e91256f63fd0b24204cec874e713cfd"},"previous_names":[],"tags_count":29,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ProjectPhysX%2FFluidX3D","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ProjectPhysX%2FFluidX3D/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ProjectPhysX%2FFluidX3D/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ProjectPhysX%2FFluidX3D/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ProjectPhysX","download_url":"https://codeload.github.com/ProjectPhysX/FluidX3D/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254042080,"owners_count":22004839,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmark","cfd","computational-fluid-dynamics","fluid-dynamics","fluid-simulation","fluid-solver","gpgpu","gpu","gpu-computing","high-performance-computing","hpc","interactive-visualization","lattice-boltzmann","lbm","opencl","physics","raytracing","scientific-computing","scientific-visualization","simulation"],"created_at":"2024-09-24T20:30:07.002Z","updated_at":"2025-05-13T23:09:19.354Z","avatar_url":"https://github.com/ProjectPhysX.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# FluidX3D\n\nThe fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via [OpenCL](https://github.com/ProjectPhysX/OpenCL-Wrapper \"OpenCL-Wrapper\"). Free for non-commercial use.\n\n\u003ca href=\"https://youtu.be/-MkRBeQkLk8\"\u003e\u003cimg src=\"https://img.youtube.com/vi/o3TPN142HxM/maxresdefault.jpg\" width=\"50%\"\u003e\u003c/img\u003e\u003c/a\u003e\u003ca href=\"https://youtu.be/oC6U1M0Fsug\"\u003e\u003cimg src=\"https://img.youtube.com/vi/oC6U1M0Fsug/maxresdefault.jpg\" width=\"50%\"\u003e\u003c/img\u003e\u003c/a\u003e\u003cbr\u003e\n\u003ca href=\"https://youtu.be/XOfXHgP4jnQ\"\u003e\u003cimg src=\"https://img.youtube.com/vi/XOfXHgP4jnQ/maxresdefault.jpg\" width=\"50%\"\u003e\u003c/img\u003e\u003c/a\u003e\u003ca href=\"https://youtu.be/K5eKxzklXDA\"\u003e\u003cimg src=\"https://img.youtube.com/vi/K5eKxzklXDA/maxresdefault.jpg\" width=\"50%\"\u003e\u003c/img\u003e\u003c/a\u003e\n(click on images to show videos on YouTube)\n\n\u003cdetails\u003e\u003csummary\u003eUpdate History\u003c/summary\u003e\n\n- [v1.0](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v1.0) (04.08.2022) [changes](https://github.com/ProjectPhysX/FluidX3D/commit/768073501af725e392a4b85885009e2fa6400e48) (public release)\n  - public release\n- [v1.1](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v1.1) (29.09.2022) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v1.0...v1.1) (GPU voxelization)\n  - added solid voxelization on GPU (slow algorithm)\n  - added tool to print current camera position (key \u003ckbd\u003eG\u003c/kbd\u003e)\n  - minor bug fix (workaround for Intel iGPU driver bug with triangle rendering)\n- [v1.2](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v1.2) (24.10.2022) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v1.1...v1.2) (force/torque compuatation)\n  - added functions to compute force/torque on objects\n  - added function to translate Mesh\n  - added Stokes drag validation setup\n- [v1.3](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v1.3) (10.11.2022) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v1.2...v1.3) (minor bug fixes)\n  - added unit conversion functions for torque\n  - `FORCE_FIELD` and `VOLUME_FORCE` can now be used independently\n  - minor bug fix (workaround for AMD legacy driver bug with binary number literals)\n- [v1.4](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v1.4) (14.12.2022) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v1.3...v1.4) (Linux graphics)\n  - complete rewrite of C++ graphics library to minimize API dependencies\n  - added interactive graphics mode on Linux with X11\n  - fixed streamline visualization bug in 2D\n- [v2.0](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.0) (09.01.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v1.4...v2.0) (multi-GPU upgrade)\n  - added (cross-vendor) multi-GPU support on a single node (PC/laptop/server)\n- [v2.1](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.1) (15.01.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.0...v2.1) (fast voxelization)\n  - made solid voxelization on GPU lightning fast (new algorithm, from minutes to milliseconds)\n- [v2.2](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.0) (20.01.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.1...v2.2) (velocity voxelization)\n  - added option to voxelize moving/rotating geometry on GPU, with automatic velocity initialization for each grid point based on center of rotation, linear velocity and rotational velocity\n  - cells that are converted from solid-\u003efluid during re-voxelization now have their DDFs properly initialized\n  - added option to not auto-scale mesh during `read_stl(...)`, with negative `size` parameter\n  - added kernel for solid boundary rendering with marching-cubes\n- [v2.3](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.3) (30.01.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.2...v2.3) (particles)\n  - added particles with immersed-boundary method (either passive or 2-way-coupled, only supported with single-GPU)\n  - minor optimization to GPU voxelization algorithm (workgroup threads outside mesh bounding-box return after ray-mesh intersections have been found)\n  - displayed GPU memory allocation size is now fully accurate\n  - fixed bug in `write_line()` function in `src/utilities.hpp`\n  - removed `.exe` file extension for Linux/macOS\n- [v2.4](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.4) (11.03.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.3...v2.4) (UI improvements)\n  - added a help menu with key \u003ckbd\u003eH\u003c/kbd\u003e that shows keyboard/mouse controls, visualization settings and simulation stats\n  - improvements to keyboard/mouse control (\u003ckbd\u003e+\u003c/kbd\u003e/\u003ckbd\u003e-\u003c/kbd\u003e for zoom, \u003ckbd\u003emouseclick\u003c/kbd\u003e frees/locks cursor)\n  - added suggestion of largest possible grid resolution if resolution is set larger than memory allows\n  - minor optimizations in multi-GPU communication (insignificant performance difference)\n  - fixed bug in temperature equilibrium function for temperature extension\n  - fixed erroneous double literal for Intel iGPUs in skybox color functions\n  - fixed bug in make.sh where multi-GPU device IDs would not get forwarded to the executable\n  - minor bug fixes in graphics engine (free cursor not centered during rotation, labels in VR mode)\n  - fixed bug in `LBM::voxelize_stl()` size parameter standard initialization\n- [v2.5](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.5) (11.04.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.4...v2.5) (raytracing overhaul)\n  - implemented light absorption in fluid for raytracing graphics (no performance impact)\n  - improved raytracing framerate when camera is inside fluid\n  - fixed skybox pole flickering artifacts\n  - fixed bug where moving objects during re-voxelization would leave an erroneous trail of solid grid cells behind\n- [v2.6](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.6) (16.04.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.5...v2.6) (Intel Arc patch)\n  - patched OpenCL issues of Intel Arc GPUs: now VRAM allocations \u003e4GB are possible and correct VRAM capacity is reported\n- [v2.7](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.7) (29.05.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.6...v2.7) (visualization upgrade)\n  - added slice visualization (key \u003ckbd\u003e2\u003c/kbd\u003e / key \u003ckbd\u003e3\u003c/kbd\u003e modes, then switch through slice modes with key \u003ckbd\u003eT\u003c/kbd\u003e, move slice with keys \u003ckbd\u003eQ\u003c/kbd\u003e/\u003ckbd\u003eE\u003c/kbd\u003e)\n  - made flag wireframe / solid surface visualization kernels toggleable with key \u003ckbd\u003e1\u003c/kbd\u003e\n  - added surface pressure visualization (key \u003ckbd\u003e1\u003c/kbd\u003e when `FORCE_FIELD` is enabled and `lbm.calculate_force_on_boundaries();` is called)\n  - added binary `.vtk` export function for meshes with `lbm.write_mesh_to_vtk(Mesh* mesh);`\n  - added `time_step_multiplicator` for `integrate_particles()` function in PARTICLES extension\n  - made correction of wrong memory reporting on Intel Arc more robust\n  - fixed bug in `write_file()` template functions\n  - reverted back to separate `cl::Context` for each OpenCL device, as the shared Context otherwise would allocate extra VRAM on all other unused Nvidia GPUs\n  - removed Debug and x86 configurations from Visual Studio solution file (one less complication for compiling)\n  - fixed bug that particles could get too close to walls and get stuck, or leave the fluid phase (added boundary force)\n- [v2.8](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.8) (24.06.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.7...v2.8) (documentation + polish)\n  - finally added more [documentation](DOCUMENTATION.md)\n  - cleaned up all sample setups in `setup.cpp` for more beginner-friendliness, and added required extensions in `defines.hpp` as comments to all setups\n  - improved loading of composite `.stl` geometries, by adding an option to omit automatic mesh repositioning, added more functionality to `Mesh` struct in `utilities.hpp`\n  - added `uint3 resolution(float3 box_aspect_ratio, uint memory)` function to compute simulation box resolution based on box aspect ratio and VRAM occupation in MB\n  - added `bool lbm.graphics.next_frame(...)` function to export images for a specified video length in the `main_setup` compute loop\n  - added `VIS_...` macros to ease setting visualization modes in headless graphics mode in `lbm.graphics.visualization_modes`\n  - simulation box dimensions are now automatically made equally divisible by domains for multi-GPU simulations\n  - fixed Info/Warning/Error message formatting for loading files and made Info/Warning/Error message labels colored\n  - added Ahmed body setup as an example on how body forces and drag coefficient are computed\n  - added Cessna 172 and Bell 222 setups to showcase loading composite .stl geometries and revoxelization of moving parts\n  - added optional semi-transparent rendering mode (`#define GRAPHICS_TRANSPARENCY 0.7f` in `defines.hpp`)\n  - fixed flickering of streamline visualization in interactive graphics\n  - improved smooth positioning of streamlines in slice mode\n  - fixed bug where `mass` and `massex` in `SURFACE` extension were also allocated in CPU RAM (not required)\n  - fixed bug in Q-criterion rendering of halo data in multi-GPU mode, reduced gap width between domains\n  - removed shared memory optimization from mesh voxelization kernel, as it crashes on Nvidia GPUs with new GPU drivers and is incompatible with old OpenCL 1.0 GPUs\n  - fixed raytracing attenuation color when no surface is at the simulation box walls with periodic boundaries\n- [v2.9](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.9) (31.07.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.8...v2.9) (multithreading)\n  - added cross-platform `parallel_for` implementation in `utilities.hpp` using `std::threads`\n  - significantly (\u003e4x) faster simulation startup with multithreaded geometry initialization and sanity checks\n  - faster `calculate_force_on_object()` and `calculate_torque_on_object()` functions with multithreading\n  - added total runtime and LBM runtime to `lbm.write_status()`\n  - fixed bug in voxelization ray direction for re-voxelizing rotating objects\n  - fixed bug in `Mesh::get_bounding_box_size()`\n  - fixed bug in `print_message()` function in `utilities.hpp`\n- [v2.10](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.10) (05.11.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.9...v2.10) (frustrum culling)\n  - improved rasterization performance via frustrum culling when only part of the simulation box is visible\n  - improved switching between centered/free camera mode\n  - refactored OpenCL rendering library\n  - unit conversion factors are now automatically printed in console when `units.set_m_kg_s(...)` is used\n  - faster startup time for FluidX3D benchmark\n  - miner bug fix in `voxelize_mesh(...)` kernel\n  - fixed bug in `shading(...)`\n  - replaced slow (in multithreading) `std::rand()` function with standard C99 LCG\n  - more robust correction of wrong VRAM capacity reporting on Intel Arc GPUs\n  - fixed some minor compiler warnings\n- [v2.11](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.11) (07.12.2023) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.10...v2.11) (improved Linux graphics)\n  - interactive graphics on Linux are now in fullscreen mode too, fully matching Windows\n  - made CPU/GPU buffer initialization significantly faster with `std::fill` and `enqueueFillBuffer` (overall ~8% faster simulation startup)\n  - added operating system info to OpenCL device driver version printout\n  - fixed flickering with frustrum culling at very small field of view\n  - fixed bug where rendered/exported frame was not updated when `visualization_modes` changed\n- [v2.12](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.12) (18.01.2024) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.11...v2.12) (faster startup)\n  - ~3x faster source code compiling on Linux using multiple CPU cores if [`make`](https://www.gnu.org/software/make/) is installed\n  - significantly faster simulation initialization (~40% single-GPU, ~15% multi-GPU)\n  - minor bug fix in `Memory_Container::reset()` function\n- [v2.13](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.13) (11.02.2024) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.12...v2.13) (improved .vtk export)\n  - data in exported `.vtk` files is now automatically converted to SI units\n  - ~2x faster `.vtk` export with multithreading\n  - added unit conversion functions for `TEMPERATURE` extension\n  - fixed graphical artifacts with axis-aligned camera in raytracing\n  - fixed `get_exe_path()` for macOS\n  - fixed X11 multi-monitor issues on Linux\n  - workaround for Nvidia driver bug: `enqueueFillBuffer` is broken for large buffers on Nvidia GPUs\n  - fixed slow numeric drift issues caused by `-cl-fast-relaxed-math`\n  - fixed wrong Maximum Allocation Size reporting in `LBM::write_status()`\n  - fixed missing scaling of coordinates to SI units in `LBM::write_mesh_to_vtk()`\n- [v2.14](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.14) (03.03.2024) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.13...v2.14) (visualization upgrade)\n  - coloring can now be switched between velocity/density/temperature with key \u003ckbd\u003eZ\u003c/kbd\u003e\n  - uniform improved color palettes for velocity/density/temperature visualization\n  - color scale with automatic unit conversion can now be shown with key \u003ckbd\u003eH\u003c/kbd\u003e\n  - slice mode for field visualization now draws fully filled-in slices instead of only lines for velocity vectors\n  - shading in `VIS_FLAG_SURFACE` and `VIS_PHI_RASTERIZE` modes is smoother now\n  - `make.sh` now automatically detects operating system and X11 support on Linux and only runs FluidX3D if last compilation was successful\n  - fixed compiler warnings on Android\n  - fixed `make.sh` failing on some systems due to nonstandard interpreter path\n  - fixed that `make` would not compile with multiple cores on some systems\n- [v2.15](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.15) (09.04.2024) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.14...v2.15) (framerate boost)\n  - eliminated one frame memory copy and one clear frame operation in rendering chain, for 20-70% higher framerate on both Windows and Linux\n  - enabled `g++` compiler optimizations for faster startup and higher rendering framerate\n  - fixed bug in multithreaded sanity checks\n  - fixed wrong unit conversion for thermal expansion coefficient\n  - fixed density to pressure conversion in LBM units\n  - fixed bug that raytracing kernel could lock up simulation\n  - fixed minor visual artifacts with raytracing\n  - fixed that console sometimes was not cleared before `INTERACTIVE_GRAPHICS_ASCII` rendering starts\n- [v2.16](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.16) (02.05.2024) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.15...v2.16) (bug fixes)\n  - simplified 10% faster marching-cubes implementation with 1D interpolation on edges instead of 3D interpolation, allowing to get rid of edge table\n  - added faster, simplified marching-cubes variant for solid surface rendering where edges are always halfway between grid cells\n  - refactoring in OpenCL rendering kernels\n  - fixed that voxelization failed in Intel OpenCL CPU Runtime due to array out-of-bounds access\n  - fixed that voxelization did not always produce binary identical results in multi-GPU compared to single-GPU\n  - fixed that velocity voxelization failed for free surface simulations\n  - fixed terrible performance on ARM GPUs by macro-replacing fused-multiply-add (`fma`) with `a*b+c`\n  - fixed that \u003ckbd\u003eY\u003c/kbd\u003e/\u003ckbd\u003eZ\u003c/kbd\u003e keys were incorrect for `QWERTY` keyboard layout in Linux\n  - fixed that free camera movement speed in help overlay was not updated in stationary image when scrolling\n  - fixed that cursor would sometimes flicker when scrolling on trackpads with Linux-X11 interactive graphics\n  - fixed flickering of interactive rendering with multi-GPU when camera is not moved\n  - fixed missing `XInitThreads()` call that could crash Linux interactive graphics on some systems\n  - fixed z-fighting between `graphics_rasterize_phi()` and `graphics_flags_mc()` kernels\n- [v2.17](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.17) (05.06.2024) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.16...v2.17) (unlimited domain resolution)\n  - domains are no longer limited to 4.29 billion (2³², 1624³) grid cells or 225 GB memory; if more are used, the OpenCL code will automatically compile with 64-bit indexing\n  - new, faster raytracing-based field visualization for single-GPU simulations\n  - added [GPU Driver and OpenCL Runtime installation instructions](DOCUMENTATION.md#0-install-gpu-drivers-and-opencl-runtime) to documentation\n  - refactored `INTERACTIVE_GRAPHICS_ASCII`\n  - fixed memory leak in destructors of `floatN`, `floatNxN`, `doubleN`, `doubleNxN` (all unused)\n  - made camera movement/rotation/zoom behavior independent of framerate\n  - fixed that `smart_device_selection()` would print a wrong warning if device reports 0 MHz clock speed\n- [v2.18](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.18) (21.07.2024) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.17...v2.18) (more bug fixes)\n  - added support for high refresh rate monitors on Linux\n  - more compact OpenCL Runtime installation scripts in Documentation\n  - driver/runtime installation instructions will now be printed to console if no OpenCL devices are available\n  - added domain information to `LBM::write_status()`\n  - added `LBM::index` function for `uint3` input parameter\n  - fixed that very large simulations sometimes wouldn't render properly by increasing maximum render distance from 10k to 2.1M\n  - fixed mouse input stuttering at high screen refresh rate on Linux\n  - fixed graphical artifacts in free surface raytracing on Intel CPU Runtime for OpenCL\n  - fixed runtime estimation printed in console for setups with multiple `lbm.run(...)` calls\n  - fixed density oscillations in sample setups (too large `lbm_u`)\n  - fixed minor graphical artifacts in `raytrace_phi()`\n  - fixed minor graphical artifacts in `ray_grid_traverse_sum()`\n  - fixed wrong printed time step count on raindrop sample setup\n- [v2.19](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v2.19) (07.09.2024) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.18...v2.19) (camera splines)\n  - the camera can now fly along a smooth path through a list of provided keyframe camera placements, [using Catmull-Rom splines](https://github.com/ProjectPhysX/FluidX3D/blob/master/DOCUMENTATION.md#video-rendering)\n  - more accurate remaining runtime estimation that includes time spent on rendering\n  - enabled FP16S memory compression by default\n  - printed camera placement using key \u003ckbd\u003eG\u003c/kbd\u003e is now formatted for easier copy/paste\n  - added benchmark chart in Readme using mermaid gantt chart\n  - placed memory allocation info during simulation startup at better location\n  - fixed threading conflict between `INTERACTIVE_GRAPHICS` and `lbm.graphics.write_frame();`\n  - fixed maximum buffer allocation size limit for AMD GPUs and in Intel CPU Runtime for OpenCL\n  - fixed wrong `Re\u003cRe_max` info printout for 2D simulations\n  - minor fix in `bandwidth_bytes_per_cell_device()`\n- [v3.0](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v3.0) (16.11.2024) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v2.19...v3.0) (larger CPU/iGPU simulations)\n  - reduced memory footprint on CPUs and iGPU from 72 to 55 Bytes/cell (fused OpenCL host+device buffers for `rho`/`u`/`flags`), allowing 31% higher resolution in the same RAM capacity\n  - faster hardware-supported and faster fallback emulation atomic floating-point addition for `PARTICLES` extension\n  - hardened `calculate_f_eq()` against bad user input for `D2Q9`\n  - fixed velocity voxelization for overlapping geometry with different velocity\n  - fixed Remaining Time printout during paused simulation\n  - fixed CPU/GPU memory printout for CPU/iGPU simulations\n- [v3.1](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v3.1) (08.02.2025) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v3.0...v3.1) (more bug fixes)\n  - faster `enqueueReadBuffer()` on modern CPUs with 64-Byte-aligned `host_buffer`\n  - hardened ray intersection functions against planar ray edge case\n  - updated OpenCL headers\n  - better OpenCL device specs detection using vendor ID and Nvidia compute capability\n  - better VRAM capacity reporting correction for Intel dGPUs\n  - improved styling of performance mermaid gantt chart in Readme\n  - added multi-GPU performance mermaid gantt chart in Readme\n  - updated driver install guides\n  - fixed voxelization being broken on some GPUs\n  - added workaround for compiler bug in Intel CPU Runtime for OpenCL that causes Q-criterion isosurface rendering corruption\n  - fixed TFlops estimate for Intel Battlemage GPUs\n  - fixed wrong device name reporting for AMD GPUs\n- [v3.2](https://github.com/ProjectPhysX/FluidX3D/releases/tag/v3.2) (09.03.2025) [changes](https://github.com/ProjectPhysX/FluidX3D/compare/v3.1...v3.2) (fast force/torque summation)\n  - implemented GPU-accelerated force/torque summation (~20x faster than CPU-multithreaded implementation before)\n  - simplified calculating object force/torque in setups\n  - improved coloring in `VIS_FIELD`/`ray_grid_traverse_sum()`\n  - updated OpenCL-Wrapper now compiles OpenCL C code with `-cl-std=CL3.0` if available\n  - fixed compiling on macOS with new OpenCL headers\n\n\u003c/details\u003e\n\n\n\n## How to get started?\n\nRead the [FluidX3D Documentation](DOCUMENTATION.md)!\n\n\n\n## Compute Features - Getting the Memory Problem under Control\n\n- \u003cdetails\u003e\u003csummary\u003e\u003ca name=\"cfd-model\"\u003e\u003c/a\u003eCFD model: lattice Boltzmann method (LBM)\u003c/summary\u003e\n\n  - streaming (part 2/2)\u003cp align=\"center\"\u003e\u003ci\u003ef\u003c/i\u003e\u003csub\u003e0\u003c/sub\u003e\u003csup\u003etemp\u003c/sup\u003e(\u003ci\u003ex\u003c/i\u003e,\u003ci\u003et\u003c/i\u003e) = \u003ci\u003ef\u003c/i\u003e\u003csub\u003e0\u003c/sub\u003e(\u003ci\u003ex\u003c/i\u003e, \u003ci\u003et\u003c/i\u003e)\u003cbr\u003e\u003ci\u003ef\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e\u003csup\u003etemp\u003c/sup\u003e(\u003ci\u003ex\u003c/i\u003e,\u003ci\u003et\u003c/i\u003e) = \u003ci\u003ef\u003c/i\u003e\u003csub\u003e(\u003ci\u003et\u003c/i\u003e%2 ? \u003ci\u003ei\u003c/i\u003e : (\u003ci\u003ei\u003c/i\u003e%2 ? \u003ci\u003ei\u003c/i\u003e+1 : \u003ci\u003ei\u003c/i\u003e-1))\u003c/sub\u003e(\u003ci\u003ei\u003c/i\u003e%2 ? \u003ci\u003ex\u003c/i\u003e : \u003ci\u003ex\u003c/i\u003e-\u003ci\u003ee\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e, \u003ci\u003et\u003c/i\u003e) \u0026nbsp; for \u0026nbsp; \u003ci\u003ei\u003c/i\u003e \u0026isin; [1, \u003ci\u003eq\u003c/i\u003e-1]\u003c/p\u003e\n  - collision\u003cp align=\"center\"\u003e\u003ci\u003e\u0026rho;\u003c/i\u003e(\u003ci\u003ex\u003c/i\u003e,\u003ci\u003et\u003c/i\u003e) = (\u0026Sigma;\u003csub\u003e\u003ci\u003ei\u003c/i\u003e\u003c/sub\u003e \u003ci\u003ef\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e\u003csup\u003etemp\u003c/sup\u003e(\u003ci\u003ex\u003c/i\u003e,\u003ci\u003et\u003c/i\u003e)) + 1\u003cbr\u003e\u003cbr\u003e\u003ci\u003eu\u003c/i\u003e(\u003ci\u003ex\u003c/i\u003e,\u003ci\u003et\u003c/i\u003e) = \u003csup\u003e1\u003c/sup\u003e\u0026#8725;\u003csub\u003e\u003ci\u003e\u0026rho;\u003c/i\u003e(\u003ci\u003ex\u003c/i\u003e,\u003ci\u003et\u003c/i\u003e)\u003c/sub\u003e \u0026Sigma;\u003csub\u003e\u003ci\u003ei\u003c/i\u003e\u003c/sub\u003e \u003ci\u003ec\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e \u003ci\u003ef\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e\u003csup\u003etemp\u003c/sup\u003e(\u003ci\u003ex\u003c/i\u003e,\u003ci\u003et\u003c/i\u003e)\u003cbr\u003e\u003cbr\u003e\u003ci\u003ef\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e\u003csup\u003eeq-shifted\u003c/sup\u003e(\u003ci\u003ex\u003c/i\u003e,\u003ci\u003et\u003c/i\u003e) = \u003ci\u003ew\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e \u003ci\u003e\u0026rho;\u003c/i\u003e · (\u003csup\u003e(\u003ci\u003eu\u003c/i\u003e\u003csub\u003e°\u003c/sub\u003e\u003ci\u003ec\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e)\u003csup\u003e2\u003c/sup\u003e\u003c/sup\u003e\u0026#8725;\u003csub\u003e(2\u003ci\u003ec\u003c/i\u003e\u003csup\u003e4\u003c/sup\u003e)\u003c/sub\u003e - \u003csup\u003e(\u003ci\u003eu\u003c/i\u003e\u003csub\u003e°\u003c/sub\u003e\u003ci\u003eu\u003c/i\u003e)\u003c/sup\u003e\u0026#8725;\u003csub\u003e(2c\u003csup\u003e2\u003c/sup\u003e)\u003c/sub\u003e + \u003csup\u003e(\u003ci\u003eu\u003c/i\u003e\u003csub\u003e°\u003c/sub\u003e\u003ci\u003ec\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e)\u003c/sup\u003e\u0026#8725;\u003csub\u003e\u003ci\u003ec\u003c/i\u003e\u003csup\u003e2\u003c/sup\u003e\u003c/sub\u003e) + \u003ci\u003ew\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e (\u003ci\u003e\u0026rho;\u003c/i\u003e-1)\u003cbr\u003e\u003cbr\u003e\u003ci\u003ef\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e\u003csup\u003etemp\u003c/sup\u003e(\u003ci\u003ex\u003c/i\u003e, \u003ci\u003et\u003c/i\u003e+\u0026Delta;\u003ci\u003et\u003c/i\u003e) = \u003ci\u003ef\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e\u003csup\u003etemp\u003c/sup\u003e(\u003ci\u003ex\u003c/i\u003e,\u003ci\u003et\u003c/i\u003e) + \u003ci\u003e\u0026Omega;\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e(\u003ci\u003ef\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e\u003csup\u003etemp\u003c/sup\u003e(\u003ci\u003ex\u003c/i\u003e,\u003ci\u003et\u003c/i\u003e), \u003ci\u003ef\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e\u003csup\u003eeq-shifted\u003c/sup\u003e(\u003ci\u003ex\u003c/i\u003e,\u003ci\u003et\u003c/i\u003e), \u003ci\u003e\u0026tau;\u003c/i\u003e)\u003c/p\u003e\n  - streaming (part 1/2)\u003cp align=\"center\"\u003e\u003ci\u003ef\u003c/i\u003e\u003csub\u003e0\u003c/sub\u003e(\u003ci\u003ex\u003c/i\u003e, \u003ci\u003et\u003c/i\u003e+\u0026Delta;\u003ci\u003et\u003c/i\u003e) = \u003ci\u003ef\u003c/i\u003e\u003csub\u003e0\u003c/sub\u003e\u003csup\u003etemp\u003c/sup\u003e(\u003ci\u003ex\u003c/i\u003e, \u003ci\u003et\u003c/i\u003e+\u0026Delta;\u003ci\u003et\u003c/i\u003e)\u003cbr\u003e\u003ci\u003ef\u003c/i\u003e\u003csub\u003e(\u003ci\u003et\u003c/i\u003e%2 ? (\u003ci\u003ei\u003c/i\u003e%2 ? \u003ci\u003ei\u003c/i\u003e+1 : \u003ci\u003ei\u003c/i\u003e-1) : \u003ci\u003ei\u003c/i\u003e)\u003c/sub\u003e(\u003ci\u003ei\u003c/i\u003e%2 ? \u003ci\u003ex\u003c/i\u003e+\u003ci\u003ee\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e : \u003ci\u003ex\u003c/i\u003e, \u003ci\u003et\u003c/i\u003e+\u0026Delta;\u003ci\u003et\u003c/i\u003e) = \u003ci\u003ef\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e\u003csup\u003etemp\u003c/sup\u003e(\u003ci\u003ex\u003c/i\u003e, \u003ci\u003et\u003c/i\u003e+\u0026Delta;\u003ci\u003et\u003c/i\u003e) \u0026nbsp; for \u0026nbsp; \u003ci\u003ei\u003c/i\u003e \u0026isin; [1, \u003ci\u003eq\u003c/i\u003e-1]\u003c/p\u003e\n  - \u003cdetails\u003e\u003csummary\u003evariables and \u003ca href=\"https://doi.org/10.15495/EPub_UBT_00005400\"\u003enotation\u003c/a\u003e\u003c/summary\u003e\n\n    | variable             | SI units                            | defining equation                                   | description                                                                     |\n    | :------------------: | :---------------------------------: | :-------------------------------------------------: | :------------------------------------------------------------------------------ |\n    |                      |                                     |                                                     |                                                                                 |\n    | \u003ci\u003ex\u003c/i\u003e             | m                                   | \u003ci\u003ex\u003c/i\u003e = (x,y,z)\u003csup\u003eT\u003c/sup\u003e                      | 3D position in Cartesian coordinates                                            |\n    | \u003ci\u003et\u003c/i\u003e             | s                                   | -                                                   | time                                                                            |\n    | \u003ci\u003e\u0026rho;\u003c/i\u003e         | \u003csup\u003ekg\u003c/sup\u003e\u0026#8725;\u003csub\u003em³\u003c/sub\u003e   | \u003ci\u003e\u0026rho;\u003c/i\u003e = (\u0026Sigma;\u003csub\u003e\u003ci\u003ei\u003c/i\u003e\u003c/sub\u003e \u003ci\u003ef\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e)+1 | mass density of fluid                                            |\n    | \u003ci\u003ep\u003c/i\u003e             | \u003csup\u003ekg\u003c/sup\u003e\u0026#8725;\u003csub\u003em\u0026nbsp;s²\u003c/sub\u003e | \u003ci\u003ep\u003c/i\u003e = \u003ci\u003ec\u003c/i\u003e² \u003ci\u003e\u0026rho;\u003c/i\u003e              | pressure of fluid                                                               |\n    | \u003ci\u003eu\u003c/i\u003e | \u003csup\u003em\u003c/sup\u003e\u0026#8725;\u003csub\u003es\u003c/sub\u003e | \u003ci\u003eu\u003c/i\u003e = \u003csup\u003e1\u003c/sup\u003e\u0026#8725;\u003csub\u003e\u003ci\u003e\u0026rho;\u003c/i\u003e\u003c/sub\u003e \u0026Sigma;\u003csub\u003e\u003ci\u003ei\u003c/i\u003e\u003c/sub\u003e \u003ci\u003ec\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e \u003ci\u003ef\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e | velocity of fluid        |\n    | \u003ci\u003e\u0026nu;\u003c/i\u003e          | \u003csup\u003em²\u003c/sup\u003e\u0026#8725;\u003csub\u003es\u003c/sub\u003e    | \u003ci\u003e\u0026nu;\u003c/i\u003e = \u003csup\u003e\u003ci\u003e\u0026mu;\u003c/i\u003e\u003c/sup\u003e\u0026#8725;\u003csub\u003e\u003ci\u003e\u0026rho;\u003c/i\u003e\u003c/sub\u003e | kinematic shear viscosity of fluid                               |\n    | \u003ci\u003e\u0026mu;\u003c/i\u003e          | \u003csup\u003ekg\u003c/sup\u003e\u0026#8725;\u003csub\u003em\u0026nbsp;s\u003c/sub\u003e | \u003ci\u003e\u0026mu;\u003c/i\u003e = \u003ci\u003e\u0026rho;\u003c/i\u003e \u003ci\u003e\u0026nu;\u003c/i\u003e          | dynamic viscosity of fluid                                                      |\n    |                      |                                     |                                                     |                                                                                 |\n    | \u003ci\u003ef\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e | \u003csup\u003ekg\u003c/sup\u003e\u0026#8725;\u003csub\u003em³\u003c/sub\u003e   | -                                                   | shifted density distribution functions (DDFs)                                   |\n    | \u0026Delta;\u003ci\u003ex\u003c/i\u003e      | m                                   | \u0026Delta;\u003ci\u003ex\u003c/i\u003e = 1                                 | lattice constant (in LBM units)                                                 |\n    | \u0026Delta;\u003ci\u003et\u003c/i\u003e      | s                                   | \u0026Delta;\u003ci\u003et\u003c/i\u003e = 1                                 | simulation time step (in LBM units)                                             |\n    | \u003ci\u003ec\u003c/i\u003e | \u003csup\u003em\u003c/sup\u003e\u0026#8725;\u003csub\u003es\u003c/sub\u003e | \u003ci\u003ec\u003c/i\u003e = \u003csup\u003e1\u003c/sup\u003e\u0026#8725;\u003csub\u003e\u0026radic;3\u003c/sub\u003e \u003csup\u003e\u0026Delta;\u003ci\u003ex\u003c/i\u003e\u003c/sup\u003e\u0026#8725;\u003csub\u003e\u0026Delta;\u003ci\u003et\u003c/i\u003e\u003c/sub\u003e | lattice speed of sound (in LBM units) |\n    | \u003ci\u003ei\u003c/i\u003e             | 1                                   | 0 \u0026le; \u003ci\u003ei\u003c/i\u003e \u003c \u003ci\u003eq\u003c/i\u003e                          | LBM streaming direction index                                                   |\n    | \u003ci\u003eq\u003c/i\u003e             | 1                                   | \u003ci\u003eq\u003c/i\u003e \u0026isin; {\u0026nbsp;9,15,19,27\u0026nbsp;}            | number of LBM streaming directions                                              |\n    | \u003ci\u003ee\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e | m                                   | D2Q9 / D3Q15/19/27                                  | LBM streaming directions                                                        |\n    | \u003ci\u003ec\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e | \u003csup\u003em\u003c/sup\u003e\u0026#8725;\u003csub\u003es\u003c/sub\u003e     | \u003ci\u003ec\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e = \u003csup\u003e\u003ci\u003ee\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e\u003c/sup\u003e\u0026#8725;\u003csub\u003e\u0026Delta;\u003ci\u003et\u003c/i\u003e\u003c/sub\u003e | LBM streaming velocities                    |\n    | \u003ci\u003ew\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e | 1                                   | \u0026Sigma;\u003csub\u003e\u003ci\u003ei\u003c/i\u003e\u003c/sub\u003e \u003ci\u003ew\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e = 1 | LBM velocity set weights                                                        |\n    | \u003ci\u003e\u0026Omega;\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e | \u003csup\u003ekg\u003c/sup\u003e\u0026#8725;\u003csub\u003em³\u003c/sub\u003e | SRT or TRT                                      | LBM collision operator                                                          |\n    | \u003ci\u003e\u0026tau;\u003c/i\u003e         | s                                  | \u003ci\u003e\u0026tau;\u003c/i\u003e = \u003csup\u003e\u003ci\u003e\u0026nu;\u003c/i\u003e\u003c/sup\u003e\u0026#8725;\u003csub\u003e\u003ci\u003ec\u003c/i\u003e²\u003c/sub\u003e + \u003csup\u003e\u0026Delta;\u003ci\u003et\u003c/i\u003e\u003c/sup\u003e\u0026#8725;\u003csub\u003e2\u003c/sub\u003e | LBM relaxation time |\n\n    \u003c/details\u003e\n  - velocity sets: D2Q9, D3Q15, D3Q19 (default), D3Q27\n  - collision operators: single-relaxation-time (SRT/BGK) (default), two-relaxation-time (TRT)\n  - [DDF-shifting](https://www.researchgate.net/publication/362275548_Accuracy_and_performance_of_the_lattice_Boltzmann_method_with_64-bit_32-bit_and_customized_16-bit_number_formats) and other algebraic optimization to minimize round-off error\n\n  \u003c/details\u003e\n\n\u003c!-- markdown equations don't render properly in mobile browser\n  - streaming (part 2/2):\n$$j=0\\\\ \\textrm{for}\\\\ i=0$$\n$$j=t\\\\%2\\\\ ?\\\\ i\\\\ :\\\\ (i\\\\%2\\\\ ?\\\\ i+1\\\\ :\\\\ i-1)\\\\ \\textrm{for}\\\\ i\\in[1,q-1]$$\n$$f_i^\\textrm{temp}(\\vec{x},t)=f_j(i\\\\%2\\\\ ?\\\\ \\vec{x}\\\\ :\\\\ \\vec{x}-\\vec{e}_i,\\\\ t)$$\n  - collision:\n$$\\rho(\\vec{x},t)=\\left(\\sum_i f_i^\\textrm{temp}(\\vec{x},t)\\right)+1$$\n$$\\vec{u}(\\vec{x},t)=\\frac{1}{\\rho(\\vec{x},t)}\\sum_i\\vec{c}_i f_i^\\textrm{temp}(\\vec{x},t)$$\n$$f_i^\\textrm{eq-shifted}(\\vec{x},t)=w_i \\rho \\cdot\\left(\\frac{(\\vec{u} _{^{^\\circ}}\\vec{c}_i)^2}{2 c^4}-\\frac{\\vec{u} _{^{^\\circ}}\\vec{u}}{2 c^2}+\\frac{\\vec{u} _{^{^\\circ}}\\vec{c}_i}{c^2}\\right)+w_i (\\rho-1)$$\n$$f_i^\\textrm{temp}(\\vec{x},\\\\ t+\\Delta t)=f_i^\\textrm{temp}(\\vec{x},t)+\\Omega_i(f_i^\\textrm{temp}(\\vec{x},t),\\\\ f_i^\\textrm{eq-shifted}(\\vec{x},t),\\\\ \\tau)$$\n  - streaming (part 1/2):\n$$j=0\\\\ \\textrm{for}\\\\ i=0$$\n$$j=t\\\\%2\\\\ ?\\\\ (i\\\\%2\\\\ ?\\\\ i+1\\\\ :\\\\ i-1)\\\\ :\\\\ i\\\\ \\textrm{for}\\\\ i\\in[1,q-1]$$\n$$f_j(i\\\\%2\\\\ ?\\\\ \\vec{x}+\\vec{e}_i\\\\ :\\\\ \\vec{x},\\\\ t+\\Delta t)=f_i^\\textrm{temp}(\\vec{x},\\\\ t+\\Delta t)$$\n --\u003e\n\n- \u003cdetails\u003e\u003csummary\u003e\u003ca name=\"vram-footprint\"\u003e\u003c/a\u003eoptimized to minimize VRAM footprint to 1/6 of other LBM codes\u003c/summary\u003e\n\n  - traditional LBM (D3Q19) with FP64 requires ~344 Bytes/cell\u003cbr\u003e\n    - 🟧🟧🟧🟧🟧🟧🟧🟧🟦🟦🟦🟦🟦🟦🟦🟦🟦🟦🟦🟦🟦🟦🟦🟦🟦🟦🟦🟦🟦🟦🟦🟦🟨🟨🟨🟨🟨🟨🟨🟨🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥🟥\u003cbr\u003e(density 🟧, velocity 🟦, flags 🟨, 2 copies of DDFs 🟩/🟥; each square = 1 Byte)\n    - allows for 3 Million cells per 1 GB VRAM\n  - FluidX3D (D3Q19) requires only 55 Bytes/cell with [Esoteric-Pull](https://doi.org/10.3390/computation10060092)+[FP16](https://www.researchgate.net/publication/362275548_Accuracy_and_performance_of_the_lattice_Boltzmann_method_with_64-bit_32-bit_and_customized_16-bit_number_formats)\u003cbr\u003e\n    - 🟧🟧🟧🟧🟦🟦🟦🟦🟦🟦🟦🟦🟦🟦🟦🟦🟨🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩🟩\u003cbr\u003e(density 🟧, velocity 🟦, flags 🟨, DDFs 🟩; each square = 1 Byte)\n    - allows for 19 Million cells per 1 GB VRAM\n    - in-place streaming with [Esoteric-Pull](https://doi.org/10.3390/computation10060092): eliminates redundant copy of density distribution functions (DDFs) in memory; almost cuts memory demand in half and slightly increases performance due to implicit bounce-back boundaries; offers optimal memory access patterns for single-cell in-place streaming\n    - [decoupled arithmetic precision (FP32) and memory precision (FP32 or FP16S or FP16C)](https://www.researchgate.net/publication/362275548_Accuracy_and_performance_of_the_lattice_Boltzmann_method_with_64-bit_32-bit_and_customized_16-bit_number_formats): all arithmetic is done in FP32 for compatibility on all hardware, but DDFs in memory can be compressed to FP16S or FP16C: almost cuts memory demand in half again and almost doubles performance, without impacting overall accuracy for most setups\n    - \u003cdetails\u003e\u003csummary\u003eonly 8 flag bits per lattice point (can be used independently / at the same time)\u003c/summary\u003e\n\n      - `TYPE_S` (stationary or moving) solid boundaries\n      - `TYPE_E` equilibrium boundaries (inflow/outflow)\n      - `TYPE_T` temperature boundaries\n      - `TYPE_F` free surface (fluid)\n      - `TYPE_I` free surface (interface)\n      - `TYPE_G` free surface (gas)\n      - `TYPE_X` remaining for custom use or further extensions\n      - `TYPE_Y` remaining for custom use or further extensions\n\n      \u003c/details\u003e\n  - large cost saving: comparison of maximum single-GPU grid resolution for D3Q19 LBM\n\n    | GPU\u0026nbsp;VRAM\u0026nbsp;capacity      | 1\u0026nbsp;GB | 2\u0026nbsp;GB | 3\u0026nbsp;GB | 4\u0026nbsp;GB | 6\u0026nbsp;GB | 8\u0026nbsp;GB | 10\u0026nbsp;GB | 11\u0026nbsp;GB | 12\u0026nbsp;GB | 16\u0026nbsp;GB | 20\u0026nbsp;GB | 24\u0026nbsp;GB | 32\u0026nbsp;GB | 40\u0026nbsp;GB | 48\u0026nbsp;GB | 64\u0026nbsp;GB | 80\u0026nbsp;GB | 94\u0026nbsp;GB | 128\u0026nbsp;GB | 192\u0026nbsp;GB | 256\u0026nbsp;GB |\n    | :------------------------------- | --------: | --------: | --------: | --------: | --------: | --------: | ---------: | ---------: | ---------: | ---------: | ---------: | ---------: | ---------: | ---------: | ---------: | ---------: | ---------: | ---------: | ----------: | ----------: | ----------: |\n    | approximate\u0026nbsp;GPU\u0026nbsp;price  | $25\u003cbr\u003eGT\u0026nbsp;210 | $25\u003cbr\u003eGTX\u0026nbsp;950 | $12\u003cbr\u003eGTX\u0026nbsp;1060 | $50\u003cbr\u003eGT\u0026nbsp;730 | $35\u003cbr\u003eGTX\u0026nbsp;1060 | $70\u003cbr\u003eRX\u0026nbsp;470 | $500\u003cbr\u003eRTX\u0026nbsp;3080 | $240\u003cbr\u003eGTX\u0026nbsp;1080\u0026nbsp;Ti | $75\u003cbr\u003eTesla\u0026nbsp;M40 | $75\u003cbr\u003eInstinct\u0026nbsp;MI25 | $900\u003cbr\u003eRX\u0026nbsp;7900\u0026nbsp;XT | $205\u003cbr\u003eTesla\u0026nbsp;P40 | $600\u003cbr\u003eInstinct\u0026nbsp;MI60 | $5500\u003cbr\u003eA100 | $2400\u003cbr\u003eRTX\u0026nbsp;8000 | $10k\u003cbr\u003eInstinct\u0026nbsp;MI210 | $11k\u003cbr\u003eA100 | \u003e$40k\u003cbr\u003eH100\u0026nbsp;NVL | ?\u003cbr\u003eGPU\u0026nbsp;Max\u0026nbsp;1550 | ~$10k\u003cbr\u003eMI300X | - |\n    | traditional\u0026nbsp;LBM\u0026nbsp;(FP64) |      144³ |      182³ |      208³ |      230³ |      262³ |      288³ |       312³ |       322³ |       330³ |       364³ |       392³ |       418³ |       460³ |       494³ |       526³ |       578³ |       624³ |       658³ |        730³ |        836³ |        920³ |\n    | FluidX3D\u0026nbsp;(FP32/FP32)        |      224³ |      282³ |      322³ |      354³ |      406³ |      448³ |       482³ |       498³ |       512³ |       564³ |       608³ |       646³ |       710³ |       766³ |       814³ |       896³ |       966³ |      1018³ |       1130³ |       1292³ |       1422³ |\n    | FluidX3D\u0026nbsp;(FP32/FP16)        |      266³ |      336³ |      384³ |      424³ |      484³ |      534³ |       574³ |       594³ |       610³ |       672³ |       724³ |       770³ |       848³ |       912³ |       970³ |      1068³ |      1150³ |      1214³ |       1346³ |       1540³ |       1624³ |\n\n  \u003c/details\u003e\n- \u003cdetails\u003e\u003csummary\u003e\u003ca name=\"multi-gpu\"\u003e\u003c/a\u003ecross-vendor multi-GPU support on a single computer/server\u003c/summary\u003e\n\n  - domain decomposition allows pooling VRAM from multiple GPUs for much larger grid resolution\n  - GPUs don't have to be identical, not even from the same vendor - \u003ca href=\"https://youtu.be/_8Ed8ET9gBU\"\u003eany combination of AMD+Intel+Nvidia GPUs will work\u003c/a\u003e - but similar VRAM capacity/bandwidth is recommended\n  - domain communication architecture (simplified)\n    ```diff\n    ++   .-----------------------------------------------------------------.   ++\n    ++   |                              GPU 0                              |   ++\n    ++   |                          LBM Domain 0                           |   ++\n    ++   '-----------------------------------------------------------------'   ++\n    ++              |                 selective                /|\\             ++\n    ++             \\|/               in-VRAM copy               |              ++\n    ++        .-------------------------------------------------------.        ++\n    ++        |               GPU 0 - Transfer Buffer 0               |        ++\n    ++        '-------------------------------------------------------'        ++\n    !!                            |     PCIe     /|\\                           !!\n    !!                           \\|/    copy      |                            !!\n    @@        .-------------------------.   .-------------------------.        @@\n    @@        | CPU - Transfer Buffer 0 |   | CPU - Transfer Buffer 1 |        @@\n    @@        '-------------------------'\\ /'-------------------------'        @@\n    @@                           pointer  X   swap                             @@\n    @@        .-------------------------./ \\.-------------------------.        @@\n    @@        | CPU - Transfer Buffer 1 |   | CPU - Transfer Buffer 0 |        @@\n    @@        '-------------------------'   '-------------------------'        @@\n    !!                           /|\\    PCIe      |                            !!\n    !!                            |     copy     \\|/                           !!\n    ++        .-------------------------------------------------------.        ++\n    ++        |               GPU 1 - Transfer Buffer 1               |        ++\n    ++        '-------------------------------------------------------'        ++\n    ++             /|\\                selective                 |              ++\n    ++              |                in-VRAM copy              \\|/             ++\n    ++   .-----------------------------------------------------------------.   ++\n    ++   |                              GPU 1                              |   ++\n    ++   |                          LBM Domain 1                           |   ++\n    ++   '-----------------------------------------------------------------'   ++\n    ##                                    |                                    ##\n    ##                      domain synchronization barrier                     ##\n    ##                                    |                                    ##\n    ||   -------------------------------------------------------------\u003e time   ||\n    ```\n  - domain communication architecture (detailed)\n    ```diff\n    ++   .-----------------------------------------------------------------.   ++\n    ++   |                              GPU 0                              |   ++\n    ++   |                          LBM Domain 0                           |   ++\n    ++   '-----------------------------------------------------------------'   ++\n    ++     |  selective in- /|\\  |  selective in- /|\\  |  selective in- /|\\    ++\n    ++    \\|/ VRAM copy (X)  |  \\|/ VRAM copy (Y)  |  \\|/ VRAM copy (Z)  |     ++\n    ++   .---------------------.---------------------.---------------------.   ++\n    ++   |    GPU 0 - TB 0X+   |    GPU 0 - TB 0Y+   |    GPU 0 - TB 0Z+   |   ++\n    ++   |    GPU 0 - TB 0X-   |    GPU 0 - TB 0Y-   |    GPU 0 - TB 0Z-   |   ++\n    ++   '---------------------'---------------------'---------------------'   ++\n    !!          | PCIe /|\\            | PCIe /|\\            | PCIe /|\\         !!\n    !!         \\|/ copy |            \\|/ copy |            \\|/ copy |          !!\n    @@   .---------. .---------.---------. .---------.---------. .---------.   @@\n    @@   | CPU 0X+ | | CPU 1X- | CPU 0Y+ | | CPU 3Y- | CPU 0Z+ | | CPU 5Z- |   @@\n    @@   | CPU 0X- | | CPU 2X+ | CPU 0Y- | | CPU 4Y+ | CPU 0Z- | | CPU 6Z+ |   @@\n    @@   '---------\\ /---------'---------\\ /---------'---------\\ /---------'   @@\n    @@      pointer X swap (X)    pointer X swap (Y)    pointer X swap (Z)     @@\n    @@   .---------/ \\---------.---------/ \\---------.---------/ \\---------.   @@\n    @@   | CPU 1X- | | CPU 0X+ | CPU 3Y- | | CPU 0Y+ | CPU 5Z- | | CPU 0Z+ |   @@\n    @@   | CPU 2X+ | | CPU 0X- | CPU 4Y+ | | CPU 0Y- | CPU 6Z+ | | CPU 0Z- |   @@\n    @@   '---------' '---------'---------' '---------'---------' '---------'   @@\n    !!         /|\\ PCIe |            /|\\ PCIe |            /|\\ PCIe |          !!\n    !!          | copy \\|/            | copy \\|/            | copy \\|/         !!\n    ++   .--------------------..---------------------..--------------------.   ++\n    ++   |   GPU 1 - TB 1X-   ||    GPU 3 - TB 3Y-   ||   GPU 5 - TB 5Z-   |   ++\n    ++   :====================::=====================::====================:   ++\n    ++   |   GPU 2 - TB 2X+   ||    GPU 4 - TB 4Y+   ||   GPU 6 - TB 6Z+   |   ++\n    ++   '--------------------''---------------------''--------------------'   ++\n    ++    /|\\ selective in-  |  /|\\ selective in-  |  /|\\ selective in-  |     ++\n    ++     |  VRAM copy (X) \\|/  |  VRAM copy (Y) \\|/  |  VRAM copy (Z) \\|/    ++\n    ++   .--------------------..---------------------..--------------------.   ++\n    ++   |        GPU 1       ||        GPU 3        ||        GPU 5       |   ++\n    ++   |    LBM Domain 1    ||    LBM Domain 3     ||    LBM Domain 5    |   ++\n    ++   :====================::=====================::====================:   ++\n    ++   |        GPU 2       ||        GPU 4        ||        GPU 6       |   ++\n    ++   |    LBM Domain 2    ||    LBM Domain 4     ||    LBM Domain 6    |   ++\n    ++   '--------------------''---------------------''--------------------'   ++\n    ##              |                     |                     |              ##\n    ##              |      domain synchronization barriers      |              ##\n    ##              |                     |                     |              ##\n    ||   -------------------------------------------------------------\u003e time   ||\n    ```\n\n  \u003c/details\u003e\n- \u003cdetails\u003e\u003csummary\u003e\u003ca name=\"performance\"\u003e\u003c/a\u003epeak performance on GPUs (datacenter/gaming/professional/laptop)\u003c/summary\u003e\n\n  - [single-GPU/CPU benchmarks](#single-gpucpu-benchmarks)\n  - [multi-GPU benchmarks](#multi-gpu-benchmarks)\n\n  \u003c/details\u003e\n- \u003cdetails\u003e\u003csummary\u003e\u003ca name=\"extensions\"\u003e\u003c/a\u003epowerful model extensions\u003c/summary\u003e\n\n  - [boundary types](https://doi.org/10.15495/EPub_UBT_00005400)\n    - stationary mid-grid bounce-back boundaries (stationary solid boundaries)\n    - moving mid-grid bounce-back boundaries (moving solid boundaries)\n    - equilibrium boundaries (non-reflective inflow/outflow)\n    - temperature boundaries (fixed temperature)\n  - global force per volume (Guo forcing), can be modified on-the-fly\n  - local force per volume (force field)\n    - optional computation of forces from the fluid on solid boundaries\n  - state-of-the-art [free surface LBM](https://doi.org/10.3390/computation10060092) (FSLBM) implementation:\n    - [volume-of-fluid model](https://doi.org/10.15495/EPub_UBT_00005400)\n    - [fully analytic PLIC](https://doi.org/10.3390/computation10020021) for efficient curvature calculation\n    - improved mass conservation\n    - ultra efficient implementation with only [4 kernels](https://doi.org/10.3390/computation10060092) additionally to `stream_collide()` kernel\n  - thermal LBM to simulate thermal convection\n    - D3Q7 subgrid for thermal DDFs\n    - in-place streaming with [Esoteric-Pull](https://doi.org/10.3390/computation10060092) for thermal DDFs\n    - optional [FP16S or FP16C compression](https://www.researchgate.net/publication/362275548_Accuracy_and_performance_of_the_lattice_Boltzmann_method_with_64-bit_32-bit_and_customized_16-bit_number_formats) for thermal DDFs with [DDF-shifting](https://www.researchgate.net/publication/362275548_Accuracy_and_performance_of_the_lattice_Boltzmann_method_with_64-bit_32-bit_and_customized_16-bit_number_formats)\n  - Smagorinsky-Lilly subgrid turbulence LES model to keep simulations with very large Reynolds number stable\n    \u003cp align=\"center\"\u003e\u003ci\u003e\u0026Pi;\u003csub\u003e\u0026alpha;\u0026beta;\u003c/sub\u003e\u003c/i\u003e = \u0026Sigma;\u003csub\u003e\u003ci\u003ei\u003c/i\u003e\u003c/sub\u003e \u003ci\u003ee\u003csub\u003ei\u0026alpha;\u003c/sub\u003e\u003c/i\u003e \u003ci\u003ee\u003csub\u003ei\u0026beta;\u003c/sub\u003e\u003c/i\u003e (\u003ci\u003ef\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e   - \u003ci\u003ef\u003csub\u003ei\u003c/sub\u003e\u003c/i\u003e\u003csup\u003eeq-shifted\u003c/sup\u003e)\u003cbr\u003e\u003cbr\u003eQ = \u0026Sigma;\u003csub\u003e\u003ci\u003e\u0026alpha;\u0026beta;\u003c/i\u003e\u003c/sub\u003e   \u003ci\u003e\u0026Pi;\u003csub\u003e\u0026alpha;\u0026beta;\u003c/sub\u003e\u003c/i\u003e\u003csup\u003e2\u003c/sup\u003e\u003cbr\u003e\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;\u0026nbsp;______________________\u003cbr\u003e\u0026tau; = \u0026frac12; (\u0026tau;\u003csub\u003e0\u003c/sub\u003e + \u0026radic; \u0026tau;\u003csub\u003e0\u003c/sub\u003e\u003csup\u003e2\u003c/sup\u003e + \u003csup\u003e(16\u0026radic;2)\u003c/sup\u003e\u0026#8725;\u003csub\u003e(\u003ci\u003e3\u0026pi;\u003c/i\u003e\u003csup\u003e2\u003c/sup\u003e)\u003c/sub\u003e \u003csup\u003e\u0026radic;Q\u003c/sup\u003e\u0026#8725;\u003csub\u003e\u003ci\u003e\u0026rho;\u003c/i\u003e\u003c/sub\u003e )\u003c/p\u003e\n  - particles with immersed-boundary method (either passive or 2-way-coupled, single-GPU only)\n\n  \u003c/details\u003e\n\n\n\n## Solving the Visualization Problem\n\n- FluidX3D can do simulations so large that storing the volumetric data for later rendering becomes unmanageable (like 120GB for a single frame, hundreds of TeraByte for a video)\n- instead, FluidX3D allows [rendering raw simulation data directly in VRAM](https://www.researchgate.net/publication/360501260_Combined_scientific_CFD_simulation_and_interactive_raytracing_with_OpenCL), so no large volumetric files have to be exported to the hard disk (see my [technical talk](https://youtu.be/pD8JWAZ2f8o))\n- the rendering is so fast that it works interactively in real time for both rasterization and raytracing\n- rasterization and raytracing are done in OpenCL and work on all GPUs, even the ones without RTX/DXR raytracing cores or without any rendering hardware at all (like A100, MI200, ...)\n- if no monitor is available (like on a remote Linux server), there is an [ASCII rendering mode](https://youtu.be/pD8JWAZ2f8o\u0026t=1456) to interactively visualize the simulation in the terminal (even in WSL and/or through SSH)\n- rendering is fully multi-GPU-parallelized via seamless domain decomposition rasterization\n- with interactive graphics mode disabled, image resolution can be as large as VRAM allows for (4K/8K/16K and above)\n- (interacitive) visualization modes:\n  - flag wireframe / solid surface (and force vectors on solid cells or surface pressure if the extension is used)\n  - velocity field (with slice mode)\n  - streamlines (with slice mode)\n  - velocity-colored Q-criterion isosurface\n  - rasterized free surface with [marching-cubes](http://paulbourke.net/geometry/polygonise/)\n  - [raytraced free surface](https://www.researchgate.net/publication/360501260_Combined_scientific_CFD_simulation_and_interactive_raytracing_with_OpenCL) with fast ray-grid traversal and marching-cubes, either 1-4 rays/pixel or 1-10 rays/pixel\n\n\n\n## Solving the Compatibility Problem\n\n- FluidX3D is written in OpenCL 1.2, so it runs on all hardware from all vendors (Nvidia, AMD, Intel, ...):\n  - world's fastest datacenter GPUs: MI300X, H100 (NVL), A100, MI200, MI100, V100(S), GPU Max 1100, ...\n  - gaming GPUs (desktop/laptop): Nvidia GeForce, AMD Radeon, Intel Arc\n  - professional/workstation GPUs: Nvidia Quadro, AMD Radeon Pro / FirePro, Intel Arc Pro\n  - integrated GPUs\n  - CPUs (requires [installation of Intel CPU Runtime for OpenCL](DOCUMENTATION.md#0-install-gpu-drivers-and-opencl-runtime))\n  - Intel Xeon Phi (requires [installation of Intel CPU Runtime for OpenCL](DOCUMENTATION.md#0-install-gpu-drivers-and-opencl-runtime))\n  - smartphone ARM GPUs\n- native cross-vendor multi-GPU implementation\n  - uses PCIe communication, so no SLI/Crossfire/NVLink/InfinityFabric required\n  - single-node parallelization, so no MPI installation required\n  - [GPUs don't even have to be from the same vendor](https://youtu.be/_8Ed8ET9gBU), but similar memory capacity and bandwidth are recommended\n- works on [Windows](DOCUMENTATION.md#windows) and [Linux](DOCUMENTATION.md#linux--macos--android) with C++17, with limited support also for [macOS](DOCUMENTATION.md#linux--macos--android) and [Android](DOCUMENTATION.md#linux--macos--android)\n- supports [importing and voxelizing triangle meshes](DOCUMENTATION.md#loading-stl-files) from binary `.stl` files, with fast GPU voxelization\n- supports [exporting volumetric data](DOCUMENTATION.md#data-export) as binary `.vtk` files\n- supports [exporting triangle meshes](DOCUMENTATION.md#data-export) as binary `.vtk` files\n- supports [exporting rendered images](DOCUMENTATION.md#video-rendering) as `.png`/`.qoi`/`.bmp` files; encoding runs in parallel on the CPU while the simulation on GPU can continue without delay\n\n\n\n## Single-GPU/CPU Benchmarks\n\nHere are [performance benchmarks](https://doi.org/10.3390/computation10060092) on various hardware in MLUPs/s, or how many million lattice cells are updated per second. The settings used for the benchmark are D3Q19 SRT with no extensions enabled (only LBM with implicit mid-grid bounce-back boundaries) and the setup consists of an empty cubic box with sufficient size (typically 256³). Without extensions, a single lattice cell requires:\n- a memory capacity of 93 (FP32/FP32) or 55 (FP32/FP16) Bytes\n- a memory bandwidth of 153 (FP32/FP32) or 77 (FP32/FP16) Bytes per time step\n- 363 (FP32/FP32) or 406 (FP32/FP16S) or 1275 (FP32/FP16C) FLOPs per time step (FP32+INT32 operations counted combined)\n\nIn consequence, the arithmetic intensity of this implementation is 2.37 (FP32/FP32) or 5.27 (FP32/FP16S) or 16.56 (FP32/FP16C) FLOPs/Byte. So performance is only limited by memory bandwidth. The table in the left 3 columns shows the hardware specs as found in the data sheets (theoretical peak FP32 compute performance, memory capacity, theoretical peak memory bandwidth). The right 3 columns show the measured FluidX3D performance for FP32/FP32, FP32/FP16S, FP32/FP16C floating-point precision settings, with the ([roofline model](https://en.wikipedia.org/wiki/Roofline_model) efficiency) in round brackets, indicating how much % of theoretical peak memory bandwidth are being used.\n\nIf your GPU/CPU is not on the list yet, you can report your benchmarks [here](https://github.com/ProjectPhysX/FluidX3D/issues/8).\n\n```mermaid\ngantt\n\ntitle FluidX3D Performance [MLUPs/s] - FP32 arithmetic, (fastest of FP32/FP16S/FP16C) memory storage\ndateFormat X\naxisFormat %s\n%%{\n\tinit: {\n\t\t\"gantt\": {\n\t\t\t'titleTopMargin': 42,\n\t\t\t'topPadding': 70,\n\t\t\t'leftPadding': 260,\n\t\t\t'rightPadding': 5,\n\t\t\t'sectionFontSize': 20,\n\t\t\t'fontSize': 20,\n\t\t\t'barHeight': 20,\n\t\t\t'barGap': 3,\n\t\t\t'numberSectionStyles': 2\n\t\t},\n\t\t'theme': 'forest',\n\t\t'themeVariables': {\n\t\t\t'sectionBkgColor': '#99999999',\n\t\t\t'altSectionBkgColor': '#00000000',\n\t\t\t'titleColor': '#AFAFAF',\n\t\t\t'textColor': '#AFAFAF',\n\t\t\t'taskTextColor': 'black',\n\t\t\t'taskBorderColor': '#487E3A'\n\t\t}\n\t}\n}%%\n\nsection MI300X\n\t41327 :crit, 0, 41327\nsection MI250 (1 GCD)\n\t9030 :crit, 0, 9030\nsection MI210\n\t9547 :crit, 0, 9547\nsection MI100\n\t8542 :crit, 0, 8542\nsection MI60\n\t5111 :crit, 0, 5111\nsection MI50 32GB\n\t8477 :crit, 0, 8477\nsection Radeon VII\n\t7778 :crit, 0, 7778\nsection GPU Max 1100\n\t6303 :done, 0, 6303\nsection GH200 94GB GPU\n\t34689 : 0, 34689\nsection H100 NVL\n\t32922 : 0, 32922\nsection H100 SXM5 80GB HBM3\n\t29561 : 0, 29561\nsection H100 PCIe 80GB HBM2e\n\t20624 : 0, 20624\nsection A100 SXM4 80GB\n\t18448 : 0, 18448\nsection A100 PCIe 80GB\n\t17896 : 0, 17896\nsection PG506-242/243\n\t15654 : 0, 15654\nsection A100 SXM4 40GB\n\t16013 : 0, 16013\nsection A100 PCIe 40GB\n\t16035 : 0, 16035\nsection CMP 170HX\n\t12392 : 0, 12392\nsection A30\n\t9721 : 0, 9721\nsection V100 SXM2 32GB\n\t8947 : 0, 8947\nsection V100 PCIe 16GB\n\t10325 : 0, 10325\nsection GV100\n\t6641 : 0, 6641\nsection Titan V\n\t7253 : 0, 7253\nsection P100 PCIe 16GB\n\t5950 : 0, 5950\nsection P100 PCIe 12GB\n\t4141 : 0, 4141\nsection GTX TITAN\n\t2500 : 0, 2500\nsection K40m\n\t1868 : 0, 1868\nsection K80 (1 GPU)\n\t1642 : 0, 1642\nsection K20c\n\t1507 : 0, 1507\n\nsection RX 9070 XT\n\t6688 :crit, 0, 6688\nsection RX 9070\n\t6019 :crit, 0, 6019\nsection RX 7900 XTX\n\t7716 :crit, 0, 7716\nsection PRO W7900\n\t5939 :crit, 0, 5939\nsection RX 7900 XT\n\t5986 :crit, 0, 5986\nsection RX 7800 XT\n\t3105 :crit, 0, 3105\nsection PRO W7800\n\t4426 :crit, 0, 4426\nsection RX 7900 GRE\n\t4570 :crit, 0, 4570\nsection PRO W7700\n\t2943 :crit, 0, 2943\nsection RX 7600\n\t2561 :crit, 0, 2561\nsection PRO W7600\n\t2287 :crit, 0, 2287\nsection PRO W7500\n\t1682 :crit, 0, 1682\nsection RX 6900 XT\n\t4227 :crit, 0, 4227\nsection RX 6800 XT\n\t4241 :crit, 0, 4241\nsection PRO W6800\n\t3361 :crit, 0, 3361\nsection RX 6700 XT\n\t2908 :crit, 0, 2908\nsection RX 6800M\n\t3213 :crit, 0, 3213\nsection RX 6700M\n\t2429 :crit, 0, 2429\nsection RX 6600\n\t1839 :crit, 0, 1839\nsection RX 6500 XT\n\t1030 :crit, 0, 1030\nsection RX 5700 XT\n\t3253 :crit, 0, 3253\nsection RX 5700\n\t3167 :crit, 0, 3167\nsection RX 5600 XT\n\t2214 :crit, 0, 2214\nsection RX Vega 64\n\t3227 :crit, 0, 3227\nsection RX 590\n\t1688 :crit, 0, 1688\nsection RX 580 4GB\n\t1848 :crit, 0, 1848\nsection RX 580 2048SP 8GB\n\t1622 :crit, 0, 1622\nsection R9 390X\n\t2217 :crit, 0, 2217\nsection HD 7850\n\t635 :crit, 0, 635\nsection Arc B580 LE\n\t4979 :done, 0, 4979\nsection Arc A770 LE\n\t4568 :done, 0, 4568\nsection Arc A750 LE\n\t4314 :done, 0, 4314\nsection Arc A580\n\t3889 :done, 0, 3889\nsection Arc Pro A40\n\t985 :done, 0, 985\nsection Arc A380\n\t1115 :done, 0, 1115\nsection RTX 5090\n\t19141 : 0, 19141\nsection RTX 5080\n\t10304 : 0, 10304\nsection RTX 5070\n\t7238 : 0, 7238\nsection RTX 4090\n\t11496 : 0, 11496\nsection RTX 6000 Ada\n\t10293 : 0, 10293\nsection L40S\n\t7637 : 0, 7637\nsection L40\n\t7945 : 0, 7945\nsection RTX 4080 Super\n\t8218 : 0, 8218\nsection RTX 4080\n\t7933 : 0, 7933\nsection RTX 4070 Ti Super\n\t7295 : 0, 7295\nsection RTX 4090M\n\t6901 : 0, 6901\nsection RTX 4070 Super\n\t5554 : 0, 5554\nsection RTX 4070\n\t5016 : 0, 5016\nsection RTX 4080M\n\t5114 : 0, 5114\nsection RTX 4000 Ada\n\t4221 : 0, 4221\nsection RTX 4060\n\t3124 : 0, 3124\nsection RTX 4070M\n\t3092 : 0, 3092\nsection RTX 2000 Ada\n\t2526 : 0, 2526\nsection RTX 3090 Ti\n\t10956 : 0, 10956\nsection RTX 3090\n\t10732 : 0, 10732\nsection RTX 3080 Ti\n\t9832 : 0, 9832\nsection RTX 3080 12GB\n\t9657 : 0, 9657\nsection RTX A6000\n\t8814 : 0, 8814\nsection RTX 3080 10GB\n\t8118 : 0, 8118\nsection RTX 3070 Ti\n\t6807 : 0, 6807\nsection RTX 3080M Ti\n\t5908 : 0, 5908\nsection RTX 3070\n\t5096 : 0, 5096\nsection RTX 3060 Ti\n\t5129 : 0, 5129\nsection RTX A4000\n\t4945 : 0, 4945\nsection RTX A5000M\n\t4461 : 0, 4461\nsection RTX 3060\n\t4070 : 0, 4070\nsection RTX 3060M\n\t4012 : 0, 4012\nsection A2\n\t2051 : 0, 2051\nsection RTX 3050M Ti\n\t2341 : 0, 2341\nsection RTX 3050M\n\t2339 : 0, 2339\nsection Titan RTX\n\t7554 : 0, 7554\nsection RTX 6000\n\t6879 : 0, 6879\nsection RTX 8000 Passive\n\t5607 : 0, 5607\nsection RTX 2080 Ti\n\t6853 : 0, 6853\nsection RTX 2080 Super\n\t5284 : 0, 5284\nsection RTX 5000\n\t4773 : 0, 4773\nsection RTX 2080\n\t4977 : 0, 4977\nsection RTX 2070 Super\n\t4893 : 0, 4893\nsection RTX 2070\n\t5017 : 0, 5017\nsection RTX 2060 Super\n\t5035 : 0, 5035\nsection RTX 4000\n\t4584 : 0, 4584\nsection RTX 2060 KO\n\t3376 : 0, 3376\nsection RTX 2060\n\t3604 : 0, 3604\nsection GTX 1660 Super\n\t3551 : 0, 3551\nsection T4\n\t2887 : 0, 2887\nsection GTX 1660 Ti\n\t3041 : 0, 3041\nsection GTX 1660\n\t1992 : 0, 1992\nsection GTX 1650M 896C\n\t1858 : 0, 1858\nsection GTX 1650M 1024C\n\t1400 : 0, 1400\nsection T500\n\t665 : 0, 665\nsection Titan Xp\n\t5495 : 0, 5495\nsection GTX 1080 Ti\n\t4877 : 0, 4877\nsection GTX 1080\n\t3182 : 0, 3182\nsection GTX 1060 6GB\n\t1925 : 0, 1925\nsection GTX 1060M\n\t1882 : 0, 1882\nsection GTX 1050M Ti\n\t1224 : 0, 1224\nsection P1000\n\t839 : 0, 839\nsection GTX 980 Ti\n\t2703 : 0, 2703\nsection GTX 980\n\t1965 : 0, 1965\nsection GTX 970\n\t1721 : 0, 1721\nsection M4000\n\t1519 : 0, 1519\nsection M60 (1 GPU)\n\t1571 : 0, 1571\nsection GTX 960M\n\t872 : 0, 872\nsection GTX 770\n\t1215 : 0, 1215\nsection GTX 680 4GB\n\t1274 : 0, 1274\nsection K2000\n\t444 : 0, 444\nsection GT 630 (OEM)\n\t185 : 0, 185\nsection NVS 290\n\t9 : 0, 9\nsection Arise 1020\n\t6 :active, 0, 6\n\nsection M2 Ultra (76-CU, 192GB)\n\t8769 :active, 0, 8769\nsection M2 Max (38-CU, 32GB)\n\t4641 :active, 0, 4641\nsection M1 Ultra (64-CU, 128GB)\n\t8418 :active, 0, 8418\nsection M1 Max (24-CU, 32GB)\n\t4496 :active, 0, 4496\nsection M1 Pro (16-CU, 16GB)\n\t2329 :active, 0, 2329\nsection M1 (8-CU, 16GB)\n\t759 :active, 0, 759\nsection Radeon 8060S (Max+ 395)\n\t2563 :crit, 0, 2563\nsection Radeon 780M (Z1 Extreme)\n\t860 :crit, 0, 860\nsection Radeon Graphics (7800X3D)\n\t498 :crit, 0, 498\nsection Vega 8 (4750G)\n\t511 :crit, 0, 511\nsection Vega 8 (3500U)\n\t288 :crit, 0, 288\nsection Arc 140V GPU (16GB)\n\t1282 :done, 0, 1282\nsection Arc Graphics (Ultra 9 185H)\n\t724 :done, 0, 724\nsection Iris Xe Graphics (i7-1265U)\n\t621 :done, 0, 621\nsection UHD Xe 32EUs\n\t245 :done, 0, 245\nsection UHD 770\n\t475 :done, 0, 475\nsection UHD 630\n\t301 :done, 0, 301\nsection UHD P630\n\t288 :done, 0, 288\nsection HD 5500\n\t192 :done, 0, 192\nsection HD 4600\n\t115 :done, 0, 115\nsection Orange Pi 5 Mali-G610 MP4\n\t232 :active, 0, 232\nsection Samsung Mali-G72 MP18\n\t230 :active, 0, 230\n\nsection 2x EPYC 9754\n\t5179 :crit, 0, 5179\nsection 2x EPYC 9654\n\t1814 :crit, 0, 1814\nsection 2x EPYC 9554\n\t2552 :crit, 0, 2552\nsection 1x EPYC 9124\n\t772 :crit, 0, 772\nsection 2x EPYC 7713\n\t1418 :crit, 0, 1418\nsection 2x EPYC 7352\n\t739 :crit, 0, 739\nsection 2x EPYC 7313\n\t498 :crit, 0, 498\nsection 2x EPYC 7302\n\t784 :crit, 0, 784\nsection 2x 6980P\n\t7875 :done, 0, 7875\nsection 2x 6979P\n\t8135 :done, 0, 8135\nsection 2x Platinum 8592+\n\t3135 :done, 0, 3135\nsection 2x Gold 6548N\n\t1811 :done, 0, 1811\nsection 2x CPU Max 9480\n\t2037 :done, 0, 2037\nsection 2x Platinum 8480+\n\t2162 :done, 0, 2162\nsection 2x Platinum 8470\n\t2068 :done, 0, 2068\nsection 2x Gold 6438Y+\n\t1945 :done, 0, 1945\nsection 2x Platinum 8380\n\t1410 :done, 0, 1410\nsection 2x Platinum 8358\n\t1285 :done, 0, 1285\nsection 2x Platinum 8256\n\t396 :done, 0, 396\nsection 2x Platinum 8153\n\t691 :done, 0, 691\nsection 2x Gold 6248R\n\t755 :done, 0, 755\nsection 2x Gold 6128\n\t254 :done, 0, 254\nsection Phi 7210\n\t415 :done, 0, 415\nsection 4x E5-4620 v4\n\t460 :done, 0, 460\nsection 2x E5-2630 v4\n\t264 :done, 0, 264\nsection 2x E5-2623 v4\n\t125 :done, 0, 125\nsection 2x E5-2680 v3\n\t304 :done, 0, 304\nsection GH200 Neoverse-V2\n\t1323 : 0, 1323\nsection TR PRO 7995WX\n\t1715 :crit, 0, 1715\nsection TR 3970X\n\t463 :crit, 0, 463\nsection TR 1950X\n\t273 :crit, 0, 273\nsection Ryzen 7900X3D\n\t521 :crit, 0, 521\nsection Ryzen 7800X3D\n\t363 :crit, 0, 363\nsection Ryzen 5700X3D\n\t229 :crit, 0, 229\nsection FX-6100\n\t22 :crit, 0, 22\nsection Athlon X2 QL-65\n\t3 :crit, 0, 3\nsection Ultra 7 258V\n\t287 :done, 0, 287\nsection Ultra 9 185H\n\t317 :done, 0, 317\nsection i9-14900K\n\t490 :done, 0, 490\nsection i7-13700K\n\t504 :done, 0, 504\nsection i7-1265U\n\t128 :done, 0, 128\nsection i9-11900KB\n\t208 :done, 0, 208\nsection i9-10980XE\n\t286 :done, 0, 286\nsection E-2288G\n\t198 :done, 0, 198\nsection i7-9700\n\t103 :done, 0, 103\nsection i5-9600\n\t147 :done, 0, 147\nsection i7-8700K\n\t152 :done, 0, 152\nsection E-2176G\n\t201 :done, 0, 201\nsection i7-7700HQ\n\t108 :done, 0, 108\nsection E3-1240 v5\n\t141 :done, 0, 141\nsection i5-5300U\n\t37 :done, 0, 37\nsection i7-4770\n\t104 :done, 0, 104\nsection i7-4720HQ\n\t80 :done, 0, 80\nsection N2807\n\t7 :done, 0, 7\n```\n\n\u003cdetails\u003e\u003csummary\u003eSingle-GPU/CPU Benchmark Table\u003c/summary\u003e\n\nColors: 🔴 AMD, 🔵 Intel, 🟢 Nvidia, ⚪ Apple, 🟡 ARM, 🟤 Glenfly\n\n| Device                                           | FP32\u003cbr\u003e[TFlops/s] | Mem\u003cbr\u003e[GB] | BW\u003cbr\u003e[GB/s] | FP32/FP32\u003cbr\u003e[MLUPs/s] | FP32/FP16S\u003cbr\u003e[MLUPs/s] | FP32/FP16C\u003cbr\u003e[MLUPs/s] |\n| :----------------------------------------------- | -----------------: | ----------: | -----------: | ---------------------: | ----------------------: | ----------------------: |\n|                                                  |                    |             |              |                        |                         |                         |\n| 🔴\u0026nbsp;Instinct\u0026nbsp;MI300X                     |             163.40 |         192 |         5300 |       22867\u0026nbsp;(66%) |        41327\u0026nbsp;(60%) |        31670\u0026nbsp;(46%) |\n| 🔴\u0026nbsp;Instinct\u0026nbsp;MI250\u0026nbsp;(1\u0026nbsp;GCD)    |              45.26 |          64 |         1638 |             5638 (53%) |              9030 (42%) |              8506 (40%) |\n| 🔴\u0026nbsp;Instinct\u0026nbsp;MI210                      |              45.26 |          64 |         1638 |             6517 (61%) |              9547 (45%) |              8829 (41%) |\n| 🔴\u0026nbsp;Instinct\u0026nbsp;MI100                      |              46.14 |          32 |         1228 |             5093 (63%) |              8133 (51%) |              8542 (54%) |\n| 🔴\u0026nbsp;Instinct\u0026nbsp;MI60                       |              14.75 |          32 |         1024 |             3570 (53%) |              5047 (38%) |              5111 (38%) |\n| 🔴\u0026nbsp;Instinct\u0026nbsp;MI50\u0026nbsp;32GB             |              13.25 |          32 |         1024 |             4446 (66%) |              8477 (64%) |              4406 (33%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;VII                          |              13.83 |          16 |         1024 |             4898 (73%) |              7778 (58%) |              5256 (40%) |\n| 🔵\u0026nbsp;Data\u0026nbsp;Center\u0026nbsp;GPU\u0026nbsp;Max\u0026nbsp;1100 |          22.22 |          48 |         1229 |             3769 (47%) |              6303 (39%) |              3520 (22%) |\n| 🟢\u0026nbsp;GH200\u0026nbsp;94GB\u0026nbsp;GPU                 |              66.91 |          94 |         4000 |       20595\u0026nbsp;(79%) |        34689\u0026nbsp;(67%) |        19407\u0026nbsp;(37%) |\n| 🟢\u0026nbsp;H100\u0026nbsp;NVL                            |              60.32 |          94 |         3938 |       20303\u0026nbsp;(79%) |        32922\u0026nbsp;(64%) |        18424\u0026nbsp;(36%) |\n| 🟢\u0026nbsp;H100\u0026nbsp;SXM5\u0026nbsp;80GB\u0026nbsp;HBM3       |              66.91 |          80 |         3350 |       17602\u0026nbsp;(80%) |        29561\u0026nbsp;(68%) |        20227\u0026nbsp;(46%) |\n| 🟢\u0026nbsp;H100\u0026nbsp;PCIe\u0026nbsp;80GB\u0026nbsp;HBM2e      |              51.01 |          80 |         2000 |       11128\u0026nbsp;(85%) |        20624\u0026nbsp;(79%) |        13862\u0026nbsp;(53%) |\n| 🟢\u0026nbsp;A100\u0026nbsp;SXM4\u0026nbsp;80GB                 |              19.49 |          80 |         2039 |       10228\u0026nbsp;(77%) |        18448\u0026nbsp;(70%) |        11197\u0026nbsp;(42%) |\n| 🟢\u0026nbsp;A100\u0026nbsp;PCIe\u0026nbsp;80GB                 |              19.49 |          80 |         1935 |             9657 (76%) |        17896\u0026nbsp;(71%) |        10817\u0026nbsp;(43%) |\n| 🟢\u0026nbsp;PG506-243\u0026nbsp;/\u0026nbsp;PG506-242          |              22.14 |          64 |         1638 |             8195 (77%) |        15654\u0026nbsp;(74%) |        12271\u0026nbsp;(58%) |\n| 🟢\u0026nbsp;A100\u0026nbsp;SXM4\u0026nbsp;40GB                 |              19.49 |          40 |         1555 |             8522 (84%) |        16013\u0026nbsp;(79%) |        11251\u0026nbsp;(56%) |\n| 🟢\u0026nbsp;A100\u0026nbsp;PCIe\u0026nbsp;40GB                 |              19.49 |          40 |         1555 |             8526 (84%) |        16035\u0026nbsp;(79%) |        11088\u0026nbsp;(55%) |\n| 🟢\u0026nbsp;CMP\u0026nbsp;170HX                           |               6.32 |           8 |         1493 |             7684 (79%) |        12392\u0026nbsp;(64%) |              6859 (35%) |\n| 🟢\u0026nbsp;A30                                      |              10.32 |          24 |          933 |             5004 (82%) |              9721 (80%) |              5726 (47%) |\n| 🟢\u0026nbsp;Tesla\u0026nbsp;V100\u0026nbsp;SXM2\u0026nbsp;32GB      |              15.67 |          32 |          900 |             4471 (76%) |              8947 (77%) |              7217 (62%) |\n| 🟢\u0026nbsp;Tesla\u0026nbsp;V100\u0026nbsp;PCIe\u0026nbsp;16GB      |              14.13 |          16 |          900 |             5128 (87%) |        10325\u0026nbsp;(88%) |              7683 (66%) |\n| 🟢\u0026nbsp;Quadro\u0026nbsp;GV100                        |              16.66 |          32 |          870 |             3442 (61%) |              6641 (59%) |              5863 (52%) |\n| 🟢\u0026nbsp;Titan\u0026nbsp;V                             |              14.90 |          12 |          653 |             3601 (84%) |              7253 (86%) |              6957 (82%) |\n| 🟢\u0026nbsp;Tesla\u0026nbsp;P100\u0026nbsp;16GB                |               9.52 |          16 |          732 |             3295 (69%) |              5950 (63%) |              4176 (44%) |\n| 🟢\u0026nbsp;Tesla\u0026nbsp;P100\u0026nbsp;12GB                |               9.52 |          12 |          549 |             2427 (68%) |              4141 (58%) |              3999 (56%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;GTX\u0026nbsp;TITAN              |               4.71 |           6 |          288 |             1460 (77%) |              2500 (67%) |              1113 (30%) |\n| 🟢\u0026nbsp;Tesla\u0026nbsp;K40m                          |               4.29 |          12 |          288 |             1131 (60%) |              1868 (50%) |               912 (24%) |\n| 🟢\u0026nbsp;Tesla\u0026nbsp;K80\u0026nbsp;(1\u0026nbsp;GPU)         |               4.11 |          12 |          240 |              916 (58%) |              1642 (53%) |               943 (30%) |\n| 🟢\u0026nbsp;Tesla\u0026nbsp;K20c                          |               3.52 |           5 |          208 |              861 (63%) |              1507 (56%) |               720 (27%) |\n|                                                  |                    |             |              |                        |                         |                         |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;RX\u0026nbsp;9070\u0026nbsp;XT         |              48.66 |          16 |          640 |             3089 (74%) |              6688 (80%) |              6090 (73%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;RX\u0026nbsp;9070                 |              36.13 |          16 |          640 |             3007 (72%) |              5746 (69%) |              6019 (72%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;RX\u0026nbsp;7900\u0026nbsp;XTX        |              61.44 |          24 |          960 |             3665 (58%) |              7644 (61%) |              7716 (62%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;PRO\u0026nbsp;W7900               |              61.30 |          48 |          864 |             3107 (55%) |              5939 (53%) |              5780 (52%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;RX\u0026nbsp;7900\u0026nbsp;XT         |              51.61 |          20 |          800 |             3013 (58%) |              5856 (56%) |              5986 (58%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;RX\u0026nbsp;7800\u0026nbsp;XT         |              37.32 |          16 |          624 |             1704 (42%) |              3105 (38%) |              3061 (38%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;PRO\u0026nbsp;W7800               |              45.20 |          32 |          576 |             1872 (50%) |              4426 (59%) |              4145 (55%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;RX\u0026nbsp;7900\u0026nbsp;GRE        |              42.03 |          16 |          576 |             1996 (53%) |              4570 (61%) |              4463 (60%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;PRO\u0026nbsp;W7700               |              28.30 |          16 |          576 |             1547 (41%) |              2943 (39%) |              2899 (39%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;RX\u0026nbsp;7600                 |              21.75 |           8 |          288 |             1250 (66%) |              2561 (68%) |              2512 (67%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;PRO\u0026nbsp;W7600               |              20.00 |           8 |          288 |             1179 (63%) |              2263 (61%) |              2287 (61%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;PRO\u0026nbsp;W7500               |              12.20 |           8 |          172 |              856 (76%) |              1630 (73%) |              1682 (75%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;RX\u0026nbsp;6900\u0026nbsp;XT         |              23.04 |          16 |          512 |             1968 (59%) |              4227 (64%) |              4207 (63%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;RX\u0026nbsp;6800\u0026nbsp;XT         |              20.74 |          16 |          512 |             2008 (60%) |              4241 (64%) |              4224 (64%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;PRO\u0026nbsp;W6800               |              17.83 |          32 |          512 |             1620 (48%) |              3361 (51%) |              3180 (48%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;RX\u0026nbsp;6700\u0026nbsp;XT         |              13.21 |          12 |          384 |             1408 (56%) |              2883 (58%) |              2908 (58%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;RX\u0026nbsp;6800M                |              11.78 |          12 |          384 |             1439 (57%) |              3190 (64%) |              3213 (64%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;RX\u0026nbsp;6700M                |              10.60 |          10 |          320 |             1194 (57%) |              2388 (57%) |              2429 (58%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;RX\u0026nbsp;6600                 |               8.93 |           8 |          224 |              963 (66%) |              1817 (62%) |              1839 (63%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;RX\u0026nbsp;6500\u0026nbsp;XT         |               5.77 |           4 |          144 |              459 (49%) |              1011 (54%) |              1030 (55%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;RX\u0026nbsp;5700\u0026nbsp;XT         |               9.75 |           8 |          448 |             1368 (47%) |              3253 (56%) |              3049 (52%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;RX\u0026nbsp;5700                 |               7.72 |           8 |          448 |             1521 (52%) |              3167 (54%) |              2758 (47%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;RX\u0026nbsp;5600\u0026nbsp;XT         |               6.73 |           6 |          288 |             1136 (60%) |              2214 (59%) |              2148 (57%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;RX\u0026nbsp;Vega\u0026nbsp;64         |              13.35 |           8 |          484 |             1875 (59%) |              2878 (46%) |              3227 (51%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;RX\u0026nbsp;590                  |               5.53 |           8 |          256 |             1257 (75%) |              1573 (47%) |              1688 (51%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;RX\u0026nbsp;580\u0026nbsp;4GB         |               6.50 |           4 |          256 |              946 (57%) |              1848 (56%) |              1577 (47%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;RX\u0026nbsp;580\u0026nbsp;2048SP\u0026nbsp;8GB |           4.94 |           8 |          224 |              868 (59%) |              1622 (56%) |              1240 (43%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;R9\u0026nbsp;390X                 |               5.91 |           8 |          384 |             1733 (69%) |              2217 (44%) |              1722 (35%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;HD\u0026nbsp;7850                 |               1.84 |           2 |          154 |              112 (11%) |               120 ( 6%) |               635 (32%) |\n| 🔵\u0026nbsp;Arc\u0026nbsp;B580\u0026nbsp;LE                    |              14.59 |          12 |          456 |             2598 (87%) |              4443 (75%) |              4979 (84%) |\n| 🔵\u0026nbsp;Arc\u0026nbsp;A770\u0026nbsp;LE                    |              19.66 |          16 |          560 |             2663 (73%) |              4568 (63%) |              4519 (62%) |\n| 🔵\u0026nbsp;Arc\u0026nbsp;A750\u0026nbsp;LE                    |              17.20 |           8 |          512 |             2555 (76%) |              4314 (65%) |              4047 (61%) |\n| 🔵\u0026nbsp;Arc\u0026nbsp;A580                            |              12.29 |           8 |          512 |             2534 (76%) |              3889 (58%) |              3488 (52%) |\n| 🔵\u0026nbsp;Arc\u0026nbsp;Pro\u0026nbsp;A40                    |               5.02 |           6 |          192 |              594 (47%) |               985 (40%) |               927 (37%) |\n| 🔵\u0026nbsp;Arc\u0026nbsp;A380                            |               4.20 |           6 |          186 |              622 (51%) |              1097 (45%) |              1115 (46%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;5090               |             104.88 |          32 |         1792 |             9522 (81%) |             18459 (79%) |             19141 (82%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;5080               |              56.34 |          16 |          960 |             5174 (82%) |             10252 (82%) |             10304 (83%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;5070               |              30.84 |          12 |          672 |             3658 (83%) |              7238 (83%) |              7107 (81%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;4090               |              82.58 |          24 |         1008 |             5624 (85%) |             11091 (85%) |             11496 (88%) |\n| 🟢\u0026nbsp;RTX\u0026nbsp;6000\u0026nbsp;Ada                   |              91.10 |          48 |          960 |             4997 (80%) |             10249 (82%) |             10293 (83%) |\n| 🟢\u0026nbsp;L40S                                     |              91.61 |          48 |          864 |             3788 (67%) |              7637 (68%) |              7617 (68%) |\n| 🟢\u0026nbsp;L40                                      |              90.52 |          48 |          864 |             3870 (69%) |              7778 (69%) |              7945 (71%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;4080\u0026nbsp;Super    |              52.22 |          16 |          736 |             4089 (85%) |              7660 (80%) |              8218 (86%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;4080               |              55.45 |          16 |          717 |             3914 (84%) |              7626 (82%) |              7933 (85%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;4070\u0026nbsp;Ti\u0026nbsp;Super |         44.10 |          16 |          672 |             3694 (84%) |              6435 (74%) |              7295 (84%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;4090M              |              28.31 |          16 |          576 |             3367 (89%) |              6545 (87%) |              6901 (92%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;4070\u0026nbsp;Super    |              35.55 |          12 |          504 |             2751 (83%) |              5149 (79%) |              5554 (85%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;4070               |              29.15 |          12 |          504 |             2646 (80%) |              4548 (69%) |              5016 (77%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;4080M              |              33.85 |          12 |          432 |             2577 (91%) |              5086 (91%) |              5114 (91%) |\n| 🟢\u0026nbsp;RTX\u0026nbsp;4000\u0026nbsp;Ada                   |              26.73 |          20 |          360 |             2130 (91%) |              3964 (85%) |              4221 (90%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;4060               |              15.11 |           8 |          272 |             1614 (91%) |              3052 (86%) |              3124 (88%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;4070M              |              18.25 |           8 |          256 |             1553 (93%) |              2945 (89%) |              3092 (93%) |\n| 🟢\u0026nbsp;RTX\u0026nbsp;2000\u0026nbsp;Ada                   |              12.00 |          16 |          224 |             1351 (92%) |              2452 (84%) |              2526 (87%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;3090\u0026nbsp;Ti       |              40.00 |          24 |         1008 |             5717 (87%) |             10956 (84%) |             10400 (79%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;3090               |              39.05 |          24 |          936 |             5418 (89%) |             10732 (88%) |             10215 (84%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;3080\u0026nbsp;Ti       |              37.17 |          12 |          912 |             5202 (87%) |              9832 (87%) |              9347 (79%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;3080\u0026nbsp;12GB     |              32.26 |          12 |          912 |             5071 (85%) |              9657 (81%) |              8615 (73%) |\n| 🟢\u0026nbsp;RTX\u0026nbsp;A6000                           |              40.00 |          48 |          768 |             4421 (88%) |              8814 (88%) |              8533 (86%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;3080\u0026nbsp;10GB     |              29.77 |          10 |          760 |             4230 (85%) |              8118 (82%) |              7714 (78%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;3070\u0026nbsp;Ti       |              21.75 |           8 |          608 |             3490 (88%) |              6807 (86%) |              5926 (75%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;3080M\u0026nbsp;Ti      |              23.61 |          16 |          512 |             2985 (89%) |              5908 (89%) |              5780 (87%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;3070               |              20.31 |           8 |          448 |             2578 (88%) |              5096 (88%) |              5060 (87%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;3060\u0026nbsp;Ti       |              16.49 |           8 |          448 |             2644 (90%) |              5129 (88%) |              4718 (81%) |\n| 🟢\u0026nbsp;RTX\u0026nbsp;A4000                           |              19.17 |          16 |          448 |             2500 (85%) |              4945 (85%) |              4664 (80%) |\n| 🟢\u0026nbsp;RTX\u0026nbsp;A5000M                          |              16.59 |          16 |          448 |             2228 (76%) |              4461 (77%) |              3662 (63%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;3060               |              13.17 |          12 |          360 |             2108 (90%) |              4070 (87%) |              3566 (76%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;3060M              |              10.94 |           6 |          336 |             2019 (92%) |              4012 (92%) |              3572 (82%) |\n| 🟢\u0026nbsp;A2                                       |               4.53 |          15 |          200 |             1031 (79%) |              2051 (79%) |              1199 (46%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;3050M\u0026nbsp;Ti      |               7.60 |           4 |          192 |             1181 (94%) |              2341 (94%) |              2253 (90%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;3050M              |               7.13 |           4 |          192 |             1180 (94%) |              2339 (94%) |              2016 (81%) |\n| 🟢\u0026nbsp;Titan\u0026nbsp;RTX                           |              16.31 |          24 |          672 |             3471 (79%) |              7456 (85%) |              7554 (87%) |\n| 🟢\u0026nbsp;Quadro\u0026nbsp;RTX\u0026nbsp;6000                |              16.31 |          24 |          672 |             3307 (75%) |              6836 (78%) |              6879 (79%) |\n| 🟢\u0026nbsp;Quadro\u0026nbsp;RTX\u0026nbsp;8000\u0026nbsp;Passive   |              14.93 |          48 |          624 |             2591 (64%) |              5408 (67%) |              5607 (69%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;2080\u0026nbsp;Ti       |              13.45 |          11 |          616 |             3194 (79%) |              6700 (84%) |              6853 (86%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;2080\u0026nbsp;Super    |              11.34 |           8 |          496 |             2434 (75%) |              5284 (82%) |              5087 (79%) |\n| 🟢\u0026nbsp;Quadro\u0026nbsp;RTX\u0026nbsp;5000                |              11.15 |          16 |          448 |             2341 (80%) |              4766 (82%) |              4773 (82%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;2080               |              10.07 |           8 |          448 |             2318 (79%) |              4977 (86%) |              4963 (85%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;2070\u0026nbsp;Super    |               9.22 |           8 |          448 |             2255 (77%) |              4866 (84%) |              4893 (84%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;2070               |               7.47 |           8 |          448 |             2444 (83%) |              4387 (75%) |              5017 (86%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;2060\u0026nbsp;Super    |               7.18 |           8 |          448 |             2503 (85%) |              5035 (87%) |              4463 (77%) |\n| 🟢\u0026nbsp;Quadro\u0026nbsp;RTX\u0026nbsp;4000                |               7.12 |           8 |          416 |             2284 (84%) |              4584 (85%) |              4062 (75%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;2060\u0026nbsp;KO       |               6.74 |           6 |          336 |             1643 (75%) |              3376 (77%) |              3266 (75%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;RTX\u0026nbsp;2060               |               6.74 |           6 |          336 |             1681 (77%) |              3604 (83%) |              3571 (82%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;GTX\u0026nbsp;1660\u0026nbsp;Super    |               5.03 |           6 |          336 |             1696 (77%) |              3551 (81%) |              3040 (70%) |\n| 🟢\u0026nbsp;Tesla\u0026nbsp;T4                            |               8.14 |          15 |          300 |             1356 (69%) |              2869 (74%) |              2887 (74%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;GTX\u0026nbsp;1660\u0026nbsp;Ti       |               5.48 |           6 |          288 |             1467 (78%) |              3041 (81%) |              3019 (81%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;GTX\u0026nbsp;1660               |               5.07 |           6 |          192 |             1016 (81%) |              1924 (77%) |              1992 (80%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;GTX\u0026nbsp;1650M\u0026nbsp;896C    |               2.72 |           4 |          192 |              963 (77%) |              1836 (74%) |              1858 (75%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;GTX\u0026nbsp;1650M\u0026nbsp;1024C   |               3.20 |           4 |          128 |              706 (84%) |              1214 (73%) |              1400 (84%) |\n| 🟢\u0026nbsp;T500                                     |               3.04 |           4 |           80 |              339 (65%) |               578 (56%) |               665 (64%) |\n| 🟢\u0026nbsp;Titan\u0026nbsp;Xp                            |              12.15 |          12 |          548 |             2919 (82%) |              5495 (77%) |              5375 (76%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;GTX\u0026nbsp;1080\u0026nbsp;Ti       |              12.06 |          11 |          484 |             2631 (83%) |              4837 (77%) |              4877 (78%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;GTX\u0026nbsp;1080               |               9.78 |           8 |          320 |             1623 (78%) |              3100 (75%) |              3182 (77%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;GTX\u0026nbsp;1060\u0026nbsp;6GB      |               4.57 |           6 |          192 |              997 (79%) |              1925 (77%) |              1785 (72%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;GTX\u0026nbsp;1060M              |               4.44 |           6 |          192 |              983 (78%) |              1882 (75%) |              1803 (72%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;GTX\u0026nbsp;1050M Ti           |               2.49 |           4 |          112 |              631 (86%) |              1224 (84%) |              1115 (77%) |\n| 🟢\u0026nbsp;Quadro\u0026nbsp;P1000                        |               1.89 |           4 |           82 |              426 (79%) |               839 (79%) |               778 (73%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;GTX\u0026nbsp;980\u0026nbsp;Ti        |               6.05 |           6 |          336 |             1509 (69%) |              2703 (62%) |              2381 (55%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;GTX\u0026nbsp;980                |               4.98 |           4 |          224 |             1018 (70%) |              1965 (68%) |              1872 (64%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;GTX\u0026nbsp;970                |               4.17 |           4 |          224 |              980 (67%) |              1721 (59%) |              1623 (56%) |\n| 🟢\u0026nbsp;Quadro\u0026nbsp;M4000                        |               2.57 |           8 |          192 |              899 (72%) |              1519 (61%) |              1050 (42%) |\n| 🟢\u0026nbsp;Tesla\u0026nbsp;M60\u0026nbsp;(1\u0026nbsp;GPU)         |               4.82 |           8 |          160 |              853 (82%) |              1571 (76%) |              1557 (75%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;GTX\u0026nbsp;960M               |               1.51 |           4 |           80 |              442 (84%) |               872 (84%) |               627 (60%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;GTX\u0026nbsp;770                |               3.33 |           2 |          224 |              800 (55%) |              1215 (42%) |               876 (30%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;GTX\u0026nbsp;680\u0026nbsp;4GB       |               3.33 |           4 |          192 |              783 (62%) |              1274 (51%) |               814 (33%) |\n| 🟢\u0026nbsp;Quadro\u0026nbsp;K2000                        |               0.73 |           2 |           64 |              312 (75%) |               444 (53%) |               171 (21%) |\n| 🟢\u0026nbsp;GeForce\u0026nbsp;GT\u0026nbsp;630\u0026nbsp;(OEM)      |               0.46 |           2 |           29 |              151 (81%) |               185 (50%) |                78 (21%) |\n| 🟢\u0026nbsp;Quadro\u0026nbsp;NVS\u0026nbsp;290                 |               0.03 |         1/4 |            6 |                9 (22%) |                 4 ( 5%) |                 4 ( 5%) |\n| 🟤\u0026nbsp;Arise\u0026nbsp;1020                          |               1.50 |           2 |           19 |                6 ( 5%) |                 6 ( 2%) |                 6 ( 2%) |\n|                                                  |                    |             |              |                        |                         |                         |\n| ⚪\u0026nbsp;M2\u0026nbsp;Ultra\u0026nbsp;GPU\u0026nbsp;76CU\u0026nbsp;192GB |           19.46 |         147 |          800 |             4629 (89%) |              8769 (84%) |              7972 (77%) |\n| ⚪\u0026nbsp;M2\u0026nbsp;Max\u0026nbsp;GPU\u0026nbsp;38CU\u0026nbsp;32GB |               9.73 |          22 |          400 |             2405 (92%) |              4641 (89%) |              2444 (47%) |\n| ⚪\u0026nbsp;M1\u0026nbsp;Ultra\u0026nbsp;GPU\u0026nbsp;64CU\u0026nbsp;128GB |           16.38 |          98 |          800 |             4519 (86%) |              8418 (81%) |              6915 (67%) |\n| ⚪\u0026nbsp;M1\u0026nbsp;Max\u0026nbsp;GPU\u0026nbsp;24CU\u0026nbsp;32GB |               6.14 |          22 |          400 |             2369 (91%) |              4496 (87%) |              2777 (53%) |\n| ⚪\u0026nbsp;M1\u0026nbsp;Pro\u0026nbsp;GPU\u0026nbsp;16CU\u0026nbsp;16GB |               4.10 |          11 |          200 |             1204 (92%) |              2329 (90%) |              1855 (71%) |\n| ⚪\u0026nbsp;M1\u0026nbsp;GPU\u0026nbsp;8CU\u0026nbsp;16GB           |               2.05 |          11 |           68 |              384 (86%) |               758 (85%) |               759 (86%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;8060S\u0026nbsp;Graphics\u0026nbsp;(Max+\u0026nbsp;395)) | 29.70 |          15 |          256 |             1231 (74%) |              2541 (76%) |              2563 (77%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;780M\u0026nbsp;(Z1\u0026nbsp;Extreme)  |               8.29 |           8 |          102 |              443 (66%) |               860 (65%) |               820 (62%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;Graphics\u0026nbsp;(7800X3D)      |               0.56 |          12 |          102 |              338 (51%) |               498 (37%) |               283 (21%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;Vega\u0026nbsp;8\u0026nbsp;(4750G)     |               2.15 |          27 |           57 |              263 (71%) |               511 (70%) |               501 (68%) |\n| 🔴\u0026nbsp;Radeon\u0026nbsp;Vega\u0026nbsp;8\u0026nbsp;(3500U)     |               1.23 |           7 |           38 |              157 (63%) |               282 (57%) |               288 (58%) |\n| 🔵\u0026nbsp;Arc\u0026nbsp;140V\u0026nbsp;GPU\u0026nbsp;(16GB)       |               3.99 |          16 |          137 |              636 (71%) |              1282 (72%) |               773 (44%) |\n| 🔵\u0026nbsp;Arc\u0026nbsp;Graphics\u0026nbsp;(Ultra\u0026nbsp;9\u0026nbsp;185H) |        4.81 |          14 |           90 |              271 (46%) |               710 (61%) |               724 (62%) |\n| 🔵\u0026nbsp;Iris\u0026nbsp;Xe\u0026nbsp;Graphics\u0026nbsp;(i7-1265U) |             1.92 |          13 |           77 |              342 (68%) |               621 (62%) |               574 (58%) |\n| 🔵\u0026nbsp;UHD\u0026nbsp;Graphics\u0026nbsp;Xe\u0026nbsp;32EUs     |               0.74 |          25 |           51 |              128 (38%) |               245 (37%) |               216 (32%) |\n| 🔵\u0026nbsp;UHD\u0026nbsp;Graphics\u0026nbsp;770               |               0.82 |          30 |           90 |              342 (58%) |               475 (41%) |               278 (24%) |\n| 🔵\u0026nbsp;UHD\u0026nbsp;Graphics\u0026nbsp;630               |               0.46 |           7 |           51 |              151 (45%) |               301 (45%) |               187 (28%) |\n| 🔵\u0026nbsp;UHD\u0026nbsp;Graphics\u0026nbsp;P630              |               0.46 |          51 |           42 |              177 (65%) |               288 (53%) |               137 (25%) |\n| 🔵\u0026nbsp;HD\u0026nbsp;Graphics\u0026nbsp;5500               |               0.35 |           3 |           26 |               75 (45%) |               192 (58%) |               108 (32%) |\n| 🔵\u0026nbsp;HD\u0026nbsp;Graphics\u0026nbsp;4600               |               0.38 |           2 |           26 |              105 (63%) |               115 (35%) |                34 (10%) |\n| 🟡\u0026nbsp;Mali-G610\u0026nbsp;MP4 (Orange\u0026nbsp;Pi\u0026nbsp;5) |             0.06 |          16 |           34 |              130 (58%) |               232 (52%) |                93 (21%) |\n| 🟡\u0026nbsp;Mali-G72\u0026nbsp;MP18 (Samsung\u0026nbsp;S9+)    |               0.24 |           4 |           29 |              110 (59%) |               230 (62%) |                21 ( 6%) |\n|                                                  |                    |             |              |                        |                         |                         |\n| 🔴\u0026nbsp;2x\u0026nbsp;EPYC\u0026nbsp;9754                   |              50.79 |        3072 |          922 |             3276 (54%) |              5077 (42%) |              5179 (43%) |\n| 🔴\u0026nbsp;2x\u0026nbsp;EPYC\u0026nbsp;9654                   |              43.62 |        1536 |          922 |             1381 (23%) |              1814 (15%) |              1801 (15%) |\n| 🔴\u0026nbsp;2x\u0026nbsp;EPYC\u0026nbsp;9554                   |              30.72 |         384 |          922 |             2552 (42%) |              2127 (18%) |              2144 (18%) |\n| 🔴\u0026nbsp;1x\u0026nbsp;EPYC\u0026nbsp;9124                   |               3.69 |         128 |          307 |              772 (38%) |               579 (15%) |               586 (15%) |\n| 🔴\u0026nbsp;2x\u0026nbsp;EPYC\u0026nbsp;7713                   |               8.19 |         512 |          410 |             1298 (48%) |               492 ( 9%) |              1418 (27%) |\n| 🔴\u0026nbsp;2x\u0026nbsp;EPYC\u0026nbsp;7352                   |               3.53 |         512 |          410 |              739 (28%) |               106 ( 2%) |               412 ( 8%) |\n| 🔴\u0026nbsp;2x\u0026nbsp;EPYC\u0026nbsp;7313                   |               3.07 |         128 |          410 |              498 (19%) |               367 ( 7%) |               418 ( 8%) |\n| 🔴\u0026nbsp;2x\u0026nbsp;EPYC\u0026nbsp;7302                   |               3.07 |         128 |          410 |              784 (29%) |               336 ( 6%) |               411 ( 8%) |\n| 🔵\u0026nbsp;2x\u0026nbsp;Xeon\u0026nbsp;6980P                  |              98.30 |        6144 |         1690 |             7875 (71%) |              5112 (23%) |              5610 (26%) |\n| 🔵\u0026nbsp;2x\u0026nbsp;Xeon\u0026nbsp;6979P                  |              92.16 |        3072 |         1690 |             8135 (74%) |              4175 (19%) |              4622 (21%) |\n| 🔵\u0026nbsp;2x\u0026nbsp;Xeon\u0026nbsp;Platinum\u0026nbsp;8592+    |              31.13 |        1024 |          717 |             3135 (67%) |              2359 (25%) |              2466 (26%) |\n| 🔵\u0026nbsp;2x\u0026nbsp;Xeon\u0026nbsp;Gold\u0026nbsp;6548N        |              22.94 |        2048 |          666 |             1811 (42%) |              1388 (16%) |              1425 (16%) |\n| 🔵\u0026nbsp;2x\u0026nbsp;Xeon\u0026nbsp;CPU\u0026nbsp;Max\u0026nbsp;9480 |              27.24 |         256 |          614 |             2037 (51%) |              1520 (19%) |              1464 (18%) |\n| 🔵\u0026nbsp;2x\u0026nbsp;Xeon\u0026nbsp;Platinum\u0026nbsp;8480+    |              28.67 |         512 |          614 |             2162 (54%) |              1845 (23%) |              1884 (24%) |\n| 🔵\u0026nbsp;2x\u0026nbsp;Xeon\u0026nbsp;Platinum\u0026nbsp;8470     |              25.29 |        2048 |          614 |             1865 (46%) |              1909 (24%) |              2068 (26%) |\n| 🔵\u0026nbsp;2x\u0026nbsp;Xeon\u0026nbsp;Gold\u0026nbsp;6438Y+       |              16.38 |        1024 |          614 |             1945 (48%) |              1219 (15%) |              1257 (16%) |\n| 🔵\u0026nbsp;2x\u0026nbsp;Xeon\u0026nbsp;Platinum\u0026nbsp;8380     |              23.55 |        2048 |          410 |             1410 (53%) |              1159 (22%) |              1298 (24%) |\n| 🔵\u0026nbsp;2x\u0026nbsp;Xeon\u0026nbsp;Platinum\u0026nbsp;8358     |              21.30 |         256 |          410 |             1285 (48%) |              1007 (19%) |              1120 (21%) |\n| 🔵\u0026nbsp;2x\u0026nbsp;Xeon\u0026nbsp;Platinum\u0026nbsp;8256     |               3.89 |        1536 |          282 |              396 (22%) |               158 ( 4%) |               175 ( 5%) |\n| 🔵\u0026nbsp;2x\u0026nbsp;Xeon\u0026nbsp;Platinum\u0026nbsp;8153     |               8.19 |         384 |          256 |              691 (41%) |               290 ( 9%) |               328 (10%) |\n| 🔵\u0026nbsp;2x\u0026nbsp;Xeon\u0026nbsp;Gold\u0026nbsp;6248R        |              18.43 |         384 |          282 |              755 (41%) |               566 (15%) |               694 (19%) |\n| 🔵\u0026nbsp;2x\u0026nbsp;Xeon\u0026nbsp;Gold\u0026nbsp;6128         |               5.22 |         192 |          256 |              254 (15%) |               185 ( 6%) |               193 ( 6%) |\n| 🔵\u0026nbsp;Xeon\u0026nbsp;Phi\u0026nbsp;7210                  |               5.32 |         192 |          102 |              415 (62%) |               193 (15%) |               223 (17%) |\n| 🔵\u0026nbsp;4x\u0026nbsp;Xeon\u0026nbsp;E5-4620\u0026nbsp;v4        |               2.69 |         512 |          273 |              460 (26%) |               275 ( 8%) |               239 ( 7%) |\n| 🔵\u0026nbsp;2x\u0026nbsp;Xeon\u0026nbsp;E5-2630\u0026nbsp;v4        |               1.41 |          64 |          137 |              264 (30%) |               146 ( 8%) |               129 ( 7%) |\n| 🔵\u0026nbsp;2x\u0026nbsp;Xeon\u0026nbsp;E5-2623\u0026nbsp;v4        |               0.67 |          64 |          137 |              125 (14%) |                66 ( 4%) |                59 ( 3%) |\n| 🔵\u0026nbsp;2x\u0026nbsp;Xeon\u0026nbsp;E5-2680\u0026nbsp;v3        |               1.92 |         128 |          137 |              304 (34%) |               234 (13%) |               291 (16%) |\n| 🟢\u0026nbsp;GH200\u0026nbsp;Neoverse-V2\u0026nbsp;CPU          |               7.88 |         480 |          384 |             1323 (53%) |               853 (17%) |               683 (14%) |\n| 🔴\u0026nbsp;Threadripper\u0026nbsp;PRO\u0026nbsp;7995WX        |              15.36 |         256 |          333 |             1134 (52%) |              1697 (39%) |              1715 (40%) |\n| 🔴\u0026nbsp;Threadripper\u0026nbsp;3970X                  |               3.79 |         128 |          102 |              376 (56%) |               103 ( 8%) |               463 (35%) |\n| 🔴\u0026nbsp;Threadripper\u0026nbsp;1950X                  |               0.87 |         128 |           85 |              273 (49%) |                43 ( 4%) |               151 (14%) |\n| 🔴\u0026nbsp;Ryzen\u0026nbsp;9\u0026nbsp;7900X3D                |               1.69 |         128 |           83 |              278 (51%) |               521 (48%) |               462 (43%) |\n| 🔴\u0026nbsp;Ryzen\u0026nbsp;7\u0026nbsp;7800X3D                |               1.08 |          32 |          102 |              296 (44%) |               361 (27%) |               363 (27%) |\n| 🔴\u0026nbsp;Ryzen\u0026nbsp;7\u0026nbsp;5700X3D                |               0.87 |          32 |           51 |              229 (68%) |               135 (20%) |               173 (26%) |\n| 🔴\u0026nbsp;FX-6100                                  |               0.16 |          16 |           26 |               11 ( 7%) |                11 ( 3%) |                22 ( 7%) |\n| 🔴\u0026nbsp;Athlon\u0026nbsp;X2\u0026nbsp;QL-65                |               0.03 |           4 |           11 |                3 ( 4%) |                 2 ( 2%) |                 3 ( 2%) |\n| 🔵\u0026nbsp;Core\u0026nbsp;Ultra\u0026nbsp;7\u0026nbsp;258V         |               0.56 |          32 |          137 |              287 (32%) |               123 ( 7%) |               167 ( 9%) |\n| 🔵\u0026nbsp;Core\u0026nbsp;Ultra\u0026nbsp;9\u0026nbsp;185H         |               1.79 |          16 |           90 |              317 (54%) |               267 (23%) |               288 (25%) |\n| 🔵\u0026nbsp;Core\u0026nbsp;i9-14900K                      |               3.74 |          32 |           96 |              443 (71%) |               453 (36%) |               490 (39%) |\n| 🔵\u0026nbsp;Core\u0026nbsp;i7-13700K                      |               2.51 |          64 |           90 |              504 (86%) |               398 (34%) |               424 (36%) |\n| 🔵\u0026nbsp;Core\u0026nbsp;i7-1265U                       |               1.23 |          32 |           77 |              128 (26%) |                62 ( 6%) |                58 ( 6%) |\n| 🔵\u0026nbsp;Core\u0026nbsp;i9-11900KB                     |               0.84 |          32 |           51 |              109 (33%) |               195 (29%) |               208 (31%) |\n| 🔵\u0026nbsp;Core\u0026nbsp;i9-10980XE                     |               3.23 |         128 |           94 |              286 (47%) |               251 (21%) |               223 (18%) |\n| 🔵\u0026nbsp;Xeon\u0026nbsp;E-2288G                        |               0.95 |          32 |           43 |              196 (70%) |               182 (33%) |               198 (36%) |\n| 🔵\u0026nbsp;Core\u0026nbsp;i7-9700                        |               0.77 |          64 |           43 |              103 (37%) |                62 (11%) |                95 (17%) |\n| 🔵\u0026nbsp;Core\u0026nbsp;i5-9600                        |               0.60 |          16 |           43 |              146 (52%) |               127 (23%) |               147 (27%) |\n| 🔵\u0026nbsp;Core\u0026nbsp;i7-8700K                       |               0.71 |          16 |           51 |              152 (45%) |               134 (20%) |               116 (17%) |\n| 🔵\u0026nbsp;Xeon\u0026nbsp;E-2176G                        |               0.71 |          64 |           42 |              201 (74%) |               136 (25%) |               148 (27%) |\n| 🔵\u0026nbsp;Core\u0026nbsp;i7-7700HQ                      |               0.36 |          12 |           38 |               81 (32%) |                82 (16%) |               108 (22%) |\n| 🔵\u0026nbsp;Xeon\u0026nbsp;E3-1240\u0026nbsp;v5                |               0.50 |          32 |           34 |              141 (63%) |                75 (17%) |                88 (20%) |\n| 🔵\u0026nbsp;Core\u0026nbsp;i7-4770                        |               0.44 |          16 |           26 |              104 (62%) |                69 (21%) |                59 (18%) |\n| 🔵\u0026nbsp;Core\u0026nbsp;i7-4720HQ                      |               0.33 |          16 |           26 |               80 (48%) |                23 ( 7%) |                60 (18%) |\n| 🔵\u0026nbsp;Celeron\u0026nbsp;N2807                       |               0.01 |           4 |           11 |                7 (10%) |                 3 ( 2%) |                 3 ( 2%) |\n\n\u003c/details\u003e\n\n\n\n## Multi-GPU Benchmarks\n\nMulti-GPU benchmarks are done at the largest possible grid resolution with cubic domains, and either 2x1x1, 2x2x1 or 2x2x2 of these domains together. The (percentages in round brackets) are single-GPU [roofline model](https://en.wikipedia.org/wiki/Roofline_model) efficiency, and the (multiplicators in round brackets) are scaling factors relative to benchmarked single-GPU performance.\n\n```mermaid\ngantt\n\ntitle FluidX3D Performance [MLUPs/s] - FP32 arithmetic, (fastest of FP32/FP16S/FP16C) memory storage\ndateFormat X\naxisFormat %s\n%%{\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprojectphysx%2Ffluidx3d","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fprojectphysx%2Ffluidx3d","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprojectphysx%2Ffluidx3d/lists"}