{"id":17632750,"url":"https://github.com/finmath/finmath-lib-cuda-extensions","last_synced_at":"2025-05-05T22:37:36.446Z","repository":{"id":52529909,"uuid":"94670518","full_name":"finmath/finmath-lib-cuda-extensions","owner":"finmath","description":"Classes enabling finmath-lib to run its Monte-Carlo models on Cuda GPUs","archived":false,"fork":false,"pushed_at":"2024-10-03T18:50:16.000Z","size":6997,"stargazers_count":9,"open_issues_count":1,"forks_count":5,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-31T00:23:57.651Z","etag":null,"topics":["cuda","finmath-lib","gpu"],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/finmath.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-06-18T06:54:23.000Z","updated_at":"2024-05-28T05:52:50.000Z","dependencies_parsed_at":"2024-10-23T07:19:03.112Z","dependency_job_id":"8dbbc592-71a1-4154-ad6e-9e1f04805a1d","html_url":"https://github.com/finmath/finmath-lib-cuda-extensions","commit_stats":null,"previous_names":[],"tags_count":16,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finmath%2Ffinmath-lib-cuda-extensions","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finmath%2Ffinmath-lib-cuda-extensions/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finmath%2Ffinmath-lib-cuda-extensions/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finmath%2Ffinmath-lib-cuda-extensions/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/finmath","download_url":"https://codeload.github.com/finmath/finmath-lib-cuda-extensions/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252588409,"owners_count":21772673,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cuda","finmath-lib","gpu"],"created_at":"2024-10-23T01:45:32.010Z","updated_at":"2025-05-05T22:37:36.426Z","avatar_url":"https://github.com/finmath.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"finmath lib cuda extensions\n==========\n\nProject home page: https://finmath.net/finmath-lib-cuda-extensions\n\n****************************************\n\n**Vector class (RandomVariable) running on GPUs using CUDA.**\n\n**Enabling finmath lib with Cuda via jcuda. - Running finmath lib models on a GPU.**\n\n****************************************\n\nThe *finmath lib cuda extensions* provide a **Cuda** implementation of the [finmath lib](http://finmath.net/finmath-lib/) interfaces `RandomVariable` and `BrownianMotion` compatible with finmath lib 4.0.12 or later\n(tested on GRID GPU Kepler GK105, GeForce GTX 1080, GeForce GT 750M.\n\nFor OpenCL support see finmath-lib-opencl-extensions.\n\nPerformance Characteristics\n-------------------------------------\n\n\u003cimg src=\"images/LIBORMarketModelCalibrationATMTest-Graph.png\" style=\"width: 50%; float: right;\"/\u003e\n\nThe current implementation uses very small CUDA kernels which affects the performance. This may be optimized quite straight forwardly in future versions.\nThis implies a specific performance characteristic: the CUDA communication overhead constitutes a certain amount of \"fixed costs\".\nDepending on GPU and CPU specifics the performance is at par for Monte Carlo simulations with 5000 paths.\nHowever, for larger number of paths, the CPU scales linear, while the GPU show almost no change. That is, For a\nMonte-Carlo simulation with 50000 paths, the GPU is 10 times faster than the CPU. For 100000 paths, the GPU is 20 times faster than the CPU.\n\n\n### Limitations\n\nThe main limitation is GPU memory. RandomVariable objects are held on the GPU and keept as long as they are referenced.\nSince multiple processes may aquire GPU memory, it may be less clear how much GPU memory is available.\nLarger Monte-Carlo simulations may require 12 GB or more of GPU memory.\n\nAnother aspect that may affect the performance is the CUDA implementation.\n\n\u003cdiv style=\"clear: both;\"/\u003e\n\nInterfaces for which CUDA Implementations are Provided\n-------------------------------------\n\n### RandomVariable\n\nA `RandomVariableCudaFactory` is provided, which can be injected in any finmath lib model/algorithm using a random variable factory to construct `RandomVariable` objects.\n\nObjects created from this factory or from objects created from this factory perform their calculation on the CUDA device (GPU).\n\nThe implementation supports type priorities (see http://ssrn.com/abstract=3246127 ) and the default priority of `RandomVariableCuda` is 20. For example: operators involving CPU and GPU vectors will result in GPU vectors.\n\nThe `RandomVariableCudaFactory` can be combined with *algorithmic differentiation* AAD wrappers, for example `RandomVariableDifferentiableAAD`, to allow algorithmic differentiation together with calculations performed on the GPU. For the type priority: objects allowing for algorithmic differentiation (AAD) have higher priority, AAD on GPU has higher priority than AAD on CPU.\n\n\n### BrownianMotion\n\n\nIn addition, objects of type `BrownianMotion` are also taking the role of a factory for objects of type `RandomVariable`. Thus, injecting the `BrownianMotionCuda` into classes consuming a `BrownianMotion` will result in finmath-lib models performing their calculations on the GPU - seamlessly.\n\nDistribution\n-------------------------------------\n\nfinmath-lib-cuda-extensions is distributed through the central Maven repository. It's coordinates are:\n\n    \u003cgroupId\u003enet.finmath\u003c/groupId\u003e\n    \u003cartifactId\u003efinmath-lib-cuda-extensions\u003c/artifactId\u003e\n\n\nThe project is currently build for Cuda 10.2.\nFor other Cuda versions use the Maven command line property `cuda.version` set to one of `8.0`, `9.2`, `10.0`, `10.1`, `10.2` and the Maven classifyer `cuda-${cuda.version}`.\n\nExample\n-------------------------------------\n\nCreate a vector of floats on the GPU device\n\n```\nRandomVariable randomVariable = new RandomVariableCuda(new float[] {-4.0f, -2.0f, 0.0f, 2.0f, 4.0f} );\n```\n\nperform some calculations (still on the GPU device)\n\n```\nrandomVariable = randomVariable.add(4.0);\nrandomVariable = randomVariable.div(2.0);\n```\n\nperform a reduction on the GPU device\n\n```\ndouble average = randomVariable.getAverage();\n```\n\nor get the result vector (to the host)\n\n```\ndouble[] result = randomVariable.getRealizations();\n```\n\n(note: the result is always double, since different implementation may support float or double on the device).\n\n### RandomVariableFactory\n\nA better approach is to use a `RandomVariableFactory` instead of a constructor. You may then write your program in terms of the `RandomVariableFactory` interface:\n\n\n```\nRandomVariableFactory randomVariableFactory = new RandomVariableCudaFactory();\n```\nand then\n```\nRandomVariable randomVariable = randomVariableFactory.createRandomVariable(new float[] {-4.0f, -2.0f, 0.0f, 2.0f, 4.0f} );\n```\n\nIf you write your code in terms of the `RandomVariableFactory` interface you may easily switch your implementations by using one of the following factories:\n\n- `RandomVariableCudaFactory` - CUDA vectors running on an CUDA device (GPU).\n- `RandomVariableFromArrayFactory` - Java floating point array (single or double precision).\n- `RandomVariableDifferentiableAADFactory` - Endowing any of the above with adjoint algorithmic differentiation.\n\nInstallation / Build\n-------------------------------------\n\nBinary distribution is available via Maven central.\n\nYou have to have NVIDIA Cuda 10.1 installed. The Maven configuration comes with profiles for Cuda 8.0, 9.2, 10.0 and 10.1.\nIf you like to use a different version, you can try to switch the JCuda version by setting the property cuda.version\non the Maven command line.\n\nTo build the project yourself and run the unit tests from the source repository:\n\nObtain the finmath-lib-cuda-extensions source\n\n```\ngit clone https://github.com/finmath/finmath-lib-cuda-extensions.git\ncd finmath-lib-cuda-extensions\n```\n\n...then build the code.\n\n```\nmvn clean package\n```\n\nThis will build the version using Cuda 10.2. For Cuda 10.1 use\n\n```\nmvn -Dcuda.version=10.1 clean package\n```\n\nIf everything goes well, you will see unit test run. Note that some of the tests may fail if the device (GPU) has not enough memory. \n\nTrying more\n-------------------------------------\n\nYou may turn on logging using `-Djava.util.logging.config.file=logging.properties`. To try different configurations you may use\n\n  - `-Dnet.finmath.montecarlo.opencl.RandomVariableOpenCL.deviceType=GPU -Dnet.finmath.montecarlo.opencl.RandomVariableOpenCL.deviceIndex=0`\n  - `-Dnet.finmath.montecarlo.opencl.RandomVariableOpenCL.deviceType=CPU -Dnet.finmath.montecarlo.opencl.RandomVariableOpenCL.deviceIndex=0`\n  - `-Dnet.finmath.montecarlo.opencl.RandomVariableOpenCL.deviceType=GPU -Dnet.finmath.montecarlo.opencl.RandomVariableOpenCL.deviceIndex=1`\n\nfor example\n  \n```\nmvn clean install test -Djava.util.logging.config.file=logging.properties -Dnet.finmath.montecarlo.opencl.RandomVariableOpenCL.deviceType=GPU -Dnet.finmath.montecarlo.opencl.RandomVariableOpenCL.deviceIndex=1\n```\n\nYou may run dedicated tests using\n\n  - `-Dtest=RandomVariableGPUTest`\n  - `-Dtest=MonteCarloBlackScholesModelTest`\n  - `-Dtest=LIBORMarketModelCalibrationATMTest`\n  - `-Dtest=LIBORMarketModelCalibrationTest`\n\nThe last tests are computationally heavy Monte-Carlo interest rate models. The test may fail on devices that lack sufficient memory.\n\nTrying on Amazon EC2\n-------------------------------------\n\nIf you do not have a machine with NVidia Cuda 10.0 at hand, you may try out the finmath-lib-cuda-extensions on an Amazon EC2 machine. To do so:\n\n* Create an Amazon AWS account (if needed) an go to your AWS console.\n* Select to start an EC2 virtual server.\n* Launch a GPU instance\n  - Filter the list of images (AMI) using `gpu` and select - e.g. - `Deep Learning Base AMI (Ubuntu) Version 19.0`.\n  - Filter the list of servers using the \"GPU instances\" and select an instance.\n* Login to your GPU instance.\n* Check that you have cuda 10.0 (e.g. use `nvcc --version`)\n* Try finmath-lib-cuda-extensions as described in the previous section.\n\nPerformance\n-------------------------------------\n\n\n### Unit test for random number generation:\n\n\n```\nRunning net.finmath.montecarlo.BrownianMotionTest\nTest of performance of BrownianMotionLazyInit                  \t..........test took 49.057 sec.\nTest of performance of BrownianMotionJavaRandom                \t..........test took 65.558 sec.\nTest of performance of BrownianMotionCudaWithHostRandomVariable\t..........test took 4.633 sec.\nTest of performance of BrownianMotionCudaWithRandomVariableCuda\t..........test took 2.325 sec.\n```\n\n\n### Unit test for Monte-Carlo simulation\n\n\n```\nRunning net.finmath.montecarlo.assetderivativevaluation.MonteCarloBlackScholesModelTest\nBrownianMotionLazyInit                    calculation time =  4.00 sec   value Monte-Carlo =  0.1898\t value analytic    =  0.1899.\nBrownianMotionJavaRandom                  calculation time =  5.19 sec   value Monte-Carlo =  0.1901\t value analytic    =  0.1899\t.\nBrownianMotionCudaWithHostRandomVariable  calculation time =  2.50 sec   value Monte-Carlo =  0.1898\t value analytic    =  0.1899.\nBrownianMotionCudaWithRandomVariableCuda  calculation time =  0.09 sec   value Monte-Carlo =  0.1898\t value analytic    =  0.1899\t.\n```\n\nRemark:\n* `BrownianMotionLazyInit`: Calculation on CPU, using Mersenne Twister.\n* `BrownianMotionJavaRandom`: Calculation on CPU, using Java random number generator (LCG).\n* `BrownianMotionCudaWithHostRandomVariable`: Calculation on CPU and GPU: Random number generator on GPU, Simulation on CPU.\n* `BrownianMotionCudaWithRandomVariableCuda`: Calculation on GPU: Random number generator on GPU, Simulation on GPU.\n\n\n### Unit test for LIBOR Market Model calibration\n\n\nThere is also a unit test performing a brute force Monte-Carlo calibration of a LIBOR Market Model with stochastic volatility on the CPU and the GPU. Note however that the unit test uses a too small size for the number of simulation paths, such that the GPU code is no improvement over the CPU code. The unit test shows that CPU and GPU give consistent results.\n\nThe performance of a brute-force Monte-Carlo calibration with 80K and 160K paths are given below. Note: if the number of paths is increased, the GPU time remains almost the same (given that the GPU has sufficient memory), while the CPU time grows linearly. This is due to the fact that the GPU performance has a large part of fixed management overhead (which will be reduced in future versions).\n\nThe CPU version was run on a an Intel i7-7800X 3.5 GHz using multi-threadded calibration.\nTHe GPU version was run on an nVidia GeForce GTX 1080.\n\n\n#### LMM with 81,920 paths\n\n\n```\nRunning net.finmath.montecarlo.interestrates.LIBORMarketModelCalibrationTest\n\nCalibration to Swaptions using CPU    calculation time = 364.42 sec    RMS Error.....: 0.198%.\nCalibration to Swaptions using GPU    calculation time =  49.46 sec    RMS Error.....: 0.198%.\n```\n(LIBOR Market Model with stochastic volatility, 6 factors, 81920 paths)\n\n\n#### LMM with 163,840 paths\n\n\n```\nRunning net.finmath.montecarlo.interestrates.LIBORMarketModelCalibrationTest\n\nCalibration to Swaptions using CPU    calculation time = 719.33 sec    RMS Error.....: 0.480%.\nCalibration to Swaptions using GPU    calculation time =  51.70 sec    RMS Error.....: 0.480%.\n```\n(LIBOR Market Model with stochastic volatility, 6 factors, 163840 paths)\n\n\nProfiles for Other Cuda Versions\n-------------------------------------\n\nThe default profile will build the version using Cuda 10.2.\n\n\n#### Cuda 11.1\n\n\nFor Cuda 11.1 use\n\n```\nmvn -Pcuda-11.1 clean package\n```\n\nor\n\n```\nmvn -Dcuda.version=11.1 clean package\n```\n\n\n#### Cuda 11.0\n\n\nFor Cuda 11.0 use\n\n```\nmvn -Pcuda-11.0 clean package\n```\n\nor\n\n```\nmvn -Dcuda.version=11.0 clean package\n```\n\n\n#### Cuda 10.1\n\n\nFor Cuda 10.1 use\n\n```\nmvn -Pcuda-10.1 clean package\n```\n\nor\n\n```\nmvn -Dcuda.version=10.1 clean package\n```\n\n\n#### Cuda 10.0\n\n\nFor Cuda 10.0 use\n\n```\nmvn -Pcuda-10.0 clean package\n```\n\nor\n\n```\nmvn -Dcuda.version=10.0 clean package\n```\n\n\n#### Cuda 9.2\n\n\nFor Cuda 9.2 use\n\n```\nmvn -Pcuda-9.2 clean package\n```\n\nor\n\n```\nmvn -Dcuda.version=9.2 clean package\n```\n\n\n#### Cuda 8.0\n\n\nFor Cuda 8.0 use\n\n```\nmvn -P cuda-8.0 clean package\n```\n\nor\n\n```\nmvn -Dcuda.version=8.0 clean package\n```\n\n\n#### Cuda 6.0\n\n\nFor Cuda 6.0 use\n\n```\nmvn -P cuda-6.0 clean package\n```\n\nor\n\n```\nmvn -Dcuda.version=6.0 clean package\n```\n\n\nFor Cuda 6.0 the jcuda binaries are not unpacked automatically and have to be installed manually. Set LD_LIBRARY_PATH (*nix environment variable) or java.library.path (Java system property) to the jcuda platform specific binaries.\n\n\n\nReferences\n-------\n\n* [finmath lib Project documentation](http://finmath.net/finmath-lib/)\nprovides the documentation of the library api.\n* [finmath lib API documentation](http://finmath.net/finmath-lib/apidocs/)\nprovides the documentation of the library api.\n* [finmath.net special topics](http://www.finmath.net/topics)\ncover some selected topics with demo spreadsheets and uml diagrams.\nSome topics come with additional documentations (technical papers).\n\nLicense\n-------\n\nThe code of \"finmath lib\", \"finmath lib opencl extensions\" and \"finmath lib cuda extensions\" (packages\n`net.finmath.*`) are distributed under the [Apache License version\n2.0](http://www.apache.org/licenses/LICENSE-2.0.html), unless otherwise explicitly stated.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffinmath%2Ffinmath-lib-cuda-extensions","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffinmath%2Ffinmath-lib-cuda-extensions","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffinmath%2Ffinmath-lib-cuda-extensions/lists"}