{"id":46682295,"url":"https://github.com/menooker/kunquant","last_synced_at":"2026-05-24T09:01:51.831Z","repository":{"id":214493993,"uuid":"736659702","full_name":"Menooker/KunQuant","owner":"Menooker","description":"A compiler, optimizer and executor for financial expressions and factors","archived":false,"fork":false,"pushed_at":"2026-05-22T10:02:41.000Z","size":961,"stargazers_count":277,"open_issues_count":0,"forks_count":52,"subscribers_count":6,"default_branch":"main","last_synced_at":"2026-05-22T16:43:46.651Z","etag":null,"topics":["alpha101","avx512","compiler","python","quant","quantitative-finance"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Menooker.png","metadata":{"files":{"readme":"Readme.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"github":"Menooker","patreon":null,"open_collective":null,"ko_fi":null,"tidelift":null,"community_bridge":null,"liberapay":null,"issuehunt":null,"lfx_crowdfunding":null,"polar":null,"buy_me_a_coffee":null,"thanks_dev":null,"custom":null}},"created_at":"2023-12-28T14:05:48.000Z","updated_at":"2026-05-20T13:21:42.000Z","dependencies_parsed_at":"2024-01-08T13:59:06.744Z","dependency_job_id":"770a7e88-56b8-44d2-bf0f-0e018bcdf04e","html_url":"https://github.com/Menooker/KunQuant","commit_stats":null,"previous_names":["menooker/kunquant"],"tags_count":13,"template":false,"template_full_name":null,"purl":"pkg:github/Menooker/KunQuant","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Menooker%2FKunQuant","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Menooker%2FKunQuant/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Menooker%2FKunQuant/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Menooker%2FKunQuant/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Menooker","download_url":"https://codeload.github.com/Menooker/KunQuant/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Menooker%2FKunQuant/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33427584,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-23T22:14:44.296Z","status":"online","status_checked_at":"2026-05-24T02:00:06.296Z","response_time":57,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["alpha101","avx512","compiler","python","quant","quantitative-finance"],"created_at":"2026-03-09T00:34:35.453Z","updated_at":"2026-05-24T09:01:51.825Z","avatar_url":"https://github.com/Menooker.png","language":"Python","funding_links":["https://github.com/sponsors/Menooker"],"categories":[],"sub_categories":[],"readme":"# KunQuant\n\n![Kun](https://github.com/Menooker/KunQuant/assets/10137875/cb67b6fb-2bd3-41dd-921f-581c4c8d34d6)\n\nKunQuant is a optimizer, code generator and executor for financial expressions and factors, e.g. `(close - open) /((high - low) + 0.001)`. The initial aim of it is to generate efficient implementation code for [Alpha101](https://arxiv.org/pdf/1601.00991) of WorldQuant and [Alpha158](https://github.com/microsoft/qlib/blob/main/examples/benchmarks/README.md) of Qlib. Some existing implementations of Alpha101 is straightforward but too simple. Hence we are developing KunQuant to provide optimizated code on a batch of general customized factors.\n\nThis project has mainly two parts: `KunQuant` and `KunRunner`. KunQuant is an optimizer \u0026 code generator written in Python. It takes a batch of financial expressions as the input and it generates highly optimized C++ code for computing these expressions. KunRunner is a supporting runtime library and Python wrapper to load and run the generated C++ code from KunQuant. Startring from version `0.1.0`, KunQuant no longer depends on `cmake` to run the generated factor code. Users can use pure Python interfaces to build and run factors.\n\n\nExperiments show that KunQuant-generated code can be more than 170x faster than naive implementation based on Pandas. We ran Alpha001~Alpha101 with [Pandas-based code](https://github.com/yli188/WorldQuant_alpha101_code/blob/master/101Alpha_code_1.py) and our optimized code. See results below:\n\n| Datatype | Pandas-based  |  KunQuant 1-thread  |  KunQuant  4-threads |\n|---|---|---|---|\n| Single precision (STs layout) | 6.138s |  0.083s  |  0.027s  |\n| Double precision (TS layout) | 6.332s |  0.120s  |  0.031s  |\n\nThe data was collected on 4-core Intel i7-7700HQ CPU, running synthetic data of 64 stocks with 260 rows of data. Environment:\n\n```\nOS=Ubuntu 22.04.3 on WSL2 on Windows 10\npython=3.10.2\npandas=2.1.4\nnumpy=1.26.3\ng++=11.4.0\n```\n\n## Supported features of KunQuant\n\n * Batch mode and stream mode for the input\n * Double and single precision float point data type\n * TS or STs memory layout as input/output in batch mode\n * Python/C/C++ interfaces to call the factor computation functions\n * x86 and ARM CPUs are supported. Linux, Windows and macOS are supported.\n\n**Important node**: For better performance compared with Pandas, KunQuant suggests to use a multiple of `{blocking_len}` as the number of stocks in inputs. For single-precision float type and AVX2 instruction set, `blocking_len=8`. That is, you are suggested to input 8, 16, 24, ..., etc. stocks in a batch, if your code is compiled with AVX2 (without AVX512) and `float` datatype. Other numbers of stocks **are supported**, with lower execution performance.\n\nThe support matrix of KunQuant\n\n| OS(CPU) | macOS (Apple Silicon)  |  macOS (x86)  |  Windows (x86) | Linux (Ubuntu,x86) | Linux (Ubuntu,aarch64) |\n|---|---|---|---|---|---|\n| Install via `pip install KunQuant` | ✅ |  ✅  |  ✅  | ✅ | ❌ |\n| Tested in CI                       | ✅ |  ❌  |  ✅  | ✅ | ✅ |\n\n## Why KunQuant is fast\n\n * KunQuant parallelizes the computation for factors and uses SIMD (AVX2) to vectorize them.\n * Redundant computation among factors are eliminated: Think what we can do with `sum(x)`, `avg(x)`, `stddev(x)`? The result of `sum(x)` is needed by all these factors. KunQuant also automatically finds if a internal result of a factor is used by other factors and try to reuse the results.\n * Temp buffers are minimized by operator-fusion. For a factor like `(a+b)/2`, pandas and numpy will first compute the result of `(a+b)` and collect all the result in a buffer. Then, `/2` opeator is applied on each element of the temp buffer of `(a+b)`. This will result in large memory usage and bandwidth. KunQuant will generate C++ code to compute `(a[i]+b[i])/2` in the same loop, to avoid the need to access and allocate temp memory.\n\n## Sponsor this project!\n\nSponsor the author [@Menooker](https://github.com/sponsors/Menooker)\n\n## Installing KunQuant\n\nInstall a released version:\n\n`pip install KunQuant`\n\nOr install the latest version on `main` branch\n\n`pip install -i https://testpypi.python.org/pypi KunQuant`\n\nKunQuant supports Windows (MSVC needs to be installed) and Linux (g++ or clang needs to be installed). Please make sure a working C++ compiler with C++11 support is properly installed and configured in your system\n\nOn x86-64 CPUs, AVX2-FMA is used by default in the installed KunQuant core library. To run on older CPUs with AVX and without AVX2, you need to build and install KunQuant from source and set environment variable `KUN_NO_AVX2=1` before building. See `Build from source and developing tips` below.\n\n## Example: Build \u0026 Run Alpha101\n\nThis section serves as am example for compiling an existing factor library: Alpha101 and running it. Building and running your own factors will be similar. If you are only interested in how you can run Alpha101 factors, this section is all you need.\n\nFirst, import KunQuant and necessary modules\n\n```python\nfrom KunQuant.jit import cfake\nfrom KunQuant.Driver import KunCompilerConfig\nfrom KunQuant.Op import Builder, Input, Output\nfrom KunQuant.Stage import Function\nfrom KunQuant.predefined import Alpha101\nfrom KunQuant.runner import KunRunner as kr\n```\n\nThen build a `Function` object and generete predefined factor `alpha001` in Alpha101:\n\n```python\nbuilder = Builder()\nwith builder:\n    vclose = Input(\"close\")\n    low = Input(\"low\")\n    high = Input(\"high\")\n    vopen = Input(\"open\")\n    amount = Input(\"amount\")\n    vol = Input(\"volume\")\n    all_data = Alpha101.AllData(low=low,high=high,close=vclose,open=vopen, amount=amount, volume=vol)\n    Output(Alpha101.alpha001(all_data), \"alpha001\")\nf = Function(builder.ops)\n```\n\nYou can review the `alpha001` expression by `print(f)`. And you will get output\n\n```\nv0 = Input@{name:close}()\nv2 = Div@(v0,v1)\nv3 = SubConst@{value:1.0}(v2)\nv4 = LessThanConst@{value:0.0}(v3)\nv5 = WindowedStddev@{window:20}(v3)\nv6 = Select@(v4,v5,v0)\nv7 = Mul@(v6,v6)\nv8 = TsArgMax@{window:5}(v7)\nv9 = Rank@(v8)\nv10 = Output@{name:alpha001}(v9)\n```\n\nThen compile it into an executable object (it may takes a few seconds to compile. If you encounter an subprocess error, please make sure MSVC or g++ is installed).\n\n```python\nlib = cfake.compileit([(\"alpha101\", f, KunCompilerConfig(input_layout=\"TS\", output_layout=\"TS\"))], \"out_first_lib\", cfake.CppCompilerConfig())\nmodu = lib.getModule(\"alpha101\")\n```\n\nWe will explain the function `cfake.compileit` in [Customize.md](./doc/Customize.md). Let's continue to see how to use the compiled `lib`.\n\nLoad your stock data. In this example, load from local pandas files. We assume the open, close, high, low, volumn and amount data for different stocks are stored in different files.\n\n```python\nimport pandas as pd\n\n# we need a multiple of 8 number of stocks\nwatch_list = [\"000002\", \"000063\", ...]\nnum_stocks = len(watch_list)\ndf = []\n\nfor stockid in watch_list:\n    d = pd.read_hdf(f\"{stockid}.hdf5\")\n    df.append(d)\n\nprint(df[0])\n\ncols = df[0].columns.values\ncol2idx = dict(zip(cols, range(len(cols))))\nprint(\"columns to index\", col2idx)\nnum_time = len(df[0])\nprint(\"dimension in time\", num_time)\n```\n\nHere we printed the data frame of the first stock and the column-index mapping, it should look like:\n\n```\n                 open       high        low      close       volume        amount\ndate                                                                             \n2020-01-02  32.799999  33.599998  32.509998  32.560001  101213040.0  3.342374e+09\n2020-01-03  32.709999  32.810001  31.780001  32.049999   80553632.0  2.584310e+09\n2020-01-06  31.750000  31.760000  31.250000  31.510000   87684056.0  2.761449e+09\n...               ...        ...        ...        ...          ...           ...\n2024-01-30  10.000000  10.050000   9.790000   9.790000   79792704.0  7.903654e+08\n2024-01-31   9.770000   9.850000   9.560000   9.600000   67478864.0  6.527274e+08\n2024-02-01   9.530000   9.660000   9.420000   9.440000   62786032.0  5.980486e+08\n\n[993 rows x 6 columns]\ncolumns to index {'open': 0, 'high': 1, 'low': 2, 'close': 3, 'volume': 4, 'amount': 5}\ndimension in time 993\n```\n\n\nTransform your pandas data to numpy array of shape `[features, stocks, time]`. Feature here means the columns for open, close, high, low, volumn and amount.\n\n```python\nimport numpy as np\n\n# [features, stocks, time]\ncollected = np.empty((len(col2idx), num_stocks, len(df[0])), dtype=\"float32\")\nfor stockidx, data in enumerate(df):\n    for colname, colidx in col2idx.items():\n        mat = data[colname].to_numpy()\n        collected[colidx, stockidx, :] = mat\n```\n\nTranspose the matrix to `[features, time, stocks]`\n```python\n# [features, stocks, time] =\u003e [features, time, stocks]\ntransposed = collected.transpose((0, 2, 1))\ntransposed = np.ascontiguousarray(transposed)\n```\n\nNow fill the input data in a dict of `{\"open\": matrix_open, \"close\": ...}`\n\n```python\ninput_dict = dict()\nfor colname, colidx in col2idx.items():\n    input_dict[colname] = transposed[colidx]\n```\n\nCreate an executor and compute the factors!\n\n```python\n# using 4 threads\nexecutor = kr.createMultiThreadExecutor(4)\nout = kr.runGraph(executor, modu, input_dict, 0, num_time)\nprint(\"Result of alpha101\", out[\"alpha001\"])\nprint(\"Shape of alpha101\", out[\"alpha001\"].shape)\n```\n\nEach output factors are computed in an array of shape `[time, stocks]`. The output of above code can be:\n\n```\nResult of alpha001 [[   nan    nan    nan ...    nan    nan    nan]\n [   nan    nan    nan ...    nan    nan    nan]\n [   nan    nan    nan ...    nan    nan    nan]\n ...\n [0.6875 0.1875 0.1875 ... 0.6875 0.6875 0.6875]\n [0.6875 0.1875 0.1875 ... 0.6875 0.6875 0.6875]\n [0.4375 1.     0.875  ... 0.4375 0.4375 0.4375]]\nShape of alpha001 (993, 8)\n```\n\nBy default, runGraph will allocate an numpy array for each of the output factor. However, you can preallocate a numpy array and tell KunRunner to fill in this array instead of creating new ones.\n\n```python\noutnames = modu.getOutputNames()\nout_dict = dict()\n# [Factors, Time, Stock]\nsharedbuf = np.empty((len(outnames), num_time, num_stocks), dtype=\"float32\")\nfor idx, name in enumerate(outnames):\n    out_dict[name] = sharedbuf[idx]\nout = kr.runGraph(executor, modu, input_dict, 0, num_time, out_dict)\n# results are in \"out\" and \"sharedbuf\"\n```\n\nNote that the executors are reusable. A multithread executor is actually a thread pool inside. If you want to run on multiple batches of data, you don’t need to create new executors for each batch.\n\n\n## Customized factors\n\nKunQuant is a tool for general expressions. You can further read [Customize.md](./doc/Customize.md) for how you can compile your own customized factors. This document also provides infomation on\n * building and keeping the compilation result for later use\n * Loading existing compiled factor library\n * enabling AVX512\n * select data types (float/double)\n * Memory layout\n\n\n## Build from source and developing tips\n\nThis section is for developer who would like to build KunQuant from source, instead of installing via pip.\n\n### Dependency\n\n* pybind11 (automatically cloned via git as a submodule)\n* Python (3.7+ with f-string and dataclass support)\n* cmake\n* A working C++ compiler with C++11 support (e.g. clang, g++, msvc)\n* x86-64 CPU with at least AVX instruction set (AVX2-FMA is preferred and required by default), or ARM CPU with NEON instruction set.\n* Optionally requires AVX512 on CPU for better performance\n\n### Build and install\n\n```shell\ngit clone https://github.com/Menooker/KunQuant --recursive\ncd KunQuant\npip install .\n```\n### Build in develop mode\n\nIf you would like to install KunQuant and edit it. You can use `editable` mode of python library.\n\nLinux:\n\n```shell\n# export KUN_BUILD_TYPE=Debug    # to build debug version of KunQuant Runtime\n# in the root directory of KunQuant\nKUN_BUILD_TESTS=1 pip install -e . \n```\n\nWindows powershell:\n\n```shell\n# in the root directory of KunQuant\n$env:KUN_BUILD_TESTS=1\npip install -e . \n```\n\nYou can also set environment variable `KUN_BUILD_TYPE=Debug` before `pip install -e .` to enable debug build of KunQuant. It will provide debug info of the KunQuant runtime but also slow down the execution.\n\nOn x86-64 CPUs, AVX2-FMA is used by default in the built KunQuant core library. To run on older CPUs with AVX and without AVX2, you need to export environment variable `KUN_NO_AVX2=1` before `pip install -e .`.\n\n### Useful environment variables\n\n * `KUN_DEBUG=1` Print the internal results of each compiler pass\n * `KUN_DEBUG_JIT=1` Print the C++ compilation internals, including command lines, temp results and etc.\n\n## Streaming mode\n\nKunQuant can be configured to generate factor libraries for streaming, when the data arrive one at a time. See [Stream.md](./doc/Stream.md)\n\n## Utility functions\n\nTo compute row-to-row correlation (for IC/IR calculation) and aggregrating functions (like `pd.groupby(...)`), please see [Utility.md](./doc/Utility.md).\n\n## Using C-style APIs\n\nKunQuant provides C-style APIs to call the generated factor code in shared libraries. See [CAPI.md](./doc/CAPI.md)\n\n\n## Operator definitions\n\nSee [Operators.md](./doc/Operators.md)\n\nA few TA-Lib indicators (TRANGE, ATR, SAR) are also implemented as composite ops; see [TA-Lib compatible ops](./doc/Operators.md#ta-lib-compatible-ops).\n\nTo add new operators, see [NewOperators.md](./doc/NewOperators.md)\n\n## Testing and validation\n\nUnit tests for some of the internal IR transformations:\n\n```\npython tests/test.py\npython tests/test2.py\n```\n\nUnit tests for C++ runtime:\n\n```\npython tests/test_runtime.py\n```\n\nTo run the runtime UTs, you need to make sure you have built the cmake target `KunTest` by\n\n```bash\ncmake --build . --target KunTest\n```\n\nCorrectness test of Alpha101\n\n```bash\n# current dir should be at the base directory of KunQuant\npython tests/test_alpha101.py\n```\n\nThe input data are randomly genereted data and the results are checked against a modified (corrected) version of [Pandas-based code](https://github.com/yli188/WorldQuant_alpha101_code/blob/master/101Alpha_code_1.py). Note that some of the factors like `alpha013` are very sensitive to numerical changes in the intermeidate results, because `rank` operators are used. The result may be very different after `rank` even if the input is very close. Hence, the tolerance of these factors will be high to avoid false positives.\n\nTo test Alpha158, you need first download the input data and reference result files: [alpha158.npz](https://github.com/Menooker/KunQuant/releases/download/alpha158/alpha158.npz) and [input.npz](https://github.com/Menooker/KunQuant/releases/download/alpha158/input.npz).\n\nThen run\n\n```bash\n# current dir should be at the base directory of KunQuant\npython tests/test_alpha158.py --inputs /PATH/TO/input.npz --ref /PATH/TO/alpha158.npz \n```\n\nThis script runs alpha158 with double precision mode in KunQuant. It feeds the library with predefined values from `input.npz` and check against the result with `alpha158.npz`, which is computed by `qlib`.\n\nTo generate another Alpha158 result with another randomly generated input, you can run\n\n```bash\n# current dir should be at the base directory of KunQuant\npython ./tests/gen_alpha158.py --tmp /tmp/a158 --qlib /path/to/source/of/qlib --out /tmp\n```\n\nIt will create the random input at `/tmp/input.npz` and result at `/tmp/alpha158.npz`\n\n\n## Acknowledgement\n\nThe implementation and testing code for Alpha101 is based on https://github.com/yli188/WorldQuant_alpha101_code\n\nThe implementation code for Alpha158 is based on https://github.com/microsoft/qlib/blob/main/qlib/contrib/data/handler.py. Licensed under the MIT License.\n\nThe AVX vector operators at `cpp/KunSIMD/cpu` was developed based on [x86simd](https://github.com/uxlfoundation/oneDNN/tree/2eb3dd1082db767fab171e934c551c609008289a/src/graph/backend/graph_compiler/core/src/runtime/kernel_include/x86simd) as a component of GraphCompiler, a backend of oneDNN Graph API. Licensed under the Apache License, Version 2.0 (the \"License\").\n\nThe MSVC environment configuration was originated from cupy, Licensed under the MIT License: https://github.com/cupy/cupy/blob/main/cupy/cuda/compiler.py","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmenooker%2Fkunquant","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmenooker%2Fkunquant","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmenooker%2Fkunquant/lists"}