{"id":17113057,"url":"https://github.com/emmt/ylapack","last_synced_at":"2026-02-19T05:02:49.244Z","repository":{"id":71510289,"uuid":"80740253","full_name":"emmt/YLapack","owner":"emmt","description":"Yorick support for LAPack (or GotoBLAS) library","archived":false,"fork":false,"pushed_at":"2020-03-20T11:48:12.000Z","size":102,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-10-14T12:55:34.254Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/emmt.png","metadata":{"files":{"readme":"README.md","changelog":"NEWS.md","contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS","dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2017-02-02T15:47:43.000Z","updated_at":"2020-03-20T11:38:59.000Z","dependencies_parsed_at":"2023-04-07T12:47:05.156Z","dependency_job_id":null,"html_url":"https://github.com/emmt/YLapack","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/emmt/YLapack","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emmt%2FYLapack","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emmt%2FYLapack/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emmt%2FYLapack/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emmt%2FYLapack/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/emmt","download_url":"https://codeload.github.com/emmt/YLapack/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/emmt%2FYLapack/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279018582,"owners_count":26086583,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-14T02:00:06.444Z","response_time":60,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-14T17:02:38.901Z","updated_at":"2025-10-14T12:55:35.326Z","avatar_url":"https://github.com/emmt.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"               __   ___                            _\n               \\ \\ / / |     __ _ _ __   __ _  ___| | __\n                \\ V /| |    / _` | '_ \\ / _` |/ __| |/ /\n                 | | | |___| (_| | |_) | (_| | (__|   \u003c\n                 |_| |_____|\\__,_| .__/ \\__,_|\\___|_|\\_\\\n                                 |_|\n\nYLapack is a semi-low level Yorick interface to BLAS and LAPACK libraries.\nThe reasons for this interface are multiples:\n\n1. Yorick only provides general matrix inversion and resolution of linear\n   systems of equations (LUsolve), resolution of tridiagonal systems of\n   equations (TDsolve), solving linear least squares (QRsolve or SVsolve) and\n   singular value deconposition (SVdec).  Lapack gives many more: eigenvalues\n   and eigenvectors, symmetric matrices, etc.\n\n2. There are a number of faster implementations of BLAS and LAPACK than the one\n   built into Yorick (an automatic Fortran to C conversion by a smart\n   Emacs-lisp script written by Dave Munro plus hand editing).  You may also\n   want to benefit from the speed-up of basic linear algebra operations such as\n   applying a matrix-vector or matrix-matrix multiplication, computing the\n   scalar product of two vectors (sum(x*y) in Yorick), computing the Euclidean\n   norm of a vector, etc.\n\n3. Although it should be possible to link Yorick with these fast libraries to\n   benefit from the speed in LUsolve, QRsolve, TDsolve, SVsolve and SVdec, you\n   may also want to have a fancier interface to these functions which\n   generalizes the rules for the dot product so that it can be applied on\n   several consecutive dimensions at the same time.\n\nYLapack started as a proof of concept to check whether Yorick can call\nmulti-threaded functions and benefit from the speed up of one of the LAPACK\nimplementation for your machine: GotoBlas, MKL, Atlas, etc.  The main\nlimitation is that only a subset of BLAS and LAPACK functions have been\ninterfaced.\n\n\n## INSTALLATION\n\nUnpack the archive\n[ylapack-0.0.5.tar.bz2](https://github.com/emmt/YLAPack/releases/download/v0.0.5/ylapack-0.0.5.tar.bz2)\nand run the `configure` script:\n\n    cd ylapack-0.0.5\n    ./configure [OPTIONS ...]\n\nwhere `[OPTIONS ...]` denotes a number configuration settings.  Option `--help`\ncan be used to query a short description of available options but essentially,\nyou want to indicate, with option `--interface`, the type of LAPACK interface\n(Atlas, OpenBLAS, GotoBLAS, MKL, etc.) that is to be used with the plug-in.\nKnown interfaces are:\n\n- `lapack` for Lapack libraries;\n- `atlas` for Atlas libraries;\n- `gotoblas2` for GotBlas libraries\n- `openblas` for OpenBlas libraries\n- `mkl_gf` for MKL libraries for 32-bit machine, 32-bit integers, GNU compiler;\n- `mkl_gf_lp64` for MKL libraries for 64-bit machine, 32-bit integers, GNU\n  compiler;\n- `mkl_gf_ilp64` for MKL libraries for 64-bit machine, 64-bit integers, GNU\n  compiler;\n- `mkl_intel` for MKL libraries for 32-bit machine, 32-bit integers, Intel\n  compiler\n- `mkl_intel_lp64` for MKL libraries for 64-bit machine, 32-bit integers, Intel\n  compiler;\n- `mkl_intel_ilp64` for MKL libraries for 64-bit machine, 64-bit integers,\n  Intel compiler\n\nRunning the `configure` script creates a `Makefile` in the current directory\nwhich for compiling and installing the plug-in.  After running the `configure`\nscript, you may edit it to fix the configuration.  Compilation and installaion\n(in Yorick directory tree) is then as simple as:\n\n    make\n    make install\n\nThe `configure` script can be run from elsewhere to build the plug-in without\nchanging the files in the source directory.  For instance:\n\n    mkdir -p build\n    cd build\n    ../configure --interface=openblas\n    make\n    make install\n\n\n## PORTABILITY\n\nThere are a lot of macro definitions to make this software as portable as\npossible, to link with different implementations of LAPACK and even compile\nthe plugin with a different compiler than the one used for Yorick itself.  For\ninstance, I was able to run the tests with a plugin compiled with ICC (the\nIntel C compiler) and linked with the MKL inside a Yorick compiled with GCC\n(the GNU C compiler).\n\n\n## DESIGN\n\nLAPACK functions which return a status (generally in an integer variable\ncalled INFO) are mapped to a Yorick function which returns the status, it is\nthe caller's responsibility to check the returned value (non-zero means error)\nhowever when called as a subroutine, as there are no means for the caller to\nrealize that an error occured, the Yorick wrapper will raise an error.\n\nDimensions are automatically guessed from the arguments.\n\n\n## LINKING WITH GOTOBLAS2\n\n[GotoBLAS2](http://www.tacc.utexas.edu/tacc-projects/gotoblas2) is a very fast\nBLAS library (plus some LAPACK functions) developped by Kazushige Goto for a\nnumber of processors.  You can download the last release made by Kazushige\nGoto (BSD license) at:\n\n    http://cms.tacc.utexas.edu/fileadmin/images/GotoBLAS2-1.13_bsd.tar.gz\n\nUnfortunately, GotoBLAS2 is no longer maintained by its author.  To my\nknowledge, there are two open-source projects which aim at maintaining\nGotoBLAS2:\n\n* OpenBLAS at http://xianyi.github.com/OpenBLAS/\n\n* SurviveGotoBLAS2 at http://prs.ism.ac.jp/~nakama/SurviveGotoBLAS2/\n\nThey are worth having a look, especially if you have some recent processor not\nsupported by GotoBLAS2-1.13.  They also fix a number of bugs in GotoBLAS2.  We\ncurrently use the OpenBLAS (0.2.5) variant with no problem.  See notes below if\nyou want to use the SurviveGotoBLAS2 variant.\n\nAfter unpacking the archive, enter the source directory of GotoBLAS2 and just\ntype:\n\n    shell\u003e make\n\nto build the library for your processor and compiler; if you want to support\nmultiple architecture, build the library with:\n\n    shell\u003e make DYNAMIC_ARCH=1\n\nIf you want the dynamic library:\n\n    shell\u003e make [OPTIONS] shared\n\nwhere `OPTIONS` are the same (e.g., `DYNAMIC_ARCH=1`) as used to build the\nstatic library.  Note that the static library is build as position independent\ncode (PIC), so it is perfectly usable for making a Yorick plugin.\n\nIf you plan to use multi-threaded FFTW, add the options:\n\n    USE_THREAD=1 USE_OPENMP=1\n\nwhen building GotoBLAS2.\n\nNote that to successfully compile for multiple architectures the\nGotoBLAS2-1.13_bsd version, I had to patch the file `driver/others/dynamic.c`,\nthe differences are:\n\n```\n71c71\n\u003c static int get_vendor(void){\n---\n\u003e static int get_vendor(void) {\n77,79c77,79\n\u003c   *(int *)(\u0026vendor[0]) = ebx;\n\u003c   *(int *)(\u0026vendor[4]) = edx;\n\u003c   *(int *)(\u0026vendor[8]) = ecx;\n---\n\u003e   memcpy(\u0026vendor[0], \u0026ebx, 4);\n\u003e   memcpy(\u0026vendor[4], \u0026edx, 4);\n\u003e   memcpy(\u0026vendor[8], \u0026ecx, 4);\n201c201\n\u003c   if (gotoblas == NULL) gotoblas = gotoblas_KATMAI;\n---\n\u003e   if (gotoblas == NULL) gotoblas = \u0026gotoblas_KATMAI;\n203c203\n\u003c   if (gotoblas == NULL) gotoblas = gotoblas_PRESCOTT;\n---\n\u003e   if (gotoblas == NULL) gotoblas = \u0026gotoblas_PRESCOTT;\n```\n\nThese have been fixed in SurviveGotoBLAS2 which has the advantage of taking\ninto account newest LAPACK version (3.3.1 as of writing) while\nGotoBLAS2-1.13_bsd is stuck at LAPACK version 3.1.1.\n\nTo build SurviveGotoBLAS2, download the latest version, unpack it, and:\n\n    shell\u003e make DYNAMIC_ARCH=1 LAPACK_VERSION=3.3.1\n\nI do not use options `NO_WARMUP=1 NO_AFFINITY=1 NUM_THREADS=48` since I got a\nsegmentation fault in the plugin (in lpk_gesv) when using the library compiled\nwith these options (though the tests were successfully passed and I do not\nknow which of this option is responsible of the problem).  Avoid\nDYNAMIC_ARCH=1 if you just want a version for your machine.\n\nOnce you have built the GotoBLAS library, copy it (or make a symbolic link or\nchange the rules in `rules/Make.gotoblas2`) into the directory of YLapack\nsource as `libgoto2.a` and run:\n\n    shell\u003e yorick -batch make.i\n    shell\u003e make clean\n    shell\u003e make MODEL=gotoblas2\n\nthen you can test the plugin (if you have used MKL or shared libraries, you\nmay have to set `LD_LIBARY_PATH` accordingly):\n\n    shell\u003e yorick\n\n    yorick\u003e include, \"lapack-test.i\";\n    yorick\u003e lpk_test_gesv, 3000, nloops=20;\n\nand enjoy the speedup ;-)\n\n\n## PERFORMANCES\n\nExecution times on my laptop -- Intel Core i7 (Q820) at 1.73GHz -- (the\npercentage is the rate of CPU occupation, more than 100% means\nmulti-threading; the speed-up w.r.t. Yorick is given between the square\nbrackets).\n\n### Scalar product\n```\n-----------------------------------------------------------------------\n             Yorick            GotoBlas2              MKL\n   size     sum(x*y)           lpk_dot              lpk_dot\n                          (µs = microseconds)\n-----------------------------------------------------------------------\n   10,000    21 µs (100%)   5.4 µs (100%) [4.0]     6.7 µs (396%) [3.1]\n  100,000   220 µs (100%)    55 µs (100%) [4.1]      46 µs (393%) [4.8]\n1,000,000  3700 µs (100%)  1200 µs (100%) [3.1]     948 µs (398%) [3.9]\n  1024^2   4000 µs (100%)  1300 µs (100%) [3.1]    1000 µs (399%) [4.0]\n-----------------------------------------------------------------------\n```\nFor this BLAS level 1 operation, GotoBlas2 is not multi-threaded.\nMKL is the fastest (for vectors of size \u003e 100,000).  MKL and GotoBlas2\nprovide speed-up between 3 and 5.\n\n\n### Resolution of a linear system of equations\n```\n----------------------------------------------------------------------\n            Yorick            GotoBlas2              MKL\n size      LUsolve            lpk_gesv             lpk_gesv\n----------------------------------------------------------------------\n   100  0.39 ms  (99%)   0.25 ms (189%)  [1.6]   0.22 ms (392%)  [1.7]\n   500    39 ms  (99%)     10 ms (212%)  [3.9]    6.1 ms (397%)  [6.3]\n 1,000   0.30 s (100%)   0.066 s (338%)  [4.5]   0.038 s (397%)  [8.1]\n 2,000    2.3 s (100%)    0.27 s (355%)  [8.6]    0.27 s (398%)  [8.6]\n 3,000    8.0 s (100%)    0.86 s (355%)  [9.4]    0.91 s (387%)  [8.9]\n 5,000     38 s (100%)     3.8 s (367%) [10.1]     4.0 s (392%)  [9.6]\n10,000    303 s (100%)      30 s (379%) [10.2]      28 s (392%) [11.1]\n----------------------------------------------------------------------\n```\nHence for moderate size matrix MKL is the fastest but for very large\nmatrices GotoBlas2 and MKL have similar speed-up of ~ 10.\n\n### Tests with GotoBLAS2 on Linux with CPU Intel Core i7-2600 @ 3.40GHz\n```\n---------------------------------------------------------------------\n                  Yorick                  Lapack\n size             LUsolve                lpk_gesv\n---------------------------------------------------------------------\n     50  3.53E-05 +/- 6E-06 ( 96%)   2.99E-05 +/- 4E-05 (139%)  [1.2]\n    100  2.41E-04 +/- 1E-05 (100%)   9.75E-05 +/- 5E-06 (156%)  [2.5]\n    200  1.76E-03 +/- 3E-05 ( 99%)   4.37E-04 +/- 2E-05 (178%)  [4.0]\n    300  5.51E-03 +/- 8E-05 ( 99%)   1.12E-03 +/- 2E-04 (204%)  [4.9]\n    500  2.41E-02 +/- 4E-04 (100%)   4.30E-03 +/- 1E-04 (216%)  [5.6]\n  1,000  1.90E-01 +/- 9E-04 (100%)   2.12E-02 +/- 4E-04 (305%)  [9.0]\n  2,000  1.47E+00 +/- 1E-03 (100%)   1.33E-01 +/- 2E-03 (358%) [11.0]\n  3,000  5.01E+00 +/- 4E-03 ( 99%)   4.34E-01 +/- 5E-03 (353%) [11.5]\n  5,000  2.32E+01 +/- 1E-02 ( 99%)   1.83E+00 +/- 3E-03 (369%) [12.7]\n 10,000  1.84E+02 +/- 6E-02 (100%)   1.36E+01 +/- 7E-02 (382%) [13.5]\n 20,000  1.76E+03 +/- 1E+01 ( 99%)   1.04E+02 +/- 8E-02 (390%) [17.0]\n---------------------------------------------------------------------\n```\nNote: I have tested Yorick built-in LUsolve (compiled with GCC 4.5.2)\nversus DGESV in Lapack 3.3.1 (compiled with Intel ifort XE 2011 SP1.6.233)\nand noticed a speed-up of ~ 1.5 for DGESV.\n\n\n## Cholesky factorization\n\nPerform Cholesky factorization (DPOTRF) and solve a linear system with a\npositive definite symmetric left hand side matrix with GotoBLAS2.\n```\n----------------------------------------------------------------\n         Size                    laptop           server\n----------------------------------------------------------------\nDPOTRF  5,000x5,000      2.2 sec (320%)     1.3 sec (330%)\nDPOTRF 10,000x10,000      (not done)       10.5 sec (320%)\nDPOTRF 12,000x12,000     23 sec (387%)     15.6 sec (368%)\n----------------------------------------------------------------\nDPOTRS 5,000x5,000      0.2 sec\n----------------------------------------------------------------\n```\n* laptop = Linux laptop with Intel Core i7 Q820 at 1.73GHz\n* server = Linux server with Intel Xeon X3450 at 2.67Ghz\n* dpotrf = Cholesky decomposition (done once for a given C)\n* dpotrs = solve the system given the Cholesky decomposition\n\nThese times can be compared to LUsolve.\n\n\n# WORK IN PROGRESS\n\n## LINKING WITH MKL\n\nWhen linking with the Math Kernel Library (MKL), you need to use 3 or 4\nlibraries:\n\n1. Interface layer.\n\n   For the IA-32 and Intel(R) MIC targets, there are only one MKL interface\n   with 32-bits integers.  For the Intel(R) 64 target, there are two possible\n   MKL interfaces: lp64 with 32-bits integers and ilp64 with 64-bits integers.\n\n   - `libmkl_gf_ilp64.a` for GNU Fortran compiler with 64-bit integers on\n     64-bit processor\n\n   - `libmkl_intel_ilp64.a` for Intel compiler\n\n2. Threading layer: choose a multi-thread library.\n\n3. Computational layer: a multi-thread library.\n\n4. Run-time libraries (only with MPI?).\n\nTo figure out which libraries to use with MKL, you can have a look at \"Intel(c)\nMath Kernel Library Link Line Advisor\":\n\nhttp://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/\n\nTo start Yorick with a given `LD_LIBRARY_PATH`:\n\n    LD_LIBRARY_PATH=/opt/intel/lib/intel64:/usr/local/lib rlwrap yorick\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Femmt%2Fylapack","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Femmt%2Fylapack","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Femmt%2Fylapack/lists"}