{"id":15048068,"url":"https://github.com/jerinphilip/slimt","last_synced_at":"2025-04-10T01:12:06.158Z","repository":{"id":187737946,"uuid":"677420510","full_name":"jerinphilip/slimt","owner":"jerinphilip","description":"Inference slice of marian for bergamot's tiny11 models. Faster to compile, and wield. Fewer model-archs than bergamot-translator.","archived":false,"fork":false,"pushed_at":"2024-10-24T16:30:40.000Z","size":396,"stargazers_count":11,"open_issues_count":3,"forks_count":2,"subscribers_count":7,"default_branch":"main","last_synced_at":"2025-04-10T01:12:00.375Z","etag":null,"topics":["cpp20","inference-engine","machine-translation","pybind11","python"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jerinphilip.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-08-11T14:28:03.000Z","updated_at":"2025-03-05T16:26:23.000Z","dependencies_parsed_at":"2023-12-20T14:14:44.027Z","dependency_job_id":"ab12d3db-4e1e-491b-80d8-ee3c12904b5d","html_url":"https://github.com/jerinphilip/slimt","commit_stats":null,"previous_names":["jerinphilip/slimt"],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jerinphilip%2Fslimt","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jerinphilip%2Fslimt/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jerinphilip%2Fslimt/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jerinphilip%2Fslimt/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jerinphilip","download_url":"https://codeload.github.com/jerinphilip/slimt/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248137888,"owners_count":21053775,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cpp20","inference-engine","machine-translation","pybind11","python"],"created_at":"2024-09-24T21:07:39.674Z","updated_at":"2025-04-10T01:12:06.130Z","avatar_url":"https://github.com/jerinphilip.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# slimt\n\n**slimt** (_slɪm tiː_) is an inference frontend for\n[tiny](https://github.com/browsermt/students/tree/master/deen/ende.student.tiny11)\n[models](https://github.com/browsermt/students) trained as part of the\n[Bergamot project](https://browser.mt/).\n\n[bergamot-translator](https://github.com/browsermt/bergamot-translator/) builds\non top of [marian-dev](https://github.com/marian-nmt/marian-dev) and uses the\ninference code-path from marian-dev. While marian is a a capable neural network\nlibrary with focus on machine translation, all the bells and whistles that come\nwith it are not necessary to run inference on client-machines (e.g: autograd,\nmultiple sequence-to-sequence architecture support, beam-search). For some use\ncases like an input-method engine doing translation (see\n[lemonade](https://github.com/jerinphilip/lemonade)) - single-thread operation\nexisting along with other processes on the system suffices. This is the\nmotivation for this transplant repository. There's not much novel here except\neasiness to wield. This repository is simply just the _tiny_ part of marian.\nCode is reused where possible.\n\nThis effort is inspired by contemporary efforts like\n[ggerganov/ggml](https://github.com/ggerganov/ggml) and\n[karpathy/llama2](https://github.com/karpathy/llama2.c). _tiny_ models roughly\nfollow the [transformer architecture](https://arxiv.org/abs/1706.03762), with\n[Simpler Simple Recurrent Units](https://aclanthology.org/D19-5632/) (SSRU) in\nthe decoder. The same models are used in Mozilla Firefox's [offline translation\naddon](https://addons.mozilla.org/en-US/firefox/addon/firefox-translations/).\n\nBoth `tiny` and `base` models have 6 encoder-layers and 2 decoder-layers, and\nfor most existing pairs a vocabulary size of 32000 (with tied embeddings). The\nfollowing table briefly summarizes some architectural differences between\n`tiny` and `base` models:\n\n| Variant | emb | ffn  | params | f32   | i8   |\n| ------- | --- | ---  | ------ | ----- | ---- |\n| `base`  | 512 | 2048 | 39.0M  | 149MB | 38MB |\n| `tiny`  | 256 | 1536 | 15.7M  | 61MB  | 17MB |\n\nThe `i8` models, quantized to 8-bit and as small as 17MB is used to provide\ntranslation for Mozilla Firefox's offline translation addon, among other\nthings.\n\nMore information on the models are described in the following papers:\n\n* [From Research to Production and Back: Ludicrously Fast Neural Machine Translation](https://aclanthology.org/D19-5632)\n* [Edinburgh’s Submissions to the 2020 Machine Translation Efficiency Task](https://aclanthology.org/2020.ngt-1.26/)\n\n\nThe large-list of dependencies from bergamot-translator have currently been\nreduced to:\n\n* For `int8_t` matrix-multiply [intgemm](https://github.com/kpu/intgemm)\n  (`x86_64`) or [ruy](https://github.com/google/ruy) (`aarch64`) or\n  [xsimd](https://github.com/xtensor-stack/xsimd) via\n  [gemmology](https://github.com/mozilla/gemmology).\n* For vocabulary - [sentencepiece](https://github.com/browsermt/sentencepiece). \n* For sentence-splitting using regular-expressions\n  [PCRE2](https://github.com/PCRE2Project/pcre2).\n* For `sgemm` - Whatever BLAS provider is found via CMake (openblas,\n  intel-oneapimkl, cblas).  Feel free to provide\n  [hints](https://cmake.org/cmake/help/latest/module/FindBLAS.html#blas-lapack-vendors). \n* [CLI11](https://github.com/CLIUtils/CLI11/) (only a dependency for cmdline) \n\nSource code is made public where basic functionality (text-translation) works\nfor English-German tiny models. Parity in features and speed with marian and\nbergamot-translator (where relevant) is a work-in-progress. Eventual support for\n`base` models are planned. Contributions are welcome and appreciated.\n\n\n## Getting started\n\nClone with submodules.\n\n```\ngit clone --recursive https://github.com/jerinphilip/slimt.git\n```\n\nConfigure and build. `slimt` is still experimenting with CMake and\ndependencies. The following, being prepared towards linux distribution should\nwork at the moment:\n\n```bash\n# Configure to use xsimd via gemmology\nARGS=(\n    # Use gemmology\n    -DWITH_GEMMOLOGY=ON               \n\n    # On x86_64 machines use the following to enable a faster matrix\n    # multiplication backend using SIMD. All of these can co-exist and dispatch\n    # on best detecting CPU at runtime.\n    -DUSE_AVX512=ON -DUSE_AVX2=ON -DUSE_SSSE3=ON -DUSE_SSE2=ON\n\n    # Uncomment below line, comment x86_64 above and use for aarch64, armv7+neon)\n    # -DUSE_NEON=ON \n\n    # Use sentencepiece installed via system.\n    -DUSE_BUILTIN_SENTENCEPIECE=OFF        \n\n    # Exports slimtConfig.cmake (cmake) and slimt.pc.in (pkg-config)\n    -DSLIMT_PACKAGE=ON \n\n    # Customize installation prefix if need be.\n    -DCMAKE_INSTALL_PREFIX=/usr/local\n)\n\ncmake -B build -S $PWD -DCMAKE_BUILD_TYPE=Release \"${ARGS[@]}\"\ncmake --build build --target all\n\n# Require sudo since /usr/local is writable usually only by root.\nsudo cmake --build build --target install \n```\n\nThe above run expects the packages `sentencepiece`, `xsimd` and a BLAS provider\nto come from the system's package manager. Examples of this in distributions\ninclude:\n\n```bash\n# Debian based systems\nsudo apt-get install -y libxsimd-dev libsentencepiece-dev libopenblas-dev\n\n# ArchLinux\npacman -S openblas xsimd\nyay -S sentencepiece-git\n```\n\nSuccessful build generate two executables `slimt-cli` and `slimt-test` for\ncommand-line usage and testing respectively. \n\n```bash\nbuild/bin/slimt-cli                           \\\n    --root \u003cpath/to/folder\u003e                   \\\n    --model \u003c/relative/path/to/model\u003e         \\\n    --vocabulary \u003c/relative/path/to/vocab\u003e    \\\n    --shortlist \u003c/relative/path/to/shortlist\u003e\n\nbuild/slimt-test \u003ctest-name\u003e\n```\nThis is still very much a work in progress, towards being able to make\n[lemonade](https://github.com/jerinphilip/lemonade) available in distributions.\nHelp is much appreciated here, please get in touch if you can help here.\n\n### Python\n\nPython bindings to the C++ code are available.  Python bindings provide a layer\nto download models and use-them via command line entrypoint `slimt` (the core\nslimt library only has the inference code).\n\n```bash\npython3 -m venv env\nsource env/bin/activate\npython3 -m pip install wheel\npython3 setup.py bdist_wheel\npython3 -m pip install dist/\u003cwheel-name\u003e.whl\n\n# Download en-de-tiny and de-en-tiny models.\nslimt download -m en-de-tiny\nslimt download -m de-en-tiny\n```\nFind an example of the built wheel running on colab below:\n\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/12wFMVwOTzOyRjoeWtett2DTDhwNAbvBZ?usp=sharing)\n\nYou may pass customizing cmake-variables via `CMAKE_ARGS` environment variable.\n\n```bash\nCMAKE_ARGS='-D...' python3 setup.py bdist_wheel\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjerinphilip%2Fslimt","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjerinphilip%2Fslimt","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjerinphilip%2Fslimt/lists"}