{"id":23288081,"url":"https://github.com/mlverse/lang","last_synced_at":"2026-03-10T12:35:54.217Z","repository":{"id":266542418,"uuid":"888672200","full_name":"mlverse/lang","owner":"mlverse","description":"Uses LLMs to translate R help docs on the fly","archived":false,"fork":false,"pushed_at":"2025-11-06T18:29:37.000Z","size":4743,"stargazers_count":32,"open_issues_count":1,"forks_count":3,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-11-06T20:27:28.725Z","etag":null,"topics":["llm","r","translations"],"latest_commit_sha":null,"homepage":"https://mlverse.github.io/lang/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mlverse.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-11-14T20:00:01.000Z","updated_at":"2025-11-06T18:28:39.000Z","dependencies_parsed_at":"2024-12-04T20:22:39.064Z","dependency_job_id":"3a358baa-6261-4420-b089-7287457f87af","html_url":"https://github.com/mlverse/lang","commit_stats":null,"previous_names":["edgararuiz/lang","mlverse/lang"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/mlverse/lang","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlverse%2Flang","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlverse%2Flang/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlverse%2Flang/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlverse%2Flang/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mlverse","download_url":"https://codeload.github.com/mlverse/lang/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mlverse%2Flang/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30333573,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-10T05:25:20.737Z","status":"ssl_error","status_checked_at":"2026-03-10T05:25:17.430Z","response_time":106,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["llm","r","translations"],"created_at":"2024-12-20T03:13:43.592Z","updated_at":"2026-03-10T12:35:54.207Z","avatar_url":"https://github.com/mlverse.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/README-\",\n  out.width = \"100%\"\n)\n\nlibrary(lang)\nlibrary(ellmer)\nchat \u003c- chat_openai(model = \"gpt-4o\")\nlang_use(backend = chat, .lang = \"spanish\")\n```\n\n\u003cimg src=\"man/figures/logo.png\" align=\"right\" alt=\"lang's hex logo\" width=\"120\" /\u003e\n\n# lang \n\u003c!-- badges: start --\u003e\n[![R-CMD-check](https://github.com/mlverse/lang/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/mlverse/lang/actions/workflows/R-CMD-check.yaml)\n[![Codecov test coverage](https://codecov.io/gh/mlverse/lang/branch/main/graph/badge.svg)](https://app.codecov.io/gh/mlverse/lang?branch=main)\n[![CRAN status](https://www.r-pkg.org/badges/version/lang)](https://CRAN.R-project.org/package=lang)\n[![Codecov test coverage](https://codecov.io/gh/mlverse/lang/graph/badge.svg)](https://app.codecov.io/gh/mlverse/lang)\n\u003c!-- badges: end --\u003e\n\nUse an **LLM to translate a function's help documentation on the fly**. `lang` \noverrides the `?` and `help()` functions in your R session. If you are using \nRStudio or Positron, the translated help page will appear in the 'Help' \npane. \n\n## Installing\n\nTo install the CRAN version of `lang` use:\n\n```r\ninstall.packages(\"lang\")\n```\n\nTo install the GitHub version of `lang`, use:\n\n```r\ninstall.packages(\"pak\")\npak::pak(\"mlverse/lang\")\n```\n\n## Using `lang`\n\nIn order to work, `lang` needs two things:\n  \n  1. An LLM connection\n  \n  1. A target language (e.g.: Spanish, French, Korean)\n\nThese two can be defined using `lang_use()`.  For example,  the following code\nshows  how to use OpenAI's GPT-4o model to translate `lm()`'s help  into Spanish:\n\n```r\nlibrary(lang)\n\nchat \u003c- ellmer::chat_openai(model = \"gpt-4o\")\n\nlang_use(backend = chat, .lang = \"spanish\")\n\n?lm\n#\u003e [1/7] ■■                                 4% | Title\n```\n\u003cimg src=\"man/figures/lm-spanish.png\" align=\"center\" width=\"100%\"\nalt=\"Screenshot of the lm function's help page in Spanish\"/\u003e\n\nAfter setup, simply use `?` to trigger and display the translated documentation.\nDuring translation, `lang` will display its progress by showing which section\nof the documentation is currently translating. During the R session, if you\nrequest the same R function's help more than one time then `lang` will use\nits cached results, which will run immediately. \n\nR enforces the printed names of each section, so they cannot be\ntranslated. This means that titles such as \"Description\", \"Usage\" and \"Arguments\" \nwill always remain untranslated. \n\n\n### LLM connections\n\nThere are two ways to define the LLM in `lang_use()`:\n\n1. Use an  [`ellmer`](https://ellmer.tidyverse.org/) chat object: \n\n    ```r\n    lang_use(backend = ellmer::chat_openai(model = \"gpt-4o\"))\n    ```\n\n1. Use local LLMs available through [Ollama](https://ollama.com/). Pass `\"ollama\"` \nas the `backend` argument, and specify which installed  model to use:\n\n    ```r\n    lang_use(backend = \"ollama\", model = \"llama3.2\", seed = 100)\n    ```\n    \n    Under the hood, `lang` uses the  [`ollamar`](https://hauselin.github.io/ollama-r/) \n    package to integrate with Ollama. Any additional arguments, such as `seed` \n    as shown above, will be passed as-is to `ollamar`'s `chat()` function. \n\n### Target language\n\n\nIn order of priority, these are the ways in which `lang` determines the language\nit will translate to: \n\n1. Value in `.lang` when calling `lang_use()`\n1. `LANGUAGE` environment variable\n1. `LANG` environment variable\n\nIt is likely that your `LANG` variable  already defaults to your locale. \nFor example, mine is set to: `en_US.UTF-8` (That means English, United States). \nFor someone in France, the locale would be  something such as `fr_FR.UTF-8`. \nLlama3.2, recognizes these UTF locales, and using `lang`, calling `?` will \nresult in translating the function's help documentation into French. \n\nIf both environment variables are set, and are different from each other, \n`lang` will display a one-time message indicating which value it will use. \nIf the target language is English, `lang` will re-route help calls back to base\nR.\n\nTo check the current target language at any point during the R session, \nsimply run: `lang_use()`,  with no arguments, and it will print out the \ncurrent settings, which include language:\n\n```{r}\nlang_use()\n```\n\n## Tips\n\n### Caching\n\nBy default, `lang` will cache the translations it performs in a temporary folder.\nIf R is restarted, a new folder will be used. \n\nIf you notice that you are translating the same function's help over and over and\nacross different R sessions, then fixing the cache location would be helpful. Use\n`.cache` to define the folder:\n\n```r\nlang::lang_use(\n  backend = \"ollama\", \n  model = \"llama3.2\", \n  .cache = \"~/help-translations/\", \n  .lang = \"spanish\"\n  )\n```\n\n\n### Auto-initialize at startup\n\nIf `lang` becomes a regular part of your workflow, and running `lang_use()` at\nthe beginning of every R session becomes cumbersome, then consider letting R\nconnect at start up. \n\nIf present, the *.Rprofile* file runs at the beginning of any R session. If you\nwish to automatically set the model and language to use, add a call to `llm_use()`\nto this file.  You can call `usethis::edit_r_profile()` to open your .Rprofile\nfile so you can add the option. \n\nHere is an example of such a call that could be used in the .Rprofile file:\n\n```r\nlang::lang_use(\n  backend = \"ollama\", \n  model = \"llama3.2\", \n  .cache = \"~/help-translations/\", \n  .lang = \"spanish\",\n  .silent = TRUE\n  )\n```\n\nIn the example, we set `.silent` to `TRUE` so that there is no message every time \nthe R session is restarted. \n\n## Considerations\n\n### Translations are not perfect\n\nAs you can imagine, the quality of translation will mostly depend on the LLM \nbeing used. This solution is meant to be as helpful as possible, but \nwe acknowledge that at this stage of LLMs, only a human curated translation\nwill be the best solution. Having said that, I believe that even an imperfect\ntranslation could go a long way with someone who is struggling to understand\nhow to use a specific function in a package and may also struggle with the\nEnglish language.\n\n### Debug\n\nIf the original English help page displays, check your environment variables:\n\n```{r}\nSys.getenv(\"LANG\")\nSys.getenv(\"LANGUAGE\")\n```\n\nIn my case, `lang` recognizes that the environment is set to English, because\nof the `en` code in the variable. If your `LANG` variable is set to `en_...` \nthen no translation will occur.\n\nIf this is your case, set the `LANGUAGE` variable to your preference. You can\nuse the full language name, such as 'spanish', or 'french', etc.  You can use\n`Sys.setenv(LANGUAGE = \"[my language]\")`, or, for a more permanent solution, \nadd the entry to your your .Renviron file (`usethis::edit_r_environ()`). \n\n### Interaction with `mall`\n\n`lang` uses the `mall` package to produce the translations. To avoid conflicts\nin the setup and use of both packages during the R session, `lang` runs `mall`\nin a separate R process which is only alive while translating the documentation.\nThis means that you can have a specific LLM setup for `lang`, and a different\none for `mall` during your R session. \n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmlverse%2Flang","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmlverse%2Flang","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmlverse%2Flang/lists"}