{"id":17586849,"url":"https://github.com/sparkfish/modelc","last_synced_at":"2025-09-04T03:32:53.840Z","repository":{"id":56936221,"uuid":"248859167","full_name":"sparkfish/modelc","owner":"sparkfish","description":"modelc is an R model object to SQL compiler","archived":false,"fork":false,"pushed_at":"2020-06-18T19:07:04.000Z","size":39,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-02-04T18:48:29.486Z","etag":null,"topics":["compiler","generalized-linear-models","linear-models","r","sql","transpiler"],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sparkfish.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-03-20T21:54:11.000Z","updated_at":"2020-08-01T15:27:50.000Z","dependencies_parsed_at":"2022-08-21T01:10:18.030Z","dependency_job_id":null,"html_url":"https://github.com/sparkfish/modelc","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sparkfish%2Fmodelc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sparkfish%2Fmodelc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sparkfish%2Fmodelc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sparkfish%2Fmodelc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sparkfish","download_url":"https://codeload.github.com/sparkfish/modelc/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246223267,"owners_count":20743158,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["compiler","generalized-linear-models","linear-models","r","sql","transpiler"],"created_at":"2024-10-22T03:06:33.514Z","updated_at":"2025-03-29T17:43:24.702Z","avatar_url":"https://github.com/sparkfish.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"# modelc\n\n[![R build\nstatus](https://github.com/sparkfish/modelc/workflows/R-CMD-check/badge.svg)](https://github.com/sparkfish/modelc/actions)\n\nmodelc is an R model object to SQL compiler. It generates SQL select statements from linear and generalized linear models.\n\nIts interface currently consists of a single function, `modelc`, which takes a single input, namely an `lm` or `glm` model object.\n\nIt currently supports Gaussian and gamma family distributions using log or identity link functions.\n\nTo import linear models directly to your SQL Server database, consider using [Castpack](https://github.com/sparkfish/castpack), which depends on `modelc`.\n\n# Usage\n\nSupposing the following data\n\n```R\na \u003c- 1:10\nb \u003c- 2*1:10 + runif(1) * 1.5\nc \u003c- as.factor(1:10)\ndf \u003c- data.frame(a,b,c)\nformula = b ~ a + c\n```\n\nA vanilla linear model\n\n```R\nlinear_model \u003c- lm(formula, data=df)\nmodelc(linear_model)\n\n```\n\ngenerates the following SQL\n\n\n``` sql\n  0.231808555545287 + 2 * `a` + (\n    CASE\n      WHEN c = 2 THEN -0.00000000000000193216758587821 * c\n      WHEN c = 3 THEN -0.000000000000000776180314897008 * c\n      WHEN c = 4 THEN -0.000000000000000665297412768863 * c\n      WHEN c = 5 THEN -0.00000000000000055441451064072 * c\n      WHEN c = 6 THEN -0.000000000000000887620818362638 * c\n      WHEN c = 7 THEN -0.000000000000000332648706384432 * c\n      WHEN c = 8 THEN -0.00000000000000110994422395641 * c\n      WHEN c = 9 THEN -0.00000000000000188723974152839 * c\n      WHEN c = 10 THEN 0 * c\n    END\n  )\n```\n\nGLMs are also supported with log or identity link functions\n\n\n```R\nglm_model \u003c- glm(formula, data=df, family=Gamma(link=\"log\"))\nmodelc(glm_model)\n```\n\n``` sql\n  EXP(\n    0.557874070609732 + 0.244938197625494 * `a` + (\n      CASE\n        WHEN c = 2 THEN 0.394878990324516 * c\n        WHEN c = 3 THEN 0.536977925025217 * c\n        WHEN c = 4 THEN 0.570378881020516 * c\n        WHEN c = 5 THEN 0.542936294999294 * c\n        WHEN c = 6 THEN 0.476536561025273 * c\n        WHEN c = 7 THEN 0.383038044594683 * c\n        WHEN c = 8 THEN 0.269593156578649 * c\n        WHEN c = 9 THEN 0.140849942185343 * c\n        WHEN c = 10 THEN 0 * c\n      END\n    )\n  )\n```\n\n\n```R\nglm_model_idlink \u003c- glm(formula, data=df, family=Gamma(link=\"identity\"))\nmodelc(glm_model_idlink)\n```\n\n``` sql\n  0.231808555545287 + 2 * `a` + (\n    CASE\n      WHEN c = 2 THEN 0.00000000000000139594865689472 * c\n      WHEN c = 3 THEN -0.000000000000000581567338978993 * c\n      WHEN c = 4 THEN -0.00000000000000111588502938831 * c\n      WHEN c = 5 THEN 0.000000000000000967650035758108 * c\n      WHEN c = 6 THEN -0.00000000000000149265067586469 * c\n      WHEN c = 7 THEN -0.000000000000000100985345060517 * c\n      WHEN c = 8 THEN -0.0000000000000000673235633736781 * c\n      WHEN c = 9 THEN 0.00000000000000199047558220559 * c\n      WHEN c = 10 THEN 0 * c\n    END\n  )\n```\n\nIn order to avoid generating invalid SQL, `modelc` temporarily sets your `scipen` option to 999.\n\n# Installing\n\nUsing `devtools`:\n\n```R\ninstall.packages(\"devtools\")\ninstall.packages(\"remotes\")\nremotes::install_github(\"sparkfish/modelc\")\n```\n\n# Precision\n\nNote that you may encounter minor differences between the output of your R and generated SQL models depending on the precision with which your numeric types are represented in the database. To ensure parity between the two models, numeric types should have a precision of at least 17.\n\n# Tests\n\nTests are written using `testthat`. To run them, simply do\n\n``` R\ndevtools::test()\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsparkfish%2Fmodelc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsparkfish%2Fmodelc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsparkfish%2Fmodelc/lists"}