{"id":23148307,"url":"https://github.com/sparkfish/castpack","last_synced_at":"2025-08-17T17:33:48.766Z","repository":{"id":104041357,"uuid":"255494802","full_name":"sparkfish/castpack","owner":"sparkfish","description":"Magical packager of R linear models for your MS SQL Server database ✨","archived":false,"fork":false,"pushed_at":"2020-08-18T22:40:37.000Z","size":57,"stargazers_count":5,"open_issues_count":0,"forks_count":0,"subscribers_count":5,"default_branch":"master","last_synced_at":"2024-05-01T09:37:50.139Z","etag":null,"topics":["deploy","forecast-models","glm","linear-models","ms-sql","packager","r","sql-server"],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sparkfish.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null}},"created_at":"2020-04-14T02:52:56.000Z","updated_at":"2021-02-05T03:44:21.000Z","dependencies_parsed_at":"2023-05-27T11:30:12.732Z","dependency_job_id":null,"html_url":"https://github.com/sparkfish/castpack","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sparkfish%2Fcastpack","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sparkfish%2Fcastpack/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sparkfish%2Fcastpack/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sparkfish%2Fcastpack/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sparkfish","download_url":"https://codeload.github.com/sparkfish/castpack/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":230151966,"owners_count":18181327,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deploy","forecast-models","glm","linear-models","ms-sql","packager","r","sql-server"],"created_at":"2024-12-17T17:10:08.485Z","updated_at":"2024-12-17T17:10:09.219Z","avatar_url":"https://github.com/sparkfish.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\u003cimg width=\"196\" src=\"https://user-images.githubusercontent.com/1108065/82249535-90bbb000-990f-11ea-9183-d24870f828af.png\" alt=\"Castpack logo\"\u003e\u003c/p\u003e\n\nCastpack is a magical R library that lets you effortlessly package linear forecast models and deploys them for use directly in your Microsoft SQL Server database.\n\nLeveraging the powerful open-source [modelc](https://github.com/team-sparkfish/modelc) library, Castpack will transpile models consisting of hundreds of parameters to performant ANSI SQL in mere seconds, and load them into your database in the blink of an eye. Just bring your models as `.rds` files, tell Castpack about your database with a simple configuration file, and let her rip!\n\nUnlike other libraries and tools, Castpack was purpose-built for predictive linear and generalized linear models. This focus on linear models keeps Castpack lightweight, and allows it to support linear models and GLMs that other libraries choke on.\n\nIt was inspired by and builds upon the venerable [tidypredict](https://tidymodels.github.io/tidypredict/) library.\n\n## Installation\n\nUsing `devtools`:\n\n```{R}\ninstall.packages(\"devtools\")\ninstall.packages(\"remotes\")\nremotes::install_github(\"team-sparkfish/Castpack\", dependencies=T)\n```\n\nPrepare a workspace directory:\n\n```{shell}\n$ mkdir workspace\n```\n\nCopy the `example.models.yml` and `example.db.yml` configuration files to `workspace/models.yml` and `workspace/db.yml` respectively and fill in the details for your database and model.\n\nSet your R working directory to your workspace, and run\n\n```{R}\nCastpack::prepare_registry()\n```\n\nThis will create the necessary objects for models to be loaded and run inside your database.\n\n## How it works\n\nCastpack is simple to use because it is opinionated (in a \"convention over configuration\" sense) about how models are represented in your database.\n\nWhen you run `Castpack::prepare_registry()`, Castpack creates two objects: a `${schema}.Models` table (where `${schema}` is the schema you specified in your configuration file), along with `${schema}.Predict`, a stored procedure for running predictions inside the database.\n\nThe `Predict` procedure takes as arguments a model name and a datasource name. The latter must correspond to an existing view or table.\n\nThe models specified in `models.yml` are then transpiled from `.rds` format files into ANSI SQL queries, which are upserted into the `Models` table. From there, you can run the `Predict` procedure against the model and a table or view in your database.\n\nBecause the models are nothing more than formulas represented as select statements, they are blazing fast.\n\n## Making Predictions\n\nTo make predictions, used the `Predict` function that is created when `Castpack::prepare_registry()` is run.\n\nIt takes two arguments:\n\n``` sql\n@modelName NVARCHAR(128),\n@dataSourceViewName NVARCHAR(258)\n```\n\n`@dataSourceViewName` should be the name of an existing table or view.\n\n## Model Configuration\n\nUse `models.yml` to configure your models. There should be a toplevel key for each model to be imported consisting of the following attributes\n\n- `name` The model name is used by the `Predict` procedure to apply the model against the specified dataset\n- `path` The path to the model file. The model should live on disk as a `.Rds` formatted file\n- `datasource` The data source should be an existing table or view the model should be applied against\n- `auxiliary_columns` These are additional columns to be returned in the output of `Predict` \n- `response_column` This specifies the alias of the response column in the output of `Predict`\n- `raw` _(optional)_ Any additional SQL (e.g., a `WHERE` or `ORDER BY` clause) can be added here\n\nSee `example.models.yml` for an example.\n\n## API\n\n- `Castpack::prepare_registry()` creates the `${schema}.Models` table and `${schema}.Predict` procedure\n- `Castpack::deploy_models()` upserts the models specified in `config.r` to the `Models` table. This function depends on a `models` variable defined in `config.r` that tells Castpack about the models you'd like to load into your database. See `example.config.r` for an example configuration.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsparkfish%2Fcastpack","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsparkfish%2Fcastpack","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsparkfish%2Fcastpack/lists"}