{"id":19079904,"url":"https://github.com/cimentadaj/tidyflow","last_synced_at":"2025-07-29T06:32:55.837Z","repository":{"id":42996892,"uuid":"240671889","full_name":"cimentadaj/tidyflow","owner":"cimentadaj","description":"A simplified and fresh workflow for doing machine learning with tidymodels","archived":false,"fork":false,"pushed_at":"2022-10-19T05:40:48.000Z","size":3715,"stargazers_count":8,"open_issues_count":15,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-30T05:44:13.332Z","etag":null,"topics":["machine-learning","r","statistics","tidymodels"],"latest_commit_sha":null,"homepage":"https://cimentadaj.github.io/tidyflow/","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cimentadaj.png","metadata":{"files":{"readme":"README.Rmd","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-02-15T08:43:53.000Z","updated_at":"2024-01-30T13:07:58.000Z","dependencies_parsed_at":"2023-01-20T02:17:37.691Z","dependency_job_id":null,"html_url":"https://github.com/cimentadaj/tidyflow","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/cimentadaj/tidyflow","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cimentadaj%2Ftidyflow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cimentadaj%2Ftidyflow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cimentadaj%2Ftidyflow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cimentadaj%2Ftidyflow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cimentadaj","download_url":"https://codeload.github.com/cimentadaj/tidyflow/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cimentadaj%2Ftidyflow/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267639569,"owners_count":24119780,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-29T02:00:12.549Z","response_time":2574,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["machine-learning","r","statistics","tidymodels"],"created_at":"2024-11-09T02:16:22.895Z","updated_at":"2025-07-29T06:32:55.787Z","avatar_url":"https://github.com/cimentadaj.png","language":"HTML","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/README-\",\n  out.width = \"100%\"\n)\n```\n\n# tidyflow\n\n\u003c!-- badges: start --\u003e\n[![R build status](https://github.com/cimentadaj/tidyflow/workflows/R-CMD-check/badge.svg)](https://github.com/cimentadaj/tidyflow/actions)\n[![Codecov test coverage](https://codecov.io/gh/cimentadaj/tidyflow/branch/master/graph/badge.svg)](https://codecov.io/gh/cimentadaj/tidyflow?branch=master)\n\u003c!-- badges: end --\u003e\n\n## What is a tidyflow? \n\nA tidyflow is a fork of [workflows](https://workflows.tidymodels.org/) that can bundle together your data, splitting, resampling, preprocessing, modeling, and grid search. Having all these steps separated into different objects can prove to be difficult. One can predict on the testing data by mistake, forget whether the recipe has been baked or not, or simply do not remember the name of all the tuning parameters to specify in the grid. `tidyflow` is a package aimed at bundling all of these steps into a coherent flow, as is represented below:\n\n\u003cimg src='man/figures/stages_arrows_plug_complete_code.png'\u003e\n\nAmong the advantages are:\n\n * You don't have to keep track of separate objects in your workspace.\n\n * The split, resample, recipe prepping, model fitting and grid search can be executed using a single call to `fit()`.\n \n## Installation\n\nYou can install the development version from [GitHub](https://github.com/) with: \n\n```{r, eval = FALSE}\n# install.packages(\"devtools\")\ndevtools::install_github(\"cimentadaj/tidyflow\")\n```\n \n## Example\n\n`tidyflow` builds upon the work in `tidymodels` to create an expressive workflow for doing machine learning. Let's suppose we want to fit a linear model to the model `mpg ~ .` on the training data of `mtcars`. We can define the split (training/testing), define our formula, define the statistical model and fit the tidyflow:\n\n```{r, message = FALSE, eval = FALSE}\nlibrary(tidymodels)\nlibrary(tidyflow)\n```\n\n```{r, echo = FALSE, message = FALSE}\nlibrary(rsample)\nlibrary(tune)\nlibrary(parsnip)\nlibrary(rsample)\nlibrary(dials)\nlibrary(tidyflow)\n```\n\n```{r}\n# Build tidyflow\ntflow \u003c-\n  mtcars %\u003e%\n  tidyflow() %\u003e%\n  plug_split(initial_split) %\u003e%\n  plug_formula(mpg ~ .) %\u003e%\n  plug_model(linear_reg() %\u003e% set_engine(\"lm\"))\n\n# Fit model\nfit_m \u003c- fit(tflow)\n\nfit_m\n```\n\n`tidyflow` will execute this order of steps: data -\u003e split training/testing -\u003e apply the formula and model to the training data. With this final model we can use the `predict_training` function to automatically predict on the training data:\n\n```{r}\n# Predict on testing\nfit_m %\u003e%\n  predict_training()\n```\n\nSimilarly, you can use `predict_testing` for predicting on the testing data.\n\nHowever, the usefulness of `tidyflow` is clearer when we perform more complex modelling. Let's extend the previous model to include a cross-validation resample and to perform a grid search for a regularized regression:\n\n```{r}\n# Grid search will be performed on the penalty and mixture arguments\nregularized_mod \u003c- linear_reg(penalty = tune(), mixture = tune()) %\u003e% set_engine(\"glmnet\")\n\n# Build tidyflow\ntflow \u003c-\n  mtcars %\u003e% # Start with the data\n  tidyflow() %\u003e%\n  plug_split(initial_split) %\u003e% # Split into training/testing\n  plug_formula(mpg ~ .) %\u003e% # Define model specification\n  plug_resample(vfold_cv) %\u003e% # Specify resample: cross-validation\n  plug_grid(grid_regular) %\u003e%  # Define type of grid search\n  plug_model(regularized_mod) # Define the type of model\n\n# Fit model\nfit_m \u003c- fit(tflow)\nfit_m\n```\n\nThe result is **not** a final model as before, but rather a grid search result. We can extract that and visualize it:\n\n```{r}\n# Extract tuning grid\nfit_m %\u003e%\n  pull_tflow_fit_tuning() %\u003e%\n  autoplot()\n```\n\nYou can finalize the `tidyflow` with `complete_tflow` which will select the best tuning parameters and train the model on the entire training data. This final model can be used for predicting on the training data and on the testing data automatically:\n\n```{r}\n# Fit best model on the entire training data\nfinal_m \u003c-\n  fit_m %\u003e%\n  complete_tflow(metric = \"rmse\")\n\n# Predict on train\nfinal_m %\u003e%\n  predict_training()\n\n# Predict on testing\nfinal_m %\u003e%\n  predict_testing()\n```\n\n## Code of Conduct\n\nPlease note that the tidyflow project is released with a [Contributor Code of Conduct](https://contributor-covenant.org/version/2/0/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcimentadaj%2Ftidyflow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcimentadaj%2Ftidyflow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcimentadaj%2Ftidyflow/lists"}