{"id":14066366,"url":"https://github.com/poissonconsulting/newdata","last_synced_at":"2025-08-23T09:08:14.392Z","repository":{"id":54573556,"uuid":"68297001","full_name":"poissonconsulting/newdata","owner":"poissonconsulting","description":"An R Package to Generate New Data Frames for Prediction","archived":false,"fork":false,"pushed_at":"2025-07-31T23:50:57.000Z","size":8855,"stargazers_count":4,"open_issues_count":3,"forks_count":1,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-08-01T01:38:41.741Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://poissonconsulting.github.io/newdata/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/poissonconsulting.png","metadata":{"files":{"readme":"README.Rmd","changelog":"NEWS.md","contributing":".github/CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":".github/CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":".github/SUPPORT.md","governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2016-09-15T13:37:55.000Z","updated_at":"2025-07-31T23:46:15.000Z","dependencies_parsed_at":"2023-09-27T07:33:21.528Z","dependency_job_id":"d00b910a-13b2-4bd8-a9ae-a2252db2da17","html_url":"https://github.com/poissonconsulting/newdata","commit_stats":null,"previous_names":[],"tags_count":6,"template":false,"template_full_name":null,"purl":"pkg:github/poissonconsulting/newdata","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poissonconsulting%2Fnewdata","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poissonconsulting%2Fnewdata/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poissonconsulting%2Fnewdata/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poissonconsulting%2Fnewdata/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/poissonconsulting","download_url":"https://codeload.github.com/poissonconsulting/newdata/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poissonconsulting%2Fnewdata/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271746301,"owners_count":24813556,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-23T02:00:09.327Z","response_time":69,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-13T07:05:03.885Z","updated_at":"2025-08-23T09:08:14.373Z","avatar_url":"https://github.com/poissonconsulting.png","language":"R","funding_links":[],"categories":["R"],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/README-\",\n  out.width = \"100%\"\n)\n```\n\n# newdata \u003cimg src=\"man/figures/logo.png\" align=\"right\" alt= \"Poisson Consulting logo\"/\u003e\n\n\u003c!-- badges: start --\u003e\n[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)\n[![R-CMD-check](https://github.com/poissonconsulting/newdata/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/poissonconsulting/newdata/actions/workflows/R-CMD-check.yaml)\n[![codecov](https://codecov.io/gh/poissonconsulting/newdata/graph/badge.svg?token=pJO8edj5Wu)](https://codecov.io/gh/poissonconsulting/newdata)\n[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/license/mit)\n[![CRAN status](https://www.r-pkg.org/badges/version/newdata)](https://cran.r-project.org/package=newdata)\n\u003c!-- badges: end --\u003e\n\n## Introduction\n\n`newdata` is an R package to generate new data frames by varying some variables while holding the others constant.\n\nBy default, all specified variables vary across their range \nwhile all other variables are held constant at a reference value.\nThe user can specify the length of each sequence, require that only\nobserved values and combinations are used and add new variables.\nTypes, classes, factor levels and time zones are always preserved.\n\nConsider the following observed 'old' data frame.\n```{r}\nlibrary(newdata)\n\nnewdata::old_data\n```\n\n### Reference Value\n\nBy default all variables are set to a reference value.\n```{r}\nxnew_data(old_data)\n```\n\nThe reference value depends on the class of the variable, by default:\n\n- logical vectors are FALSE;\n- double vectors are the mean;\n- integer, Date, POSIXct and hms vectors are the floored mean;\n- character vectors are the most common value or the first when sorted of the most common values;\n- factor and ordered vectors are the first level.\n\n### Sequences\n\nSpecifying a variable causes it to vary sequentially across its range.\n```{r}\nxnew_data(old_data, int)\n```\n\nBy default the sequence depends on the class of the variable:\n\n- logical vectors are length 2 (TRUE and FALSE);\n- double vectors are 30 equally spaced values from the minimum value to the maximum value;\n- integer, Date, POSIXct and hms vectors are up to 30 discrete values from the minimum to the maximum value as evenly spaced as possible;\n- character vectors are the number of unique values.\n- factor and ordered vectors are the number of levels.\n\nThese values can be overridden by setting the following options:\n\n- `new_data.length_out_lgl`, which is 2 by default, for logical vectors;\n- `new_data.length_out_dbl`, which is 30 by default, for double vectors;\n- `new_data.length_out_int`, which is 30 by default, for integer, Date, POSIXct and hms vectors^1;\n- `new_data.length_out_chr`, which is Inf by default, for character, factor and ordered vectors.\n\n1. The length of Date, POSIXct and hms sequences are controlled by `new_data.length_out_int` as they are treated as integers for the purpose of generating a sequence.\n\nWhen programming it is strongly recommended that the user explicitly specify the length of each sequence individually.\n```{r}\nxnew_data(old_data, lgl, xnew_seq(int, length_out = 3))\n```\n\nA third alternative is to specify the length of all the sequences in the data set but this can result in less common character strings or later factor or ordered levels being dropped.\n\n```{r}\nxnew_data(old_data, dbl, int, .length_out = 2)\n```\n\n### Observed Values\n\nThe user can also indicate whether only observed values should be used in the sequence.\n```{r}\nxnew_data(old_data, xnew_seq(int, length_out = 3, obs_only = TRUE))\n```\n\nThe `xobs_only()` function can be used to filter out unobserved values after the sequence has been generated.\n```{r}\nxnew_data(old_data, xobs_only(xnew_seq(int, length_out = 3)))\n```\n\nand when two or more variables are specified all combinations are used.\n```{r}\nxnew_data(old_data, int, fct)\n```\n\nto only get observed combinations.\n```{r}\nxnew_data(old_data, xobs_only(int, fct))\n```\n\n### Modifying Variables\n\nModifying an existing variable or changing an existing one is simple.\n```{r}\nxnew_data(old_data, lgl = median(lgl, na.rm = TRUE), extra = c(TRUE, FALSE))\n```\n\n### Casting Variables\n\nCasting variables to be the same class as the original is achieved as follows.\n```{r}\nxnew_data(old_data, xcast(lgl = 1, int = 7, dbl = 10L, fct = \"a rarity\", hms = \"00:00:02\"))\n```\n\n### A Simple Wrapper\n\nAlthough superseded, for consistency with existing code `new_data()` which is a simple wrapper on `xnew_data()` allows the user to pass a character vector and to specifying the length of all the sequences is also provided.\n\n```{r}\nnew_data(old_data, seq = c(\"int\", \"fct\"), length_out = 5)\n```\n\n## Installation\n\nTo install the latest release version from CRAN.\n```r\ninstall.packages(\"newdata\")\n```\n\nTo install the latest development version from [GitHub](https://github.com/poissonconsulting/newdata)\n```r\n# install.packages(\"pak\")\npak::pak(\"poissonconsulting/newdata\")\n```\n\nor from [r-universe](https://poissonconsulting.r-universe.dev/newdata).\n```r\ninstall.packages(\"newdata\", repos = c(\"https://poissonconsulting.r-universe.dev\", \"https://cloud.r-project.org\"))\n```\n\n## Contribution\n\nPlease report any [issues](https://github.com/poissonconsulting/newdata/issues).\n\n[Pull requests](https://github.com/poissonconsulting/newdata/pulls) are always welcome.\n\n## Code of Conduct\n\nPlease note that the newdata project is released with a [Contributor Code of Conduct](https://contributor-covenant.org/version/2/1/CODE_OF_CONDUCT.html). By contributing to this project, you agree to abide by its terms.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpoissonconsulting%2Fnewdata","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpoissonconsulting%2Fnewdata","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpoissonconsulting%2Fnewdata/lists"}