{"id":14067857,"url":"https://github.com/data-cleaning/validatetools","last_synced_at":"2025-10-22T04:06:12.192Z","repository":{"id":56936536,"uuid":"98531483","full_name":"data-cleaning/validatetools","owner":"data-cleaning","description":null,"archived":false,"fork":false,"pushed_at":"2024-06-14T09:32:49.000Z","size":6432,"stargazers_count":15,"open_issues_count":6,"forks_count":3,"subscribers_count":5,"default_branch":"master","last_synced_at":"2024-10-13T12:44:00.603Z","etag":null,"topics":["data-cleaning","r","rules","validation"],"latest_commit_sha":null,"homepage":"https://data-cleaning.github.io/validatetools","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/data-cleaning.png","metadata":{"files":{"readme":"README.Rmd","changelog":"NEWS.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-07-27T12:10:17.000Z","updated_at":"2024-07-21T22:47:49.000Z","dependencies_parsed_at":"2024-02-19T19:16:12.789Z","dependency_job_id":"1807aeb5-aee5-4344-b1ac-ab227edb2ff0","html_url":"https://github.com/data-cleaning/validatetools","commit_stats":{"total_commits":151,"total_committers":2,"mean_commits":75.5,"dds":"0.019867549668874163","last_synced_commit":"a114074b9ab48ba269f28cb4748c56bd00005f00"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/data-cleaning%2Fvalidatetools","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/data-cleaning%2Fvalidatetools/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/data-cleaning%2Fvalidatetools/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/data-cleaning%2Fvalidatetools/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/data-cleaning","download_url":"https://codeload.github.com/data-cleaning/validatetools/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":228075624,"owners_count":17865506,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-cleaning","r","rules","validation"],"created_at":"2024-08-13T07:05:48.940Z","updated_at":"2025-10-22T04:06:07.159Z","avatar_url":"https://github.com/data-cleaning.png","language":"R","funding_links":[],"categories":["R"],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, include=FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"README-\"\n)\nlibrary(validatetools)\n```\n\n\u003c!-- badges: start --\u003e\n[![R-CMD-check](https://github.com/data-cleaning/validatetools/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/data-cleaning/validatetools/actions/workflows/R-CMD-check.yaml)\n[![CRAN status](https://www.r-pkg.org/badges/version/validatetools)](https://CRAN.R-project.org/package=validatetools)\n[![Mentioned in Awesome Official Statistics](https://awesome.re/mentioned-badge.svg)](http://www.awesomeofficialstatistics.org)\n[![Codecov test coverage](https://codecov.io/gh/data-cleaning/validatetools/graph/badge.svg)](https://app.codecov.io/gh/data-cleaning/validatetools)\n\u003c!-- badges: end --\u003e\n\n# validatetools\n\n`validatetools` is a utility package for managing validation rule sets that are defined with `validate`.\nIn production systems validation rule sets tend to grow organically and accumulate redundant or (partially)\ncontradictory rules. `validatetools` helps to identify problems with large rule sets and includes simplification\nmethods for resolving issues.\n\n## Installation\n\n`validatetools` is available from CRAN and can be installed with\n\n```r\ninstall.packages(\"validatetools\")\n```\n\nThe adventurous can install an (unstable) development version of `validatetools` from github with:\n\n``` r\n# install.packages(\"devtools\")\ndevtools::install_github(\"data-cleaning/validatetools\")\n```\nor use\n\n```r\ninstall.packages('validatetools', repos = c('https://data-cleaning.r-universe.dev', 'https://cloud.r-project.org'))\n```\n\n## Example\n\n### Check for feasibility\n\n```{r}\nrules \u003c- validator( x \u003e 0)\nis_infeasible(rules)\n\nrules \u003c- validator(\n  rule1 = x \u003e 0,\n  rule2 = x \u003c 0\n)\nis_infeasible(rules)\n\ndetect_infeasible_rules(rules, verbose=TRUE)\n# find out the conflict with this rule\nis_contradicted_by(rules, \"rule1\", verbose=TRUE)\n\n# we prefer to keep rule1, so we can give rule1 Inf weight\ndetect_infeasible_rules(\n  rules, \n  weight=c(rule1 = Inf), \n  verbose=TRUE\n)\n\nmake_feasible(rules, weight=c(rule1=Inf), verbose=TRUE)\n```\n\n### Finding contradicting if rules\n\n\n```{r}\nrules \u003c- validator(\n  rule1 = if (income \u003e 0) job == \"yes\",\n  rule2 = if (job == \"yes\") income == 0\n)\n    \nis_infeasible(rules, verbose=TRUE)\nconflicts \u003c- detect_contradicting_if_rules(rules, verbose=TRUE)\n```\n\n\n```{r}\nprint(conflicts)\n```\n\n## Simplifying \n\nThe function `simplify_rules` combines most simplification methods of `validatetools` to simplify a rule set.\nFor example, it reduces the following rule set to a simpler form:\n\n```{r}\nrules \u003c- validator(\n  rule1 = if (age \u003c 16) income == 0,\n  rule2 = job %in% c(\"yes\", \"no\"),\n  rule3 = if (job == \"yes\") income \u003e 0\n)\n\nsimplify_rules(rules, age = 13)\n#or \nsimplify_rules(rules, job = \"yes\")\n```\n\n`simplify_rules` combines the following simplification and substitution methods:\n\n\n### Value substitution\n\n```{r}\nrules \u003c- validator( \n  rule1 = height \u003e 4,\n  rule2 = height \u003c= max_height,\n  rule3 = if (gender == \"male\") weight \u003e 100,\n  rule4 = gender %in% c(\"male\", \"female\")\n)\nsubstitute_values(rules, max_height = 6, gender = \"male\")\n```\n\n### Finding fixed values\n\n```{r}\nrules \u003c- validator( \n  rule1 = x \u003e= 0, \n  rule2 = x \u003c=0\n)\ndetect_fixed_variables(rules)\nsimplify_fixed_variables(rules)\n\nrules \u003c- validator(\n  rule1 = x1 + x2 + x3 == 0,\n  rule2 = x1 + x2 \u003e= 0,\n  rule3 = x3 \u003e=0\n)\nsimplify_fixed_variables(rules)\n```\n\n### Simplifying conditional statements\n\n```{r}\n# superfluous conditions\nrules \u003c- validator(\n  r1 = if (age \u003e 18) age \u003c= 67,\n  r2 = if (income \u003e 0 \u0026\u0026 income \u003e 1000) job == TRUE \n)\n# implies that age always is \u003c= 67\nsimplify_conditional(rules)\n\n\n\n# non-relaxing clause\nrules \u003c- validator( \n  r1 = if (income \u003e 0) age \u003e= 16,\n  r2 = age \u003c 12\n)\n# age \u003e 16 is always FALSE so r1 can be simplified\nsimplify_conditional(rules)\n\n\n# non-constraining clause\nrules \u003c- validator( \n  rule1 = if (age  \u003c 16) income == 0,\n  rule2 = if (age \u003e=16) income \u003e= 0\n)\nsimplify_conditional(rules)\n```\n\n### Removing redundant rules\n\n```{r}\nrules \u003c- validator(\n  rule1 = age \u003e 12,\n  rule2 = age \u003e 18\n)\n\n# rule1 is superfluous\nremove_redundancy(rules, verbose=TRUE)\n\nrules \u003c- validator(\n  rule1 = age \u003e 12,\n  rule2 = age \u003e 12\n)\n\n# standout: rule1 and rule2, first rule wins\nremove_redundancy(rules, verbose=TRUE)\n\n# Note that detection signifies both rules!\ndetect_redundancy(rules, verbose=TRUE)\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdata-cleaning%2Fvalidatetools","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdata-cleaning%2Fvalidatetools","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdata-cleaning%2Fvalidatetools/lists"}