{"id":13399185,"url":"https://github.com/data-cleaning/validate","last_synced_at":"2026-02-18T22:02:13.192Z","repository":{"id":14347910,"uuid":"17057497","full_name":"data-cleaning/validate","owner":"data-cleaning","description":"Professional data validation for the R environment","archived":false,"fork":false,"pushed_at":"2025-12-10T15:35:20.000Z","size":6684,"stargazers_count":430,"open_issues_count":51,"forks_count":42,"subscribers_count":17,"default_branch":"master","last_synced_at":"2026-01-14T05:01:48.983Z","etag":null,"topics":["data-cleaning","r","validation"],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/data-cleaning.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2014-02-21T13:50:42.000Z","updated_at":"2026-01-07T21:45:11.000Z","dependencies_parsed_at":"2024-04-23T12:38:20.869Z","dependency_job_id":"83b53977-74b7-4c5f-9f5b-044ebfe85dd8","html_url":"https://github.com/data-cleaning/validate","commit_stats":{"total_commits":731,"total_committers":13,"mean_commits":56.23076923076923,"dds":"0.12585499316005477","last_synced_commit":"9e2f4f2e324e16878a9cb299ed2c8c8f3c9544e9"},"previous_names":["edwindj/validate"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/data-cleaning/validate","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/data-cleaning%2Fvalidate","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/data-cleaning%2Fvalidate/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/data-cleaning%2Fvalidate/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/data-cleaning%2Fvalidate/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/data-cleaning","download_url":"https://codeload.github.com/data-cleaning/validate/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/data-cleaning%2Fvalidate/sbom","scorecard":{"id":323943,"data":{"date":"2025-08-11","repo":{"name":"github.com/data-cleaning/validate","commit":"87dd34643f78ad5295e3e9f62021e166fa217f90"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3.1,"checks":[{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Code-Review","score":3,"reason":"Found 6/19 approved changesets -- score normalized to 3","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Maintained","score":1,"reason":"1 commit(s) and 1 issue activity found in the last 90 days -- score normalized to 1","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":0,"reason":"license file not detected","details":["Warn: project does not have a license file"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 17 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-18T02:02:33.531Z","repository_id":14347910,"created_at":"2025-08-18T02:02:33.531Z","updated_at":"2025-08-18T02:02:33.531Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29596329,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-18T20:59:56.587Z","status":"ssl_error","status_checked_at":"2026-02-18T20:58:41.434Z","response_time":162,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-cleaning","r","validation"],"created_at":"2024-07-30T19:00:34.979Z","updated_at":"2026-02-18T22:02:13.176Z","avatar_url":"https://github.com/data-cleaning.png","language":"R","funding_links":[],"categories":["Open-Source Software","R","HTML"],"sub_categories":["Data Quality Control"],"readme":"\n[![CRAN](http://www.r-pkg.org/badges/version/validate)](http://cran.r-project.org/package=validate/)\n[![Downloads](https://cranlogs.r-pkg.org/badges/validate)](http://cran.r-project.org/package=validate/)\n[![status](https://tinyverse.netlify.app/badge/validate)](https://CRAN.R-project.org/package=validate)\n[![Mentioned in Awesome Official Statistics ](https://awesome.re/mentioned-badge.svg)](http://www.awesomeofficialstatistics.org)\n\n\nEasy data validation for the masses.\n-----------------------------------\n\nThe `validate` R-package makes it super-easy to check whether data lives up to expectations you have based on domain knowledge. It works by allowing you to define data validation rules independent of the code or data set. Next you can confront a dataset, or various versions thereof with the rules. Results can be summarized, plotted, and so on. Below is a simple example.\n\n```r\n\u003e library(validate)\n\u003e check_that(iris, Sepal.Width \u003c 0.5*Sepal.Length) |\u003e summary()\n  rule items passes fails nNA error warning                       expression\n1   V1   150     79    71   0 FALSE   FALSE Sepal.Width \u003c 0.5 * Sepal.Length\n```\n\nWith `validate`, data validation rules are treated as first-class citizens.\nThis means you can import, export, annotate, investigate and manipulate data\nvalidation rules in a meaninful way. \n\nTo get started: see our [data validation cookbook](https://data-cleaning.github.io/validate/).\n\n\n#### Citing\n\nPlease cite the [JSS article](https://www.jstatsoft.org/article/view/v097i10)\n\n```\n@article{van2021data,\n  title={Data validation infrastructure for R},\n  author={van der Loo, Mark PJ and de Jonge, Edwin},\n  journal={Journal of Statistical Software},\n  year={2021},\n  volume ={97},\n  issue = {10},\n  pages = {1-33},\n  doi={10.18637/jss.v097.i10},\n  url = {https://www.jstatsoft.org/article/view/v097i10}\n}\n```\n\nTo cite the theory, please cite our [Wiley StatsRef](https://arxiv.org/abs/2012.12028) chapter.\n\n```\n@article{loo2020data,\n  title = {Data Validation},\n  year = {2020},\n  journal = {Wiley StatsRef: Statistics Reference Online},\n  author = {M.P.J. van der Loo and E. de Jonge},\n  pages = {1--7},\n  doi = {https://doi.org/10.1002/9781118445112.stat08255},\n  url = {https://onlinelibrary.wiley.com/doi/10.1002/9781118445112.stat08255}\n}\n```\n\n\n#### Other Resources\n\n- [Tutorial material](https://github.com/markvanderloo/2024uRos) from the tutorial at _uRos2024_ (Greece)\n- [Tutorial material](https://github.com/data-cleaning/validate) from our tutorial at _useR!_2021\n- [The Data Validation Cookbook](https://data-cleaning.github.io/validate)\n- [Slides](http://www.slideshare.net/MarkVanDerLoo/data-validation-infrastructure-the-validate-package) of the [useR2016](http://www.useR2016.org) talk (Stanford University, June 28 2016).\n- [Video](https://www.youtube.com/watch?v=RMCc2Iu0UIQ) of the [satRdays](https://budapest.satRdays.org) talk (Hungarian Academy of Sciences, Sept 3 2016).\n- [Slides and exercises](https://github.com/data-cleaning/useR2019_tutorial) from the [useR2018](https://user2018.r-project.org/) tutorial.\n- [Materials](https://github.com/data-cleaning/uRos2018_tutorial) for the [uRos2018](http://r-project.ro/conference2018.html) workshop (The Hague, 2018)\n- [Materials](https://github.com/data-cleaning/EESW2019_tutorial) for the [ENBES|EESW](https://statswiki.unece.org/display/ENBES/EESW19) workshop (Bilbao, 2019)\n- [Materials](https://github.com/data-cleaning/ISM2020_tutorial) for the planned workshop at the [Institute for Statistical Mathematics](https://www.ism.ac.jp/index_e.html) (Tokyo, 2020 - cancelled because of the COVID-19 situation)\n\n#### Installation\n\n\nThe latest release can be installed from the R command-line\n```r\ninstall.packages(\"validate\")\n```\n\nThe development version can be installed as follows.\n```bash\ngit clone https://github.com/data-cleaning/validate\ncd validate\nmake install\n```\n\nNote that the development version likely contain bugs (please report them!) and interfaces that may not be stable.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdata-cleaning%2Fvalidate","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdata-cleaning%2Fvalidate","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdata-cleaning%2Fvalidate/lists"}