{"id":13665843,"url":"https://github.com/markvanderloo/simputation","last_synced_at":"2025-10-22T06:06:29.149Z","repository":{"id":54160470,"uuid":"63790418","full_name":"markvanderloo/simputation","owner":"markvanderloo","description":"Making imputation easy","archived":false,"fork":false,"pushed_at":"2024-08-02T12:08:55.000Z","size":756,"stargazers_count":91,"open_issues_count":13,"forks_count":11,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-03-19T05:03:31.731Z","etag":null,"topics":["data-science","imputation","officialstatistics","r","rstats"],"latest_commit_sha":null,"homepage":"","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/markvanderloo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-07-20T14:58:27.000Z","updated_at":"2025-03-04T02:40:29.000Z","dependencies_parsed_at":"2022-08-13T08:00:31.942Z","dependency_job_id":"15969ce1-9cc7-4fd9-b1f8-80437c9f44c7","html_url":"https://github.com/markvanderloo/simputation","commit_stats":{"total_commits":174,"total_committers":3,"mean_commits":58.0,"dds":"0.011494252873563204","last_synced_commit":"2b864b4a4aed4b7033679e7437da75cabae924b8"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/markvanderloo%2Fsimputation","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/markvanderloo%2Fsimputation/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/markvanderloo%2Fsimputation/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/markvanderloo%2Fsimputation/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/markvanderloo","download_url":"https://codeload.github.com/markvanderloo/simputation/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250961107,"owners_count":21514581,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-science","imputation","officialstatistics","r","rstats"],"created_at":"2024-08-02T06:00:51.962Z","updated_at":"2025-10-22T06:06:24.128Z","avatar_url":"https://github.com/markvanderloo.png","language":"R","funding_links":[],"categories":["R"],"sub_categories":[],"readme":"\n\n[![CRAN](http://www.r-pkg.org/badges/version/simputation)](https://CRAN.R-project.org/package=simputation)[![status](https://tinyverse.netlify.app/badge/simputation)](https://CRAN.R-project.org/package=simputation)\n[![Downloads](http://cranlogs.r-pkg.org/badges/simputation)](https://CRAN.R-project.org/package=simputation)[![Mentioned in Awesome Official Statistics ](https://awesome.re/mentioned-badge.svg)](http://www.awesomeofficialstatistics.org)\n\n\n# simputation\nAn R package to make imputation simple. Currently supported methods include\n\n- Model based (optionally add [non-]parametric random residual)\n    - linear regression \n    - robust linear regression (M-estimation)\n    - ridge/elasticnet/lasso regression (from version \u003e= 0.2.1)\n    - CART models\n    - Random forest\n- Model based, multivariate\n    - Imputation based on EM-estimated parameters (from version \u003e= 0.2.1)\n    - [missForest](https://CRAN.R-project.org/package=missForest) (from version \u003e= 0.2.1)\n- Donor imputation (including various donor pool specifications)\n  - k-nearest neigbour (based on [gower](https://cran.r-project.org/package=gower)'s distance)\n  - sequential hotdeck (LOCF, NOCB)\n  - random hotdeck\n  - Predictive mean matching\n- Other\n  - (groupwise) median imputation (optional random residual)\n  - Proxy imputation (copy from other variable) \n\n\n### Installation\n\nTo install simputation and all packages needed to support various imputation\nmodels do the following.\n```r\ninstall.packages(\"simputation\", dependencies=TRUE)\n```\n\nTo install the development version.\n\n```{bash}\ngit clone https://github.com/markvanderloo/simputation\nmake install\n```\n\n\n### Example usage\n\nCreate some data suffering from missings\n```r\nlibrary(simputation) # current package\n\ndat \u003c- iris\n# empty a few fields\ndat[1:3,1] \u003c- dat[3:7,2] \u003c- dat[8:10,5] \u003c- NA\nhead(dat,10)\n```\nNow impute `Sepal.Length` and `Sepal.Width` by regression on `Petal.Length` and `Species`, and impute `Species` using a CART model, that uses all other variables (including the imputed variables in this case).\n```r\ndat |\u003e\n  impute_lm(Sepal.Length + Sepal.Width ~ Petal.Length + Species) |\u003e\n  impute_cart(Species ~ .) |\u003e # use all variables except 'Species' as predictor\n  head(10)\n```\n\n### Materials\n\n- The introductory [vignette](https://cran.r-project.org/web/packages/simputation/vignettes/intro.html)\n- [slides](https://markvanderloo.eu/files/share/loo2017easy.pdf) from my [useR2017](https://user2017.brussels/) talk.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmarkvanderloo%2Fsimputation","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmarkvanderloo%2Fsimputation","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmarkvanderloo%2Fsimputation/lists"}