{"id":37078119,"url":"https://github.com/stonegor/ae-imputer","last_synced_at":"2026-01-14T09:03:46.092Z","repository":{"id":226499604,"uuid":"768863252","full_name":"stonegor/ae-imputer","owner":"stonegor","description":" A python package used for missing data imputation via autoencoders.","archived":false,"fork":false,"pushed_at":"2024-03-21T18:18:15.000Z","size":31,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-12-21T23:55:46.708Z","etag":null,"topics":["data-science","deep-learning","imputation","machine-learning","python","pytorch"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/stonegor.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2024-03-07T21:51:48.000Z","updated_at":"2024-12-05T04:50:23.000Z","dependencies_parsed_at":"2024-03-08T17:28:55.263Z","dependency_job_id":"46dbb421-484f-4677-98b2-02679a3ef77b","html_url":"https://github.com/stonegor/ae-imputer","commit_stats":null,"previous_names":["stonegor/ae-impute","stonegor/ae-imputer"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/stonegor/ae-imputer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stonegor%2Fae-imputer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stonegor%2Fae-imputer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stonegor%2Fae-imputer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stonegor%2Fae-imputer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/stonegor","download_url":"https://codeload.github.com/stonegor/ae-imputer/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stonegor%2Fae-imputer/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28414736,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-14T08:38:59.149Z","status":"ssl_error","status_checked_at":"2026-01-14T08:38:43.588Z","response_time":107,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-science","deep-learning","imputation","machine-learning","python","pytorch"],"created_at":"2026-01-14T09:03:45.408Z","updated_at":"2026-01-14T09:03:46.085Z","avatar_url":"https://github.com/stonegor.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge\u0026logo=python\u0026logoColor=ffdd54)![NumPy](https://img.shields.io/badge/numpy-%23013243.svg?style=for-the-badge\u0026logo=numpy\u0026logoColor=white)![PyTorch](https://img.shields.io/badge/PyTorch-%23EE4C2C.svg?style=for-the-badge\u0026logo=PyTorch\u0026logoColor=white)![scikit-learn](https://img.shields.io/badge/scikit--learn-%23F7931E.svg?style=for-the-badge\u0026logo=scikit-learn\u0026logoColor=white)\n# ae-imputer\nae-imputer is a python package used for missing data imputation via autoencoders.\n\nAs of now, only numerical values are supported for imputation.\n\nThe method used is based on the paper:\n\n[John T. McCoy, Steve Kroon, Lidia Auret: Variational Autoencoders for Missing Data Imputation with Application to a Simulated Milling Circuit, IFAC-PapersOnLine, 2018](https://www.sciencedirect.com/science/article/pii/S2405896318320949)\n\n## Installing\n\nNote that  **ae-imputer** uses **PyTorch** for all of its underlying AutoEncoder implementations.\n\nRequirements:\n\n* Python 3.8 or greater\n* numpy\n* scikit-learn\n* pytorch\n\n```bash\npip install ae-imputer\n```\n\n## Usage\n\nThe ae-imputer package is designed to match sklearn imputers calling API. \n\n```python\nimport numpy as np\nfrom aeimputer import AEImputer\n\nX = [[1,2,3],[2,np.nan,4],[np.nan,5,6],[np.nan,2,3],[2,3,4],[4,5,6]]\nimputer = AEImputer(n_layers=5)\n\nX_imputed = imputer.fit_transform(X)\n```\nIt is recommended to normalize your data before fitting and imputation.\nUnlike the example above, AEImputer is meant to be used with much larger amounts of data,\nin order to properly utilyze its capabilities.\n\nThere are a number of parameters that can be set for the AEImputer class; the\nmajor ones are as follows:\n\n -  ``model_type`` : 'variational' or 'vanilla', default='variational;\n        Type of AutoEncoder architecture to use.\n\n -  ``n_layers`` : int, default=3\n        The number of layers in the AutoEncoder network.\n\n    ``hidden_dims`` : list of int, default=None\n        The number of neurons for each hidden layer in the AutoEncoder network. If None, will be \n        determined automatically.`hidden_dims`` : list of int, default=None\n        The number of neurons for each hidden layer in the AutoEncoder network. If None, will be \n        determined automatically.\n\n    ``preimpute_at_train`` : bool, default = False\n        AEImputer uses only complete rows of data during fitting by default.\n        If set True the missing values will be imputed with 'preimpute_strategy' before training.\n        Advised, if the fraction of missing rows is significant\n        \n    ``max_epochs`` : int, default=1000\n        The maximum number of epochs to train the AutoEncoder.\n\n    ``lr`` : float, default=1e-3\n        The learning rate for the optimizer during training.\n\n\n```bibtex\n@article{MCCOY2018141,\n    title = {Variational Autoencoders for Missing Data Imputation with Application to a Simulated Milling Circuit},\n    journal = {IFAC-PapersOnLine},\n    volume = {51},\n    number = {21},\n    pages = {141-146},\n    year = {2018},\n    note = {5th IFAC Workshop on Mining, Mineral and Metal Processing MMM 2018},\n    issn = {2405-8963},\n    doi = {https://doi.org/10.1016/j.ifacol.2018.09.406},\n    url = {https://www.sciencedirect.com/science/article/pii/S2405896318320949},\n    author = {John T. McCoy and Steve Kroon and Lidia Auret},\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstonegor%2Fae-imputer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstonegor%2Fae-imputer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstonegor%2Fae-imputer/lists"}