{"id":15107592,"url":"https://github.com/pharo-ai/data-imputers","last_synced_at":"2026-01-18T20:02:02.126Z","repository":{"id":213302047,"uuid":"614488348","full_name":"pharo-ai/data-imputers","owner":"pharo-ai","description":"This project contains transformers for missing value imputation","archived":false,"fork":false,"pushed_at":"2023-12-19T14:41:48.000Z","size":45,"stargazers_count":1,"open_issues_count":2,"forks_count":0,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-04-04T04:41:27.376Z","etag":null,"topics":["ai","data","data-science","imputer","pharo","pharo-smalltalk","smalltalk"],"latest_commit_sha":null,"homepage":"","language":"Smalltalk","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pharo-ai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2023-03-15T17:26:26.000Z","updated_at":"2023-03-22T06:26:54.000Z","dependencies_parsed_at":"2023-12-19T17:57:38.292Z","dependency_job_id":"c4f4b55e-5fd3-4cd2-a26c-9ebfd0f7abb5","html_url":"https://github.com/pharo-ai/data-imputers","commit_stats":null,"previous_names":["pharo-ai/data-imputers"],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/pharo-ai/data-imputers","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pharo-ai%2Fdata-imputers","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pharo-ai%2Fdata-imputers/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pharo-ai%2Fdata-imputers/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pharo-ai%2Fdata-imputers/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pharo-ai","download_url":"https://codeload.github.com/pharo-ai/data-imputers/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pharo-ai%2Fdata-imputers/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28549732,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-18T19:56:05.265Z","status":"ssl_error","status_checked_at":"2026-01-18T19:55:54.685Z","response_time":98,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","data","data-science","imputer","pharo","pharo-smalltalk","smalltalk"],"created_at":"2024-09-25T21:25:49.357Z","updated_at":"2026-01-18T20:02:02.099Z","avatar_url":"https://github.com/pharo-ai.png","language":"Smalltalk","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Data Imputers\n\n[![CI](https://github.com/pharo-ai/data-imputers/actions/workflows/ci.yml/badge.svg)](https://github.com/pharo-ai/data-imputers/actions/workflows/ci.yml)\n[![Coverage Status](https://coveralls.io/repos/github/pharo-ai/data-imputers/badge.svg?branch=master)](https://coveralls.io/github/pharo-ai/data-imputers?branch=master)\n[![Pharo version](https://img.shields.io/badge/Pharo-9-%23aac9ff.svg)](https://pharo.org/download)\n[![Pharo version](https://img.shields.io/badge/Pharo-10-%23aac9ff.svg)](https://pharo.org/download)\n[![Pharo version](https://img.shields.io/badge/Pharo-11-%23aac9ff.svg)](https://pharo.org/download)\n[![Pharo version](https://img.shields.io/badge/Pharo-12-%23aac9ff.svg)](https://pharo.org/download)\n[![License](https://img.shields.io/badge/license-MIT-blue.svg)](https://raw.githubusercontent.com/PharoAI/data-imputers/master/LICENSE)\n\nThis is a Pharo library for transforming data to manage missing values.\n\n## How to install it?\n\nTo install the project, go to the Playground (Ctrl+OW) in your [Pharo](https://pharo.org/) image and execute the following Metacello script (select it and press Do-it button or Ctrl+D):\n\n```Smalltalk\nMetacello new\n  baseline: 'AIDataImputers';\n  repository: 'github://pharo-ai/data-imputers/src';\n  load.\n```\n\n## How to depend on it?\n\nIf you want to add a dependency on this project to your project, include the following lines into your baseline method:\n\n```Smalltalk\nspec\n  baseline: 'AIDataImputers'\n  with: [ spec repository: 'github://pharo-ai/data-imputers/src' ].\n```\n\nIf you are new to baselines and Metacello, check out this wonderful [Baselines](https://github.com/pharo-open-documentation/pharo-wiki/blob/master/General/Baselines.md) tutorial on Pharo Wiki.\n\n## Quick Start\n\nI can be used to fill the missing values of a collection like this:\n\n```st\n| collection|\ncollection := #( #( 7 2 5 6 ) #( 7 nil 5 9 ) #( 10 2 nil 6 ) ).\n\t\nAISimpleImputer new\n\tuseMostFrequent;\n\tfit: collection;\n\ttransform: collection \"#( #( 7 2 5 6 ) #( 7 2 5 9 ) #( 10 2 5 6 ) )\"\n```\n\nI can also be used to fill missing values of a [`DataFrame`](https://github.com/PolyMathOrg/DataFrame):\n\n```st\nAISimpleImputer mostFrequent fitAndTransform: (DataFrame withRows: #( #( 7 2 5 6 ) #( 7 nil 5 9 ) #( 10 2 nil 6 ) )) \n```\n\n## Simple Imputer\n\nI am a simple imputer whose goal is to fill missing values in 2D collections.\n\nTo use me you need 3 steps. The first one is to define the value to replace the missing values with:\n- `#useAverage` (Default value)\n- `#useMedian`\n- `#useMostFrequent`\n- `#useContant:`\n\nThen you need to use `#fit:` to allow me to compute the missing value. Once it is done, you can use `#statistics` to get those values.\n\nFinally you can use `#transform:` to fill the missing values of a 2D collection. \n\nAn alternative is to use `#fitAndTransform:` if you want to fill the missing values using the same collection to compute them.\n\nExample:\n\n```st\n| collection|\ncollection := #( #( 7 2 5 6 ) #( 7 nil 5 9 ) #( 10 2 nil 6 ) ).\n\t\nAISimpleImputer new\n\tuseMostFrequent;\n\tfit: collection;\n\tstatistics; \"This methods allows to get the replacement values once the imputer is fitted. In this case =\u003e #( 7 2 5 6 )\"\n\ttransform: collection \"#( #( 7 2 5 6 ) #( 7 2 5 9 ) #( 10 2 5 6 ) )\"\n```\n\nor\n\n```st\nAISimpleImputer new\n\tuseMostFrequent;\n\tfitAndTransform: #( #( 7 2 5 6 ) #( 7 nil 5 9 ) #( 10 2 nil 6 ) ) \"#( #( 7 2 5 6 ) #( 7 2 5 9 ) #( 10 2 5 6 ) )\"\n```\n\nI can also be used with a [`DataFrame`](https://github.com/PolyMathOrg/DataFrame):\n\n```st\nAISimpleImputer new\n\tuseMostFrequent;\n\tfitAndTransform: (DataFrame withRows: #( #( 7 2 5 6 ) #( 7 nil 5 9 ) #( 10 2 nil 6 ) )) \n```\n\nIt is also possible to change the missing value in case you want to replace something else than nil values:\n\n```st\nAISimpleImputer new\n\tuseMostFrequent;\n\tmissingValue: false;\n\tfitAndTransform: #( #( 7 2 5 6 ) #( 7 false 5 9 ) #( 10 2 false 6 ) ) \"#( #( 7 2 5 6 ) #( 7 2 5 9 ) #( 10 2 5 6 ) )\"\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpharo-ai%2Fdata-imputers","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpharo-ai%2Fdata-imputers","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpharo-ai%2Fdata-imputers/lists"}