{"id":15297368,"url":"https://github.com/mannasoumya/imputerapi","last_synced_at":"2025-03-25T13:15:02.469Z","repository":{"id":62570816,"uuid":"378590391","full_name":"mannasoumya/imputerApi","owner":"mannasoumya","description":"Data Imputer API in Python ","archived":false,"fork":false,"pushed_at":"2022-08-19T08:30:15.000Z","size":37,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-25T13:14:59.192Z","etag":null,"topics":["api","data-cleaning","data-science","datapreprocessing","dataprocessing","imputer","machine-learning","machine-learning-algorithms","matrix","python3"],"latest_commit_sha":null,"homepage":"","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mannasoumya.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-06-20T07:59:49.000Z","updated_at":"2022-08-19T08:08:32.000Z","dependencies_parsed_at":"2022-11-03T18:25:46.545Z","dependency_job_id":null,"html_url":"https://github.com/mannasoumya/imputerApi","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mannasoumya%2FimputerApi","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mannasoumya%2FimputerApi/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mannasoumya%2FimputerApi/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mannasoumya%2FimputerApi/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mannasoumya","download_url":"https://codeload.github.com/mannasoumya/imputerApi/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245467615,"owners_count":20620216,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api","data-cleaning","data-science","datapreprocessing","dataprocessing","imputer","machine-learning","machine-learning-algorithms","matrix","python3"],"created_at":"2024-09-30T19:16:56.552Z","updated_at":"2025-03-25T13:15:02.429Z","avatar_url":"https://github.com/mannasoumya.png","language":"HTML","readme":"# Data Imputer API in Python\r\n\r\n[![Generic badge](https://img.shields.io/badge/imputerApi-passing-\u003cCOLOR\u003e.svg)](https://pypi.org/project/ImputerApi/)\r\n[![Downloads](https://pepy.tech/badge/imputerapi)](https://pepy.tech/project/imputerapi)\r\n[![Downloads](https://pepy.tech/badge/imputerapi/month)](https://pepy.tech/project/imputerapi)\r\n\r\n\r\nCheck out the [Wiki](\u003chttps://en.wikipedia.org/wiki/Imputation_(statistics)\u003e) here.\r\n\r\n### \u003ca href=\"https://mannasoumya.github.io/imputerApi/\" target=\"_blank\"\u003e 'imputerApi' Documentation. \u003c/a\u003e\r\n\r\n## Installation\r\n```console\r\n$ python3 -m venv venv\r\n$ source venv/bin/activate\r\n(venv) $ pip install ImputerApi\r\n```\r\n\r\n## Currently Supported Strategies:\r\n\r\n- Mean\r\n- Median\r\n- Most-Frequent\r\n- Constant\r\n- K Nearest Neighbors\r\n\r\n## Usage:\r\n\r\n#### Read from csv file:\r\n\r\n```python\r\nfrom ImputerAPI.imputerApi import ImputerApi\r\n\r\n# Create instance of class\r\nimm_api = ImputerApi(path_to_file=\"data.csv\",strategy='mean', headers=True)\r\n# Print data in console\r\nimm_api.print_table(imm_api.data)\r\n# Transform data by replacing missing values with mean\r\n# and selecting only columns Age and Salary with indexes 1 and 2\r\nreplaced_data = imm_api.transform(column_indexes=[1, 2])\r\n# Print repalced data in console\r\nimm_api.print_table(replaced_data)\r\n# Write new data to csv file\r\nimm_api.dump_data_to_csv('datanew_mean.csv', replaced_data,use_header_from_data=True, override=True)\r\n```\r\n\r\n#### Read from a Two Dimensional Matrix (Python List):\r\n\r\n```python\r\nfrom ImputerAPI.imputerApi import ImputerApi\r\n\r\nmatrix_2d = [\r\n    ['Country', 'Age', 'Salary', 'Purchased'],\r\n    ['France', 44, 72000, 'No'],\r\n    ['Spain', 27, 48000, 'Yes'],\r\n    ['Germany', 30, 54000, 'No'],\r\n    ['Spain', 38, 61000, 'No'],\r\n    ['Germany', 40, '', 'Yes'],\r\n    ['France', 35, 58000, 'Yes'],\r\n    ['Spain', '', 52000, 'No'],\r\n    ['France', 48, 79000, 'Yes'],\r\n    ['Germany', 50, 83000, 'No'],\r\n    ['France', 37, 67000, 'Yes']\r\n]\r\n# Create instance of class\r\nimm_api = ImputerApi(matrix_2D=matrix_2d, strategy='median', headers=True)\r\n# Print data in console\r\nimm_api.print_table(imm_api.data)\r\n# Transform data by replacing missing values with median\r\n# and selecting only columns Age and Salary\r\nreplaced_data = imm_api.transform(columns_by_header_name=[\"Age\",\"Salary\"])\r\n# Print repalced data in console\r\nimm_api.print_table(replaced_data)\r\n# Write new data to csv file\r\nimm_api.dump_data_to_csv('datanew_median.csv', replaced_data,use_header_from_data=True,override=True)\r\n# Create instance with strategy most-frequent\r\nimm_api_most_freq = ImputerApi(path_to_file='datanew_median.csv',strategy=\"most-frequent\",headers=True)\r\nimm_api_most_freq.print_table(imm_api_most_freq.data)\r\n# Transform data by replacing missing values with most-frequent\r\n# and selecting only column Purchased\r\nreplaced_data = imm_api_most_freq.transform(columns_by_header_name=[\"Purchased\"])\r\nimm_api_most_freq.print_table(replaced_data)\r\n# Write new table to csv file\r\nimm_api_most_freq.dump_data_to_csv('datanew_most_frequent.csv', replaced_data,\r\n                         use_header_from_data=True, override=True)\r\n```\r\n\r\n#### Integrating with pandas,numpy:\r\n\r\n```python\r\nfrom ImputerAPI.imputerApi import ImputerApi\r\nimport numpy as np\r\nimport pandas as pd\r\n# Read csv data as Pandas DataFrame\r\ndf = pd.read_csv('data.csv')\r\n# Convert Pandas Dataframe to Numpy Array\r\narr = df.values\r\n# Convert Numpy Array to Python List \r\narr_list = arr.tolist()\r\n# Pass List to ImputerApi in parameter matrix_2D ; headers = False since it is 2D array\r\nimputer_api = ImputerApi(matrix_2D=arr_list,strategy=\"mean\",headers=False)\r\n# Replacing missing value 'np.nan' with mean\r\nreplaced_data = imputer_api.transform(column_indexes=[1,2],missing_value=np.nan)\r\n# Print to console\r\nimputer_api.print_table(arr_2D=replaced_data)\r\n# Write data to CSV file2\r\nimputer_api.dump_data_to_csv(\"data2.csv\",replaced_data,override=True)\r\n\r\n```\r\n\r\n#### Using K-Nearest Neighbors\r\n\r\n```python\r\n# Loading Data\r\nimputer_api= ImputerApi(\"data.csv\",strategy=\"knn\",headers=True)\r\n# Imputing Purchased Column containing Text Categorical Values \r\n# using knn technique and distance method 'Levenshtein'\r\nreplaced_data = imputer_api.transform(columns_by_header_name=[\"Purchased\"],missing_value=\"\",knn_method=\"levenshtein\",knn_selection=\"most-frequent\")\r\n# Creating new instance of ImputerApi using replaced_data\r\nimputer_api2 = ImputerApi(matrix_2D=replaced_data,strategy=\"knn\",headers=False)\r\n# Imputing colums 1 and 2 using knn and distance method 'Eucilidian'\r\nreplaced_data = imputer_api2.transform(column_indexes=[1,2],missing_value=\"\",knn_method=\"Euclidian\",knn_selection=\"median\")\r\n# Writing replaced data to file\r\nimputer_api.dump_data_to_csv(\"data2.csv\",replaced_data,override=True,use_header_from_data=True)\r\n```\r\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmannasoumya%2Fimputerapi","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmannasoumya%2Fimputerapi","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmannasoumya%2Fimputerapi/lists"}