{"id":18786092,"url":"https://github.com/ritzvik/oversampling-techniques","last_synced_at":"2026-05-11T05:31:58.489Z","repository":{"id":120181701,"uuid":"151610035","full_name":"ritzvik/oversampling-techniques","owner":"ritzvik","description":null,"archived":false,"fork":false,"pushed_at":"2019-02-01T14:59:18.000Z","size":66,"stargazers_count":4,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2024-12-29T12:43:36.586Z","etag":null,"topics":["data","machine-learning","machine-learning-algorithms","oversampling","python","r","smote","smote-ipf","smoteipf","spider","spider2"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ritzvik.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-10-04T17:25:57.000Z","updated_at":"2022-10-29T01:54:46.000Z","dependencies_parsed_at":null,"dependency_job_id":"453095c3-91b8-4493-9efe-1c05c9029c34","html_url":"https://github.com/ritzvik/oversampling-techniques","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ritzvik%2Foversampling-techniques","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ritzvik%2Foversampling-techniques/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ritzvik%2Foversampling-techniques/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ritzvik%2Foversampling-techniques/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ritzvik","download_url":"https://codeload.github.com/ritzvik/oversampling-techniques/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":239702582,"owners_count":19683122,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data","machine-learning","machine-learning-algorithms","oversampling","python","r","smote","smote-ipf","smoteipf","spider","spider2"],"created_at":"2024-11-07T20:50:32.444Z","updated_at":"2025-12-22T02:30:23.125Z","avatar_url":"https://github.com/ritzvik.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# oversampling-techniques\n\n## SMOTE\nInput file should contain Yes/No Class and header. The header of Yes/No Column should be named 'Class'. Entries in Yes/No Column should be either 'Y' or 'N'. Input File should be in csv format.\n\nChange k in KNN for SMOTE by changing the variable 'k_knn' in smote.r . Change 'YNcolumn' variable to indicate column number of Yes/No Column. Column numbers start  from 1.\n\nFile Usage : rscript smote.r \u003cinput.csv\u003e \u003coutput.csv\u003e\n\n\n## SMOTE-IPF\nInput file should contain Yes/No Class and header. The header of Yes/No Column should be named 'Class'. Entries in Yes/No Column should be either 'Y' or 'N'. Input File should be in csv format.\n\nChange k in KNN for SMOTE by changing the variable 'k_knn' in smoteipf.r . Change 'YNcolumn' variable to indicate column number of Yes/No Column. Column numbers start  from 1.\n\nAlso, variable 'n', 'k', 'voting', 'p' can be changed accordingly. For information on these variables visit : https://www.sciencedirect.com/science/article/pii/S0020025514008561\n\nFile Usage : rscript smoteipf.r \u003cinput.csv\u003e \u003coutput.csv\u003e\n\n## SPIDER2\nInput file should contain Yes/No Class and header. The header of Yes/No Column should be named 'Class'. Entries in Yes/No Column should be either 'Y' or 'N'. Input File should be in csv format.\n\nChange k in KNN changing the variable 'k' in spider.py . Change 'YNcolumn' variable to indicate column number of Yes/No Column. Column numbers start  from 1.\n\nOther variables can be changed below the comment \"#change below parameters according to requirment\". For info on these variables visit : https://link.springer.com/chapter/10.1007/978-3-642-13529-3_18\n\nFile Usage : python3 spider.py \u003cinput.csv\u003e\n\nOutput : k-r-a0.csv, k-r-a1.csv, k-r-a2.csv where a0, a1 and a2 represent no aplification, weak amplification and strong amplification respectively.\n\n\n#### A sample file named 'sample.csv' is uploaded for reference. \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fritzvik%2Foversampling-techniques","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fritzvik%2Foversampling-techniques","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fritzvik%2Foversampling-techniques/lists"}