{"id":15540052,"url":"https://github.com/blaizzy/cancer_classifier","last_synced_at":"2025-03-29T00:14:21.730Z","repository":{"id":178452202,"uuid":"168742423","full_name":"Blaizzy/Cancer_classifier","owner":"Blaizzy","description":"Data science, AI and Machine Learning","archived":false,"fork":false,"pushed_at":"2019-03-19T18:26:39.000Z","size":925,"stargazers_count":1,"open_issues_count":0,"forks_count":3,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-03-28T16:46:49.720Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Blaizzy.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-02-01T18:30:12.000Z","updated_at":"2019-02-03T15:58:16.000Z","dependencies_parsed_at":null,"dependency_job_id":"b0137654-2f65-47e6-94f9-f30826d66421","html_url":"https://github.com/Blaizzy/Cancer_classifier","commit_stats":null,"previous_names":["blaizzy/cancer_classifier"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Blaizzy%2FCancer_classifier","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Blaizzy%2FCancer_classifier/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Blaizzy%2FCancer_classifier/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Blaizzy%2FCancer_classifier/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Blaizzy","download_url":"https://codeload.github.com/Blaizzy/Cancer_classifier/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246117761,"owners_count":20726069,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-02T12:12:20.336Z","updated_at":"2025-03-29T00:14:21.725Z","avatar_url":"https://github.com/Blaizzy.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Cancer_classifier\n\nGithub profile:  https://github.com/Blaizzy\nMedium profile:  https://medium.com/@prince.canuma\n\n    Dataset:\n    [Wisconsin Breast Cancer Database](https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/breast-cancer-wisconsin.names)\n    This breast cancer database was obtained from the University of Wisconsin Hospitals, Madison from Dr William H. Wolberg on January 8, 1991.\n    \n    Citation:\n     K. P. Bennett \u0026 O. L. Mangasarian: \"Robust linear programming\n     discrimination of two linearly inseparable sets\", Optimization Methods\n    and Software 1, 1992, 23-34 (Gordon \u0026 Breach Science Publishers\n    \n# Class Dataset\nI create a Class(Data.py) optimized for this Dataset\nAll the Data Preprocessing and Postprocessing is done automatically for you. \nYou can contribute by forking this repo and extend the class to suit your needs.\n    \nPlease don't forget to cite this repo. :+1:\n\n## Methods\n\n**load()**: Gets the text in the .txt file, creates a pandas Dataframe(DF),\ncopies it to the class variable df and returns DataFrame(DF).\n\n## Class Methods\n\n**scatter_plot()**: uses the copied pandas DF and creates a scatter plot with showing correlation and histogram\nof all columns.\n\n**df_scatter_plot(*DataFrame as args)**: receives a pandas DF and creates a scatter plot with showing correlation and histogram of all columns.\n\n**class_distribution(*Dataframe as args)**: receives a pandas df and plots a the label distribution\n\n**correlation_matrix()**: Uses the class copy of the DF and displays a correlation matrix between attributes.\n\n**decode_preds(*predictions as args)**: recieves an array of predictions(0s or 1s)from the test set and returns the name of the\nclasses (Benign or Malignant)\n\n**confusion_matrix(*true_labels, *predictions)**: recieves two arguments, the first is an array of the true labels\nand the second are the predicited labels\n    \n    \n# Data\n![sample data](https://github.com/Blaizzy/Cancer_classifier/blob/Blaizzy-beta/img/Screenshot%20from%202019-02-03%2018-19-32.png)\n\n# Classifier \n\n![pred](https://github.com/Blaizzy/Cancer_classifier/blob/Blaizzy-beta/img/precision_50%25.png)\n\n**My classifier is only mislabeling 8 Benign cancer samples) out of 220 and mislabeling 7 (Malignant) cancer samples out of 219.**\nThere is room for improvement. \nI will iteratively improve this algorithm till 99%, so follow my Github profile to be updated.\n\nYou can download the model I created and use it on another dataset with the same distribution. link for [download](https://github.com/Blaizzy/Cancer_classifier/blob/Blaizzy-beta/models/saved_models/WiscosinBreastCancerClf.joblib)\n\nYou can run the classifier via this notebook [models/BreastCancer(Sklearn)](https://github.com/Blaizzy/Cancer_classifier/blob/Blaizzy-beta/models/BreastCancer(Sklearn).ipynb)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fblaizzy%2Fcancer_classifier","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fblaizzy%2Fcancer_classifier","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fblaizzy%2Fcancer_classifier/lists"}