{"id":16778676,"url":"https://github.com/ritvik19/toxic-comment-classification","last_synced_at":"2025-10-13T16:37:22.908Z","repository":{"id":104587630,"uuid":"262729682","full_name":"Ritvik19/Toxic-Comment-Classification","owner":"Ritvik19","description":null,"archived":false,"fork":false,"pushed_at":"2020-07-28T16:48:19.000Z","size":10167,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-16T19:48:25.314Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Ritvik19.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-05-10T06:53:40.000Z","updated_at":"2021-10-25T02:45:13.000Z","dependencies_parsed_at":null,"dependency_job_id":"46e73ed5-b6a2-4438-9e45-32aad6e90a25","html_url":"https://github.com/Ritvik19/Toxic-Comment-Classification","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Ritvik19/Toxic-Comment-Classification","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ritvik19%2FToxic-Comment-Classification","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ritvik19%2FToxic-Comment-Classification/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ritvik19%2FToxic-Comment-Classification/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ritvik19%2FToxic-Comment-Classification/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Ritvik19","download_url":"https://codeload.github.com/Ritvik19/Toxic-Comment-Classification/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ritvik19%2FToxic-Comment-Classification/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279016095,"owners_count":26085802,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-13T02:00:06.723Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-13T07:28:25.699Z","updated_at":"2025-10-13T16:37:22.888Z","avatar_url":"https://github.com/Ritvik19.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Toxic-Comment-Classification\n\nDiscussing things you care about can be difficult. The threat of abuse and harassment online means that many people stop expressing themselves and give up on seeking different opinions. Platforms struggle to effectively facilitate conversations, leading many communities to limit or completely shut down user comments.\n\nSome characteristics that can signify that a text is toxic:\n\n* Has a non-neutral tone\n  * Has an exaggerated tone to underscore a point about a group of people\n  * Is rhetorical and meant to imply a statement about a group of people\n* Is disparaging or inflammatory\n  * Suggests a discriminatory idea against a protected class of people, or seeks confirmation of a stereotype\n  * Makes disparaging attacks/insults against a specific person or group of people\n  * Based on an outlandish premise about a group of people\n  * Disparages against a characteristic that is not fixable and not measurable\n* Isn't grounded in reality\n  * Based on false information, or contains absurd assumptions\n* Uses sexual content (incest, bestiality, pedophilia) for shock value\n\n**Problem Statement:** to build a multi-headed model that’s capable of detecting different types of of toxicity like threats, obscenity, insults, and identity-based hate\n\n**Sources:** [Kaggle-Toxic Comment Classification Challenge](https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge/) and [Kaggle-Jigsaw Unintended Bias in Toxicity Classification](https://www.kaggle.com/c/jigsaw-unintended-bias-in-toxicity-classification/)\n\n**Project Objective:** a model to prerform advanced sentiment analysis\n\n___\n\n### Approach Summary\n\n**Performance Measure:** Area Under Receiver Operating Characteristic\n\n**Feature Extraction:** Sublinear Smoothed TFIDF\n\n**Algorithm:** OVR Logistic Regression\n\n___\n\n### Performance Summary\n\nApproach | Algorithm | Mean AUROC | Mean Accuracy | Mean F1\n:---|:---|---:|---:|---:\nSampled Data | Logistic Regression | 0.9745 | 0.7812 | 0.8926\nSampled Data | Bagging Classifier | 0.9680 | 0.7616 | 0.8793\nComplete Data | Logistic Regression | 0.9717 | 0.8687 | 0.9046\nComplete Data | Stacking Classifier | 0.9729 | 0.7940 | 0.8903\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fritvik19%2Ftoxic-comment-classification","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fritvik19%2Ftoxic-comment-classification","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fritvik19%2Ftoxic-comment-classification/lists"}