{"id":22845172,"url":"https://github.com/toscdom/spam_detection","last_synced_at":"2026-04-27T21:31:15.517Z","repository":{"id":267753946,"uuid":"902245662","full_name":"ToscDom/SPAM_Detection","owner":"ToscDom","description":"This repository contains a project focused on analyzing and classifying emails to detect SPAM. It includes:  Training a machine learning classifier for SPAM detection. Identifying key topics in SPAM emails using NLP techniques. Calculating semantic distances to evaluate topic similarity. Tools used include Python libraries like nlp frameworks","archived":false,"fork":false,"pushed_at":"2024-12-12T07:43:37.000Z","size":396,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-06T09:47:48.862Z","etag":null,"topics":["classifier","nlp","nltk","scikit-learn","semantic-analysis","spam-detection"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ToscDom.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-12-12T07:37:23.000Z","updated_at":"2024-12-12T08:25:04.000Z","dependencies_parsed_at":"2024-12-12T08:29:47.570Z","dependency_job_id":"2fdd806b-5b2c-4ae8-a057-785aa5914402","html_url":"https://github.com/ToscDom/SPAM_Detection","commit_stats":null,"previous_names":["toscdom/spam_detection"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ToscDom%2FSPAM_Detection","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ToscDom%2FSPAM_Detection/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ToscDom%2FSPAM_Detection/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ToscDom%2FSPAM_Detection/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ToscDom","download_url":"https://codeload.github.com/ToscDom/SPAM_Detection/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246420962,"owners_count":20774414,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["classifier","nlp","nltk","scikit-learn","semantic-analysis","spam-detection"],"created_at":"2024-12-13T03:15:57.931Z","updated_at":"2026-04-27T21:31:15.470Z","avatar_url":"https://github.com/ToscDom.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Email Analysis and Classification for SPAM Detection ##\nThis project focuses on analyzing and classifying emails to identify SPAM and extract key insights from the data. \n\nThe main objectives of the project are:\n1. __Train a Classifier__ : Develop a machine learning model to accurately classify emails as SPAM or NOT SPAM.\n2. __Topic Identification__: Analyze the content of SPAM emails to identify the main topics or recurring themes\n3. __Semantic Analysis__: Calculate the semantic distance between identified topics to evaluate their similarity.\n\n__Features__\n- *Data Preparation*: Load, clean, and preprocess email datasets to prepare them for analysis.\n- *Model Training*: Utilize supervised learning techniques to build a robust SPAM detection classifier.\n- *Topic Modeling*: Use natural language processing (NLP) methods to identify and group similar topics within SPAM emails.\n- *Semantic Distance Calculation*: Employ techniques to measure the similarity between topics, aiding in deeper understanding of email patterns.\n\n__Tools and Libraries__\n- *Core Libraries*: pandas, numpy, and collections for data manipulation and analysis.\n- *Visualization*: Libraries like matplotlib and seaborn for exploratory data analysis and visual representation of findings.\n- *Machine Learning and NLP*: Frameworks for building and evaluating the SPAM classifier and topic models.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftoscdom%2Fspam_detection","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftoscdom%2Fspam_detection","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftoscdom%2Fspam_detection/lists"}