{"id":28923958,"url":"https://github.com/sabin74/spam_mail_detection","last_synced_at":"2026-05-04T12:33:11.732Z","repository":{"id":298742109,"uuid":"1000964832","full_name":"sabin74/spam_mail_detection","owner":"sabin74","description":"A machine learning project to classify SMS messages as Spam or Ham (Not Spam) using Natural Language Processing (NLP) techniques and Scikit-learn. This binary classification task uses the UCI SMS Spam Collection Dataset and implements various models including Naive Bayes, SVM, and Logistic Regression with performance tuning.","archived":false,"fork":false,"pushed_at":"2025-06-12T15:50:04.000Z","size":403,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-22T10:02:42.927Z","etag":null,"topics":["gridsearchcv","nltk","python","scikit-learn","smote","sms-spam-detection","uci-machine-learning"],"latest_commit_sha":null,"homepage":"https://archive.ics.uci.edu/dataset/228/sms+spam+collection","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sabin74.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-12T15:39:22.000Z","updated_at":"2025-06-12T15:50:45.000Z","dependencies_parsed_at":"2025-06-12T16:49:56.561Z","dependency_job_id":"d8ffb078-baab-45b3-8f09-62d63fce3e02","html_url":"https://github.com/sabin74/spam_mail_detection","commit_stats":null,"previous_names":["sabin74/spam_mail_detection"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/sabin74/spam_mail_detection","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sabin74%2Fspam_mail_detection","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sabin74%2Fspam_mail_detection/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sabin74%2Fspam_mail_detection/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sabin74%2Fspam_mail_detection/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sabin74","download_url":"https://codeload.github.com/sabin74/spam_mail_detection/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sabin74%2Fspam_mail_detection/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32607526,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-04T10:08:07.713Z","status":"ssl_error","status_checked_at":"2026-05-04T10:08:02.005Z","response_time":58,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gridsearchcv","nltk","python","scikit-learn","smote","sms-spam-detection","uci-machine-learning"],"created_at":"2025-06-22T10:02:26.708Z","updated_at":"2026-05-04T12:33:11.727Z","avatar_url":"https://github.com/sabin74.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 📧 Spam Email Detection\n\nA machine learning project to classify SMS messages as **Spam** or **Ham** (Not Spam) using **Natural Language Processing (NLP)** techniques and **Scikit-learn**. This binary classification task uses the **UCI SMS Spam Collection Dataset** and implements various models including Naive Bayes, SVM, and Logistic Regression with performance tuning.\n\n---\n\n## 🚀 Features\n\n- Text preprocessing and cleaning\n- Feature extraction using TF-IDF with n-grams\n- Handling imbalanced classes using **SMOTE**\n- Hyperparameter tuning with **GridSearchCV**\n- Model comparison: Naive Bayes, SVM, Logistic Regression\n- Save \u0026 load trained model and vectorizer\n- Predict new SMS messages\n\n---\n\n## 🛠️ Tools \u0026 Libraries\n\n- Python\n- Pandas, NumPy\n- Scikit-learn\n- NLTK (for stopword removal)\n- Imbalanced-learn (for SMOTE)\n- Matplotlib, Seaborn (for visualization)\n- Joblib (for model persistence)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsabin74%2Fspam_mail_detection","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsabin74%2Fspam_mail_detection","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsabin74%2Fspam_mail_detection/lists"}