{"id":15157797,"url":"https://github.com/amiriiw/text_classification","last_synced_at":"2026-01-20T18:56:14.525Z","repository":{"id":252891257,"uuid":"841811391","full_name":"amiriiw/text_classification","owner":"amiriiw","description":"Welcome to the Text Classification Project! This project is designed to train a model for classifying texts based on their emotional content and then using it to categorize new texts into corresponding emotional categories.","archived":false,"fork":false,"pushed_at":"2024-10-28T07:34:39.000Z","size":27,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-13T17:18:09.007Z","etag":null,"topics":["keras","numpy","pandas","pickle","scikit-learn","tensorflow","text-classification"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/amiriiw.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-13T05:03:12.000Z","updated_at":"2024-10-28T07:34:42.000Z","dependencies_parsed_at":null,"dependency_job_id":"892bfb79-33ab-4756-ac5f-71826d106fbd","html_url":"https://github.com/amiriiw/text_classification","commit_stats":{"total_commits":3,"total_committers":1,"mean_commits":3.0,"dds":0.0,"last_synced_commit":"54ae6edc45d66cbb376ec93bf71793e8134a919b"},"previous_names":["amiriiw/text_classification"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amiriiw%2Ftext_classification","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amiriiw%2Ftext_classification/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amiriiw%2Ftext_classification/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/amiriiw%2Ftext_classification/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/amiriiw","download_url":"https://codeload.github.com/amiriiw/text_classification/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247675633,"owners_count":20977376,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["keras","numpy","pandas","pickle","scikit-learn","tensorflow","text-classification"],"created_at":"2024-09-26T20:03:44.555Z","updated_at":"2025-04-07T14:46:49.344Z","avatar_url":"https://github.com/amiriiw.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Sentiment Detection using LSTM and PostgreSQL\n\nThis project includes two main scripts for detecting sentiment (positive or negative) in text: `train.py` for training the model and `detect.py` for predicting the sentiment in new text inputs, then saving the results to a PostgreSQL database.\n\n## Features\n- **Sentiment Classification**: Classifies text into positive or negative sentiment using an LSTM model.\n- **Database Integration**: Stores detected sentiments in a PostgreSQL database for record-keeping.\n- **Tokenization**: Uses tokenizers to prepare text data for model input.\n\n## Libraries Used\n- [TensorFlow](https://www.tensorflow.org/) for deep learning model development and training.\n- [Pandas](https://pandas.pydata.org/) for data loading and preprocessing.\n- [scikit-learn](https://scikit-learn.org/) for data splitting into training and testing sets.\n- [Psycopg2](https://www.psycopg.org/) for database interaction with PostgreSQL.\n- [Numpy](https://numpy.org/) for numerical operations and data handling.\n\n## Introduction\n\n### File: `train.py`\nThis file is responsible for training an LSTM model to detect sentiment in text. Key classes and functions include:\n\n- **Class ModelTrainer**: Manages tokenizer setup, data preparation, model building, and training.\n  - `__init__(self, vocab_size, max_length, embedding_dim)`: Initializes model parameters.\n  - `load_data(self, file_path, test_size)`: Loads data from a CSV file and splits it for training and testing.\n  - `build_model(self)`: Builds and configures the LSTM model.\n  - `train_model(self, train_data, train_labels, test_data, test_labels, epochs, batch_size)`: Trains the model with input data.\n  - `save_model(self, file_path)`: Saves the trained model to the specified path.\n  - `save_tokenizers(self, tokenizer_path, label_tokenizer_path)`: Saves the text and label tokenizers.\n\n### File: `detect.py`\nThis file is used for detecting sentiment in new text inputs and saving the results to a PostgreSQL database. Key classes and functions include:\n\n- **Class EmotionClassifier**: Manages the model, tokenizer, database connection, and prediction functions.\n  - `__init__(self, model_path, tokenizer_path, label_tokenizer_path, db_params)`: Initializes the model, tokenizers, and database connection.\n  - `connect_db(self)`: Connects to a PostgreSQL database.\n  - `create_table(self)`: Creates a table for storing sentiment results if it doesn’t already exist.\n  - `predict_emotion(self, sentence)`: Predicts the sentiment of a sentence.\n  - `classify_sentences(self, input_file)`: Classifies sentences from a file and saves them to the database.\n  - `close_db(self)`: Closes the database connection.\n\n## Usage\n\n### Training the Model\n1. Ensure there is a `dataset.csv` file with `text` and `label` columns in the project directory. Labels should be \"positive\" or \"negative.\"\n2. Run `train.py`:\n\n    ```bash\n    python3 train.py\n    ```\n\n3. This will train the model and save the trained model and tokenizers to the current directory.\n\n### Detecting Sentiment\n1. Ensure PostgreSQL is set up and connection parameters are configured in `detect.py`.\n2. Run `detect.py`:\n\n    ```bash\n    python3 detect.py\n    ```\n\n3. This script will classify the sentences in `text.txt` and save the sentiment results to the database.\n\n## Installation\n1. Clone this repository:\n\n    ```bash\n    git clone https://github.com/amiriiw/text_classification\n    cd text_classification\n    cd Text_classification\n    ```\n\n2. Install the required packages:\n\n    ```bash\n    pip3 install -r requirements.txt\n    ```\n\n3. Ensure PostgreSQL is installed, and create a database for this project.\n\n4. Download the dataset via this link: [Drive](https://drive.google.com/drive/folders/1wazzbRMNZFaLdFZCWYyLhKHUBzfc4adV?usp=sharing)\n\n\n## License\nThis project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famiriiw%2Ftext_classification","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Famiriiw%2Ftext_classification","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Famiriiw%2Ftext_classification/lists"}