{"id":17990270,"url":"https://github.com/doragd/text-classification-pytorch","last_synced_at":"2025-04-23T15:54:31.897Z","repository":{"id":37630804,"uuid":"202774986","full_name":"Doragd/Text-Classification-PyTorch","owner":"Doragd","description":"Implementation of papers for text classification task on SST-1/SST-2","archived":false,"fork":false,"pushed_at":"2024-07-25T10:16:14.000Z","size":19,"stargazers_count":63,"open_issues_count":1,"forks_count":10,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-03-30T01:12:15.947Z","etag":null,"topics":["bilstm-attention","nlp","sentiment-classification","text-classification","textcnn"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Doragd.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-08-16T17:51:49.000Z","updated_at":"2025-03-21T14:16:09.000Z","dependencies_parsed_at":"2022-08-18T03:05:25.163Z","dependency_job_id":null,"html_url":"https://github.com/Doragd/Text-Classification-PyTorch","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Doragd%2FText-Classification-PyTorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Doragd%2FText-Classification-PyTorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Doragd%2FText-Classification-PyTorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Doragd%2FText-Classification-PyTorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Doragd","download_url":"https://codeload.github.com/Doragd/Text-Classification-PyTorch/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250467759,"owners_count":21435445,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bilstm-attention","nlp","sentiment-classification","text-classification","textcnn"],"created_at":"2024-10-29T19:17:15.348Z","updated_at":"2025-04-23T15:54:31.865Z","avatar_url":"https://github.com/Doragd.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Text-Classification-PyTorch :whale2:\n\nHere is a new boy :bow:  who wants to become a NLPer and his repository for Text Classification.  Besides TextCNN and TextAttnBiLSTM, more models will be added in the near future. \n\nThanks for you Star:star:, Fork and Watch！\n\n## Dataset\n\n* [Stanford Sentiment Treebank(SST)](nlp.stanford.edu/sentiment/code.html)\n  * SST-1: 5 classes(fine-grained),  SST-2: 2 classes(binary)\n* Preprocess\n  * Map sentiment values to labels\n  * Remove tokens consisting of all non-alphanumeric characters, such as `...`\n\n## Pre-trained Word Vectors\n\n* [Word2Vec](https://code.google.com/archive/p/word2vec/) : `GoogleNews-vectors-negative300.bin`\n* [GloVe](https://nlp.stanford.edu/projects/glove/) : `glove.840B.300d.txt`\n  * Because the OOV Rate of *GloVe* is lower than *Word2Vec* and the experiment performance is also better than the other one, we use *GloVe* as pre-trained word vectors.\n  * Options for different format word vectors are still preserved in the code.\n\n## Model\n\n* TextCNN\n  \n  * Paper: [Convolutional Neural Networks for Sentence Classification](https://www.aclweb.org/anthology/D14-1181)\n  * See：`models/TextCNN.py`\n  \n  ![](https://ws1.sinaimg.cn/large/72cf269fly1g6229o5a47j20m609c74t.jpg)\n  \n* TextAttnBiLSTM\n  \n  * Paper: [Attention-Based Bidirection LSTM for Text Classification](https://www.aclweb.org/anthology/P16-2034)\n  * See: `models/TextAttnBiLSTM.py`\n\n![](https://ws1.sinaimg.cn/large/72cf269fly1g622af7rxij20la0axq3g.jpg)\n\n## Result\n\n* Baseline from the paper\n\n| model            | SST-1    | SST-2    |\n| ---------------- | -------- | -------- |\n| CNN-rand         | 45.0     | 82.7     |\n| CNN-static       | 45.5     | 86.8     |\n| CNN-non-static   | **48.0** | 87.2     |\n| CNN-multichannel | 47.4     | **88.1** |\n\n* Re-Implementation\n\n| model              | SST-1      | SST-2      |\n| ------------------ | ---------- | ---------- |\n| CNN-rand           | 34.841     | 74.500     |\n| CNN-static         | 45.056     | 84.125     |\n| CNN-non-static     | 46.974     | 85.886     |\n| CNN-multichannel   | 45.129     | **85.993** |\n| Attention + BiLSTM | 47.015     | 85.632     |\n| Attention + BiGRU  | **47.854** | 85.102     |\n\n## Requirement\n\nPlease install the following library requirements first.\n\n```markdown\npandas==0.24.2\ntorch==1.1.0\nfire==0.1.3\nnumpy==1.16.2\ngensim==3.7.3\n```\n\n## Structure\n\n```python\n│  .gitignore\n│  config.py            # Global Configuration\n│  datasets.py          # Create Dataloader\n│  main.py \n│  preprocess.py\n│  README.md\n│  requirements.txt\n│  utils.py   \n│  \n├─checkpoints           # Save checkpoint and best model\n│      \n├─data                  # pretrained word vectors and datasets\n│  │  glove.6B.300d.txt\n│  │  GoogleNews-vectors-negative300.bin\n│  └─stanfordSentimentTreebank # datasets folder\n│          \n├─models\n│      TextAttnBiLSTM.py\n│      TextCNN.py\n│      __init__.py\n│      \n└─output_data           # Preprocessed data and vocabulary, etc.\n```\n\n## Usage\n\n* Set global configuration parameters in config.py\n\n* Preprocess the datasets \n\n```shell\n$python preprocess.py\n```\n\n* Train\n\n```shell\n$python main.py run\n```\n\nYou can set the parameters in the `config.py` and `models/TextCNN.py` or `models/TextAttnBiLSTM.py` in the command line.\n\n```shell\n$python main.py run [--option=VALUE]\n```\n\nFor example，\n\n```shell\n$python main.py run --status='train' --use_model=\"TextAttnBiLSTM\"\n```\n\n* Test\n\n```shell\n$python main.py run --status='test' --best_model=\"checkpoints/BEST_checkpoint_SST-2_TextCNN.pth\"\n```\n\n## Conclusion\n\n* The `TextCNN` model uses the n-gram-like convolution kernel extraction feature, while the `TextAttnBiLSTM` model uses BiLSTM to capture semantics and long-term dependencies, combined with the attention mechanism for classification.\n* TextCNN Parameter tuning:\n  * glove is better than word2vec\n  * Use a smaller batch size\n  * Add weight decay ($l_2$ constraint), learning rate decay, early stop, etc.\n  * Do not set `padding_idx=0` in embedding layer\n* TextAttnBiLSTM\n  * Apply dropout on embedding layer, LSTM layer, and fully-connected layer\n\n## Acknowledge\n\n* Motivated by https://github.com/TobiasLee/Text-Classification\n* Thanks to https://github.com/bigboNed3/chinese_text_cnn\n* Thanks to https://github.com/ShawnyXiao/TextClassification-Keras\n\n## Reference\n\n[1] [Convolutional Neural Networks for Sentence Classification](http://www.aclweb.org/anthology/D14-1181)\n\n[2] [A Sensitivity Analysis of (and Practitioners' Guide to) Convolutional Neural Networks for Sentence Classification](https://arxiv.org/pdf/1510.03820)\n\n[3] [Attention-Based Bidirection LSTM for Text Classification](https://www.aclweb.org/anthology/P16-2034)\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdoragd%2Ftext-classification-pytorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdoragd%2Ftext-classification-pytorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdoragd%2Ftext-classification-pytorch/lists"}