Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/freedomintelligence/textclassificationbenchmark
A Benchmark of Text Classification in PyTorch
https://github.com/freedomintelligence/textclassificationbenchmark
attention-is-all-you-need benchmark capusle cnn cnn-classification crnn lstm lstm-sentiment-analysis pytorch quantum rcnn text-classification
Last synced: about 9 hours ago
JSON representation
A Benchmark of Text Classification in PyTorch
- Host: GitHub
- URL: https://github.com/freedomintelligence/textclassificationbenchmark
- Owner: FreedomIntelligence
- License: mit
- Created: 2017-12-13T09:08:33.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2024-04-20T01:36:55.000Z (7 months ago)
- Last Synced: 2024-07-11T04:18:14.399Z (4 months ago)
- Topics: attention-is-all-you-need, benchmark, capusle, cnn, cnn-classification, crnn, lstm, lstm-sentiment-analysis, pytorch, quantum, rcnn, text-classification
- Language: Python
- Size: 1.76 MB
- Stars: 596
- Watchers: 32
- Forks: 138
- Open Issues: 20
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# Text Classification Benchmark
A Benchmark of Text Classification in PyTorch## Motivation
We are trying to build a Benchmark for Text Classification including
>Many Text Classification **DataSet**, including Sentiment/Topic Classfication, popular language(e.g. English and Chinese). Meanwhile, a basic word embedding is provided.
>Implment many popular and state-of-art **Models**, especially in deep neural network.
## Have done
We have done some dataset and models
### Dataset done
- IMDB
- SST
- Trec### Models done
- FastText
- BasicCNN (KimCNN,MultiLayerCNN, Multi-perspective CNN)
- InceptionCNN
- LSTM (BILSTM, StackLSTM)
- LSTM with Attention (Self Attention / Quantum Attention)
- Hybrids between CNN and RNN (RCNN, C-LSTM)
- Transformer - Attention is all you need
- ConS2S
- Capsule
- Quantum-inspired NN## Libary
You should have install [these librarys](docs/windows_torch_en.md)
python3
torch
torchtext (optional)## Dataset
Dataset will be automatically configured in current path, or download manually your data in [Dataset](docs/data_config_en.md), step-by step.including
Glove embeding
Sentiment classfication dataset IMDB## usage
Run in default setting
python main.py
CNN
python main.py --model cnn
LSTM
python main.py --model lstm
## Road Map
- [X] Data preprossing framework
- [X] Models modules
- [ ] Loss, Estimator and hyper-paramter tuning.
- [ ] Test modules
- [ ] More Dataset
- [ ] More models## Organisation of the repository
The core of this repository is models and dataset.* ```dataloader/```: loading all dataset such as ```IMDB```, ```SST```
* ```models/```: creating all models such as ```FastText```, ```LSTM```,```CNN```,```Capsule```,```QuantumCNN``` ,```Multi-Head Attention```
* ```opts.py```: Parameter and config info.
* ```utils.py```: tools.
* ```dataHelper```: data helper
## Contributor
- [@Allenzhai](https://github.com/zhaizheng)
- [@JaredWei](https://github.com/jacobwei)
- [@AlexMeng](https://github.com/EdwardLorenz)
- [@Lilianwang](https://github.com/WangLilian)
- [@ZhanSu](https://github.com/shuishen112)
- [@Wabywang](https://github.com/Wabyking)Welcome your issues and contribution!!!