https://github.com/freedomintelligence/textclassificationbenchmark

A Benchmark of Text Classification in PyTorch
https://github.com/freedomintelligence/textclassificationbenchmark

attention-is-all-you-need benchmark capusle cnn cnn-classification crnn lstm lstm-sentiment-analysis pytorch quantum rcnn text-classification

Last synced: 3 months ago
JSON representation

A Benchmark of Text Classification in PyTorch

Host: GitHub
URL: https://github.com/freedomintelligence/textclassificationbenchmark
Owner: FreedomIntelligence
License: mit
Created: 2017-12-13T09:08:33.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2024-04-20T01:36:55.000Z (about 1 year ago)
Last Synced: 2025-04-13T00:49:13.192Z (3 months ago)
Topics: attention-is-all-you-need, benchmark, capusle, cnn, cnn-classification, crnn, lstm, lstm-sentiment-analysis, pytorch, quantum, rcnn, text-classification
Language: Python
Size: 1.76 MB
Stars: 603
Watchers: 32
Forks: 137
Open Issues: 21
Metadata Files:
- Readme: README.md
- License: LICENSE.txt

Awesome Lists containing this project

README

        # Text Classification Benchmark

A Benchmark of Text Classification in PyTorch

## Motivation

We are trying to build a Benchmark for Text Classification including

>Many Text Classification  **DataSet**, including Sentiment/Topic Classfication, popular language(e.g. English and Chinese). Meanwhile, a basic word embedding is provided.

>Implment many popular and state-of-art **Models**, especially in deep neural network.

## Have done

We have done some dataset and models

### Dataset done

- IMDB

- SST 

- Trec

### Models done

- FastText

- BasicCNN (KimCNN,MultiLayerCNN, Multi-perspective CNN)

- InceptionCNN

- LSTM (BILSTM, StackLSTM)

- LSTM with Attention (Self Attention / Quantum Attention)

- Hybrids between CNN and RNN (RCNN, C-LSTM)

- Transformer - Attention is all you need

- ConS2S

- Capsule

- Quantum-inspired NN

## Libary

You should have install [these librarys](docs/windows_torch_en.md)


python3

torch

torchtext (optional)



## Dataset 

Dataset will be automatically configured in current path, or download manually your data in [Dataset](docs/data_config_en.md),  step-by step.

including


Glove embeding

Sentiment classfication dataset IMDB



## usage

Run in default  setting

python main.py


CNN 

python main.py --model cnn


LSTM

python main.py --model lstm


## Road Map

- [X] Data preprossing framework

- [X] Models modules

- [ ] Loss, Estimator and hyper-paramter tuning.

- [ ] Test modules

- [ ] More Dataset

- [ ] More models

## Organisation of the repository

The core of this repository is models and dataset.

* ```dataloader/```: loading all dataset such as ```IMDB```, ```SST```

* ```models/```: creating all models such as ```FastText```, ```LSTM```,```CNN```,```Capsule```,```QuantumCNN``` ,```Multi-Head Attention```

* ```opts.py```: Parameter and config info.

* ```utils.py```: tools.

* ```dataHelper```: data helper

## Contributor

-	[@Allenzhai](https://github.com/zhaizheng)

-	[@JaredWei](https://github.com/jacobwei)

-	[@AlexMeng](https://github.com/EdwardLorenz)

-	[@Lilianwang](https://github.com/WangLilian)

-	[@ZhanSu](https://github.com/shuishen112)

-	[@Wabywang](https://github.com/Wabyking)

Welcome your issues and contribution!!!

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/freedomintelligence/textclassificationbenchmark

Awesome Lists containing this project

README