https://github.com/tata1661/shine-emnlp21
Codes for SHINE published in EMNLP 2021.
https://github.com/tata1661/shine-emnlp21
graph-neural-networks graph-structure-learning hierarchical-graph-neural-network semi-supervised-learning short-text-classification text-classification
Last synced: about 1 year ago
JSON representation
Codes for SHINE published in EMNLP 2021.
- Host: GitHub
- URL: https://github.com/tata1661/shine-emnlp21
- Owner: tata1661
- Created: 2021-09-10T12:13:18.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2022-07-01T04:29:38.000Z (almost 4 years ago)
- Last Synced: 2025-03-27T22:12:31.376Z (about 1 year ago)
- Topics: graph-neural-networks, graph-structure-learning, hierarchical-graph-neural-network, semi-supervised-learning, short-text-classification, text-classification
- Language: Python
- Homepage:
- Size: 46 MB
- Stars: 38
- Watchers: 3
- Forks: 5
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# SHINE-EMNLP21

This repository provides the source codes of ["Hierarchical Heterogeneous Graph Representation Learning for Short Text Classification"](https://aclanthology.org/2021.emnlp-main.247/) published in *EMNLP 2021* as a long paper.
Please cite our paper if you find it helpful. Thanks.
```
@inproceedings{wang-etal-2021-hierarchical,
title = "Hierarchical Heterogeneous Graph Representation Learning for Short Text Classification",
author = "Wang, Yaqing and
Wang, Song and
Yao, Quanming and
Dou, Dejing",
booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
month = nov,
year = "2021",
address = "Online and Punta Cana, Dominican Republic",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2021.emnlp-main.247",
doi = "10.18653/v1/2021.emnlp-main.247",
pages = "3091--3101",
abstract = "Short text classification is a fundamental task in natural language processing. It is hard due to the lack of context information and labeled data in practice. In this paper, we propose a new method called SHINE, which is based on graph neural network (GNN), for short text classification. First, we model the short text dataset as a hierarchical heterogeneous graph consisting of word-level component graphs which introduce more semantic and syntactic information. Then, we dynamically learn a short document graph that facilitates effective label propagation among similar short texts. Thus, comparing with existing GNN-based methods, SHINE can better exploit interactions between nodes of the same types and capture similarities between short texts. Extensive experiments on various benchmark short text datasets show that SHINE consistently outperforms state-of-the-art methods, especially with fewer labels.",
}
```
## Environment
We provide both the **PyTorch** and **PaddlePaddle** implementations of SHINE in this repository:
### Torch Version:
- Python 3.7
- Pytorch 1.2
### Paddle Version:
- Python 3.7
- Paddlepaddle 2.2
## Quick Start
If you want to run the Torch version:
```
cd SHINE-Torch
```
Or if you want to run the PaddlePaddle version:
```
cd SHINE-Paddle
```
Then, You can quickly check out how SHINE operates on the Twitter dataset by:
```
Python train.py
```
You can choose a specific dataset by:
```
Python train.py --dataset snippets
```
Likewise, you can choose the specific GPU by:
```
Python train.py --dataset snippets --gpu 2
```
## Use Your Own Datasets
If you want to try SHINE on your own datasets, you need to make your data in the same form of "snippets_split.json".
For the pretrained NELL entity embedding and Glove6B word embedding used in SHINE, you can download them from [here](https://drive.google.com/file/d/1gzIsN6XVqEXPJQR8MXVolbmKqlPgU_YA/view?usp=sharing).
Afterwards, you can preprocess the data by:
```
Python preprocess.py
```