https://github.com/wakafengfan/simcse-pytorch
pytorch版simcse无监督语义相似模型
https://github.com/wakafengfan/simcse-pytorch
Last synced: about 1 year ago
JSON representation
pytorch版simcse无监督语义相似模型
- Host: GitHub
- URL: https://github.com/wakafengfan/simcse-pytorch
- Owner: wakafengfan
- License: mit
- Created: 2021-04-28T08:57:03.000Z (about 5 years ago)
- Default Branch: main
- Last Pushed: 2021-05-13T04:20:43.000Z (about 5 years ago)
- Last Synced: 2024-11-16T07:33:14.825Z (over 1 year ago)
- Language: Python
- Homepage:
- Size: 13.7 KB
- Stars: 22
- Watchers: 2
- Forks: 3
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- StarryDivineSky - wakafengfan/simcse-pytorch
README
# simcse-pytorch
最近出圈的无监督语义表示模型simcse,依然是基于苏神的keras版本改造的pytorch版本,
先占坑待后续补充更多实验,并补充Danqi女神的pytorch版本在中文上效果
目前仅实验了roberta-wwm在LCQMC上无监督训练效果,评测指标是Spearman correlation
| Model | correlation score |
| -------------------------- | ----------------- |
| `roberta-wwm` | 0.67029 |
| dropout_rate=0.1 | |
| learning_rate=1e-5 | |
| pooling: first-last-avg | |
| `roberta-wwm(no training)` | 0.60377 |
| pooling: first-last-avg | |
### 环境
- python==3.6.*
- pytorch==1.8
- transformers==4.4.2
### 参考
SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/pdf/2104.08821.pdf
https://kexue.fm/archives/8348