https://github.com/taishan1994/pytorch_chinese_generate

基于pytorch的中文文本生成。
https://github.com/taishan1994/pytorch_chinese_generate

Last synced: 6 months ago
JSON representation

基于pytorch的中文文本生成。

Host: GitHub
URL: https://github.com/taishan1994/pytorch_chinese_generate
Owner: taishan1994
Created: 2022-07-07T07:56:25.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2022-12-20T07:19:29.000Z (almost 3 years ago)
Last Synced: 2023-03-04T13:53:09.611Z (over 2 years ago)
Language: Python
Size: 27.5 MB
Stars: 4
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# pytorch_Chinese_Generate
基于pytorch的中文文本生成。

时下主流的预训练框架可以分为三种：
- autoregressive 自回归模型的代表是GPT，本质上是一个从左到右的语言模型，常用于无条件生成任务（unconditional generation）。
- autoencoding 自编码模型是通过某个降噪目标（如掩码语言模型）训练的语言编码器，如BERT、ALBERT、DeBERTa。自编码模型擅长自然语言理解任务（natural language understanding tasks），常被用来生成句子的上下文表示。
- encoder-decoder 则是一个完整的Transformer结构，包含一个编码器和一个解码器，以T5、BART为代表，常用于有条件的生成任务（conditional generation）。

bert_unilm相当于是将bert改造为一种encoder-decode结构，当然也可以改造为autoencoding这种方式。

### 参考
> 理论：https://zhuanlan.zhihu.com/p/532851481

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/taishan1994/pytorch_chinese_generate

Awesome Lists containing this project

README