https://github.com/voidful/tfkit
🤖📇 handling multiple nlp task in one pipeline
https://github.com/voidful/tfkit
multi-label-classification multi-task nlp tagger tagging text-classification text-generation text-processing transformer-models transformers
Last synced: 5 months ago
JSON representation
🤖📇 handling multiple nlp task in one pipeline
- Host: GitHub
- URL: https://github.com/voidful/tfkit
- Owner: voidful
- License: apache-2.0
- Created: 2019-12-21T10:58:39.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2025-06-30T01:52:17.000Z (7 months ago)
- Last Synced: 2025-08-30T18:35:07.023Z (5 months ago)
- Topics: multi-label-classification, multi-task, nlp, tagger, tagging, text-classification, text-generation, text-processing, transformer-models, transformers
- Language: Python
- Homepage: https://voidful.github.io/TFkit/
- Size: 15.9 MB
- Stars: 56
- Watchers: 5
- Forks: 6
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
## What is it
TFKit is a tool kit mainly for language generation.
It leverages the use of transformers on many tasks with different models in this all-in-one framework.
All you need is a little change of config.
## Task Supported
With transformer models - BERT/ALBERT/T5/BART......
| | |
|-|-|
| Text Generation | :memo: seq2seq language model |
| Text Generation | :pen: causal language model |
| Text Generation | :printer: once generation model / once generation model with ctc loss |
| Text Generation | :pencil: onebyone generation model |
# Getting Started
Learn more from the [document](https://voidful.github.io/TFkit/).
## How To Use
### Step 0: Install
Simple installation from PyPI
```bash
pip install git+https://github.com/voidful/TFkit.git@refactor-dataset
```
### Step 1: Prepare dataset in csv format
[Task format](https://voidful.tech/TFkit/tasks/)
```
input, target
```
### Step 2: Train model
```bash
tfkit-train \
--task clas \
--config xlm-roberta-base \
--train training_data.csv \
--test testing_data.csv \
--lr 4e-5 \
--maxlen 384 \
--epoch 10 \
--savedir roberta_sentiment_classifier
```
### Step 3: Evaluate
```bash
tfkit-eval \
--task roberta_sentiment_classifier/1.pt \
--metric clas \
--valid testing_data.csv
```
## Advanced features
Multi-task training
```bash
tfkit-train \
--task clas clas \
--config xlm-roberta-base \
--train training_data_taskA.csv training_data_taskB.csv \
--test testing_data_taskA.csv testing_data_taskB.csv \
--lr 4e-5 \
--maxlen 384 \
--epoch 10 \
--savedir roberta_sentiment_classifier_multi_task
```
## Not maintained task
Due to time constraints, the following tasks are temporarily not supported
| | |
|-|-|
| Classification | :label: multi-class and multi-label classification |
| Question Answering | :page_with_curl: extractive qa |
| Question Answering | :radio_button: multiple-choice qa |
| Tagging | :eye_speech_bubble: sequence level tagging / sequence level with crf |
| Self-supervise Learning | :diving_mask: mask language model |
## Supplement
- [transformers models list](https://huggingface.co/models): you can find any pretrained models here
- [nlprep](https://github.com/voidful/NLPrep): download and preprocessing data in one line
- [nlp2go](https://github.com/voidful/nlp2go): create demo api as quickly as possible.
## Contributing
Thanks for your interest.There are many ways to contribute to this project. Get started [here](https://github.com/voidful/tfkit/blob/master/CONTRIBUTING.md).
## License 
* [License](https://github.com/voidful/tfkit/blob/master/LICENSE)
## Icons reference
Icons modify from Freepik from www.flaticon.com
Icons modify from Nikita Golubev from www.flaticon.com