https://github.com/voidful/tfkit

🤖📇 handling multiple nlp task in one pipeline
https://github.com/voidful/tfkit

multi-label-classification multi-task nlp tagger tagging text-classification text-generation text-processing transformer-models transformers

Last synced: 10 months ago
JSON representation

🤖📇 handling multiple nlp task in one pipeline

Host: GitHub
URL: https://github.com/voidful/tfkit
Owner: voidful
License: apache-2.0
Created: 2019-12-21T10:58:39.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2025-06-30T01:52:17.000Z (about 1 year ago)
Last Synced: 2025-08-30T18:35:07.023Z (10 months ago)
Topics: multi-label-classification, multi-task, nlp, tagger, tagging, text-classification, text-generation, text-processing, transformer-models, transformers
Language: Python
Homepage: https://voidful.github.io/TFkit/
Size: 15.9 MB
Stars: 56
Watchers: 5
Forks: 6
Open Issues: 0
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE

Awesome Lists containing this project

README

## What is it
TFKit is a tool kit mainly for language generation.
It leverages the use of transformers on many tasks with different models in this all-in-one framework.
All you need is a little change of config.

# Getting Started
Learn more from the [document](https://voidful.github.io/TFkit/).

## How To Use

### Step 0: Install
Simple installation from PyPI
```bash
pip install git+https://github.com/voidful/TFkit.git@refactor-dataset
```

### Step 1: Prepare dataset in csv format
[Task format](https://voidful.tech/TFkit/tasks/)
```
input, target
```

### Step 2: Train model
```bash
tfkit-train \
--task clas \
--config xlm-roberta-base \
--train training_data.csv \
--test testing_data.csv \
--lr 4e-5 \
--maxlen 384 \
--epoch 10 \
--savedir roberta_sentiment_classifier
```

### Step 3: Evaluate
```bash
tfkit-eval \
--task roberta_sentiment_classifier/1.pt \
--metric clas \
--valid testing_data.csv
```

## Advanced features

Multi-task training

```bash
tfkit-train \
--task clas clas \
--config xlm-roberta-base \
--train training_data_taskA.csv training_data_taskB.csv \
--test testing_data_taskA.csv testing_data_taskB.csv \
--lr 4e-5 \
--maxlen 384 \
--epoch 10 \
--savedir roberta_sentiment_classifier_multi_task
```

## Supplement
- [transformers models list](https://huggingface.co/models): you can find any pretrained models here
- [nlprep](https://github.com/voidful/NLPrep): download and preprocessing data in one line
- [nlp2go](https://github.com/voidful/nlp2go): create demo api as quickly as possible.

## Contributing
Thanks for your interest.There are many ways to contribute to this project. Get started [here](https://github.com/voidful/tfkit/blob/master/CONTRIBUTING.md).

## License ![PyPI - License](https://img.shields.io/github/license/voidful/tfkit)

* [License](https://github.com/voidful/tfkit/blob/master/LICENSE)

## Icons reference
Icons modify from Freepik from www.flaticon.com
Icons modify from Nikita Golubev from www.flaticon.com

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/voidful/tfkit

Awesome Lists containing this project

README