https://github.com/jhj0517/document_classification

finetune text classification model
https://github.com/jhj0517/document_classification

ai deep-learning document-classification open-source text-classification

Last synced: about 1 year ago
JSON representation

finetune text classification model

Host: GitHub
URL: https://github.com/jhj0517/document_classification
Owner: jhj0517
Created: 2023-12-07T12:59:18.000Z (over 2 years ago)
Default Branch: master
Last Pushed: 2023-12-10T10:04:34.000Z (over 2 years ago)
Last Synced: 2025-02-12T10:18:22.749Z (over 1 year ago)
Topics: ai, deep-learning, document-classification, open-source, text-classification
Language: Jupyter Notebook
Homepage:
Size: 4.9 MB
Stars: 2
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# document clasification
This repository is dedicated to fine-tune the text classification models.
It primarily focuses on fine-tuning the pre-trained BERT model, utilizing the [ratsnlp](https://github.com/ratsgo/ratsnlp) package.

# Notebook
If you want to try it in the colab, please refer to [notebook](https://colab.research.google.com/github/jhj0517/document_classification/blob/master/notebook/document_classification.ipynb) here.

# Dataset
You will need to prepare a dataset comprising two columns: one for the `document` and the other for `label`. An example of the dataset format is as follows:

| label | document |
|----------|----------------|
| sadness | I'm so sad |
| happiness | I'm happy!! |

For a more detailed understanding, please refer to the [example dataset](https://github.com/jhj0517/document_classification/tree/master/example_data).

This repository includes a very small sample example dataset sourced from Kaggle, available here: [Kaggle Dataset](https://www.kaggle.com/datasets/pashupatigupta/emotion-detection-from-text)

Note: The sample in the repo is very small size, it is recommended to prepare a much larger dataset.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/jhj0517/document_classification

Awesome Lists containing this project

README