https://github.com/taylor-eos/bert-classifier

Text classification experiment using language model
https://github.com/taylor-eos/bert-classifier

Last synced: 26 days ago
JSON representation

Text classification experiment using language model

Host: GitHub
URL: https://github.com/taylor-eos/bert-classifier
Owner: Taylor-eOS
Created: 2024-09-22T18:25:01.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2024-10-29T09:10:08.000Z (over 1 year ago)
Last Synced: 2025-01-14T18:41:52.468Z (over 1 year ago)
Language: Python
Homepage:
Size: 26.4 KB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

This was a learning project that uses DistilBERT to train and predict what type the blocks of text in a PDF are (header, body, footer, quote). The script used to make predictions in the vacinity of 97% accuracy, but then I added neighboring blocks as context, and now the model doesn't know what to focus on, and predictions are very bad. I made a [manual tool](https://github.com/Taylor-eOS/manual-classifier) instead, which is more robust. Machine learning is not the right tool for a task where you have limited ground truth and prefer accurate results. But the project runs and might be an interesting application of the technology for learning.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/taylor-eos/bert-classifier

Awesome Lists containing this project

README