https://github.com/bradley-butcher/topcon
TopCon^2 - Conformal Constrained Topic Classification
https://github.com/bradley-butcher/topcon
conformal-prediction gpt-3 language-model topic-classification topic-modeling
Last synced: 5 months ago
JSON representation
TopCon^2 - Conformal Constrained Topic Classification
- Host: GitHub
- URL: https://github.com/bradley-butcher/topcon
- Owner: Bradley-Butcher
- Created: 2023-07-15T22:04:45.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2023-08-02T12:43:44.000Z (about 2 years ago)
- Last Synced: 2025-03-31T15:00:59.334Z (6 months ago)
- Topics: conformal-prediction, gpt-3, language-model, topic-classification, topic-modeling
- Language: Python
- Homepage:
- Size: 69.3 KB
- Stars: 3
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Conformal Constrained Topic Classification - TopCon^2
Creating this repository because I have the feeling GPT3.5 with constrained generation, and some calibration with conformal prediction could do very well on topic classification.
Either way, it's generic so you can use whatever topic classification you want.
I called it topcon^2 because it's 11pm, I'm tired, and lacking imagination apparently.
## Installation
Install with poetry:
```bash
pip install poetry
poetry install
```## Usage
Requires calibrating via a calibration set before prediction. There is a huggingface datasets class-method for ease of use. Below is an example.
```python
from topcon.predict import topic_proba
from topcon.calibration import ConformalPredictortopic_conformer = ConformalPredictor.from_hf_datasets(
hf_repo_name='yahoo_answers_topics',
topic_column='topic',
text_columns=['question_title', 'question_content'],
topic_proba=topic_proba, #the topic classification function
calibration_size=1000,
save_path='save.pkl',
)
```You can then predict on new text:
```python
topic_conformer.get_prediction_sets(
text='I think football is really amazing because man kick ball good',
)
```Can also load a previously calibrated model:
```python
topic_conformer = ConformalPredictor.load('save.pkl')
```## TODO
- [ ] Add Tests
- [ ] Perform extensive evaluation
- [ ] Get a job at DeepMind