Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/strickvl/isafpr_finetune
Finetuning an LLM for structured data extraction from press releases
https://github.com/strickvl/isafpr_finetune
fine-tuning finetuning llm llms
Last synced: about 1 month ago
JSON representation
Finetuning an LLM for structured data extraction from press releases
- Host: GitHub
- URL: https://github.com/strickvl/isafpr_finetune
- Owner: strickvl
- Created: 2024-06-06T21:08:03.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2024-06-27T07:46:00.000Z (6 months ago)
- Last Synced: 2024-06-27T09:01:16.845Z (6 months ago)
- Topics: fine-tuning, finetuning, llm, llms
- Language: Jupyter Notebook
- Homepage:
- Size: 5.19 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ISAF Press Releases Finetuning
I'll be doing a few finetunes around a dataset I annotated a few years back which is probably an interesting use case for structured data extraction. Some links for context:
https://mlops.systems/posts/2024-03-24-publishing-afghanistan-dataset-huggingface.html is a blog I wrote about the dataset
https://huggingface.co/datasets/strickvl/isafpressreleases is the original dataset.
https://mlops.systems/posts/2024-06-02-isafpr-prompting-baseline.html describes the context of the task for which I want to fine-tune.
https://mlops.systems/posts/2024-06-03-isafpr-evaluating-baseline.html is a blog
where I examine the baseline performance of GPT-4-Turbo at extracting entities
from the text (as I hope to achieve with finetuning).