Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/strickvl/isafpr_finetune

Finetuning an LLM for structured data extraction from press releases
https://github.com/strickvl/isafpr_finetune

fine-tuning finetuning llm llms

Last synced: about 1 month ago
JSON representation

Finetuning an LLM for structured data extraction from press releases

Awesome Lists containing this project

README

        

# ISAF Press Releases Finetuning

I'll be doing a few finetunes around a dataset I annotated a few years back which is probably an interesting use case for structured data extraction. Some links for context:

https://mlops.systems/posts/2024-03-24-publishing-afghanistan-dataset-huggingface.html is a blog I wrote about the dataset

https://huggingface.co/datasets/strickvl/isafpressreleases is the original dataset.

https://mlops.systems/posts/2024-06-02-isafpr-prompting-baseline.html describes the context of the task for which I want to fine-tune.

https://mlops.systems/posts/2024-06-03-isafpr-evaluating-baseline.html is a blog
where I examine the baseline performance of GPT-4-Turbo at extracting entities
from the text (as I hope to achieve with finetuning).