Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/strickvl/isafpr_finetune

Finetuning an LLM for structured data extraction from press releases
https://github.com/strickvl/isafpr_finetune

fine-tuning finetuning llm llms

Last synced: about 1 month ago
JSON representation

Finetuning an LLM for structured data extraction from press releases

Host: GitHub
URL: https://github.com/strickvl/isafpr_finetune
Owner: strickvl
Created: 2024-06-06T21:08:03.000Z (7 months ago)
Default Branch: main
Last Pushed: 2024-06-27T07:46:00.000Z (6 months ago)
Last Synced: 2024-06-27T09:01:16.845Z (6 months ago)
Topics: fine-tuning, finetuning, llm, llms
Language: Jupyter Notebook
Homepage:
Size: 5.19 MB
Stars: 2
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# ISAF Press Releases Finetuning

I'll be doing a few finetunes around a dataset I annotated a few years back which is probably an interesting use case for structured data extraction. Some links for context:

https://mlops.systems/posts/2024-03-24-publishing-afghanistan-dataset-huggingface.html is a blog I wrote about the dataset

https://huggingface.co/datasets/strickvl/isafpressreleases is the original dataset.

https://mlops.systems/posts/2024-06-02-isafpr-prompting-baseline.html describes the context of the task for which I want to fine-tune.

https://mlops.systems/posts/2024-06-03-isafpr-evaluating-baseline.html is a blog
where I examine the baseline performance of GPT-4-Turbo at extracting entities
from the text (as I hope to achieve with finetuning).