https://github.com/stefan-it/georgian-ner

Resources about Named Entity Recognition for Georgian
https://github.com/stefan-it/georgian-ner

flair georgian named-entity-recognition

Last synced: about 2 months ago
JSON representation

Resources about Named Entity Recognition for Georgian

Host: GitHub
URL: https://github.com/stefan-it/georgian-ner
Owner: stefan-it
Created: 2023-11-15T21:09:56.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2023-11-17T01:12:47.000Z (over 1 year ago)
Last Synced: 2025-02-08T07:13:23.525Z (3 months ago)
Topics: flair, georgian, named-entity-recognition
Language: Jupyter Notebook
Homepage:
Size: 91.8 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

        # 🇬🇪 Georgian NER

გამარჯობა! This repository contains my resources about Named Entity Recognition for Georgian.

# English-Georgian NER Model with Flair

We fine-tune a NER Model with Flair on English and Georgian training splits of WikiANN dataset

([Rahimi et al.](https://www.aclweb.org/anthology/P19-1015) splits).

Based on [this repository](https://github.com/stefan-it/autotrain-flair-mobie) the fine-tuning is done with the

awesome [Flair](https://github.com/flairNLP/flair) library, incl. support for Hugging Face's [AutoTrain](https://github.com/huggingface/autotrain-advanced).

We use a basic hyper-parameter search with the following configuration:

| Parameter     | Value             |

|---------------|-------------------|

| Learning Rate | `5e-06`           |

| Batch Size    | `4`               |

| Epoch         | `10`              |

| Seeds         | `[1, 2, 3, 4, 5]` |

We use [XLM-R Large](https://huggingface.co/xlm-roberta-large) as base model.

The following environment variables needs to be set when using AutoTrain:

| Environment Variable | Description                                                                                       |

|----------------------|---------------------------------------------------------------------------------------------------|

| `HF_TOKEN`           | Hugging Face User Access Token, which can be found [here](https://huggingface.co/settings/tokens) |

| `HUB_ORG_NAME`       | Username or organization under models will be uploaded to                                         |

The fine-tuning can then be started by running the `script.py` script.

## Model Card Creation

The [`ModelCardCreation.ipynb`](ModelCardCreation.ipynb) notebook shows how to automatically generate model cards for

all uploaded models. This includes also a results overview table with linked models.

## Fine-tuned Models

All fine-tuned models are released on the Hugging Face Hub, incl. a nice inference widget:

![Inference Widget](images/inference-widget.png)

The fine-tuned models can be found [here](https://huggingface.co/models?search=autotrain-flair-georgian). Additionally,

they can be found in [this collection](https://huggingface.co/collections/stefan-it/georgian-ner-models-6556bd33dd1c096392074791).

# Changelog

* 17.11.2023: Add model card creation and fine-tuned models sections. Mention fine-tuned models on Hub.

* 15.11.2023: Initial version of this repository.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/stefan-it/georgian-ner

Awesome Lists containing this project

README