https://github.com/natlibfi/fintoai-data-yso

DVC pipeline for YSO projects of Finto AI
https://github.com/natlibfi/fintoai-data-yso

annif dvc dvc-pipeline glam subject-indexing text-classification

Last synced: over 1 year ago
JSON representation

DVC pipeline for YSO projects of Finto AI

Host: GitHub
URL: https://github.com/natlibfi/fintoai-data-yso
Owner: NatLibFi
License: cc0-1.0
Created: 2022-08-08T11:45:33.000Z (almost 4 years ago)
Default Branch: main
Last Pushed: 2024-12-12T07:32:52.000Z (over 1 year ago)
Last Synced: 2025-01-21T14:46:15.460Z (over 1 year ago)
Topics: annif, dvc, dvc-pipeline, glam, subject-indexing, text-classification
Language: Jupyter Notebook
Homepage: https://ai.finto.fi
Size: 2.36 MB
Stars: 1
Watchers: 4
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # FintoAI-data-YSO

Configurations for maintaining the Annif projects with YSO vocabulary used at [Finto AI service](ai.finto.fi/) and the [analysis notebook](/repository-metrics-analysis/analyse-theseus-tietolinja.ipynb) of Annif suggestions in [Theseus repository](https://www.theseus.fi/).

The projects are trained and evaluated using a [DVC (Data Version Control) pipeline](https://dvc.org/doc/start/data-management/data-pipelines) defined in [dvc.yaml](/dvc.yaml).

The training corpora that are public can be found from [Annif-corpora repository](https://github.com/NatLibFi/Annif-corpora/).

The pipeline takes care of 

1. installing Annif in a venv,

2. loading YSO vocabulary,

3. training the projects,

4. evaluating the projects.

When the necessary vocabulary and training corpora are in place the pipeline can be run using the command

    dvc repro

    

For more information about using DVC with Annif projects see the [DVC exercise of Annif tutorial](https://github.com/NatLibFi/Annif-tutorial/blob/master/exercises/OPT_dvc.md).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/natlibfi/fintoai-data-yso

Awesome Lists containing this project

README