Projects in Awesome Lists by techiaith
A curated list of projects in awesome lists by techiaith .
https://github.com/techiaith/pyfestival
Amlapiwr Python C ar gyfer hwyluso rhaglennu gyda Festival | A Python C wrapper for simple coding with Festival
cymraeg speech text-to-speech welsh
Last synced: 17 Jan 2026
https://github.com/techiaith/lecsicon-cymraeg-bangor
Lecsicon cynhwysfawr o eirffurfiau'r Gymraeg yn seiliedig ar ddata gwirydd sillafu a gramadeg Cysill // A comprehensive lexicon of Welsh-language wordforms based on data from the Cysill spelling and grammar checker
cc0 dictionary lexicon nlp welsh
Last synced: 17 Jan 2026
https://github.com/techiaith/hunspell-cy
Fersiwn wedi'i ddiweddaru o'r fersiwn Cymraeg o wirydd sillafu Hunspell. | An updated version of the Welsh version of the Hunspell spellchecker.
cymraeg dictionary hunspell nlp spellchecker welsh
Last synced: 17 Jan 2026
https://github.com/techiaith/model-tagiwr-spacy-cy
Model spaCy (2.3.2) sy'n cynnwys tagiwr rhan ymadrodd Cymraeg cychwynnol sydd â chywirdeb o 91% | A Welsh-language spaCy (2.3.2) model featuring a pos tagger that achieves 91% accuracy on unseen data.
Last synced: 17 Jan 2026
https://github.com/techiaith/macsen-sgwrsfot
Cydran adnabod bwriad a sgwrsfot yr ap Macsen // Intent parser and chatbot component for the Macsen app
chatbot cymraeg intent-parser macsen welsh
Last synced: 17 Jan 2026
https://github.com/techiaith/llais_festival
Data llais Cymraeg ddeuffon er mwyn llwytho i lawr a’u rhedeg o fewn gosodiad lleol o Festival (ar Linux, Raspberry Pi neu Windows) | Welsh diphone voice data for loading into your local installation of Festival (on Linux, Raspberry Pi or Windows)
Last synced: 17 Jan 2026
https://github.com/techiaith/geiriau-mwyaf-aml
Rhestrau geiriau mwyaf aml y Gymraeg a Saesneg // Wordlists of the most common wordforms in Welsh, and the most common English words used in Welsh.
lexicon nlp welsh word-frequency-count
Last synced: 17 Jan 2026
https://github.com/techiaith/welsh-lts
Rheolau ynganu Cymraeg | Welsh letter-to-sound rules
Last synced: 17 Jan 2026
https://github.com/techiaith/brawddegau-tagiedig
Corpws o frawddegau CC0 mewn fformat jsonl, gyda rhannau ymadrodd y tocynnau (geiriau etc.) wedi'u tagio â thagiau Universal Dependencies. // A Corpus of CC0 sentences in the jsonl format, tagged with Universal Dependency part-of-speech tags.
annotated cc0 commonvoice data nlp welsh
Last synced: 17 Jan 2026
https://github.com/techiaith/lecsicon-cymraeg-bangor-enghreifftiau
Enghreifftiau o ddefnyddio Lecsicon Cymraeg Bangor // Examples of code utilising the Bangor University Welsh language lexicon.
lemmatization lexicon morphological-analysis nlp spellchecker welsh wordle
Last synced: 17 Jan 2026
https://github.com/techiaith/trawsgrifiwr-arlein
Cod gwefan Trawsgrifiwr Ar-lein gan Uned Technolegau Iaith, Prifysgol Bangor // // The code for the Trawsgrifiwr Ar-lein website by the Language Technologies Unit, Bangor University
cymraeg speech speech-recognition transcription welsh
Last synced: 17 Jan 2026
https://github.com/techiaith/spacy-lang-cy
Ffolder lang 'cy' ar gyfer ychwanegu'r Gymraeg i spaCy 2.3.2 | A 'cy' lang folder that adds Welsh to spaCy 2.3.2
Last synced: 17 Jan 2026
https://github.com/techiaith/trawsgrifiwr-windows
Ap Windows sy'n trawsgrifio lleferydd Cymraeg i destun / A Windows app that transcribes Welsh language speech to text.
cymraeg speech speech-to-text welsh windows-desktop
Last synced: 17 Jan 2026
https://github.com/techiaith/docker-coqui-tts-cy
Lleisiau synthetig testun i leferydd dwyieithog Cymraeg a Saesneg // // Bilingual Welsh and English synthetic text to speech voices
coqui-ai cymraeg text-to-speech welsh
Last synced: 17 Jan 2026
https://github.com/techiaith/julius-cy
Ffeiliau ar gyfer adnabod lleferydd Cymraeg drwy Julius | Files for realising Welsh speech recognition with Julius
Last synced: 17 Jan 2026
https://github.com/techiaith/piper-cy
Lleisiau all-lein Cymraeg || Welsh offline voices
speech speech-synthesis speech-to-text
Last synced: 17 Jan 2026
https://github.com/techiaith/hunspell-cy-llafar
Fersiwn Cymraeg llafar o wirydd sillafu Hunspell. | Spoken Welsh version of the Hunspell spellchecker.
cymraeg dictionary hunspell nlp spellchecker welsh
Last synced: 17 Jan 2026
https://github.com/techiaith/parsiwr-dibyniaethau
Parsiwr dibyniaethau Cymraeg ar gyfer spaCy. | A Welsh language dependency parser for spaCy.
Last synced: 17 Jan 2026
https://github.com/techiaith/ataleiriau
Rhestr o ataleiriau Cymraeg | Welsh Stopwords List
Last synced: 17 Jan 2026
https://github.com/techiaith/speech-corpus-builder
Build a speech corpus from YouTube and other websites' videos.
Last synced: 17 Jan 2026
https://github.com/techiaith/word2vec-cy
Model Iaith Fectorau Word2vec ar sail corpora ymchwil yr Uned Technolegau Iaith a gasglwyd o ffynonellau amrywiol at ddibenion ymchwil fel cynhyrchu modelau iaith. | A Word2vec Language Model based on the Language Technologies Unit's research corpora.
Last synced: 17 Jan 2026
https://github.com/techiaith/corpws-cc0
Corpws o frawddegau o destun Cymraeg wedi'u trwyddedu o dan drwydded CC0 | A corpus of Welsh texts licensed under the CC0 licence
cc0 commonvoice corpus nlp welsh
Last synced: 17 Jan 2026
https://github.com/techiaith/kaldi-cy
Adnabod lleferydd Cymraeg gyda Kaldi ASR | Welsh language speech recognition using Kaldi ASR
kaldi speech speech-recognition welsh
Last synced: 17 Jan 2026
https://github.com/techiaith/offer-trin-iaith
Offer ar gyfer hwyluso trin agweddau o destunau Cymraeg, er enghraifft treigladau ac ati | Tools for facilitating the manipulation of Welsh texts, including mutation.
Last synced: 17 Jan 2026
https://github.com/techiaith/seilwaith
Offer hwyluso creu Adnabod Lleferydd Cymraeg gyda HTK, IRSTLM, Julius a Docker | Welsh Speech Recognition with HTK, IRSTLM, Julius and Docker
htk speech speech-recognition welsh
Last synced: 17 Jan 2026
https://github.com/techiaith/spacy_cy_tag_lem_ner_lg
Model tagio POS, NER a fectorau Cymraeg spaCy 3.5 | A Welsh POS tagging, NER and vector model for spaCy 3.5
Last synced: 17 Jan 2026
https://github.com/techiaith/parsiwr_dibyniaethau_enwol_berfol
Parsiwr dibyniaethau sy'n ceisio gwahaniaethu rhwng defnydd enwol a berfol o'r berfenw // A dependency parser which attempts to differentiate between nominal and verbal verbnouns
Last synced: 17 Jan 2026
https://github.com/techiaith/lecsicon-dwyieithog-giza
Lecsicon dwyieithog Cymraeg / Saesneg Giza ar gyfer Bitextor // A Bilingual Welsh / English Giza Lexicon for Bitextor
Last synced: 17 Jan 2026
https://github.com/techiaith/techiaith-tts
Testun I Leferydd Techiaith. // Techiaith Text To Speech.
Last synced: 14 Jan 2026
https://github.com/techiaith/corpws-meincnodi-rhannau-ymadrodd
Corpws ar gyfer meincnodi tagwyr rhannau ymadrodd Cymraeg | A corpus for benchmarking Welsh part-of-speech taggers
corpus nlp part-of-speech-tagging welsh
Last synced: 17 Jan 2026
https://github.com/techiaith/urls_cymraeg
Casgliad cychwynnol o URLs sy'n cynnwys testun Cymraeg // An initial collection of URLs contaning Welsh-language texts
Last synced: 17 Jan 2026
https://github.com/techiaith/word2vec-cy-demutated
Model Iaith Fectorau Word2vec ar sail corpora ymchwil enfawr lle dad-dreiglwyd y geiriau perthnasol | A Word2vec Language Model based on a large demutated research corpora.
Last synced: 17 Jan 2026
https://github.com/techiaith/spacy
Mae spaCy yn llyfrgell ar gyfer Prosesu Iaith Naturiol uwch yn Python a Cython. // spaCy is a library for advanced Natural Language Processing in Python and Cython.
Last synced: 17 Jan 2026
https://github.com/techiaith/mimic
Llyfrgell Python ar gyfer weithio gyda Modelau Iaith Fawr i'r Gymraeg // A Python library for working with Large Language Models for Welsh
Last synced: 17 Jan 2026
https://github.com/techiaith/talpiwr_enwol_cymraeg
Talpiwr enwol Cymraeg ar gfyer spaCy / a Welsh-language noun chunker for spaCy
Last synced: 17 Jan 2026
https://github.com/techiaith/anonymeiddiwr-beta
Anonymeiddiwr Beta ar gyfer testunau dwyieithog Saesneg-Cymraeg a thestunau Cymraeg uniaith.
Last synced: 17 Jan 2026
https://github.com/techiaith/techiaith-utils
Llyfrgell Python yn darparu offer ar gyfer weithio gyda brojectau NMT | Python library for working with NMT projects.
Last synced: 17 Jan 2026
https://github.com/techiaith/spacy-tagiwr-ency
Tagiwr arbrofol dwieithog ar gyfer testunau Cymraeg a Saesneg | An experimental bilingual tagger for English and Welsh texts
Last synced: 17 Jan 2026
https://github.com/techiaith/wikipedia-extractor
Estyn cynnwys Wicipedia / Extract Wikipedia content
Last synced: 17 Jan 2026
https://github.com/techiaith/spacy-wales-en-ner-model
Model adnabod enwau endidau Saesneg a hyfforddwyd ar endidau Cymreig | An English named entity recognition model further trained on entities specific to Wales
Last synced: 17 Jan 2026
https://github.com/techiaith/whisper-mobile-mic
Trawsgrifio ar gael drwy’r eicon microffon o fewn bysellfwrdd arferol ffon symudol
Last synced: 17 Jan 2026
https://github.com/techiaith/corpws-sgwrsfot-cysgliad
Corpws o sgyrsiau cymorth Cysgliad | A Corpus of support chat messages for the Cysgliad software
Last synced: 17 Jan 2026
https://github.com/techiaith/spacy-lookups-lemmatizer-cy
Fersiwn wedi'i becynnu o spacy-lookups-data gyda data lemateiddio Cymraeg | A packaged version of spacy-lookups-data including Welsh lemmatization data
Last synced: 17 Jan 2026
https://github.com/techiaith/docker-prosodylabaligner-cy
Defnyddio Prosodylab-Aligner Cymraeg yn hwylus gyda Docker. | Easy to use Prosodylab Aligner with Welsh support via Docker
Last synced: 17 Jan 2026
https://github.com/techiaith/docker-kaldi-cy
Amgylchedd hwyluso hyfforddi adnabod lleferydd Kaldi Cymraeg
cymraeg docker kaldi speech speech-recognition welsh
Last synced: 17 Jan 2026
https://github.com/techiaith/docker-deepspeech-cy-server
Gweinydd syml ar gyfer ddarparu gwasanaeth API at modelau adnabod lleferydd DeepSpeech // Simple server for providing API access to DeepSpeech speech recognition models.
api-server cymraeg speech speech-recognition welsh
Last synced: 17 Jan 2026
https://github.com/techiaith/festival_msapi
Integreiddio testun i leferydd Festival i Windows drwy MSAPI | Integrate Festival TTS into Windows via MSAPI
Last synced: 17 Jan 2026
https://github.com/techiaith/docker-coqui-stt-cy
Hyfforddi a defnyddio modelau adnabod lleferydd Cymraeg coqui-stt a KenLM // Train and use coqui-stt and KenLM based Welsh language speech recognition models.
api-server commonvoice coqui-ai cymraeg speech speech-recognition training welsh
Last synced: 17 Jan 2026
https://github.com/techiaith/tebygrwydd_brawddegau_cymraeg
Enghraifft o god ar gyfer cyfrifo tebygrwydd brawddegau Cymraeg gan ddefnyddio spaCy / Code examples for calculating Welsh sentence similarity for spaCy
Last synced: 17 Jan 2026
https://github.com/techiaith/paldaruo
Cod yr ap Paldaruo i iOS ar gyfer torfoli casglu corpws lleferydd | Code for the Paldaruo speech corpus crowdsourcing ap for iOS
corpus-tools crowdsourcing speech welsh
Last synced: 17 Jan 2026