Projects in Awesome Lists tagged with parallel-corpus
A curated list of projects in awesome lists tagged with parallel-corpus .
https://github.com/NiuTrans/Classical-Modern
非常全的文言文(古文)-现代文平行语料
corpus parallel-corpus traditional-and-simplified-chinese traditional-chinese
Last synced: 09 May 2025
https://github.com/niutrans/classical-modern
非常全的文言文(古文)-现代文平行语料
corpus parallel-corpus traditional-and-simplified-chinese traditional-chinese
Last synced: 08 Apr 2025
https://github.com/kirralabs/indonesian-NLP-resources
data resource untuk NLP bahasa indonesia
corpus corpus-linguistics crawler dataset dependency-parser indonesian indonesian-language named-entity-recognition nlp parallel-corpus pos-tagging sentiment-analysis
Last synced: 15 Apr 2025
https://github.com/Helsinki-NLP/OpusFilter
OpusFilter - Parallel corpus processing toolkit
corpus-processing corpus-tools machine-translation natural-language-processing nlp parallel-corpus
Last synced: 19 Nov 2025
https://github.com/matbahasa/TALPCo
TUFS Asian Language Parallel Corpus
addressee bahasa-indonesia bahasa-melayu burmese constituency-tree english indonesian interpersonal japanese javanese korean malay meaning myanmar parallel-corpus thai tiengviet tokenized-sentences treebank vietnamese
Last synced: 16 Nov 2025
https://github.com/bramvanroy/astred
An easy-to-use library to linguistically compare one sentence and its words to another, in the same language or a different one. For instance useful for comparing a translation with the original text, to find differences and similarities between two different translations, or to see how a machine translation differs from a reference translation.
alignment linguistics nlp parallel-corpus parsing spacy stanza translation
Last synced: 12 Apr 2025
https://github.com/korenyoni/opus-api
OPUS (opus.nlpl.eu) Python3 API
api corpora corporate corpus language-model machine-learning opus parallel-corpora parallel-corpus python
Last synced: 15 Apr 2025
https://github.com/michmech/irish-sentence-bank
4,500 sentences in Irish, tokenized, manually lemmatized, translated into English.
gaeilge irish lemmatization parallel-corpus
Last synced: 17 Jul 2025
https://github.com/spraakbanken/swell-editor
Editor for normalising learner texts (error annotation and tagging.)
annotation-tool parallel-corpus second-language-acquisition sla swell swell-editor
Last synced: 30 Apr 2025
https://github.com/uudigitalhumanitieslab/perfectextractor
Extracting present perfects (and related forms) from parallel corpora
extraction parallel-corpus xpath
Last synced: 25 Jul 2025
https://github.com/uudigitalhumanitieslab/timealign
Parallel corpus annotation and visualization
annotation django-application parallel-corpus visualization
Last synced: 07 May 2025
https://github.com/tanloong/interlaced.nvim
Neovim plugin for aligning bilingual parallel texts
corpus-linguistics nvim-plugin parallel-corpus sentence-alignment
Last synced: 07 May 2025
https://github.com/pythainlp/thai-lao-parallel-corpus
Thai Lao Parallel corpus
corpus lao-language parallel-corpus thai-language
Last synced: 05 Mar 2025
https://github.com/KurdishBLARK/InterdialectCorpus
A parallel corpus of Sorani, Kurmanji and English
corpus kurdish kurdish-language-processing machine-translation natural-language-processing parallel-corpus
Last synced: 07 May 2025
https://github.com/deeptiman/php-dom-parser-translation-tool
A Simple DOM Parser and Translation Tool using PHP, HTML, and MySQL. The translation model is supported for English to Odia language. There is a built in dictionary to support the translation.
apache corpus corpus-tool dom-parser linguist linguistic-analysis linguistic-corpora moses-machine-translation mysql odia-language parallel-corpus parser-generator parser-library php phpmyadmin statistical-machine-translation tomcat-server translation-service translation-tool
Last synced: 04 Oct 2025
https://github.com/gederajeg/constructional-equivalence
Repository of supplementary materials and RStudio project for the paper on corpus-based approach to measuring constructional equivalence.
construction-grammar constructionist-approach corpus-linguistics english-indonesian-translation open-code open-data open-science open-subtitle parallel-corpora parallel-corpus quantitative-linguistics r-programming r-programming-projects translation-equivalence translation-studies udayana-university universitas-udayana verbal-near-synonyms
Last synced: 11 Sep 2025