Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with linguistics

A curated list of projects in awesome lists tagged with linguistics .

https://github.com/Tatoeba/tatoeba2

Tatoeba is a platform whose purpose is to create a collaborative and open dataset of sentences and their translations.

languages linguistics

Last synced: 01 Nov 2024

https://github.com/proycon/pynlpl

PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).

computational-linguistics evaluation-metrics folia language-modelling library linguistics machine-learning natural-language-processing nlp nlp-library python search-algorithms text-processing

Last synced: 15 Dec 2024

https://github.com/tshatrov/ichiran

Linguistic tools for texts in Japanese language

common-lisp dictionary grammar japanese japanese-language language linguistics

Last synced: 19 Nov 2024

https://github.com/quadrismegistus/prosodic

Prosodic: a metrical-phonological parser, written in Python. For English and Finnish, with flexible language support.

finnish-language-analysis linguistics metrical-parser nlp poetry rhythm

Last synced: 18 Dec 2024

https://github.com/hangulize/hangulize

Hangulize transcribes non-Korean words into Hangul

korean linguistics transcription

Last synced: 14 Nov 2024

https://github.com/MaxBittker/nyt-first-said

Tweets when words are published for the first time in the NYT

civic-tech journalism linguistics newsroom politics python scraper twitter

Last synced: 02 Dec 2024

https://github.com/what-studio/tossi

Chooses correct Korean particle morphs for arbitrary words.

korean linguistics localization python

Last synced: 17 Nov 2024

https://github.com/CoEDL/elpis

🙊 software for creating speech recognition models.

automatic-speech-recognition computational-linguistics docker kaldi linguistics python transcription

Last synced: 15 Nov 2024

https://github.com/pyconll/pyconll

A minimal, pure Python library to interface with CoNLL-U format files.

annotation conllu dependency-parsing linguistics minimal python universal-dependencies

Last synced: 27 Nov 2024

https://github.com/hbuschme/TextGridTools

Read, write, and manipulate Praat TextGrid files with Python

annotation data-analysis elan linguistics praat python textgrid

Last synced: 27 Nov 2024

https://github.com/proycon/colibri-core

Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.

c-plus-plus computational-linguistics corpus library linguistics ngram ngrams nlp pattern-recognition python skipgram text-processing

Last synced: 17 Dec 2024

https://github.com/proycon/flat

FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.github.io/folia), a rich XML-based format for linguistic annotation. Flat allows users to view annotated FoLiA documents and enrich these documents with new annotations, a wide variety of linguistic annotation types is supported through the FoLiA paradigm.

annotation-tool clariah clarin computational-linguistics folia javascript linguistic-annotation-framework linguistics nlp python web-application

Last synced: 17 Dec 2024

https://github.com/yohasebe/rsyntaxtree

Syntax tree generator for linguistic research

linguistics ruby rubynlp svg syntax-tree visualization

Last synced: 16 Dec 2024

https://github.com/ars-linguistica/mlconjug3

A Python library to conjugate verbs in French, English, Spanish, Italian, Portuguese and Romanian (more soon) using Machine Learning techniques.

conjugation conjugator devops linguistics machine-learning nlp nlp-library nlp-machine-learning python3 test-driven-development

Last synced: 20 Dec 2024

https://github.com/koskenni/beta

An open source reimplementation of Benny Brodda's BETA in Python

benny-brodda beta corpus-tools hyphenation linguistics open-source string-manipulation string-rewriting

Last synced: 12 Nov 2024

https://github.com/proycon/folia

FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for processing FoLiA is implemented as part of PyNLPl, this contains higher-level tools that use the library as well as the full documentation, validation schemas, and set definitions

computational-linguistics corpus file-format folia language library linguistic-annotation-framework linguistics nlp python xml

Last synced: 14 Oct 2024

https://github.com/kyubyong/koparadigm

KoParadigm: Korean Inflectional Paradigm Generator

inflection korean linguistics morphology nlp paradigm

Last synced: 10 Nov 2024

https://github.com/knadh/indic.page

A directory of Indic (Indian) language computing resources.

datasets indian-language indic-languages language linguistics nlp

Last synced: 16 Dec 2024

https://github.com/ropensci/lingtypology

R package for linguistic cartography and typological databases search

abvd afbo atlas autotype bivaltyp clld glottolog-database linguistic-maps linguistics phoible r r-package sails typology wals

Last synced: 22 Nov 2024

https://github.com/anna-hope/phonemes

Jason Riggle's chart of phonological features in JSON format + extras

computational-linguistics ipa-symbols linguistics phonemes phonetics phonological-features phonology

Last synced: 19 Dec 2024

https://github.com/sillsdev/libpalaso

Palaso Library: A set of .Net libraries useful for developers of Language Software.

hacktoberfest languages linguistics linux windows

Last synced: 21 Dec 2024

https://github.com/zamgi/lingvo--ner-ru

Named entity recognition (NER) in Russian texts / Определение именованных сущностей (NER) в тексте на русском языке

linguistics lingvo named-entity-recognition natural-language-processing ner nlp nlp-machine-learning

Last synced: 05 Nov 2024

https://github.com/yuhr/langue

A modern platform for conlanging. Currently in the planning stage.

conlang conlinguistics conscript conworld dictionary language langue linguistics ontology speech-recognition speech-synthesis translation

Last synced: 14 Oct 2024

https://github.com/dialpad/inclusive-language

Inclusive language guide as developed by Dialpad linguists

inclusion inclusive-language linguistics

Last synced: 09 Dec 2024

https://github.com/proiel/proiel-treebank

Official releases of the PROIEL treebank of ancient Indo-European languages

ancient-greek ancient-languages armenian corpus gothic2 language latin linguistics new-testament old-church-slavonic treebank

Last synced: 28 Oct 2024

https://github.com/kdelwat/onset

A language evolution simulator, using realistic phonetic changes.

flask linguistics phonetics phonology python vue

Last synced: 19 Nov 2024

https://github.com/omarsar/clinical_nlp_elastic

Clinical NLP Analysis with Elasticsearch and Kibana

elastic elasticsearch kibana linguistics machine-learning mental-health nlp

Last synced: 28 Oct 2024

https://github.com/dveselov/mystem

CGo bindings to Yandex.Mystem

cgo-bindings linguistics mystem russian-specific

Last synced: 26 Oct 2024

https://github.com/agmmnn/syn

🌾 Get synonyms and antonyms of words from Thesaurus.com and other sources in your terminal, with rich output.

cli command-line datamuse dictionary linguistics python rich synonyms terminal thesaurus wordsearch

Last synced: 27 Oct 2024

https://github.com/liulalemx/felig-toolkit

A toolset for Amharic Language pre-processing. Includes an Amharic Stemmer, Transliterator, Stopword remover , Lexical analyzer, Corpus indexer and Term weighter.

amharic amharic-corpus amharic-nlp amharic-stemmer corpus lexical-analyzer linguistics stopword-removal transliterator

Last synced: 18 Nov 2024

https://github.com/orgtre/google-books-ngram-frequency

Word/n-gram frequency lists for the Google Books Ngram Corpus (v3, all languages) with Python code

google language-learning linguistics ngrams wordlist

Last synced: 14 Oct 2024

https://github.com/dbklim/stressrnn

Modified version of RusStress (https://github.com/MashaPo/russtress) — python package for placing stress in Russian text using RNN (BiLSTM) and the "Grammatical Dictionary" by A. A. Zaliznyak (from http://odict.ru/).

accent bilstm emphasis linguistic linguistics lstm nlp rnn russian russian-accent russian-stress russtress rustress stress

Last synced: 11 Nov 2024

https://github.com/korpling/pepper

A highly extensible plattform for conversion and manipulation of linguistic data between an unbound set of formats. Pepper can be used stand-alone as a command line interface, or be integrated as an API into other software products.

annotations converter format java linguistic-formats linguistics nlp pepper

Last synced: 15 Nov 2024

https://github.com/bramvanroy/astred

An easy-to-use library to linguistically compare one sentence and its words to another, in the same language or a different one. For instance useful for comparing a translation with the original text, to find differences and similarities between two different translations, or to see how a machine translation differs from a reference translation.

alignment linguistics nlp parallel-corpus parsing spacy stanza translation

Last synced: 14 Oct 2024

https://github.com/orgtre/top-open-subtitles-sentences

Most common sentences and words for all languages in the OpenSubtitles2018 corpus with Python code

language-learning linguistics opensubtitles wordlist

Last synced: 14 Oct 2024

https://github.com/DevelopersTree/KurdishResources

A repository for resources in Kurdish Language

bot kurdish kurdish-oss linguistics wordlist

Last synced: 14 Nov 2024

https://github.com/xylous/grzegorz

A comand-line phonetics tool for finding minimal pairs

anki cli command-line language-learning linguistics minimal-pairs phonology python utility

Last synced: 14 Oct 2024

https://github.com/shujian2015/neural-net-linguistics

Papers about NN and linguistics

linguistics nlp papers

Last synced: 09 Nov 2024

https://github.com/alvations/expletives

Expletives vomiting library...

bad-words expletives linguistics nlp python vulgarities

Last synced: 29 Nov 2024

https://github.com/dativebase/dative

Dative: software for linguistic fieldwork

coffeescript linguistics spa

Last synced: 15 Nov 2024

https://gitlab.com/smc/mlmorph

Malayalam Morphological Analyzer using Finite State Transducer https://morph.smc.org.in

Malayalam fst hfst linguistics morphology analyser sfst

Last synced: 19 Nov 2024

https://github.com/zamgi/lingvo--classify

Автоклассификация текста на русском языке

classification linguistics lingvo natural-language-processing nlp nlp-machine-learning text-classification

Last synced: 05 Nov 2024

https://github.com/tallguyjenks/runes

🧙‍♀️ ᚱᚢᚾᛖᛋ in your R Documents!

bryan-jenks cran elder-futhark-runes futhark futhark-runes linguistics nordic r rstats rstudio rune runes

Last synced: 04 Dec 2024

https://github.com/rshrc/varnamala

A personal app to teach oneself any language, Duolingo style.

flutter-apps kannada language learning linguistics

Last synced: 25 Nov 2024

https://github.com/digitallinguistics/transliterate

A small JavaScript library for transliterating strings between different orthographies

digital-humanities digital-linguistics dlx linguistics transliteration

Last synced: 30 Nov 2024

https://github.com/derintelligence/az-summarization

Abstractive summarization for Azerbaijani language

azerbaijan dataset language linguistics nlp summarization

Last synced: 13 Nov 2024

https://github.com/adamliter/latex-workshop

Materials for workshop on LaTeX aimed at linguists

latex latex-examples linguistics tutorial

Last synced: 11 Oct 2024

https://github.com/PaddiM8/GlossVisualiser

Displays interlinear gloss in a more readable way with HTML.

gloss linguistics

Last synced: 13 Nov 2024

https://github.com/zamgi/lingvo--textsegmenter

Text segmentation into separate words using a simple unigram model and the Viterbi algorithm

linguistics lingvo natural-language-processing nlp text-segmentation viterbi-algorithm

Last synced: 05 Nov 2024

https://github.com/groverburger/sapling

An intuitive graphical linguistics syntax tree editor that runs in your browser.

editor linguistics sapling syntax tree

Last synced: 14 Oct 2024

https://github.com/matthias-stemmler/annimate

Annimate - Your Friendly ANNIS Match Exporter

application desktop linguistics react rust tauri typescript

Last synced: 18 Nov 2024

https://github.com/alicerunsonfedora/sniglet

Generate sniglets with machine learning!

abysima linguistics machine-learning word-generation

Last synced: 23 Oct 2024

https://github.com/ggteixeira/plural-generator

Linguistic algorithm which main goal is to generate plurals for Brazilian Portuguese.

linguistics morphology nlp plural python

Last synced: 12 Nov 2024

https://github.com/tmalsburg/selfhost_ling_expts

A guide and templates for self-hosted experiments designed with jsPsych and served using Python

behavioral-research crowdsourcing jspsych linguistics psycholinguistics

Last synced: 28 Oct 2024

https://github.com/mounta11n/vowelreconstruct

An easy to use and understand method for the average user to test various aspects of intelligence of your LLM in only one run.

guanaco linguistics llama llamacpp llm

Last synced: 07 Nov 2024

https://github.com/zamgi/lingvo--syntax-ru

Определение синтаксических ролей слов в предложении в тексте на русском языке

linguistics lingvo natural-language-processing nlp nlp-machine-learning pos-tagging syntax syntax-analysis

Last synced: 05 Nov 2024

https://github.com/zamgi/lingvo--postagger-ru

Определение частей речи / Нормализация текста: приведение всех слов к словарной форме в тексте на русском языке

linguistics lingvo morphological-analysis morphologies morphology natural-language-processing nlp nlp-machine-learning part-of-speech-tagging pos-tagger pos-tagging

Last synced: 05 Nov 2024

https://github.com/digitallinguistics/scription

A specification for formatting interlinear glossed texts in a way that is computationally parseable

digital-humanities digital-linguistics dlx documentary-linguistics glosses language language-documentation linguistics scription scription-files

Last synced: 30 Nov 2024

https://github.com/rshrc/words625

A personal app to teach oneself any language, Duolingo style.

flutter-apps kannada language learning linguistics

Last synced: 10 Nov 2024

https://github.com/davidfoerster/synesketch

Software library with synesthetic abilities, made for Processing digital artists. Its code serves as a medium between words, emotions, and images.

affective-computing java linguistics processing-library synesthesia

Last synced: 10 Nov 2024

https://github.com/boltomli/MyShinyApps

R apps that run on shinyapps.io or RStudio Connect

audio linguistics python r rstudio rstudio-connect shinyapps speech

Last synced: 22 Nov 2024

https://github.com/arjo129/langcluster

A visuallization for cognates in various languages and how they spread

artificial-intelligence azure-functions clustering d3-visualization linguistic-analysis linguistics

Last synced: 10 Nov 2024

https://github.com/bluebie/nzsl-training-data-generator

Tool for reading NZSL-Dictionary dataset, and using PoseNet ML model to extract information and images from video of NZSL sign performances, to generate datasets to train CNNs to recognise traits of visual signed languages

linguistics ml nzsl posenet sign-language

Last synced: 22 Oct 2024

https://github.com/jweinst1/corplet

A binary-corpus system for word tagging

corpus-linguistics database linguistics nlp nlp-library

Last synced: 08 Nov 2024

https://github.com/dcavar/geoling

GeoLing: GIS app for mailing list announcements via LINGUIST List

django gis linguist-list linguistics listserver python

Last synced: 07 Nov 2024

https://github.com/orgtre/google-books-words

Words in the Google Books Ngram Corpus (v3, all languages) with metadata and Python code

dictionary google language-learning linguistics wordlist words

Last synced: 29 Nov 2024

https://github.com/nanxstats/tea-sea-cha-land

Spatial-temporal dataset on how the word "tea" spread around the globe: tea if by sea, cha if by land.

dataset linguistics map-visualization

Last synced: 16 Nov 2024

https://github.com/davidfoerster/kaleidok-examples

KaleidOk invites participants to use a new kind of interactive media tool and take part in an emerging experience which explores speech recognition, media retrieval and visuals generating in a collaborative context (between people, and between people and machines).

affective-computing art java linguistics processing-library speech-processing synesthesia

Last synced: 10 Nov 2024

https://github.com/sergeyt/scraper

Declarative web scraper in JavaScript primarily designed to extract linguistics data

linguistics scraper

Last synced: 02 Nov 2024

https://github.com/stdlib-js/nlp

Standard library natural language processing.

javascript language lib library linguistics modeling natural nlp node node-js nodejs standard stdlib

Last synced: 20 Nov 2024

https://github.com/zamgi/lingvo--postagger-ner-ru-dnn

Part of speech tagging of words and Named-entity recognition in Russian language using deep neural network in C# for .NET

csharp deep-learning linguistics lingvo machine-learning morphology named-entity-recognition natural-language-processing ner net neural-network nlp nlp-machine-learning pos-tagger pos-tagging russian

Last synced: 05 Nov 2024