Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/agmmnn/syn
đž Get synonyms and antonyms of words from Thesaurus.com and other sources in your terminal, with rich output.
cli command-line datamuse dictionary linguistics python rich synonyms terminal thesaurus wordsearch
Last synced: 02 Jul 2024
![](https://github.com/agmmnn.png)
https://github.com/proiel/proiel-treebank
Official releases of the PROIEL treebank of ancient Indo-European languages
ancient-greek ancient-languages armenian corpus gothic2 language latin linguistics new-testament old-church-slavonic treebank
Last synced: 01 Jul 2024
![](https://github.com/proiel.png)
https://github.com/google/corpuscrawler
Crawler for linguistic corpora
corpus-builder corpus-linguistics crawling linguistics minority-language
Last synced: 30 Jun 2024
![](https://github.com/google.png)
https://github.com/willianantunes/transcriber-wrapper
Wrapper of well-known transcribers that transform text into phoneme codes
arpabet espeak-ng festival-speech-synthesis international-phonetic-alphabet ipa linguistics mypy pytest transcriber transcription
Last synced: 27 Jun 2024
![](https://github.com/willianantunes.png)
https://github.com/delph-in/pydmrs
A library for manipulating DMRS structures
computational-linguistics delph-in dependency-graph dmrs formal-semantics hpsg linguistics minimal-recursion-semantics mrs natural-language natural-language-processing nlp python semantics
Last synced: 20 Jun 2024
![](https://github.com/delph-in.png)
https://github.com/Tatoeba/tatoeba2
Tatoeba is a platform whose purpose is to create a collaborative and open dataset of sentences and their translations.
Last synced: 12 Jun 2024
![](https://github.com/Tatoeba.png)
https://github.com/tallguyjenks/runes
đ§ââď¸ áąá˘ážáá in your R Documents!
bryan-jenks cran elder-futhark-runes futhark futhark-runes linguistics nordic r rstats rstudio rune runes
Last synced: 10 Jun 2024
![](https://github.com/tallguyjenks.png)
https://github.com/rime/rime-cantonese
Rime Cantonese input schema | 精čŞćźéłčź¸ĺ ĽćšćĄ
cantonese cantonese-dictionary cantonese-language chinese chinese-language chinese-nlp input-method jyutping linguistics rime rime-schema
Last synced: 09 Jun 2024
![](https://github.com/rime.png)
https://github.com/geoffbacon/cerberus
Cerberus is an app that reduces the annotation burden of linguists
allennlp linguistics natural-language-processing streamlit
Last synced: 07 Jun 2024
![](https://github.com/geoffbacon.png)
https://github.com/xiamx/awesome-sentiment-analysis
đđđđ A curated list of Sentiment Analysis methods, implementations and misc. đĽđđąđ¤
awesome-list deep-learning linguistics machine-learning nlp python sentiment-analysis supervised-machine-learning
Last synced: 31 May 2024
![](https://github.com/xiamx.png)
http://proycon.github.io/folia/
FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for processing FoLiA is implemented as part of PyNLPl, this contains higher-level tools that use the library as well as the full documentation, validation schemas, and set definitions
computational-linguistics corpus file-format folia language library linguistic-annotation-framework linguistics nlp python xml
Last synced: 31 May 2024
![](https://github.com/proycon.png)
https://github.com/what-studio/tossi
Chooses correct Korean particle morphs for arbitrary words.
korean linguistics localization python
Last synced: 27 May 2024
![](https://github.com/what-studio.png)
https://github.com/sublee/hangulize
Korean Alphabet Transcription
hangul korean linguistics localization python transcription translation
Last synced: 27 May 2024
![](https://github.com/sublee.png)
https://github.com/milangritta/Pragmatic-Guide-to-Geoparsing-Evaluation
Full resources supporting the publication "A Pragmatic Guide to Geoparsing Evaluation."
analysis data evaluation geocoder geocoding geography geoparser geoparsing google-cloud linguistics location machine-learning named-entity-recognition places spacy-nlp taxonomy toponym-resolution toponyms toponymy training-data
Last synced: 26 May 2024
![](https://github.com/milangritta.png)
https://github.com/proycon/flat
FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.github.io/folia), a rich XML-based format for linguistic annotation. Flat allows users to view annotated FoLiA documents and enrich these documents with new annotations, a wide variety of linguistic annotation types is supported through the FoLiA paradigm.
annotation-tool clariah clarin computational-linguistics folia javascript linguistic-annotation-framework linguistics nlp python web-application
Last synced: 22 May 2024
![](https://github.com/proycon.png)
https://github.com/ropensci/lingtypology
R package for linguistic cartography and typological databases search
abvd afbo atlas autotype bivaltyp clld glottolog-database linguistic-maps linguistics phoible r r-package sails typology wals
Last synced: 20 May 2024
![](https://github.com/ropensci.png)
https://github.com/tshatrov/ichiran
Linguistic tools for texts in Japanese language
common-lisp dictionary grammar japanese japanese-language language linguistics
Last synced: 19 May 2024
![](https://github.com/tshatrov.png)
https://github.com/derintelligence/en-az-parallel-corpus
English-Azerbaijani parallel language corpus
azerbaijan azerbaijani-translation corpus language linguistics nlp parallel translation
Last synced: 13 May 2024
![](https://github.com/derintelligence.png)
https://github.com/derintelligence/az-summarization
Abstractive summarization for Azerbaijani language
azerbaijan dataset language linguistics nlp summarization
Last synced: 13 May 2024
![](https://github.com/derintelligence.png)
https://github.com/proycon/colibri-core
Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.
c-plus-plus computational-linguistics corpus library linguistics ngram ngrams nlp pattern-recognition python skipgram text-processing
Last synced: 12 May 2024
![](https://github.com/proycon.png)
https://github.com/proycon/pynlpl
PyNLPl, pronounced as 'pineapple', is a Python library for Natural Language Processing. It contains various modules useful for common, and less common, NLP tasks. PyNLPl can be used for basic tasks such as the extraction of n-grams and frequency lists, and to build simple language model. There are also more complex data types and algorithms. Moreover, there are parsers for file formats common in NLP (e.g. FoLiA/Giza/Moses/ARPA/Timbl/CQL). There are also clients to interface with various NLP specific servers. PyNLPl most notably features a very extensive library for working with FoLiA XML (Format for Linguistic Annotation).
computational-linguistics evaluation-metrics folia language-modelling library linguistics machine-learning natural-language-processing nlp nlp-library python search-algorithms text-processing
Last synced: 12 May 2024
![](https://github.com/proycon.png)
https://github.com/hangulize/hangulize
Hangulize transcribes non-Korean words into Hangul
korean linguistics transcription
Last synced: 08 May 2024
![](https://github.com/hangulize.png)
https://github.com/psychopy/psychopy
For running psychology and neuroscience experiments
experiment experiment-control experimental-design linguistics neuroscience psycholinguistics psychology psychophysics psychopy python science
Last synced: 08 May 2024
![](https://github.com/psychopy.png)
https://github.com/theimpossibleastronaut/awesome-linguistics
A curated list of anything remotely related to linguistics
awesome-list language linguistics resources
Last synced: 05 May 2024
![](https://github.com/theimpossibleastronaut.png)
https://github.com/meyersbs/SPLAT
Speech Processing & Linguistic Analysis Tool
berkeley-parser command-line-tool library linguistics natural-language-processing nltk pypi
Last synced: 04 May 2024
![](https://github.com/meyersbs.png)
https://github.com/hbuschme/TextGridTools
Read, write, and manipulate Praat TextGrid files with Python
annotation data-analysis elan linguistics praat python textgrid
Last synced: 02 May 2024
![](https://github.com/hbuschme.png)
https://github.com/jacksonllee/pycantonese
Cantonese Linguistics and NLP
cantonese computational-linguistics jyutping linguistics natural-language-processing nlp part-of-speech-tagging pycantonese python stop-words word-segmentation
Last synced: 01 May 2024
![](https://github.com/jacksonllee.png)
https://github.com/dveselov/mystem
CGo bindings to Yandex.Mystem
cgo-bindings linguistics mystem russian-specific
Last synced: 29 Apr 2024
![](https://github.com/dveselov.png)
https://github.com/pyconll/pyconll
A minimal, pure Python library to interface with CoNLL-U format files.
annotation conllu dependency-parsing linguistics minimal python universal-dependencies
Last synced: 27 Apr 2024
![](https://github.com/pyconll.png)
https://github.com/josecannete/spanish-corpora
Unannotated Spanish 3 Billion Words Corpora
corpora linguistics natural-language-processing nlp spanish spanish-language
Last synced: 27 Apr 2024
![](https://github.com/josecannete.png)
https://github.com/MaxBittker/nyt-first-said
Tweets when words are published for the first time in the NYT
civic-tech journalism linguistics newsroom politics python scraper twitter
Last synced: 26 Apr 2024
![](https://github.com/MaxBittker.png)
https://github.com/DevelopersTree/KurdishResources
A repository for resources in Kurdish Language
bot kurdish kurdish-oss linguistics wordlist
Last synced: 23 Apr 2024
![](https://github.com/DevelopersTree.png)
https://github.com/CoEDL/elpis
đ software for creating speech recognition models.
automatic-speech-recognition computational-linguistics docker kaldi linguistics python transcription
Last synced: 17 Apr 2024
![](https://github.com/CoEDL.png)
https://github.com/proycon/folia
FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for processing FoLiA is implemented as part of PyNLPl, this contains higher-level tools that use the library as well as the full documentation, validation schemas, and set definitions
computational-linguistics corpus file-format folia language library linguistic-annotation-framework linguistics nlp python xml
Last synced: 17 Apr 2024
![](https://github.com/proycon.png)
https://github.com/digitallinguistics/data-format
The Data Format for Digital Linguistics (DaFoDiL)
corpora corpus-linguistics daffodil digital-humanities digital-linguistics dlx dlx-format json json-schema language languages linguistics natural-language schema
Last synced: 17 Apr 2024
![](https://github.com/digitallinguistics.png)
https://github.com/CUNY-CL/wikipron
Massively multilingual pronunciation mining
computational-linguistics g2p language linguistics nlp phonetics phonology pronunciation python-api scraped-data speech
Last synced: 17 Apr 2024
![](https://github.com/CUNY-CL.png)
https://github.com/korpling/pepper
A highly extensible plattform for conversion and manipulation of linguistic data between an unbound set of formats. Pepper can be used stand-alone as a command line interface, or be integrated as an API into other software products.
annotations converter format java linguistic-formats linguistics nlp pepper
Last synced: 17 Apr 2024
![](https://github.com/korpling.png)
https://github.com/dativebase/old
The Online Linguistic Database (OLD): software for linguistic fieldwork.
data-management linguistics linguistics-field python
Last synced: 17 Apr 2024
![](https://github.com/dativebase.png)
https://github.com/dativebase/old-pyramid
Online Linguistic Database (OLD)
linguistics linguistics-databases pyramid-framework python3
Last synced: 17 Apr 2024
![](https://github.com/dativebase.png)
https://github.com/sillsdev/libpalaso
Palaso Library: A set of .Net libraries useful for developers of Language Software.
hacktoberfest languages linguistics linux windows
Last synced: 17 Apr 2024
![](https://github.com/sillsdev.png)
https://github.com/dativebase/dative
Dative: software for linguistic fieldwork
Last synced: 17 Apr 2024
![](https://github.com/dativebase.png)
https://github.com/boltomli/MyShinyApps
R apps that run on shinyapps.io or RStudio Connect
audio linguistics python r rstudio rstudio-connect shinyapps speech
Last synced: 13 Apr 2024
![](https://github.com/boltomli.png)
https://github.com/LexPredict/lexpredict-lexnlp
LexNLP by LexPredict
analytics contracts data law legal legaltech linguistics ml nlp
Last synced: 13 Apr 2024
![](https://github.com/LexPredict.png)
https://github.com/yohasebe/rsyntaxtree
Syntax tree generator for linguistic research
linguistics ruby rubynlp svg syntax-tree visualization
Last synced: 10 Apr 2024
![](https://github.com/yohasebe.png)
https://gitlab.com/smc/mlmorph
Malayalam Morphological Analyzer using Finite State Transducer https://morph.smc.org.in
Malayalam fst hfst linguistics morphology analyser sfst
Last synced: 03 Apr 2024
https://github.com/koskenni/beta
An open source reimplementation of Benny Brodda's BETA in Python
benny-brodda beta corpus-tools hyphenation linguistics open-source string-manipulation string-rewriting
Last synced: 01 Apr 2024
![](https://github.com/koskenni.png)
https://github.com/open-dict-data/ipa-dict
Monolingual wordlists with pronunciation information in IPA
dictionaries g2p grapheme-to-phoneme ipa ipa-data ipa-dictionary language linguistics phonemic-transcription phonetic-transcriptions wordlist
Last synced: 24 Mar 2024
![](https://github.com/open-dict-data.png)
https://github.com/nonamestreet/weixin_public_corpus
ĺžŽäżĄĺ ŹäźĺˇčŻćĺş
chinese-nlp corpora corpus linguistics natural-language-processing nlp wei-xin weixin weixin-data yu-liao yu-liao-ku
Last synced: 21 Mar 2024
![](https://github.com/nonamestreet.png)