Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists by LanguageMachines
A curated list of projects in awesome lists by LanguageMachines .
https://github.com/LanguageMachines/frog
Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg memory-based learning software package.
computational-linguistics dependency-parser dutch folia lemmatiser morphological-analyser morphology named-entity-recognition natural-language-processing nlp pos-tagger syntax text-processing
Last synced: 31 Jul 2024
https://github.com/LanguageMachines/ucto
Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic preprocessing steps such as changing case that you can all use to make your text suited for further processing such as indexing, part-of-speech tagging, or machine translation. Ucto comes with tokenisation rules for several languages and can be easily extended to suit other languages. It has been incorporated for tokenizing Dutch text in Frog, our Dutch morpho-syntactic processor. http://ilk.uvt.nl/ucto --
computational-linguistics folia language natural-language-processing nlp punctuation tokeniser
Last synced: 31 Jul 2024
https://github.com/LanguageMachines/PICCL
A set of workflows for corpus building through OCR, post-correction and normalisation
computational-linguistics corpus-linguistics corpus-tools folia nlp ocr workflow
Last synced: 01 Aug 2024
https://languagemachines.github.io/timbl/
TiMBL implements several memory-based learning algorithms.
c-plus-plus classification decision-tree ib1 ib1-ig igtree k-nearest-neighbours knn learning-algorithm learning-algorithms machine-learning nearest-neighbours timbl
Last synced: 03 Aug 2024
https://github.com/LanguageMachines/libfolia
FoLiA library for C++
folia library natural-language-processing nlp
Last synced: 31 Jul 2024