Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with ngram

A curated list of projects in awesome lists tagged with ngram .

https://github.com/zhezhaoa/ngram2vec

Four word embedding models implemented in Python. Supporting arbitrary context features

analogy chinese embedding glove n-gram ngram ngram2vec ppmi svd word word-embedding word2vec

Last synced: 01 Aug 2024

https://github.com/lonePatient/albert_pytorch

A Lite Bert For Self-Supervised Learning Language Representations

albert bert language-model mask ngram nlp pytorch

Last synced: 01 Aug 2024

https://github.com/wintermute-cell/ngrrram

A TUI tool to help you type faster and learn new layouts. Includes a free cat.

cat cli colemak dvorak layout ngram rust touchtyping tui typing

Last synced: 01 Aug 2024

https://github.com/ranelpadon/ngram-type

Touch typing trainer using N-grams as data source, with options to customize the auto-generated lessons and specify the minimum typing performance needed. There are sound/color effects as well.

amphetype colemak dvorak keybr lesson-generator monkeytype ngram norman qwerty touch-typing vue

Last synced: 03 Aug 2024

https://github.com/proycon/colibri-core

Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool ``colibri-patternmodeller`` whi ch allows you to build, view, manipulate and query pattern models.

c-plus-plus computational-linguistics corpus library linguistics ngram ngrams nlp pattern-recognition python skipgram text-processing

Last synced: 29 Sep 2024

https://github.com/ChrisMuir/refinr

Cluster and merge similar string values: an R implementation of Open Refine clustering algorithms

approximate-string-matching clustering cran data-cleaning data-clustering fuzzy-matching ngram openrefine r rstats

Last synced: 31 Jul 2024

https://github.com/wrathematics/ngram

Fast n-Gram Tokenization

ngram r text text-mining

Last synced: 30 Jul 2024

https://github.com/vickumar1981/stringdistance

A fuzzy matching string distance library for Scala and Java that includes Levenshtein distance, Jaro distance, Jaro-Winkler distance, Dice coefficient, N-Gram similarity, Cosine similarity, Jaccard similarity, Longest common subsequence, Hamming distance, and more..

cosine-similarity cosine-similarity-scores dice-coefficient fuzzy-matching hacktoberfest hamming-distance jaccard jaccard-similarity jaro jaro-distance jaro-winkler jaro-winkler-distance levenshtein levenshtein-distance longest-common-subsequence ngram sorensen-dice-distance soundex soundex-algorithm string-similarity

Last synced: 02 Oct 2024

https://github.com/JackHCC/Chinese-Tokenization

利用传统方法(N-gram,HMM等)、神经网络方法(CNN,LSTM等)和预训练方法(Bert等)的中文分词任务实现【The word segmentation task is realized by using traditional methods (n-gram, HMM, etc.), neural network methods (CNN, LSTM, etc.) and pre training methods (Bert, etc.)】

bert-crf bilstm-crf hmm-viterbi-algorithm ngram nlp tokenization

Last synced: 03 Aug 2024

https://github.com/0xVavaldi/gramify

Create n-grams of wordlists based on words, characters, or charsets to use in offline password attacks and data analysis

hashcat jtr mdxfind ngram password password-analysis password-cracking

Last synced: 01 Aug 2024

https://github.com/mkearney/chr

🔤 Lightweight R package for manipulating [string] characters

character chr extract mkearney-r-package ngram r r-package regex rstats string-manipulation strings text-processing

Last synced: 13 Aug 2024

https://github.com/mochi-co/ngrams

A Go n-gram indexer for natural language processing with modular tokenizers and data stores

bigrams go golang language-model natural-language-processing ngram ngrams nlp tokenization trigrams

Last synced: 02 Aug 2024

https://github.com/euskadi31/go-ngram

an n-gram is a contiguous sequence of n items from a given sequence of text or speech.

go golang golang-library machine-learning ngram ngram-analysis ngrams nlp

Last synced: 02 Aug 2024