Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists by stephantul
A curated list of projects in awesome lists by stephantul .
https://github.com/stephantul/somber
Recursive Self-Organizing Map/Neural Gas.
cython kohonen machine-learning neural-gas ng plsom recsom recurrent-neural-networks som unsupervised
Last synced: 08 Feb 2025
https://github.com/stephantul/reach
Load embeddings and featurize your sentences.
embeddings numpy vectorization word2vec
Last synced: 08 Feb 2025
https://github.com/stephantul/piecelearn
Learning BPE embeddings by first learning a segmentation model and then training word2vec
bpe embeddings sentencepiece word2vec wordpiece
Last synced: 08 Feb 2025
https://github.com/stephantul/unitoken
Tokenization across languages. Useful as preprocessing for subword tokenization.
Last synced: 08 Feb 2025
https://github.com/stephantul/quickumls_pred
Predict semantic types using QuickUMLS
Last synced: 08 Feb 2025
https://github.com/stephantul/old20
Calculate Yarkoni, Baloto & Yap's OLD20.
Last synced: 08 Feb 2025
https://github.com/stephantul/orst
A pixel sorting program, written in python 3.x.
Last synced: 08 Feb 2025
https://github.com/stephantul/vicinage
Fast implementations of various string- and vector-based neighborhood metrics
Last synced: 08 Feb 2025
https://github.com/stephantul/trnsps
transpose words
deletion levenshtein levenshtein-distance spelling substitution transposition
Last synced: 08 Feb 2025
https://github.com/stephantul/tacosdetection
Contains the supplementary materials from the paper: "A Dictionary-based Approach to Racism Detection in Dutch Social Media", under review for the TACOS workshop at LREC 2016.
Last synced: 08 Feb 2025
https://github.com/stephantul/ruly
A short script to generate stuff based on binary cellular automata.
Last synced: 08 Feb 2025
https://github.com/stephantul/torchic
Simple linear thing in Torch, with a scikit-learn compatible API.
Last synced: 08 Feb 2025
https://github.com/stephantul/lrec2018
Code for the experiments in the LREC 2018 paper "WordKit: a Python Package for Orthographic and Phonological Featurization"
Last synced: 08 Feb 2025
https://github.com/stephantul/hashing_split
Stable train/test splits using hashing
Last synced: 08 Feb 2025