Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists by stephantul

A curated list of projects in awesome lists by stephantul .

https://github.com/stephantul/reach

Load embeddings and featurize your sentences.

embeddings numpy vectorization word2vec

Last synced: 08 Feb 2025

https://github.com/stephantul/piecelearn

Learning BPE embeddings by first learning a segmentation model and then training word2vec

bpe embeddings sentencepiece word2vec wordpiece

Last synced: 08 Feb 2025

https://github.com/stephantul/unitoken

Tokenization across languages. Useful as preprocessing for subword tokenization.

Last synced: 08 Feb 2025

https://github.com/stephantul/quickumls_pred

Predict semantic types using QuickUMLS

Last synced: 08 Feb 2025

https://github.com/stephantul/plate

holographic reduced representations

Last synced: 08 Feb 2025

https://github.com/stephantul/old20

Calculate Yarkoni, Baloto & Yap's OLD20.

Last synced: 08 Feb 2025

https://github.com/stephantul/orst

A pixel sorting program, written in python 3.x.

art pixel-sorting

Last synced: 08 Feb 2025

https://github.com/stephantul/vicinage

Fast implementations of various string- and vector-based neighborhood metrics

Last synced: 08 Feb 2025

https://github.com/stephantul/cobweb

cobweb plot

Last synced: 08 Feb 2025

https://github.com/stephantul/wavesom

Base part of the global space model.

dynamic som systems

Last synced: 08 Feb 2025

https://github.com/stephantul/tacosdetection

Contains the supplementary materials from the paper: "A Dictionary-based Approach to Racism Detection in Dutch Social Media", under review for the TACOS workshop at LREC 2016.

Last synced: 08 Feb 2025

https://github.com/stephantul/ruly

A short script to generate stuff based on binary cellular automata.

automata wolfram

Last synced: 08 Feb 2025

https://github.com/stephantul/torchic

Simple linear thing in Torch, with a scikit-learn compatible API.

Last synced: 08 Feb 2025

https://github.com/stephantul/rd

representation distance

Last synced: 08 Feb 2025

https://github.com/stephantul/lrec2018

Code for the experiments in the LREC 2018 paper "WordKit: a Python Package for Orthographic and Phonological Featurization"

Last synced: 08 Feb 2025

https://github.com/stephantul/hashing_split

Stable train/test splits using hashing

Last synced: 08 Feb 2025

https://github.com/stephantul/charlm

Character LM

Last synced: 08 Feb 2025