Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/akretion/nfelib

nfelib - bindings Python para e ler e gerir XML de NF-e, NFS-e nacional, CT-e, MDF-e, BP-e

bpe brasil cte mdfe nfe nfse nota-fiscal-eletronica python sped

Last synced: 04 Jul 2024

https://github.com/ankane/youtokentome-ruby

High performance unsupervised text tokenization for Ruby

bpe byte-pair-encoding npl tokenization unsupervised-learning word-segmentation

Last synced: 02 Jun 2024

https://github.com/zurawiki/tiktoken-rs

Ready-made tokenizer library for working with GPT and tiktoken

bpe openai rust tokenizer

Last synced: 29 Apr 2024

https://github.com/niieani/gpt-tokenizer

JavaScript BPE Tokenizer Encoder Decoder for OpenAI's GPT-2 / GPT-3 / GPT-4. Port of OpenAI's tiktoken with additional features.

bpe decoder encoder gpt-2 gpt-3 gpt-4 machine-learning openai tokenizer

Last synced: 08 Apr 2024

https://github.com/vkcom/youtokentome

Unsupervised text tokenizer focused on computational efficiency

bpe natural-language-processing nlp tokenization word-segmentation

Last synced: 23 Mar 2024

https://github.com/rsennrich/subword-nmt

Unsupervised Word Segmentation for Neural Machine Translation and Text Generation

bpe machine-translation neural-machine-translation nmt segmentation subword-units

Last synced: 17 Mar 2024