Projects in Awesome Lists tagged with text-normalization
A curated list of projects in awesome lists tagged with text-normalization .
https://github.com/jfilter/clean-text
๐งน Python package for text cleaning
natural-language-processing nlp python python-package scraping text-cleaning text-normalization text-preprocessing user-generated-content
Last synced: 15 May 2025
https://github.com/nvidia/nemo-text-processing
NeMo text processing for ASR and TTS
inverse-text-n text-normalization
Last synced: 13 Apr 2025
https://github.com/NVIDIA/NeMo-text-processing
NeMo text processing for ASR and TTS
inverse-text-n text-normalization
Last synced: 25 Nov 2024
https://github.com/ikegami-yukino/neologdn
Japanese text normalizer for mecab-neologd
japanese-language mecab-ipadic-neologd nlp preprocessing text-normalization
Last synced: 12 Mar 2025
https://github.com/snakers4/russian_stt_text_normalization
Russian text normalization pipeline for speech-to-text and other applications based on tagging s2s networks
python3 pytorch russian-language speech speech-to-text text-normalization torchscript
Last synced: 27 Nov 2024
https://github.com/greenlikeorange/knayi-myscript
Myanmar Language Script Library
burmese-nlp fontconvert fontdetect myanmar text-normalization unicode zawgyi
Last synced: 14 Mar 2025
https://github.com/tomaarsen/ttstextnormalization
Convert English text from written expressions into spoken forms
competition nlp normalization spoken-forms text-normalization tts
Last synced: 23 Apr 2025
https://github.com/sugatagh/E-commerce-Text-Classification
Proper categorization of e-commerce products enhances the user experience and achieves better results with external search engines. The objective of the project is to classify a product into four given categories, based on its description available on an e-commerce platform.
e-commerce natural-language-processing product-categorization text-classification text-normalization tf-idf word2vec
Last synced: 13 Apr 2025
https://github.com/seavleu/khmer-utils
A ๐ฐ๐ญ utility library for number formatting, currency display, date localization, text normalization, and script transliteration, built for Cambodian developers.
currency-conversion date-localization khmer-language locale number-formatting text-normalization transliteration
Last synced: 10 Apr 2025
https://github.com/mvakili/tokenizer
Spelling corrector and text normalizer
natural-language-processing text-normalization tokenizer
Last synced: 14 Apr 2025
https://github.com/curegit/unicodecheck
Simple tool to check if Unicode text files are Unicode-normalized
character-encoding text-normalization unicode
Last synced: 28 Jan 2025