Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-khmer-language
A large collection of Khmer language resources. Khmer is a language used by Cambodia.
https://github.com/seanghay/awesome-khmer-language
Last synced: 5 days ago
JSON representation
-
Awesome Khmer Language
-
6. Blog / Slides
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Issues in Khmer syllable validation
- Khmer Machine Learning (ML) Experiment
- How domnung.com Ranks Khmer News
- Text Classification with scikit-learn on Khmer Documents
- Multi-Class Text Classification on Khmer News Articles
- Word Segmentation of Khmer Text Using Conditional Random Fields
- NLP: Text Segmentation Using Dictionary Based Algorithms
- NLP: Text Segmentation with Ngram
- NLP: Text Segmentation Using Naive Bayes
- NLP: Text Segmentation Using Hidden Markov Model
- NLP: Text Segmentation Using Maximum Entropy Markov Model
- NLP: Text Segmentation Using Conditional Random Fields
- Khmer Language Model Using ULMFiT (Feb 2020)
- Creating a Khmer Language Model using BERT
- Building a Khmer Spelling Checker
- khmerlang.com
- Khmer word spell correction using BK-Tree data structure and Levenshtein distance
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- ការបញ្ចូលអក្សរខ្មែរក្នុងយូនីកូដ ឯកសារឆ្នាំ 1996
- Building a Khmer Spelling Checker
- Building a Khmer Spelling Checker
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Using AI to Generate Khmer Baby Names
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
- Building a Khmer Spelling Checker
- Introduction to kNN algorithm by experiment on Khmer Handwriting classification using Java 8
-
3. Datasets
- Khmer Bible Recordings
- ParaCrawl Corpus
- Asian Language Treebank (ALT) Project
- google/language-resources
- Illustrations and recordings for language learning
- seanghay/khmer-dictionary-44k
- seanghay/km-speech-corpus
- seanghay/bookmebus-reviews
- seanghay/khmer_mpwt_speech
- seanghay/khmer_kheng_info_speech
- seanghay/khmer_grkpp_speech
- High quality TTS data for Khmer
- Google FLEURS
- mc4
- Khmer tesseract-ocr
- Khmerlang Mobile Keyboard data
- Khmer annotation
- khPOS (Khmer Part-of-Speech) Corpus for Khmer NLP Research and Developments
- phylypo/segmentation-crf-khmer
- Khmer LineBreaking Dictionary
- SleukRith Set
- khPOS (Khmer Part-of-Speech) Corpus for Khmer NLP Research and Developments
-
1. Specification
-
2. Toolkit
- khmer-dictionary-tools
- automatic-phonemic-and-phonetic-transcription
- Khmer Word Segmentation - Rina Buoy
- Khmer natural language processing toolkit
- Khmer Limon to Unicode
- seanghay/split-khmer
- seanghay/khmertokenizer
- seanghay/khmerword
- seanghay/khmernumber
- seanghay/khmernormalizer
- khmer-ocr-benchmark-dataset
- Khmer utility functions
- Trey314159/KhmerSyllableReordering
- nota/split-graphemes
- khmercut
- Socret360/akara-python - Source Khmer Spell Checker
- khmer-latin-name-transformer
- native-khmer-g2p
- khmerphonemizer
- kfa
- khmer-unicode-converter
- khmerpunctuate
- khmerocr_tools
- Socret360/jaws
- seanghay/khmersegment
- seanghay/khmerpronounce
- seanghay/khmer2number
- sillsdev/khmer-normalizer - khmer-encoding.pdf
- khmerocr_tools
- NextSpell - ពិនិត្យអក្ខរាវិរុទ្ធ, ខ្មែរ OCR, កាត់ពាក្យ
- sosap(សូរសព្ទ)
- seanghay/khmer-acoustic-model-mfa
- seanghay/tha - A Khmer Text Normalization and Verbalization Toolkit
-
4. Research Papers
- An End-to-End Khmer Optical Character Recognition using Sequence-to-Sequence with Attention
- Khmer Word Search: Challenges, Solutions, and Semantic-Aware Search
- Khmer Text Classification Using Word Embedding and Neural Networks
- Joint Khmer Word Segmentation and Part-of-Speech Tagging Using Deep Learning
- Building WFST based Grapheme to Phoneme Conversion for Khmer
- Query Expansion for Khmer Information Retrieval
- Building a Syllable Database to Solve the Problem of Khmer Word Segmentation
- Khmer Word Segmentation based on Bi-Directional Maximal Matching for Plaintext and Microsoft Word Document
- Khmer printed character recognition using attention-based Seq2Seq network
- Khmer Word Segmentation Using Conditional Random Fields
- A Large-scale Study of Statistical Machine Translation Methods for Khmer Language
- A Rule-based Approach for Khmer Word Extraction
- The Standard Khmer vowel system: An acoustic study
- Towards deep learning on speech recognition for Khmer language
- A review of Khmer word segmentation and part-of-speech tagging and an experimental study using bidirectional long short-term memory
- Bi-directional Maximal Matching Algorithm to Segment Khmer Words in Sentence
- Detection and Correction of Homophonous Error Word for Khmer Language
- No Language Left Behind (NLLB)
- Phonological Principles And Automatic Phonemic And Phonetic Transcription Of Khmer Words
- Multi-lingual Transformer Training for Khmer Automatic Speech Recognition
- TriECCC: Trilingual Corpus of the Extraordinary Chambers in the Courts of Cambodia for Speech Recognition and Translation Studies
- Domain and Language Adaptation Using Heterogeneous Datasets for Wav2vec2.0-Based Speech Recognition of Low-Resource Language
- Khmer pronouncing dictionary: standard Khmer and Phnom Penh dialect
- ViTSTR-Transducer: Cross-Attention-Free Vision Transformer Transducer for Scene Text Recognition
- Explainable Connectionist-Temporal-Classification-Based Scene Text Recognition
- Toward a Low-Resource Non-Latin-Complete Baseline: An Exploration of Khmer Optical Character Recognition
-
5. Projects/Models
- facebookresearch/fairseq/mms
- Khmer Language Model using ULMFiT
- KHMER WORD SEARCH BASE ON SEMANTIC RELATION
- Khmer Audio Dictionary
- Khmer to IPA Converter
- Khmer Phonemizer
- Khmer Text-to-Speech MMS
- Khmer Part of Speech Tagging with XLM RoBERTa
- Whisper Small Khmer Fine-tuned
- vitouphy/wav2vec2-xls-r-300m-khmer
- vitouphy/wav2vec2-xls-r-1b-khmer
- Khmer Text Classification
- Fast Khmer Dictionary
- Khmer Single Word TTS
- SeaLLMs
-
7. Misc
-
Programming Languages
Categories
Sub Categories
Keywords
khmer
17
cambodia
9
khmer-language
5
nlp
3
phonetisaurus
2
g2p
2
khmernumber
2
khmer-unicode
2
crf
2
normalization
1
tokenizer
1
nodejs
1
word-segmenter
1
sentence-segmenter
1
segmentation
1
part-of-speech-tagging
1
nlp-library
1
phonological regularities
1
phonetic transcription
1
phonemic transcription
1
automatic transcription
1
Thrax transducer
1
Khmer language
1
python
1
word-segmentation
1
crfpp
1
xlm-roberta
1
sentence-segmentation
1
punctuation-restoration
1
khmer-punct
1
khmer-limon
1
wav2vec2
1
forced-alignment
1
alignment
1
phonemes
1
thrax
1
openfst
1
pynini
1
crfsuite
1
ocr
1
dataset
1
verbalization
1
normalizer
1