https://github.com/tuan6100/language-identification-tool
https://github.com/tuan6100/language-identification-tool
machine-learning n-grams naive-bayes-classifier
Last synced: 5 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/tuan6100/language-identification-tool
- Owner: tuan6100
- Created: 2025-05-05T13:32:49.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-05-15T15:49:51.000Z (5 months ago)
- Last Synced: 2025-05-15T16:28:18.331Z (5 months ago)
- Topics: machine-learning, n-grams, naive-bayes-classifier
- Language: Python
- Homepage:
- Size: 7.75 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Language Identification
Language identification model using Naive-Bayes Classifier method
## ⚡ CUDA Acceleration (Optional)
If you'd like to leverage **CUDA** to accelerate computations:
1. Download and install the [NVIDIA CUDA Toolkit](https://developer.nvidia.com/cuda-downloads).
2. Ensure that the `$CUDA_PATH` environment variable is properly set.
3. Then, run the following commands to complete the setup:```bash
pip install nvidia-cuda-runtime-cu12
pip install -U pip setuptools wheel
```You may need restart your computer to enable all cuda features.
## References:
- [https://www.kaggle.com/code/mehmetlaudatekman/naive-bayes-based-language-identification-system](https://www.kaggle.com/code/mehmetlaudatekman/naive-bayes-based-language-identification-system)
- [https://huggingface.co/papluca/xlm-roberta-base-language-detection](https://huggingface.co/papluca/xlm-roberta-base-language-detection)