https://github.com/ralfbrown/whatlang2
Trainable language identifier
https://github.com/ralfbrown/whatlang2
langid nlp
Last synced: 10 months ago
JSON representation
Trainable language identifier
- Host: GitHub
- URL: https://github.com/ralfbrown/whatlang2
- Owner: ralfbrown
- License: gpl-3.0
- Created: 2019-07-07T18:51:45.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2019-08-23T18:49:30.000Z (almost 7 years ago)
- Last Synced: 2025-07-25T21:44:39.036Z (11 months ago)
- Topics: langid, nlp
- Language: C++
- Size: 1.29 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README
- License: COPYING
Awesome Lists containing this project
README
This is the 'whatlang' language identifier, updated to use FramepaC-ng
and compile properly in C++11 under more recent compilers than the
version originally included as part of LA-Strings (the language-aware
text-string extractor).
The makefile assumes that the 'framepac' subdirectory contains a copy
of FramepaC-ng. If you already have a copy installed, just make it a
symlink (git has been told to ignore 'framepac'). Otherwise, you can
use 'git submodule' to install a copy from the FramepaC-ng repo.
Due to the size of the data files for training, they have not been
copied to GitHub. You can still retrieve them from SourceForge, at
https://sourceforge.net/projects/la-strings/files/Language-Data/