https://github.com/varnamproject/govarnam
Easily type Indic languages on computer and mobile. GoVarnam is a cross-platform transliteration library. Manglish -> Malayalam, Thanglish -> Tamil, Hinglish -> Hindi plus another 10 languages. GoVarnam is a near-Go port of libvarnam
https://github.com/varnamproject/govarnam
hindi ime input-method keyboard language linux mac malayalam tamil transliteration windows
Last synced: 6 months ago
JSON representation
Easily type Indic languages on computer and mobile. GoVarnam is a cross-platform transliteration library. Manglish -> Malayalam, Thanglish -> Tamil, Hinglish -> Hindi plus another 10 languages. GoVarnam is a near-Go port of libvarnam
- Host: GitHub
- URL: https://github.com/varnamproject/govarnam
- Owner: varnamproject
- License: other
- Created: 2021-08-17T18:21:04.000Z (about 4 years ago)
- Default Branch: master
- Last Pushed: 2024-12-14T19:01:31.000Z (10 months ago)
- Last Synced: 2025-03-29T11:03:00.156Z (7 months ago)
- Topics: hindi, ime, input-method, keyboard, language, linux, mac, malayalam, tamil, transliteration, windows
- Language: Go
- Homepage: https://varnamproject.com
- Size: 373 KB
- Stars: 147
- Watchers: 5
- Forks: 11
- Open Issues: 28
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# Varnam
Varnam is an Indian language transliteration library. GoVarnam is a brand new Go port of [libvarnam](https://github.com/varnamproject/libvarnam) with some core architectural changes.
It is stable to use daily as an input method. Try out different languages here: https://varnamproject.github.io/editor/
Malayalam has really good support in Varnam. We welcome improvements of all languages in Varnam.
* An [Input Method Engine](https://en.wikipedia.org/wiki/Input_method) for GNU/Linux operating systems via IBus is available here: https://github.com/varnamproject/govarnam-ibus
* For macOS, there is a [Varnam IME too](https://github.com/varnamproject/varnam-macOS).
* Windows: Need Help## Installation & Usage
See instructions in website: https://varnamproject.github.io/download/
FAQ: https://varnamproject.github.io/docs/faq/

## Development
Proceed through these sections one by one:
### Videos
See this video to understand more about Varnam (DebConf21):
* PeerTube: https://peertube.debian.social/w/vWwMGcmTZG9n1UWv8ZdimB?s=1
* YouTube: https://www.youtube.com/watch?v=pJpOWlD_7OI### Files & Folders
* `govarnam` - The library files
* `main.go, c-shared*` - Files that help in making the govarnam a C shared library
* `govarnamgo` - Go bindings for the library. For use with other Go projects
* `cli` - A CLI tool written in Go for Varnam. Uses `govarnamgo` to interface with the library.### Build Library
Requires minimum Go version 1.16.
This repository have 3 things :
1. GoVarnam library
2. GoVarnam Command Line Utility (CLI)
3. Go bindings for GoVarnamGoVarnam is written in Go, but to be a standard library that can be used with any other programming languages, we compile it to a C library. This is done by :
```bash
go build -buildmode "c-shared" -o libgovarnam.so
```(Shortcut to doing above is `make library`)
The output `libgovarnam.so` is a shared library that can be dynamically linked in any other programming languages using its header file `libgovarnam.h`. Some examples :
* Go bindings for GoVarnam: See govarnam**go** folder in this repo
* Java bindings for GoVarnam: https://github.com/varnamproject/govarnam-java/Wait, it means we need to write another Go file to interface with GoVarnam library ! This is because we're interfacing with a C shared library and not the Go library directly. The `govarnamgo` acts as this interface for Go apps to use GoVarnam.
### CLI (Command Line Utility)
After making `libgovarnam.so` you can make the CLI to use GoVarnam :
```
make cli
```The command line utility (CLI) is written in Go, uses govarnamgo to interface with the library.
You can build both library and CLI with just `make`.
### Language Support
Varnam uses a `.vst` (Varnam Symbol Table) file for language support. You can get it from it from `schemes` folder in [a release](https://github.com/varnamproject/schemes/releases). Place VST files in **one of these** locations (from high priority to least priority locations):
* `$PWD/schemes` (PWD is Present Working Directory)
* `/usr/local/share/varnam/schemes`
* `/usr/share/varnam/schemes`Now we can use `varnamcli`:
```
# Show linker the path to search for libgovarnam.so
export LD_LIBRARY_PATH=$(realpath ./):$LD_LIBRARY_PATH./varnamcli -s ml namaskaaram
```The `ml` above is the scheme ID. It should match with the VST filename.
You can link the library to `/usr/local/lib` to skip doing the `export LD_LIBRARY_PATH` every time:
```
sudo ln -s $PWD/libgovarnam.so /usr/local/lib/libgovarnam.so
```Now any software can find the GoVarnam library.
### Testing
You can run tests (to make sure nothing broke) with :
```bash
make test
```### Use Varnam Live
It's good to install an IME to test changes you make to the library live.
* Linux IME: https://github.com/varnamproject/govarnam-ibus
* Mac IME (Coming Soon...): https://github.com/varnamproject/govarnam/issues/8
* Windows IME (Coming Soon...): https://github.com/varnamproject/govarnam/issues/7### Changes from libvarnam
* `ml.vst` has been changed to add a new `weight` column in `symbols` table. Get the new `ml.vst` here. The symbol with the least weight has more significance. This is calculated according to popularity from corpus. You can populate a `ml.vst` with weight values by a Python script. See that in the subfolder. The previous ruby script is used for making the VST. That is the same. **`ml.vst` from libvarnam is incompatible with govarnam**.
* `patterns_content` is renamed to `patterns` in GoVarnam
* `patterns` table in learnings DB won't store malayalam patterns. Instead, for each input, all possible malayalam words are calculated (from `symbols` VARNAM_MATCH_ALL) and searched in `words`. These are returned as suggestions. Previously, `pattern` would store every pattern to a word. english => malayalam.
* `patterns` in govarnam is used solely for English words. `Computer => കമ്പ്യൂട്ടർ`. These English words won't work out with our VST tokenizer cause the words are not really transliterable in our language. It would be `kambyoottar => Computer`