https://github.com/vipranarayan14/aksharas
An utility for analysing akṣaras and varṇas in an Devanagari text.
https://github.com/vipranarayan14/aksharas
character-counter characters devanagari indic-languages indic-scripts sanskrit sanskrit-language syllabification syllables
Last synced: about 8 hours ago
JSON representation
An utility for analysing akṣaras and varṇas in an Devanagari text.
- Host: GitHub
- URL: https://github.com/vipranarayan14/aksharas
- Owner: vipranarayan14
- License: mit
- Created: 2022-08-08T19:22:13.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2023-03-22T08:38:41.000Z (almost 3 years ago)
- Last Synced: 2025-10-21T23:30:02.703Z (4 months ago)
- Topics: character-counter, characters, devanagari, indic-languages, indic-scripts, sanskrit, sanskrit-language, syllabification, syllables
- Language: TypeScript
- Homepage:
- Size: 603 KB
- Stars: 3
- Watchers: 1
- Forks: 2
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Aksharas
[](https://www.npmjs.com/package/@vipran/aksharas)  
**Aksharas** is an utility for analysing *akṣaras* and *varṇas* in a Devanagari text.
## Installation
```sh
npm i @vipran/aksharas
```
## Usage
```js
import Aksharas from "@vipran/aksharas";
// OR for CommonJS:
// const Aksharas = require("@vipran/aksharas").default;
const input = "सर्वे भवन्तु सुखिनः।"
const results = Aksharas.analyse(input);
const aksharas = results.aksharas.map(akshara => akshara.value);
console.log(aksharas); // "स", "र्वे", "भ", "व", "न्तु", "सु", "खि", "नः"
```
## API
### `Aksharas.analyse()`
Accepts a `string` input and returns a [`Results`](#results) object.
```ts
const input: string = 'नमः';
const results: Results = Aksharas.analyse(input);
```
### `Aksharas.TokenType`
It is an enum with the following values:
- `TokenType.Akshara`
- `TokenType.Symbol`
- `TokenType.Whitespace`
- `TokenType.Invalid`
- `TokenType.Unrecognised`
These can be used to filter the tokens in the [`Results`](#results) object. Example:
```js
import Aksharas from "@vipran/aksharas";
// OR import Aksharas, { TokenType } ...
const input = "हे! हरेऽत्र नागच्छ।";
const results = Aksharas.analyse(input);
const symbols = results.all
.filter((token) => token.type === Aksharas.TokenType.Symbol)
.map((token) => token.value);
console.log(symbols); // "ऽ", "।"
```
### `Aksharas.VarnaType`
It is an enum with the following values:
- `VarnaType.Svara`
- `VarnaType.Vyanjana`
These can be used to filter the varnas in [`Results.varnas`](#results). Example:
```js
import Aksharas from "@vipran/aksharas";
// OR import Aksharas, { VarnaType } ...
const input = "गुरुः";
const results = Aksharas.analyse(input);
const svaras = results.varnas
.filter((varna) => varna.type === Aksharas.VarnaType.Svara)
.map((varna) => varna.value);
console.log(svaras); // "उ", "उः"
```
### `Results`
The `Results` object contains the following properties:
- **all**
- type: `Token[]`
- An array of [`Token`](#token) objects containing all the tokens analysed from `input` string. It includes Devanagari *akṣaras*, Devanagari symbols (१, २, ।, ॥, etc.) and non-devangari characters (i.e. characters in other scripts, special characters, whitespace characters, etc.)
- **aksharas**
- type: `Token[]`
- Devanagari syllables like रा, सी, etc. Here, *halanta* consonants such as क्, च्, य्, etc. are also considered as `aksharas` when they are at the end of a word.
- **varnas**
- type: `Varna[]`
- Devanagari consonants and vowels in the `input`. *(Only in v0.4.0 or above.)*
- **symbols**
- type: `Token[]`
- Devanagari symbols such as १, २, ।, ॥, etc.
- **whitespaces**
- type: `Token[]`
- All whitespace characters: `\s`, `\t`, `\n`, etc.
- **invalid**
- type: `Token[]`
- All Devanagari characters whose occurance in the `input` string do not conform to the definition of an *akṣara*. For example, a *virāma* or a vowel mark which is not preceded by a consonant is invalid. ("अ्", "गोु", etc.)
- **unrecognised**
- type: `Token[]`
- Non-devangari characters (i.e. characters in other scripts and special characters such as @, #, etc.)
- **chars**
- type: `string[]`
- All Unicode characters in the `input` string. Same as `String.prototype.split()`.
### `Token`
Many of the properties in the `Results` object consists of an array of `Token`-s. A `Token` object has the following properties:
- **type**
- type: `TokenType`
- Type of the token. One of the values of [`Aksharas.TokenType`](#aksharastokentype).
- **value**
- type: `string`
- Conatins an analysed part of the `input` string.
- **from**
- type: `number`
- From index - representing the start position of the token in the `input` string.
- **to**
- type: `number`
- To index - representing the end position of the token in the `input` string.
- **attributes**
- type: `Record`
- An optional key-value object which may contain other attributes of the token. It is currently used only in the `Akshara` tokens for storing the `varnas` in that akshara.
### `Varna`
`Results.varnas` consists of an array of `Varna` objects. A `Varna` object has the following properties:
- **type**
- type: `VarnaType`
- Type of the token. One of the values of [`Aksharas.VarnaType`](#aksharasvarnatype).
- **value**
- type: `string`
- Conatins an analysed part of the `input` string.
## License
MIT © [Prasanna Venkatesh T S](https://github.com/vipranarayan14)