{"id":13817665,"url":"https://github.com/brothersincode/virastar","last_synced_at":"2025-12-12T03:38:26.172Z","repository":{"id":29292370,"uuid":"32825225","full_name":"brothersincode/virastar","owner":"brothersincode","description":"Cleaning-up Persian Texts!","archived":false,"fork":false,"pushed_at":"2025-05-02T12:02:26.000Z","size":1358,"stargazers_count":140,"open_issues_count":3,"forks_count":15,"subscribers_count":7,"default_branch":"master","last_synced_at":"2025-08-10T23:30:22.335Z","etag":null,"topics":["farsi","javascript","persian","persian-language","spelling-correction","text-processing","virastar"],"latest_commit_sha":null,"homepage":"https://virastar.brothersincode.ir","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":"beberlei/DoctrineExtensions","license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/brothersincode.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2015-03-24T21:07:54.000Z","updated_at":"2025-08-02T04:28:49.000Z","dependencies_parsed_at":"2022-08-31T09:42:14.637Z","dependency_job_id":"be77a2e1-2510-4156-8428-88e2d867d376","html_url":"https://github.com/brothersincode/virastar","commit_stats":null,"previous_names":["juvee/virastar"],"tags_count":26,"template":false,"template_full_name":null,"purl":"pkg:github/brothersincode/virastar","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brothersincode%2Fvirastar","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brothersincode%2Fvirastar/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brothersincode%2Fvirastar/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brothersincode%2Fvirastar/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/brothersincode","download_url":"https://codeload.github.com/brothersincode/virastar/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/brothersincode%2Fvirastar/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":27675827,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-12-12T02:00:06.775Z","response_time":129,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["farsi","javascript","persian","persian-language","spelling-correction","text-processing","virastar"],"created_at":"2024-08-04T06:00:53.204Z","updated_at":"2025-12-12T03:38:26.152Z","avatar_url":"https://github.com/brothersincode.png","language":"JavaScript","funding_links":[],"categories":["جاواسکریپت Javascript"],"sub_categories":[],"readme":"# Virastar (ویراستار)\nVirastar is a Persian text cleaner. It started as A JavaScript port of [aziz/virastar](https://github.com/aziz/virastar). With lots of help from [ebraminio/persiantools](https://github.com/ebraminio/persiantools). See live [demo](https://virastar.brothersincode.ir).\n\n[![NPM version](https://img.shields.io/npm/v/virastar.svg?style=flat-square)](https://www.npmjs.com/package/virastar)\n[![GitHub issues](https://img.shields.io/github/issues/brothersincode/virastar.svg?style=flat-square)](https://github.com/brothersincode/virastar/issues)\n[![GitHub license](https://img.shields.io/badge/license-MIT-blue.svg?style=flat-square)](https://raw.githubusercontent.com/brothersincode/virastar/master/LICENSE)\n[![js-semistandard-style](https://img.shields.io/badge/code%20style-semistandard-brightgreen.svg?style=flat-square)](https://github.com/Flet/semistandard)\n\n## Install\n```bash\nnpm install virastar\n```\n\n## Usage\n```js\nvar Virastar = require('virastar');\nvar virastar = new Virastar();\n\nvirastar.cleanup(\"فارسي را كمی درست تر می نويسيم\"); // Outputs: \"فارسی را کمی درست‌تر می‌نویسیم\"\n```\n\n### Browser\n```html\n\u003cscript src=\"lib/virastar.js\"\u003e\u003c/script\u003e\n\u003cscript\u003e\n  var virastar = new Virastar();\n  alert(virastar.cleanup(\"فارسي را كمی درست تر می نويسيم\"));\n\u003c/script\u003e\n```\n\n#### Virastar([text] [,options])\n\n##### text\nType: `string`\n\nstring of persian source to be cleaned.\n\n##### options\nType: `object`\n\n```js\nVirastar(\"سلام 123\" ,{\"fix_english_numbers\":false}); // Outputs: \"سلام 123\"\n```\n\n## Options and Specifications\nVirastar comes with a list of options to control its behavior.\n\n#### `normalize_eol`\n_default_: `true`\n- replaces windows end of lines with unix eol (`\\n`)\n\n#### `decode_htmlentities`\n_default_: `true`\n- converts numeral and selected html character-sets into original characters\n\n#### `fix_dashes`\n_default_: `true`\n- replaces triple dash to mdash\n- replaces double dash to ndash\n\n#### `fix_three_dots`\n_default_: `true`\n- removes spaces between dots \n- replaces three dots with ellipsis character\n\n#### `normalize_ellipsis`\n_default_: `true`\n- replaces more than one ellipsis with one\n- replaces more than one space before ellipsis with one space\n- replaces (space|tab|zwnj) after ellipsis with one space\n\n#### `remove_spaces_before_ellipsis`\n_default_: `true`\n- removes spaces before ellipsis\n\n#### `normalize_dates`\n_default_: `true`\n- re-orders date parts with slash as delimiter\n\n#### `fix_english_quotes_pairs`\n_default_: `true`\n- replaces english quote pairs (`“”`) with their persian equivalent (`«»`)\n\n#### `fix_english_quotes`\n_default_: `true`\n- replaces english quote marks with their persian equivalent\n\n#### `fix_hamzeh`\n_default_: `true`\n- replaces `ه` followed by (space|ZWNJ|lrm) follow by `ی` with `هٔ`\n- replaces `ه` followed by (space|ZWNJ|lrm|nothing) follow by `ء` with `هٔ`\n- replaces `هٓ` or single-character `ۀ` with the standard `هٔ`\n\n#### `fix_hamzeh_arabic`\n_default_: `false`\n- converts arabic hamzeh `ة` to `هٔ`\n\n#### `cleanup_rlm`\n_default_: `true`\n- converts Right-to-left marks followed by persian characters to zero-width non-joiners (ZWNJ)\n\n#### `cleanup_zwnj`\n_default_: `true`\n- converts all soft hyphens (`\u0026shy;`) into zwnj\n- converts all angled dash (`\u0026not;`) into zwnj\n- removes more than one zwnj\n- cleans zwnj after characters that don't conncet to the next\n- cleans zwnj before and after numbers, english words, spaces and punctuations\n- removes unnecessary zwnj on start/end of each line\n\n#### `fix_arabic_numbers`\n_default_: `true`\n- replaces arabic numbers with their persian equivalent\n\n#### `fix_english_numbers`\n_default_: `true`\n- replaces english numbers with their persian equivalent\n\n#### `fix_numeral_symbols`\n_default_: `true`\n- replaces english percent signs (U+066A)\n- replaces dots between numbers into decimal separator (U+066B)\n- replaces commas between numbers into thousands separator (U+066C)\n\n#### `fix_misc_non_persian_chars`\n_default_: `true`\n- replaces arabic normal/swash kaf with its persian equivalent\n- replaces arabic/urdu/pushtu/uyghur/barree yeh with its persian equivalent\n- replaces kurdish he with its persian equivalent\n\n#### `fix_punctuations`\n_default_: `true`\n- replaces `,`, `;` with its persian equivalent\n\n#### `fix_question_mark`\n_default_: `true`\n- replaces question marks with its persian equivalent\n\n#### `fix_perfix_spacing`\n_default_: `true`\n- puts zwnj between the word and the prefix:\n\t- `mi*`, `nemi*`, `bi*`\n\n#### `fix_suffix_spacing`\n_default_: `true`\n- puts zwnj between the word and the suffix:\n\t- `*ha`, `*haye`\n\t- `*am`, `*at`, `*ash`, `*ei`, `*eid`, `*eem`, `*and`, `*man`, `*tan`, `*shan`\n\t- `*tar`, `*tari`, `*tarin`\n\t- `*hayee`, `*hayam`, `*hayat`, `*hayash`, `*hayetan`, `*hayeman`, `*hayeshan`\n\n#### `fix_suffix_misc`\n_default_: `true`\n- replaces `ه` followed by `ئ` or `ی`, and then by `ی`, with `ه‌ای`\n\n#### `fix_spacing_for_braces_and_quotes`\n_default_: `true`\n- removes inside spaces and more than one outside for `()`, `[]`, `{}`, `“”` and `«»`\n\n#### `fix_spacing_for_punctuations`\n_default_: `true`\n- removes space before punctuations\n- removes more than one space after punctuations, except followed by new-lines\n- removes space after colon that separates time parts\n- removes space after dots in numbers\n- removes space before some common domain tlds\n- removes space between question and exclamation marks\n- removes space between same marks\n\n#### `fix_diacritics`\n_default_: `true`\n- cleans zwnj before diacritic characters\n- cleans more than one of each diacritic characters\n- cleans spaces before diacritic characters\n\n### `remove_diacritics`\n_default_: `false`\n- removes all diacritic characters\n\n#### `fix_persian_glyphs`\n_default_: `true`\n- converts incorrect persian glyphs to standard characters\n\n#### `fix_misc_spacing`\n_default_: `true`\n- removes space before parentheses on misc cases\n- removes space before braces containing numbers\n\n#### `cleanup_spacing`\n_default_: `true`\n- replaces more than one space with just a single one\n- cleans tab/space/zwnj/zwj/nbsp between two new-lines\n\n#### `cleanup_line_breaks`\n_default_: `true`\n- cleans whitespace/zwnj between new-lines\n- cleans more than **two** contiguous line breaks\n\n#### `cleanup_begin_and_end`\n_default_: `true`\n- removes space/tab/zwnj/nbsp from the beginning of the new-lines\n- removes spaces, tabs, zwnj, direction marks and new lines from the beginning and end of text\n\n### markdown\n#### `markdown_normalize_braces`\n_default_: `true`\n- removes spaces between `[]` and `()` (`[text] (link)` into `[text](link)`)\n- removes space between `!` and opening brace (`! [alt](src)` into `![alt](src)`)\n- removes spaces inside double `()`, `[]`, `{}` (`[[ text ]]` into `[[text]]`)\n- removes spaces between double `()`, `[]`, `{}` (`[[text] ]` into `[[text]]`)\n\n#### `markdown_normalize_lists`\n_default_: `true`\n- removes extra lines between two items on a markdown list beginning with `-`, `*` or `#`\n\n#### `skip_markdown_ordered_lists_numbers_conversion`\n_default_: `false`\n- skips converting english numbers of ordered lists in markdown\n\n### aggressive editing\n#### `cleanup_extra_marks`\n_default_: `true`\n- replaces more than one exclamation mark with just one\n- replaces more than one english or persian question mark with just one\n- re-orders consecutive marks: `?!` into `!?`\n\n#### `kashidas_as_parenthetic`\n_default_: `true`\n- replaces kashidas to ndash in parenthetic\n\n#### `cleanup_kashidas`\n_default_: `true`\n- converts kashida between numbers to ndash\n- removes all kashidas between non-whitespace characters\n\n### extras\n#### `preserve_frontmatter`\n_default_: `true`\n- preserves frontmatter data in the text\n\n#### `preserve_HTML`\n_default_: `true`\n- preserves all html tags in the text\n\n#### `preserve_comments`\n_default_: `true`\n- preserves all html comments in the text\n\n#### `preserve_entities`\n_default_: `true`\n- preserves all html entities in the text\n\n#### `preserve_URIs`\n_default_: `true`\n- preserves all uri strings in the text\n\n#### `preserve_brackets`\n_default_: `false`\n- preserves strings inside square brackets (`[]`)\n\n#### `preserve_braces`\n_default_: `false`\n- preserves strings inside curly braces (`{}`)\n\n#### `preserve_nbsps`\n_default_: `true`\n- preserves all no-break space entities in the text\n\n## License\nThis software is licensed under the MIT License. [View the license](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrothersincode%2Fvirastar","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbrothersincode%2Fvirastar","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbrothersincode%2Fvirastar/lists"}