{"id":13827131,"url":"https://github.com/alihoseiny/word_cloud_fa","last_synced_at":"2025-07-09T03:30:56.689Z","repository":{"id":45097127,"uuid":"187226020","full_name":"alihoseiny/word_cloud_fa","owner":"alihoseiny","description":"A wrapper for wordcloud module for creating Persian word clouds.","archived":false,"fork":false,"pushed_at":"2025-07-06T08:50:27.000Z","size":1842,"stargazers_count":145,"open_issues_count":0,"forks_count":13,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-07-06T09:35:32.095Z","etag":null,"topics":["data-visualization","python","python3","text-processing"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/alihoseiny.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null},"funding":{"custom":["https://payping.ir/@alihoseiny"]}},"created_at":"2019-05-17T14:00:06.000Z","updated_at":"2025-07-06T08:50:24.000Z","dependencies_parsed_at":"2025-07-06T09:27:33.510Z","dependency_job_id":"87cdd999-1214-4dba-b84c-91199b926c3d","html_url":"https://github.com/alihoseiny/word_cloud_fa","commit_stats":{"total_commits":40,"total_committers":6,"mean_commits":6.666666666666667,"dds":"0.19999999999999996","last_synced_commit":"6dabf88be046189d289ed86b9b1033dc97d1dfd1"},"previous_names":[],"tags_count":8,"template":false,"template_full_name":null,"purl":"pkg:github/alihoseiny/word_cloud_fa","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alihoseiny%2Fword_cloud_fa","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alihoseiny%2Fword_cloud_fa/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alihoseiny%2Fword_cloud_fa/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alihoseiny%2Fword_cloud_fa/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/alihoseiny","download_url":"https://codeload.github.com/alihoseiny/word_cloud_fa/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/alihoseiny%2Fword_cloud_fa/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263881359,"owners_count":23524430,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-visualization","python","python3","text-processing"],"created_at":"2024-08-04T09:01:50.762Z","updated_at":"2025-07-09T03:30:56.274Z","avatar_url":"https://github.com/alihoseiny.png","language":"Python","readme":"# WordCloudFa\n[![Downloads](https://pepy.tech/badge/wordcloud-fa)](https://pepy.tech/project/wordcloud-fa)\n![](https://img.shields.io/pypi/v/wordcloud-fa.svg?style=popout)\n\n\n![](https://github.com/alihoseiny/word_cloud_fa/raw/master/Examples/masked-example.png)\n\n\nThis module is an easy-to-use wrapper for [word_cloud module](https://github.com/amueller/word_cloud).\n\nThe original module doesn't support Farsi Texts. But by using **WordCloudFa** you can generate word clouds from \ntexts those are including Persian and English words.\n\nThis module is not only a wrapper, but it adds some features to the original module.\n\n\u003c!-- toc --\u003e\n\n- [How to Install](#how-to-install)\n- [How to Use](#how-to-use)\n  * [Generating Word Cloud from Text](#generating-word-cloud-from-text)\n  * [Generating Word Cloud from Frequencies](#generating-word-cloud-from-frequencies)\n  * [Working with Stopwords](#working-with-stopwords)\n  * [Mask Image](#mask-image)\n  * [Reshaping words](#reshaping-words)\n  * [Avoiding Dangerous non-ASCII characters](#Avoiding-Dangerous-non-ASCII-characters)\n- [Examples](#examples)\n- [Font](#font)\n- [Persian Tutorial](#persian-tutorial)\n- [Contribution](#contribution)\n- [Common Problems](#Common-Problems)\n  * [Farsi Letters are separated](#Farsi-Letters-are-separated)\n  * [I See Repeated Farsi Words](#I-See-Repeated-Farsi-Words)\n  * [I Have Problem in Running Example Scripts](#I-Have-Problem-in-Running-Example-Scripts)\n- [There is any problem?](#there-is-any-problem)\n- [Citations](#citations)\n\n\u003c!-- tocstop --\u003e\n\n# How to Install\n\nFor installing this module on other operating systems, you can simply run \n\n`pip install wordcloud-fa`.\n\nThis module tested on `python 3`\n\n*WordCloudFa* depends on `numpy` and `pillow`.\n\nAlso you should have `Hazm` module. Normally, all of them will install automatically when you install this module using \n`pip` as described at the beginning of this section.  \n\nTo save the wordcloud into a file, `matplotlib` can also be installed.\n\n**Attention**\n\nYou need to have `python-dev` for python3 on your system. If you don't have it, you can install it on operating systems \nthose using `apt` as the package manager (Like Ubuntu) by this command:\n\n`sudo apt-get install python3-dev`\n\nAnd you can install it on operating systems those using `yum` as the package manager (like RedHat, Fedora and ...) you can \nuse the following command:\n\n`sudo yum install python3-devel` \n\n# How to Use\nFor creating a word cloud from a text, first you should import the class into your code:\n\n`from wordcloud_fa import WordCloudFa`\n\nyou can create an instance of this class like:\n\n`wordcloud = WordCloudFa()`\n\nYou can pass different parameters to the constructor. For see full documents of them, you can see \n[WordCloud Documentations](https://amueller.github.io/word_cloud/) \n\nThere are three parameters that are not in the original class.\n\nFirst one is `persian_normalize`. If you pass this parameter with `True` value, your data will normalize by using \n[Hazm normalizer](https://github.com/sobhe/hazm). It's recommended to always pass this parameter. That will replace \narabic letters with persian ones and do some other stuff.\nThe default value of this parameter is `False`.\n\n`wordcloud = WordCloudFa(persian_normalize=True)`  \n\nthe second parameter is `include_numbers` that is not in the published original module. If you set this parameter to `False`,\n all Persian, Arabic and English numbers will remove from your data.\n\n The default value of this parameter is `True`\n\n `wordcloud = WordCloudFa(include_numbers=False)`\n\n **Common problem Hint:**\n\n The last and very important parameter is: `no_reshape`. The default value of the parameter is `False`. But if you see \n that the letters of the words in Farsi texts are separated in your local system, you should pass `True` value to this parameter.\n ```python\nwordcloud = WordCloudFa(no_reshape=True)\n ```\n ## Generating Word Cloud from Text\n for generating word cloud from a string, you can simply call `generate` method of you instance:\n\n ```python\nwordcloud = WordCloudFa(persian_normalize=True)\nwc = wordcloud.generate(text)\nimage = wc.to_image()\nimage.show()\nimage.save('wordcloud.png')\n\n ```\n\n## Generating Word Cloud from Frequencies\n\nYou can generate a word cloud from frequencies. You can use the output of `process_text` method as frequencies.\n Also you can use any dictionary like this.\n\n ```python\nwordcloud = WordCloudFa()\nfrequencies = wordcloud.process_text(text)\nwc = wordcloud.generate_from_frequencies(frequencies)\n ```\n\n`generate_from_frequencies` method in this module will exclude stopwords. But the original module will not exclude them \nwhen you are using this method. Also you can use Persian words as keys in frequencies dict without any problem.\n\n## Working with Stopwords\n\nStopwords are the words that we don't want to consider. If you dan't pass any stopword, the default words in the \n[stopwords](https://github.com/alihoseiny/word_cloud_fa/blob/master/wordcloud_fa/stopwords) file will consider as \nstopwords.\n\nYou don't want to use them at all and you want to choose your stopwords? you can simply set `stopwords` parameter when \nyou are creating an instance from `WordCloudFa` and pass a `set` of words into it.\n\n```python\nstop_words = set(['کلمه‌ی اول', 'کلمه‌ی دوم'])\nwc = WordCloudFa(stopwords=stop_words)\n```\n\nIf you want to add additional words to the default stopwords, you can simply call `add_stop_words` method on your \ninstance of `WordCloudFa` and pass an iterable type (`list`, `set`, ...) into it.\n\n```python\nwc = WordCloudFa()\nwc.add_stop_words(['کلمه‌ی اول', 'کلمه‌ی دوم'])\n```\n\nAlso you can add stopwords from a file. That file should include stopwords and each word should be in a separate line.\n\nFor that, you should use `add_stop_words_from_file` method. The only parameter of this \n\nmethod is relative or absolute path to the stop words file.\n\n```python\nwc = WordCloudFa()\nwc.add_stop_words_from_file(\"stopwords.txt\")\n```\n\n## Mask Image\n\nYou can mask the final word cloud by an image. For example, the first image of this document is a wordcloud masked by an image \nof the map of Iran country. For setting a mask, you should pass the `mask` parameter.\n\nBut before, you first should be sure you have a black and white image. Because other images will not create a good result.\n\nThen, you should convert that image to a numpy array. For that, you should do something like this:\n\n```python\nimport numpy as np\nfrom PIL import Image\n\nmask_array = np.array(Image.open(\"mask.png\"))\n\n```\n\nYou just should add those two imports, but you don't need to be worried about installing them, because those have been \ninstalled as dependencies of this module.\n\nThen, you can pass that array to the constructor of the `WordCloudFa` class for masking the result.\n\n```python\nwordcloud = WordCloudFa(mask=mask_array)\n```\n\nNow you can use your worldcloud instance as before.\n\n## Reshaping words\n\nWhen you pass your texts into an instance of this class, all words will reshape for turning to a proper way for showing \nAnd avoiding the invalid shape of Persian or Arabic words (splitted and inverse letters).\n\nIf you want to do the same thing outside of this module, you can call `reshape_words` static method.\n\n```python\nreshaped_words = WordCloudFa.reshape_words(['کلمه‌ی اول', 'کلمه‌ی دوم'])\n```\n\nthis method gets an `Iterable` as input and returns a list of reshaped words.\n\n**DONT FORGET THAT YOU SHOULD NOT PASS RESHAPED WORDS TO THE METHODS OF THIS CLASS AND THIS STATIC METHOD IS ONLY FOR USAGES OUT OF THIS MODULE**\n\n## Avoiding Dangerous non-ASCII characters\nSome non-ASCII characters like emojies causing errors. By Default, those characters will remove from the input text (not when you are using the `generate_from_frequencies` method).\n\nFor disabling this feature, you can set the value of the `remove_unhandled_utf_characters` parameter to `False` when you are creating a new instance of the `WordCloudFa`.\n\nAlso you can access the compiled regex patten of those characters using the `unhandled_characters_regex` class attribute.   \n\n# Examples\nYou can see [Example codes in the Examples directory](https://github.com/alihoseiny/word_cloud_fa/tree/master/Examples).\n\n![](https://github.com/alihoseiny/word_cloud_fa/raw/master/Examples/english-example.png)\n![](https://github.com/alihoseiny/word_cloud_fa/raw/master/Examples/mixed-example.png)\n![](https://github.com/alihoseiny/word_cloud_fa/raw/master/Examples/persian-example.png)\n\n# Font\nThe default font is an unknown! font that supports both Persian and English letters. So you don't need to pass a font for \ngetting results. But if you want to change the font you can pass `font_path` parameter.\n\n# Persian Tutorial\nIf you want to read a brief tutorial about how to use this package in Farsi (Persian), you can \n[click on this link](https://blog.alihoseiny.ir/%da%86%da%af%d9%88%d9%86%d9%87-%d8%a8%d8%a7-%d9%be%d8%a7%db%8c%d8%aa%d9%88%d9%86-%d8%a7%d8%a8%d8%b1-%da%a9%d9%84%d9%85%d8%a7%d8%aa-%d9%81%d8%a7%d8%b1%d8%b3%db%8c-%d8%a8%d8%b3%d8%a7%d8%b2%db%8c%d9%85%d8%9f/?utm_source=github\u0026utm_medium=readme\u0026utm_campaign=wordcloudfa).\n\n# Contribution\nWe want to keep this library fresh and useful for all Iranian developers. So we need your help for adding new features, fixing bugs and adding more documents.\n\nYou are wondering how you can contribute to this project? Here is a list of what you can do:\n\n1. Documents are not enough? You can help us by adding more documents.\n2. The current code could be better? You can make this cleaner or faster.\n3. Do you think one useful feature missed? You can open an issue and tell us about it.\n4. Did you find a good open and free font that supports Farsi and English? You can notify us by a pull request or if opening an issue\n\n# Common Problems\n\n## Farsi Letters are separated\n\nIf you see separated Farsi letters in your output, you should pass `no_reshape=True` parameter to your `WordCoudFa` constructor:\n\n```python\nwordcloud = WordCloudFa(no_reshape=True)\n```\n\n\n\n## I See Repeated Farsi Words\n\nIn some cases you may see repeated Farsi words in the output. For solving that problem, you should pass `collocations=False` Parameter to your `WordCloudFa` constructor:\n\n```python\nwordcloud = WordCloudFa(collocations=False)\n```\n\n## I Have Problem in Running Example Scripts\n\nIn some operating systems like Windows, you should specify the encoding of the example text files. If you can not open example files, add `encoding=\"utf-8\"` to your open statements:\n\n```python\nwith open('persian-example.txt', 'r', encoding=\"utf-8\") as file:\n```\n\n# There is any problem?\nIf you have questions, find some bugs or need some features, you can open an issue and tell us. For some strange reasons this is not possible? so contact me by this email: `salam@alihoseiny.ir`.\n\n# Citations\nTexts in the `Example` directory are from [this](https://fa.wikipedia.org/wiki/%D8%A7%DB%8C%D8%B1%D8%A7%D9%86) and [this](https://en.wikipedia.org/wiki/Iran) Wikipedia pages.\n\n","funding_links":["https://payping.ir/@alihoseiny"],"categories":["Python"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falihoseiny%2Fword_cloud_fa","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Falihoseiny%2Fword_cloud_fa","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Falihoseiny%2Fword_cloud_fa/lists"}