Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/AshkanArabim/persian-word-extractor

This script creates a list of unique words from Persian text. Words can be sorted by frequency or alphabetical order. This is a new project, there could be major bugs in the code.
https://github.com/AshkanArabim/persian-word-extractor

persian python3 word-extraction

Last synced: 3 months ago
JSON representation

This script creates a list of unique words from Persian text. Words can be sorted by frequency or alphabetical order. This is a new project, there could be major bugs in the code.

Awesome Lists containing this project

README

        

## persian-word-extractor
This script creates a list of unique words from Persian text. Words are sorted by the frequency that they appear in the source.txt file. This is a new project, there could be major bugs in the code.
**Words with accent marks are excluded from results.**

## Features:
* sort by frequency or alphabetical order

* extract words from source.txt or online links

## How to use:
1. Create a file named 'source.txt' in root directory and paste source text inside.
2. Run 'main.py'
4. Follow CLI instructions.
5. Results will be written to 'output.txt' in root directory.

**Feel free to tweak the code to suit your needs.**

## How did I use it?
I ran this script on a large body of Persian text to extract words for contribution to [Monkeytype](https://monkeytype.com). I added the "Persian 1k" & "Persian 5k" tests. My first open-source contribution!!