Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/motazsaad/process-arabic-text

Pre-process arabic text (remove diacritics, punctuations and repeating characters)
https://github.com/motazsaad/process-arabic-text

arabic-nlp punctuation remove-diacritics

Last synced: about 2 months ago
JSON representation

Pre-process arabic text (remove diacritics, punctuations and repeating characters)

Awesome Lists containing this project

README

        

# Pre-process Arabic Text
Pre-process arabic text (remove diacritics, punctuations, and repeating characters)

## Usage:
```
Usage: clean_arabic_text.py [-h] -i INFILE -o OUTFILE

Pre-process arabic text (remove diacritics, punctuations, and repeating
characters).

optional arguments:
-h, --help show this help message and exit
-i INFILE, --infile INFILE
input file.
-o OUTFILE, --outfile OUTFILE
out file.
```

## Example

```
python clean_arabic_text.py -i infile.txt -o outfile.txt
```

## How to contribute
Your contributions to improve the code are welcomed. Please follow the steps below.
1. Fork the project.
2. Modify the code, test it, make sure that it works fine.
3. Make a pull request.

Please consult [github help](https://help.github.com/) to get help.