https://github.com/linuxscout/arabic-stemmers-tester
َArabic test for stemmers
https://github.com/linuxscout/arabic-stemmers-tester
Last synced: 3 months ago
JSON representation
َArabic test for stemmers
- Host: GitHub
- URL: https://github.com/linuxscout/arabic-stemmers-tester
- Owner: linuxscout
- License: gpl-3.0
- Created: 2018-08-26T08:01:39.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2022-06-21T21:30:12.000Z (almost 3 years ago)
- Last Synced: 2023-03-11T10:12:25.931Z (about 2 years ago)
- Language: Python
- Size: 5.6 MB
- Stars: 7
- Watchers: 2
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## Test Stemmers
#Dataset :
nafis
gold
qi quranic_corpus
qc quran_index
qcw quran corpus words
kb kabi## Config file
a config file in scripts directory to help to choose stemmers to tests.
### Run Test on data set
for individual data set
```
make gold
```
* change to qi, qc or nafis
* results are stored in output/${dataset}.csv
* statistics are stored in output/${dataset}.csv.statsRun a stemmer on dataset
```
make khoja_nafis
```Run a stemmer on all dataset
```
make khoja_all
make moataz_all
make assem_all
make farasa_all
```
results are stored on output/processedfor Tashaphyne stemmers on all datasets
```
make test_all
```
### Merge all filesTo merge all result files into joined files
```
make join_all
```
The merged files are stored on output/joined### Collect tests statstics after tests
Whithout runing again the tests, you can collect stats
*** Individuals***
```
make eval_gold
```
change to eval_qi, eval_qc or eval_nafis*** for all ***
```
make eval_all
```
Statistics are stored on output/stats## Visualize statitics into Latex and Excel
We can visualize and convert results into Excels and Latex
```
make visualize
```
Global Stats are stored in output/visuale directory:
it contains:
* a tex file
* charts images
* pivots tables of evaluation### Run Statistics on Datasets
Show datasets statitics
```
make show_gold
```
Change to qi, qc or nafis.
To show all stats
```
make show_all
```