Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/haroonshakeel/roman_urdu_hate_speech
https://github.com/haroonshakeel/roman_urdu_hate_speech
Last synced: 3 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/haroonshakeel/roman_urdu_hate_speech
- Owner: haroonshakeel
- License: mit
- Created: 2020-10-02T06:23:15.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2021-09-03T05:40:20.000Z (almost 3 years ago)
- Last Synced: 2024-01-21T04:05:06.758Z (5 months ago)
- Size: 882 KB
- Stars: 10
- Watchers: 2
- Forks: 8
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Lists
- indicnlp_catalog - Roman Urdu Offensive Language Detection, 2020 - main.197)) (<a name='TextCorpora'></a>Text Corpora / <a name='HateSpeech'></a>Hate Speech and Offensive Comments)
- awesome-urdu - Hate Speech & Offensive Language Detection, 2020 - 10k tweets (Urdu Datasets / Urdu Sentiment Datasets)
README
# Roman Urdu Hate-speech and Offensive Language Detection
### Embeddings
The embeddings could be found at: https://drive.google.com/drive/folders/1_ZeoYMyBTb2sROeKmxS0quVrbvJBrct2?usp=sharingNote that ``ver1`` is trained on ``0.3 million`` tweets only while ``ver2`` is trained on ``4.7 million`` tweets.
``label_definitions.txt`` contains the mapping for the labels for both tasks (i.e., coarsegrained and finegrained labels).
# Reference
```bash
@inproceedings{rizwan2020hate,
title={Hate-speech and offensive language detection in roman Urdu},
author={Rizwan, Hammad and Shakeel, Muhammad Haroon and Karim, Asim},
booktitle={Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
pages={2512--2522},
year={2020}
}
```