Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/rania-azad/FakeKurdNews---Fake-Kurdish-News-Dataset
https://github.com/rania-azad/FakeKurdNews---Fake-Kurdish-News-Dataset
Last synced: 28 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/rania-azad/FakeKurdNews---Fake-Kurdish-News-Dataset
- Owner: rania-azad
- Created: 2021-05-26T06:42:30.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2021-05-26T06:47:36.000Z (over 3 years ago)
- Last Synced: 2024-08-04T01:17:35.677Z (4 months ago)
- Size: 2.82 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-kurdish - FakeKurdNews
README
# FakeKurdNews---Fake-Kurdish-News-Dataset
Context
The majority of the previous studies got focused on detecting fake news in the English language due to the availability of well-known annotated fake corpus openly available, variety of fact-checkers around the world while the less-resourced languages left behind such as the Kurdish language. While the Kurdish language is spoken by more than 30 million people around the world, yet, it is considered as less-resourced in the Natural Language Processing (NLP) domain due to the inaccessibility of NLP tools and the shortage or unavailability of the labeled corpus.
This is a repository for a fake news dataset for a research project at the College of Informatics, Sulaimani Polytechnic University, Iraq.In this paper: full details about data collection, pre-processing and classifiers used on this dataset.
Content
Our dataset consists of 3 sets of news articles crawled from Facebook pages in KurdKurdish language only in different subjects.
The dataset consists of a set of articles/news labeled by 0 (fake) or 1 (credible).The dataset consists of:
-5000 articles labeled as true
-5000 articles labeled false
-5000 articles automatically modified from the real articles to create fake news