Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/deemeetree/datascrape

data scrape of NPOV Wikipedia articles for further discourse diversity analysis
https://github.com/deemeetree/datascrape

Last synced: about 2 months ago
JSON representation

data scrape of NPOV Wikipedia articles for further discourse diversity analysis

Awesome Lists containing this project

README

        

# datascrape

`WikipediaScrape-ContestedNeutrality.ipynb`

This script will scrape some of the articles on https://en.wikipedia.org/w/index.php?title=Category:All_NPOV_disputes page which contains all the articles that have contested neutrality. It will save it into an XML file which can then be processed by https://gitlab.com/mattiasostmar/discoursediversity to identify correlation in the diversity of the discourse and NPOV measures. NPOV is set to FALSE here.

`WikipediaScrape-GoodArticles.ipynb`

This script will scrape some of the articles on https://en.wikipedia.org/wiki/Wikipedia:Good_articles/all page which contains all the articles that have been marked as "good". We assume that they have sufficient neutrality score (NPOV - neutral point of view). It will save it into an XML file which can then be processed by https://gitlab.com/mattiasostmar/discoursediversity to identify correlation in the diversity of the discourse and NPOV measures. NPOV is set to TRUE here.