Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/deemeetree/datascrape
data scrape of NPOV Wikipedia articles for further discourse diversity analysis
https://github.com/deemeetree/datascrape
Last synced: 9 days ago
JSON representation
data scrape of NPOV Wikipedia articles for further discourse diversity analysis
- Host: GitHub
- URL: https://github.com/deemeetree/datascrape
- Owner: deemeetree
- Created: 2019-04-16T18:16:30.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2019-04-16T22:45:51.000Z (over 5 years ago)
- Last Synced: 2024-08-02T14:10:33.567Z (3 months ago)
- Language: Jupyter Notebook
- Size: 646 KB
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# datascrape
`WikipediaScrape-ContestedNeutrality.ipynb`
This script will scrape some of the articles on https://en.wikipedia.org/w/index.php?title=Category:All_NPOV_disputes page which contains all the articles that have contested neutrality. It will save it into an XML file which can then be processed by https://gitlab.com/mattiasostmar/discoursediversity to identify correlation in the diversity of the discourse and NPOV measures. NPOV is set to FALSE here.
`WikipediaScrape-GoodArticles.ipynb`
This script will scrape some of the articles on https://en.wikipedia.org/wiki/Wikipedia:Good_articles/all page which contains all the articles that have been marked as "good". We assume that they have sufficient neutrality score (NPOV - neutral point of view). It will save it into an XML file which can then be processed by https://gitlab.com/mattiasostmar/discoursediversity to identify correlation in the diversity of the discourse and NPOV measures. NPOV is set to TRUE here.