Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/carlcolglazier/kaskeda
A machine learning project investigating global media, information diffusion, and polarization on news aggregation websites.
https://github.com/carlcolglazier/kaskeda
Last synced: about 1 month ago
JSON representation
A machine learning project investigating global media, information diffusion, and polarization on news aggregation websites.
- Host: GitHub
- URL: https://github.com/carlcolglazier/kaskeda
- Owner: CarlColglazier
- Created: 2018-07-06T15:58:27.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2018-07-10T17:52:32.000Z (over 6 years ago)
- Last Synced: 2024-10-15T18:00:48.017Z (2 months ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 22.5 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# "Kaskeda"
At the moment, this is a project investigating media, information
diffusion, propaganda, and polarization on [reddit]. The data being
used mostly comes from [Pushshift], which is maintained by Jason
Baumgartner.The software in this repository is designed to facilitate high-scale
analytics on reddit data. It has been tested on a server with 64 cores
and can iterate through almost the entire history of reddit
submissions (2005-2017) in under a minute.---
## Why?
According to a [2017 Pew Research Center report][pew2017], 6% of U.S.
adults use reddit and 4% of U.S. adults get news from the site.---
## Set up
Reddit archives can be obtained from [Pushshift]. The [file server]
contains a mostly complete dataset of reddit comments, submissions,
and subreddits.```sh
git clone [email protected]:CarlColglazier/kaskeda.git
cd kaskeda
mkdir data
```[reddit]: https://reddit.com/ "reddit: the front page of the internet"
[Pushshift]: https://pushshift.io/
[file server]: https://files.pushshift.io/reddit/
[pew2017]: http://www.journalism.org/2017/09/07/news-use-across-social-media-platforms-2017/