https://github.com/rattletat/caption-party
Topic modeling of German parties based on YouTube video data
https://github.com/rattletat/caption-party
caption command-line-tool natural-language-processing party politics subtitles topic-modeling youtube-api youtube-dl
Last synced: 13 days ago
JSON representation
Topic modeling of German parties based on YouTube video data
- Host: GitHub
- URL: https://github.com/rattletat/caption-party
- Owner: rattletat
- License: mit
- Created: 2018-09-20T21:52:44.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2019-09-05T16:16:22.000Z (over 6 years ago)
- Last Synced: 2025-01-18T18:53:48.942Z (over 1 year ago)
- Topics: caption, command-line-tool, natural-language-processing, party, politics, subtitles, topic-modeling, youtube-api, youtube-dl
- Language: Jupyter Notebook
- Homepage:
- Size: 90.7 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# caption-party
This script allows downloading subtitles and other metadata
from Youtube channels using the [YouTube API v3](https://developers.google.com/youtube/v3/docs/) and [youtube-dl](https://github.com/ytdl-org/youtube-dl/) (avoiding heavy quota costs).
Topic and metadata analysis is done within [jupyter notebooks](https://jupyter.org/) in `analysis\`.
![Top 20 words in a period of 3 month before the Bundestag elections using (TF-IDF)[https://en.wikipedia.org/wiki/Tf%E2%80%93idf]](rsc/tfidf.png)
## Script arguments:
- **fetch**:
*Download video captions from one or multiple parties to
`subtitles\{party}\{video}`.
Use argument `all` to fetch videos
from every party specified in the json file.*
## Examples:
- `python connect.py fetch spd cdu --after-date 01.01.2017 --before-date 01.01.2018`
- `python connect.py fetch all --videos-per-channel -1 --key client_secret.json`
## Requirements:
- [Google API Client](https://github.com/googleapis/google-api-python-client)
- [Google Auth Library](https://github.com/googleapis/google-auth-library-python)
- [youtube-dl](https://github.com/rg3/youtube-dl)
- [click](https://github.com/pallets/click)
- [word_cloud](https://github.com/amueller/word_cloud)
`sudo pip install google-api-python-client google-auth youtube-dl click wordcloud`