Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/seanbreckenridge/overrustle_parser
extract my messages from the overrustlelogs archive (twitch chat logs)
https://github.com/seanbreckenridge/overrustle_parser
chatlog chatlogs overrustlelogs twitch twitch-tv
Last synced: 3 months ago
JSON representation
extract my messages from the overrustlelogs archive (twitch chat logs)
- Host: GitHub
- URL: https://github.com/seanbreckenridge/overrustle_parser
- Owner: seanbreckenridge
- License: mit
- Created: 2021-04-30T22:09:17.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2023-09-23T19:58:08.000Z (over 1 year ago)
- Last Synced: 2023-09-24T00:52:20.117Z (over 1 year ago)
- Topics: chatlog, chatlogs, overrustlelogs, twitch, twitch-tv
- Language: Python
- Homepage:
- Size: 12.7 KB
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# overrustle_parser
Some code to extract my messages from [this overrustle logs archive torrent](https://web.archive.org/web/20210920224341/https://www.reddit.com/r/Destiny/comments/gcapu0/overrustle_logs_archive_torrent_as_of_april_30/). For those unaware, OverRustle collated logs from popular twitch channels for a couple years but were shut down in 2020 -- so this is just to grab some of my old messages so I have access to them.
Thought the [twitch data request](https://www.twitch.tv/p/en/legal/privacy-choices/#user-privacy-requests) would've given me my chat logs but sadly did not.
Expects:
- the logs directory (which has a bunch of `.7z` files in it) as the first argument
- your twitch username as the second argumentExtracts the `.7z` files one by one into the current directory, finds any of my logs, then removes the temporary directory. Can take multiple days to run depending on your computer, is a *lot* of data (`~48G` when compressed)
Saves results to a `./` directory -- one JSON file per channel. This saves even if it finds no logs, so in case this crashes, it can re-started and already processed files will be skipped. To combine those into a single file, you can use [`jq`](https://github.com/stedolan/jq), like `jq '.[]' <.//* | jq -r --slurp > comments.json`
Created to be used as part of [HPI](https://github.com/seanbreckenridge/HPI)
### Example Usage
```bash
git clone https://github.com/seanbreckenridge/overrustle_parser
cd ./overrustle_parser
python3 -m pip install -r ./requirements.txt
python3 parse.py ~/Downloads/OverrustleLogs\ Archive/ moobot
```Personally resulted in:
```bash
$ jq <* '.[] | .dt' | wc -l
1585 # number of comments
$ jq -r <* '.[] | .channel' | sort -u | wc -l
43 # from these many channels
```To run tests:
```bash
python3 -m pip install pytest
python3 -m pytest parse.py
```