https://github.com/willdunklin/lecture-web-scraper
https://github.com/willdunklin/lecture-web-scraper
Last synced: 19 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/willdunklin/lecture-web-scraper
- Owner: willdunklin
- Created: 2021-09-20T22:59:57.000Z (almost 5 years ago)
- Default Branch: main
- Last Pushed: 2021-09-21T00:20:28.000Z (almost 5 years ago)
- Last Synced: 2025-03-05T01:45:11.964Z (over 1 year ago)
- Language: Python
- Size: 2.93 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# lecture-web-scraper
Recently one of my favorite online classes of the pandemic period ended and I wanted to save the lecture videos for posterity.
Unluckily for me, the video host Echo360 made it remarkably difficult to download any content on their site.
So I set out to make this project, a webscraper/ffmpeg app in Python that will download Echo360 lectures to your heart's content.
## How it works
The project uses Selenium, more specifically selenium-wire, to open content on Echo360 and track the network traffic.
I investigated the source of the media and found it to be two tailored URLs, one for audio and ther other for video (even though both were in the same .m4s file format).
After downloading the files, all that is left to do is to combine them using ffmpeg to output a shiny, brand new mp4.
### Dependencies
- ffmpeg
- ffmpeg-python
- selenium-wire
- re
- requests