Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/euberdeveloper/tum-conf-scraper
This is a scraper written in Node.js and using Puppeteer that gets the videos served on TUM conf (zoom)
https://github.com/euberdeveloper/tum-conf-scraper
npm puppeteer scraper typescript video-scraper
Last synced: 20 days ago
JSON representation
This is a scraper written in Node.js and using Puppeteer that gets the videos served on TUM conf (zoom)
- Host: GitHub
- URL: https://github.com/euberdeveloper/tum-conf-scraper
- Owner: euberdeveloper
- License: mit
- Created: 2022-08-09T19:52:14.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2022-09-15T12:38:07.000Z (over 2 years ago)
- Last Synced: 2024-12-09T22:59:13.426Z (about 2 months ago)
- Topics: npm, puppeteer, scraper, typescript, video-scraper
- Language: TypeScript
- Homepage: https://tum-conf-scraper.euber.dev
- Size: 490 KB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
![Build](https://github.com/euberdeveloper/tum-conf-scraper/workflows/Build/badge.svg)
[![Commitizen friendly](https://img.shields.io/badge/commitizen-friendly-brightgreen.svg)](http://commitizen.github.io/cz-cli/)
[![License](https://img.shields.io/npm/l/tum-conf-scraper.svg)](https://github.com/euberdeveloper/tum-conf-scraper/blob/main/LICENSE)
[![GitHub issues](https://img.shields.io/github/issues/euberdeveloper/tum-conf-scraper.svg)](https://github.com/euberdeveloper/tum-conf-scraper/issues)
[![GitHub stars](https://img.shields.io/github/stars/euberdeveloper/tum-conf-scraper.svg)](https://github.com/euberdeveloper/tum-conf-scraper/stargazers)
![npm](https://img.shields.io/npm/v/tum-conf-scraper.svg)# tum-conf-scraper
This is a scraper written in Node.js and using Puppeteer that gets the videos served by [Tum Conf](https://tum-conf.zoom.us) services.## Install
To install tum-conf-scraper, run:
```bash
$ npm install tum-conf-scraper
```## Project purpose
This module is written because videos hosted on [Tum Conf](tum-conf.zoom.us) are difficult to download and watchable only in the browser. By using the module **[video-scraper-core](https://www.npmjs.com/package/video-scraper-core)**, I created this module, that allows those videos to be recorderd.
## Project usage
To scrape a video available at "https://tum-conf.zoom.us/rec/share/myvideo" and save it to "./saved.webm":
```javascript
const { TumConfVideoScraper } = require('tum-conf-scraper');async function main() {
// Create an instance of the scraper
const scraper = new TumConfVideoScraper('mypasscode', {
debug: true
});
// Launch the Chrome browser
await scraper.launch();
// Scrape and save the video
await scraper.scrape('https://tum-conf.zoom.us/rec/share/myvideo', './saved.webm');
// Close the browser
await scraper.close();
}
main();
```To scrape and download more than one video:
```javascript
const { TumConfVideoScraper } = require('tum-conf-scraper');async function main() {
// Create an instance of the scraper
const scraper = new TumConfVideoScraper('mypasscode', {
debug: true
});
// Launch the Chrome browser
await scraper.launch();
// Scrape and save the first video
await scraper.scrape('https://tum-conf.zoom.us/rec/share/myvideo', './saved.webm');
// Scrape and save the second video
await scraper.scrape('https://tum-conf.zoom.us/rec/share/myvideo-bis', './saved_bis.webm');
// Close the browser
await scraper.close();
}
main();
```To scrape and download in parallel more than one video:
```javascript
const { TumConfVideoScraper } = require('tum-conf-scraper');async function scrape(dest, link) {
// Create an instance of the scraper
const scraper = new TumConfVideoScraper('mypasscode', {
debug: true
});
// Launch the Chrome browser
await scraper.launch();
// Scrape and save the video
await scraper.scrape(link, dest);
// Close the browser
await scraper.close();
}async function main() {
const tasks = [
['./saved.webm', 'https://tum-conf.zoom.us/rec/share/myvideo'],
['./saved_bis.webm', 'https://tum-conf.zoom.us/rec/share/myvideo-bis']
].map(([dest, link]) => scrape(dest, link));
await Promise.all(tasks);
}
main();
```With custom options:
```javascript
const { TumConfVideoScraper } = require('tum-conf-scraper');async function main() {
// Browser options
const scraper = new TumConfVideoScraper('mypasscode', {
debug: true,
debugScope: 'This will be written as scope of the euberlog debug',
windowSize: {
width: 1000,
height: 800
},
browserExecutablePath: '/usr/bin/firefox'
});
await scraper.launch();// Scraping options
await scraper.scrape('https://tum-conf.zoom.us/rec/share/myvideo', './saved.webm', { duration: 1000 });
await scraper.scrape('https://tum-conf.zoom.us/rec/share/myvideo-bis', './saved_bis.webm', {
audio: false,
delayAfterVideoStarted: 3000,
delayAfterVideoFinished: 2000
});await scraper.close();
}
main();
```...all the options can be seen in the API section or with the Typescript definitions.
## API
The documentation site is: [tum-conf-scraper documentation](https://tum-conf-scraper.euber.dev)
The documentation for development site is: [tum-conf-scraper dev documentation](https://tum-conf-scraper-dev.euber.dev)
### TumConfVideoScraper
The TumConfVideoScraper class, that scrapes a video from a "BBB WebKonferenze" and saves it to a file.
**Syntax:**
`const scraper = new TumConfVideoScraper(passcode, options);`
**Parameters:**
* __passcode__: A `string` that specifies the passcode to access the video page.
* __options__: Optional. A `BrowserOptions` object that specifies the options for this instance.**Methods:**
* __setBrowserOptions(options: BrowserOptions): void__: Changes the browser options with the ones given by the `options` parameter.
* __launch(): Promise__: Launches the browser window.
* __close(): Promise__: Closes the browser window.
* __scrape(url: string, destPath: string, options: ScrapingOptions): Promise__: Scrapes the video in `url` and saves it to `destPath`. Some ScrapingOptions can be passed.### BrowserOptions
The options given to the TumConfVideoScraper constructor, see [video-scraper-core](https://github.com/euberdeveloper/video-scraper-core#browseroptions) for more information.
### ScrapingOptions
The options passing to a scrape method, see [video-scraper-core](https://github.com/euberdeveloper/video-scraper-core#scrapingoptions) for more information.
### Errors
There are also some error classes that can be thrown by this module, see [video-scraper-core](https://github.com/euberdeveloper/video-scraper-core#errors) for more information.
## Tests
The package is tested by using **[jest](https://jestjs.io/)** and **[ts-jest](https://www.npmjs.com/package/ts-jest)**. The tests try for real to download some videos and check if they are saved, therefore, are not run in the CI because they are not headless.
## Notes
* The default browser is **Google Chrome** on `/usr/bin/google-chrome`, because Chromium did not support the BBB videos. You can always change the browser executable path on the configurations.
* By default (if the **duration** option is `null`), the **duration of the recording will be automatically detected** by looking at the vjs player of the page and by adding a stopping delay of 15 seconds.
* This module can be uses only in **headful mode**.