Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/arstgit/videox
Download HTML5 videos from a website page.
https://github.com/arstgit/videox
mediasource-extensions
Last synced: about 11 hours ago
JSON representation
Download HTML5 videos from a website page.
- Host: GitHub
- URL: https://github.com/arstgit/videox
- Owner: arstgit
- License: mit
- Created: 2020-10-02T03:23:52.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2020-10-08T22:58:09.000Z (about 4 years ago)
- Last Synced: 2024-04-24T14:29:40.095Z (7 months ago)
- Topics: mediasource-extensions
- Language: JavaScript
- Homepage:
- Size: 11.7 KB
- Stars: 4
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# videox
Download HTML5 videos from a website page using Media Source Extensions (MSE).
Note:
1. videox is designed for pages using Media Source Extensions (MSE) technique. For pages using other techniques, just embed a HTTP URL into video tag, for example, videox will throw an error.
2. Some pages have video ads using the same technique as the actual video content, the MSE. videox can't distingush them, it just downloads all video ads and the actual video by default. The easiest way to deal with this is using a browser with ads block extension. Alternatively you can modify this program as you need as it's just a web crawler based on puppeteer.# Prerequisites
- chrome. Needed if the websites were providing MP4 video you wanted that is usually the case. Otherwise chromium, puppeteer downloaded automatically is enough.
# Design
[https://www.tiaoxingyubolang.com/zh/article/2020-10-09_mediasource](https://www.tiaoxingyubolang.com/zh/article/2020-10-09_mediasource)
# Usage
```js
const Videox = require('videox')const targetUrl = 'https://www.youtube.com/watch?v=h32FxBqmu_U'
(async () = {
const videox = new Videox({
debug: true,
headless: true,
downloadBrowser: false,
logTo: process.stdout,
browserExecutePath: '/usr/bin/chromium',
browserArgs: ['--no-sandbox'],
downloadAsFile: true,
downloadPath: path.join(__dirname, 'download'),
checkCompleteLoopInterval: 100,
waitForNextDataTimeout: 8000,
})await videox.init()
await videox.get(targetUrl)
await videox.destroy()
})()
```# API
## Class: Videox
### Event: 'data'
- `objectURL` \ The URL created from `URL.createObjectURL`, usually starts with `blob`.
- `mimeCodec` \ Corresponding mimeCodec.
- `chunk` \ The data received from page.If `options.downloadAsFile` is specified as `false`, this event must be listened for receiving media data.
`objectURL` and `mimeCode` together identify a media file to which `chunk` corresponding.
### new Videox([options])
- `options` \
- `debug` \ Default: false.
- `headless` \ Default: true.
- `downloadBrowser` \ Default: false.
- `logTo` \ Default: process.stdout.
- `browserExecutePath`: \ Default: '/usr/bin/chromium'.
- `browserArgs`: \: Default: [].
- `downloadAsFile` \ Default: true.
- `dowloadPath` \ Default: ''.
- `checkCompleteLoopInterval` \ The time interval between checking whether current download progress is commplete, in milliseconds. Default: 100,
- `waitForNextDataTimeout`: \ The timeout waiting for next media data, in milliseconds. Default: 3000.
- `Returns`: \Usually `dowloadBrowser` is false and `browserExecutePath` is filled with common browser path to download MP4 using browsers other than the default chromium. See `puppeteer` package for more information.
### videox.init()
- `Returns`: \
### video.get(options)
- `pageUrl` \ Required.
- `Returns`: \### videox.destroy()
- `Returns`: \