https://github.com/gildas-lormeau/single-filez-cli
https://github.com/gildas-lormeau/single-filez-cli
Last synced: 5 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/gildas-lormeau/single-filez-cli
- Owner: gildas-lormeau
- License: agpl-3.0
- Created: 2022-05-31T21:24:20.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2024-02-15T17:33:18.000Z (over 1 year ago)
- Last Synced: 2025-04-30T17:08:30.750Z (5 months ago)
- Language: JavaScript
- Size: 10.2 MB
- Stars: 38
- Watchers: 2
- Forks: 3
- Open Issues: 2
-
Metadata Files:
- Readme: README.MD
- License: LICENSE
Awesome Lists containing this project
README
# SingleFileZ CLI (Command Line Interface)
## Introduction
SingleFileZ can be launched from the command line by running it into a
(headless) browser. It runs through Node.js as a standalone script injected into
the web page instead of being embedded into a WebExtension. To connect to the
browser, it can use [Puppeteer](https://github.com/GoogleChrome/puppeteer) or
[Selenium WebDriver](https://www.npmjs.com/package/selenium-webdriver).
Alternatively, it can also emulate a browser with JavaScript disabled by using
[jsdom](https://github.com/jsdom/jsdom).## Installation with Docker
- Installation from Docker Hub
`docker pull capsulecode/singlefilez`
`docker tag capsulecode/singlefilez singlefilez`
- Manual installation
`git clone --depth 1 --recursive https://github.com/gildas-lormeau/single-filez-cli.git`
`cd single-filez-cli`
`docker build --no-cache -t singlefilez .`
- Run
`docker run singlefilez "https://www.wikipedia.org"`
- Run and redirect the result into a file
`docker run singlefilez "https://www.wikipedia.org" > wikipedia.html`
(Linux/UNIX or Windows with `cmd.exe`)- Run and mount a volume to get the saved file in the current directory
- Save one page
`docker run -v %cd%:/usr/src/app/out singlefilez "https://www.wikipedia.org" wikipedia.html`
`docker run -v $(pwd):/usr/src/app/out singlefilez "https://www.wikipedia.org" wikipedia.html`
(Linux/UNIX)- Save one or multiple pages by using the filename template (see
`--filename-template` option)`docker run -v %cd%:/usr/src/app/out singlefilez "https://www.wikipedia.org" --dump-content=false`
(Windows)`docker run -v $(pwd):/usr/src/app/out singlefilez "https://www.wikipedia.org" --dump-content=false`
(Linux/UNIX)## Manual installation
- Make sure Chrome or Firefox is installed and the executable can be found
through the `PATH` environment variable. Otherwise you will need to set the
`--browser-executable-path` option to help SingleFileZ locating it. As an
alternative to Chrome and Firefox, you can use jsdom by setting the
`--back-end` option to `jsdom`.- Install [Node.js](https://nodejs.org)
- There are 3 ways to download the code of SingleFileZ, choose the one you
prefer (`npm` is installed with Node.js):- Download and install globally with `npm`
`npm install -g "gildas-lormeau/single-filez-cli"`
- Download and unzip manually the
[master archive](https://github.com/gildas-lormeau/single-filez-cli/archive/master.zip)
provided by Github`unzip master.zip .`
`cd single-filez-cli-master`
`npm install`
- Download with `git`
`git clone --depth 1 --recursive https://github.com/gildas-lormeau/single-filez-cli.git`
`cd single-filez-cli`
`npm install`
- Make `single-filez` executable (Linux/Unix/BSD etc.) if SingleFile is not
installed globally.`chmod +x single-filez`
- To use Firefox instead of Chrome, you must download the
[Selenium WebDriver](https://www.npmjs.com/package/selenium-webdriver)
component (i.e. `geckodriver` for Firefox). Make sure it can be found through
the `PATH` environment variable or the `cli` folder. Otherwise you will need
to set the `--web-driver-executable-path` option to help WebDriver locating
the executable.## Run
- Syntax
`single-filez [output] [options ...]`
- Display help
`single-filez --help`
- Examples
- Save https://www.wikipedia.org into `wikipedia.html` in the current folder
`single-filez https://www.wikipedia.org wikipedia.html`
- Save https://www.wikipedia.org into `wikipedia.html` in the current folder
with Firefox instead of Chrome`single-filez https://www.wikipedia.org wikipedia.html --back-end=webdriver-gecko`
- Save a list of URLs stored into `list-urls.txt` in the current folder
`single-filez --urls-file=list-urls.txt`
- Save https://www.wikipedia.org and crawl its internal links with the query
parameters removed from the URL`single-filez https://www.wikipedia.org --crawl-links=true --crawl-inner-links-only=true --crawl-max-depth=1 --crawl-rewrite-rule="^(.*)\\?.*$ $1"`
- Save https://www.wikipedia.org and external links only
`single-filez https://www.wikipedia.org --crawl-links=true --crawl-inner-links-only=false --crawl-external-links-max-depth=1 --crawl-rewrite-rule="^.*wikipedia.*$"`
## Troubleshooting
- If the error message
`UnhandledPromiseRejectionWarning: Error: Browser is not downloaded. Run "npm install" or "yarn install" at ChromeLauncher.launch`
is displayed, it probably means that `single-filez` was not able to find the
executable of the browser. Using the option `--browser-executable-path` to
pass to `single-filez` the complete path of the executable fixes this issue.- If saving a page takes an unusually long time, this may be due to a timeout
error that was automatically recovered. Setting `--browser-wait-until` to a
lower value (e.g. `networkidle0` or `load` instead of `networkidle2`) fixes
this issue.## License
SingleFileZ is licensed under AGPL. Code derived from third-party projects is
licensed under MIT. Please contact me at gildas.lormeau <at> gmail.com if
you are interested in licensing the SingleFileZ code for a commercial service or
product.