Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/hrbrmstr/gepetto
🎎 ScrapingHub Splash-like REST API for Headless Chrome
https://github.com/hrbrmstr/gepetto
hapi hapijs headless-chrome node-js nodejs npm r-cyber splash
Last synced: 23 days ago
JSON representation
🎎 ScrapingHub Splash-like REST API for Headless Chrome
- Host: GitHub
- URL: https://github.com/hrbrmstr/gepetto
- Owner: hrbrmstr
- Created: 2018-08-23T16:40:43.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2018-08-24T05:01:05.000Z (about 6 years ago)
- Last Synced: 2024-07-31T19:20:49.258Z (3 months ago)
- Topics: hapi, hapijs, headless-chrome, node-js, nodejs, npm, r-cyber, splash
- Language: JavaScript
- Homepage: https://hrbrmstr.github.io/gepetto/index.html
- Size: 243 KB
- Stars: 15
- Watchers: 3
- Forks: 4
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# gepetto
[ScrapingHub Splash](https://github.com/scrapinghub/splash)-like REST API for [Headless Chrome](https://developers.google.com/web/updates/2017/04/headless-chrome) based on [Puppeteer](https://github.com/GoogleChrome/puppeteer/blob/v1.7.0/docs/api.md)
## Description
Splash is a lightweight, scriptable browser as a service with an HTTP API. This project aims to create the same for Headless Chrome and duplicate the high-level Splash API and offer similiar functionality as is provided by Splash's Lua interface (without supporting Lua).
The goal is not to become a full [WebDriver](https://www.w3.org/TR/webdriver/) (i.e. it's not aiming to replace [Selenium](https://www.seleniumhq.org/projects/webdriver/)) but to provide a straigthforward/concise facility for loading URLs in a javascript context and obtaining HTML, PDF or screenshot data back.
It requires a recent installation of [Node.js](https://nodejs.org/en/) and [npm](https://www.npmjs.com/).
## Installation
npm install https://gitlab.com/hrbrmstr/gepetto.git --global
or:
git clone [email protected]/hrbrmstr/gepetto
cd gepetto
npm install [-g] # use -g for global installation which may require sudo on some systemsThis will grab all the dependencies and also download a module-copy of Chromium for the platfor you are on.
## What's Inside the Tin?
### Starting the Server
If you only performed a local installation, then you can fire up `gepetto` in the module directory with:
$ node index
🚀 Launch browser!
👍 gepetto running on: http://localhost:3000If you already have a service running on TCP port 3000, then you can change the port `gepetto` uses via:
$ PORT=#### node index
If you performed a global installation, you now have a `gepetto` command on your `PATH` and can just do:
$ gepetto
or:
PORT=#### gepetto
You can use the `HOST` environment variable to change what IP address the service listens on.
### API Documentation
There is online API documentation for `gepetto` at the `/documentation` endpoint. i.e., if you're running with the defaults, you can go go and see the API documentation there.
Static documenation is avaiable at `docs/index.html` from the module's top-level directory.