https://github.com/stopsopa/html-scraper-browserless

Last synced: over 1 year ago
JSON representation

Host: GitHub
URL: https://github.com/stopsopa/html-scraper-browserless
Owner: stopsopa
Created: 2018-07-11T13:07:26.000Z (almost 8 years ago)
Default Branch: master
Last Pushed: 2023-09-01T23:29:55.000Z (almost 3 years ago)
Last Synced: 2025-02-06T08:13:16.734Z (over 1 year ago)
Language: JavaScript
Size: 558 KB
Stars: 0
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: Readme.md

Awesome Lists containing this project

README

# Deprecated
(Deprecated -> use better https://github.com/stopsopa/html-scraper-browserless) Microservice tool to scraping html from "any" page

I wouldn't suggest it now - it's just old. But I'll leave it here.

# Installation:

git clone this repository and go to main directory
make install
cp config.js.dist config.js

# manually change password in config.js for basic auth

make start

# Using:

Just visit:

http://localhost:8811/generate

# Current execution environment:

- node v8.9.4
- yarn
- Docker version 18.03.1-ce, build 9ee9f40

# Ping:

http://xx.xx.xx.xx:8811/html-scraper-ping
http://slowwly.robertomurray.co.uk/delay/32000/url/https://github.com/stopsopa/docker-puppeteer-pdf-generator

# Useful things (irrelevant now):

docker run -it --rm puppeteer-alpine-generate-pdf /usr/bin/chromium-browser --version
$ Chromium 64.0.3282.168

or if you follow node:8-slim : https://github.com/GoogleChrome/puppeteer/blob/master/docs/troubleshooting.md#running-puppeteer-in-docker
docker run -it --rm --cap-add=SYS_ADMIN --rm puppeteer-chrome-linux /usr/bin/google-chrome-unstable --version
Google Chrome 68.0.3438.3 dev

on mac:
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --version
$ Google Chrome 66.0.3359.181
/Applications/Google\ Chrome\ Canary.app/Contents/MacOS/Google\ Chrome\ Canary --version
$ Google Chrome 69.0.3445.0 canary

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/stopsopa/html-scraper-browserless

Awesome Lists containing this project

README