Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/itbdw/server-render-javascript-chrome
Prerender your javascript web page for better seo with Google Chrome. π£β οΈ [eat too much memory]
https://github.com/itbdw/server-render-javascript-chrome
chrome prerender server-side-rendering
Last synced: about 2 months ago
JSON representation
Prerender your javascript web page for better seo with Google Chrome. π£β οΈ [eat too much memory]
- Host: GitHub
- URL: https://github.com/itbdw/server-render-javascript-chrome
- Owner: itbdw
- Created: 2017-08-05T17:42:21.000Z (over 7 years ago)
- Default Branch: master
- Last Pushed: 2020-10-14T16:16:09.000Z (about 4 years ago)
- Last Synced: 2024-10-11T11:45:17.256Z (3 months ago)
- Topics: chrome, prerender, server-side-rendering
- Language: JavaScript
- Homepage:
- Size: 26.4 KB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
:warning::warning::warning::warning::warning::warning::warning::warning::warning::warning:
:warning::warning::warning::warning::warning::warning::warning::warning::warning::warning:
:warning::warning::warning::warning::warning::warning::warning::warning::warning::warning:
:warning::warning::warning::warning::warning::warning::warning::warning::warning::warning:Do not try this!
δΈθ¦ε°θ―οΌ
:warning::warning::warning::warning::warning::warning::warning::warning::warning::warning:
:warning::warning::warning::warning::warning::warning::warning::warning::warning::warning:
:warning::warning::warning::warning::warning::warning::warning::warning::warning::warning:
:warning::warning::warning::warning::warning::warning::warning::warning::warning::warning:# server-render-javascript-chrome
Prerender your javascript web page for better seo with Google Chrome.## Dependency
Must
`Chrome`
`NodeJS`
Suggested
`pm2`, start the server and much more.
## Install
> Suppose you are using a Ubuntu Server and nginx as web server.
1. NodeJS
```
curl -sL https://deb.nodesource.com/setup_6.x | sudo -E bash -sudo apt-get install -y nodejs
```
2. Chrome (need the latest stable version for better performance)
```
wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
dpkg -i google-chrome-stable_current_amd64.deb
apt-get install -fapt install libosmesa6
ln -s /usr/lib/x86_64-linux-gnu/libOSMesa.so.6 /opt/google/chrome/libosmesa.so
```3. Install and Run
Download this code to your server, say `/var/server/spider`, the directory
structure looks like below```
/var/server/spider/
craw.js
package.json
spider.js```
```
cd /var/server/spider
npm installnpm install pm2 -g
PORT=3000 pm2 -f start spider.js
```after started, you can use `pm2 logs` to monitor logs, `pm2 list` to display services and much more.
4. Proxy Request
I suppose you use nginx as web server and run the nodejs and nginx at same server.
```
upstream spider {
server localhost:3000;
}server {
...
set $is_spider 0;
set $is_server_render 0;if ($http_user_agent ~ Baiduspider) {
set $is_spider 1;
}if ($http_user_agent ~ Googlebot) {
set $is_spider 1;
}if ($http_user_agent ~ ServerRender) {
set $is_server_render 1;
}set $is_spider_is_render $is_spider$is_server_render;
location / {
...
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;if ($is_spider_is_render = 10) {
proxy_pass http://spider;
}...
}
...
}
```After changing nginx config, don't forget reload the nginx.
You can make a request and check your nginx access log if anything works great.
`curl -A 'fake Googlebot by server-render-javascript' http://yourwebsite.com/abc`
You should get two line in nginx access log, one is your request with user-agent `fake Googlebot by server-render-javascript` and one made by
your upstream server with user-agent `ServerRenderJavascript`, if you have not change the default user-agent at craw.js.## What and how it works
Your website is rendered with javascript. But search engine (like Baidu, Sogou, 360) does not like the page, and `even` can not understand javascript content totally.
So, why don't we run a browser on the server side. When spider like Googlebot comes to your website,
proxy the request to a upstream server, why not `nodejs server`, and the upstream server deal the request
with a headless browser and make a new request just like the we human visit website by Safari or Chrome and return the
rendered content back.The workflow looks like this
```
GoogleBot => Web Server => NodeJS Server => Make A Request Again With Server Browser => Get Web Content And Return
```